A new approach for separating mathematics from usual text is presented. Contrary to the existing methods, it is more oriented toward the segmentation than the recognition, isolati...
This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system shoul...
Abstract. Learning ranking functions is crucial for solving many problems, ranging from document retrieval to building recommendation systems based on an individual user’s prefer...
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Annals and chronicles may be the foundation of accounting, but writers of stories and histories have long known that they seldom render a satisfactory account of complex events. I...