This paper addresses the problem of topic distillation on the World Wide Web, namely, given a typical user query to find quality documents related to the query topic. Connectivity...
In this paper, we present an analysis based on linguistic and typographic features that allows for the identification of titles in web documents. We focus in particular on procedu...
Abstract. Several actions are usually performed when document is appended to textual database in information retrieval system. The most frequent actions are compression of the docu...
Term translation probabilities proved an effective method of semantic smoothing in the language modelling approach to information retrieval. We use Generalized Latent Semantic Ana...
In this paper we propose a novel approach to the detection of on-line handwritten text lines based on dynamic programming. We try to find the paths with the minimum cost between ...