We introduce a novel approach to combining rankings from multiple retrieval systems. We use a logistic regression model or an SVM to learn a ranking from pairwise document prefere...
A new isolated handwritten Farsi numeral recognition algorithm is proposed in this paper, which exploits the sparse and over-complete structure from the handwritten Farsi numeral ...
We propose a distribution-based pruning of n-gram backoff language models. Instead of the conventional approach of pruning n-grams that are infrequent in training data, we prune n...
Abstract. Cross-lingual event tracking from a very large number of information sources (thousands of Web sites, for example) is an open challenge. In this paper we investigate effe...
Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea ident...
Ian H. Witten, Gordon W. Paynter, Eibe Frank, Carl...