Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds o...
David Vilar, Hermann Ney, Alfons Juan, Enrique Vid...
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. In this paper, we propose a Second Order Cone Programming (SO...
In this paper, we propose a robust approach for recognition of text embedded in natural scenes. Instead of using binary information as most other OCR systems do, we extract featur...
Jing Zhang, Xilin Chen, Andreas Hanneman, Jie Yang...
In this paper we improve previous work on measuring the similarity of short segments of text in two ways. First, we introduce a Web-relevance similarity measure and demonstrate it...