Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds o...
David Vilar, Hermann Ney, Alfons Juan, Enrique Vid...
We conduct large-scale experiments to investigate optimal features for classification of verbs in biomedical texts. We introduce a range of feature sets and associated extraction ...
This paper is concerned with the problem of definition search. Specifically, given a term, we are to retrieve definitional excerpts of the term and rank the extracted excerpts acc...
We explore the problem of rapid automatic semantic tagging of video frames of unstructured (unedited) videos. We apply the Sort-Merge algorithm for feature selection on a large (&...
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...