This paper describes a hybrid model that combines machine learning with linguistic heuristics for integrating unknown word identification with Chinese word segmentation. The model...
We introduce several methods of combining feature selectors for text classification. Results from a large investigation of these combinations are summarized. Easily constructed co...
In this paper, we investigate how modeling content structure can benefit text analysis applications such as extractive summarization and sentiment analysis. This follows the lingu...
In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that th...
We present a novel sentence reduction system for automatically removing extraneous phrases from sentences that are extracted from a document for summarization purpose. The system ...