Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarizati...
Jingbo Zhu, Na Ye, Xinzhi Chang, Wenliang Chen, Be...
In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, an...
This paper presents a cluster-based text categorization system which uses class distributional clustering of words. We propose a new clustering model which considers the global in...
We propose new methods to exploit contemporaneous text, such as on-line news articles, to improve language models for automatic speech recognition and other natural language proce...
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...