Abstract— In this paper we suggest a new approach to represent text document collections, integrating background knowledge to improve clustering effectiveness. Background knowled...
Word clustering is a conventional and important NLP task, and the literature has suggested two kinds of approaches to this problem. One is based on the distributional similarity a...
Abstract. In this paper we introduce a general framework for hierarchical clustering that deals with both static and dynamic data sets. From this framework, different hierarchical...
An approach to simultaneous document classification and word clustering is developed using a two-way mixture model of Poisson distributions. Each document is represented by a vect...
We present a simple semi-supervised relation extraction system with large-scale word clustering. We focus on systematically exploring the effectiveness of different cluster-based ...