Abstract— Analyzing unknown data sets such as multispectral images often requires unsupervised techniques. Data clustering is a well known and widely used approach in such cases....
A good clustering performance depends on the quality of the distance function used to asses similarity. In this paper we propose a pairwise document coreference model to improve pe...
Iustin Dornescu, Constantin Orasan, Tatiana Lesnik...
—This paper describes effective object function design for combining on-line and off-line character recognizers for on-line handwritten Japanese text recognition. We combine on-l...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
Large-scale cluster-based Internet services often host partitioned datasets to provide incremental scalability. The aggregation of results produced from multiple partitions is a f...