Sciweavers

47 search results - page 3 / 10
» Cross-Instance Tuning of Unsupervised Document Clustering Al...
Sort
View
ICDAR
2007
IEEE
14 years 13 days ago
Simultaneous Layout Style and Logical Entity Recognition in a Heterogeneous Collection of Documents
Logical entity recognition in heterogeneous collections of document page images remains a challenging problem since the performance of traditional supervised methods degrade drama...
S. Chen, S. Mao, G. Thoma
ICML
2005
IEEE
14 years 7 months ago
Multi-way distributional clustering via pairwise interactions
We present a novel unsupervised learning scheme that simultaneously clusters variables of several types (e.g., documents, words and authors) based on pairwise interactions between...
Ron Bekkerman, Ran El-Yaniv, Andrew McCallum
JCDL
2005
ACM
116views Education» more  JCDL 2005»
13 years 11 months ago
Name disambiguation in author citations using a K-way spectral clustering method
An author may have multiple names and multiple authors may share the same name simply due to name abbreviations, identical names, or name misspellings in publications or bibliogra...
Hui Han, Hongyuan Zha, C. Lee Giles
TSD
2007
Springer
14 years 6 days ago
On the Relative Hardness of Clustering Corpora
Abstract. Clustering is often considered the most important unsupervised learning problem and several clustering algorithms have been proposed over the years. Many of these algorit...
David Pinto, Paolo Rosso
KDD
2005
ACM
118views Data Mining» more  KDD 2005»
14 years 6 months ago
On the use of linear programming for unsupervised text classification
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Mark Sandler