Sciweavers

572 search results - page 58 / 115
» Winnowing-based text clustering
Sort
View
KDD
2009
ACM
169views Data Mining» more  KDD 2009»
16 years 12 days ago
COA: finding novel patents through text analysis
In recent years, the number of patents filed by the business enterprises in the technology industry are growing rapidly, thus providing unprecedented opportunities for knowledge d...
Mohammad Al Hasan, W. Scott Spangler, Thomas D. Gr...
KDD
2002
ACM
138views Data Mining» more  KDD 2002»
16 years 8 days ago
Learning to match and cluster large high-dimensional data sets for data integration
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
William W. Cohen, Jacob Richman
GECCO
2003
Springer
167views Optimization» more  GECCO 2003»
15 years 5 months ago
Dimensionality Reduction via Genetic Value Clustering
Abstract. Feature extraction based on evolutionary search offers new possibilities for improving classification accuracy and reducing measurement complexity in many data mining and...
Alexander P. Topchy, William F. Punch
COLING
2008
15 years 1 months ago
Unsupervised Induction of Labeled Parse Trees by Clustering with Syntactic Features
We present an algorithm for unsupervised induction of labeled parse trees. The algorithm has three stages: bracketing, initial labeling, and label clustering. Bracketing is done f...
Roi Reichart, Ari Rappoport
ICML
2006
IEEE
16 years 20 days ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan