Sciweavers

338 search results - page 19 / 68
» A clustering method based on path similarities of XML data
Sort
View
SSDBM
2005
IEEE
218views Database» more  SSDBM 2005»
15 years 6 months ago
The "Best K" for Entropy-based Categorical Data Clustering
With the growing demand on cluster analysis for categorical data, a handful of categorical clustering algorithms have been developed. Surprisingly, to our knowledge, none has sati...
Keke Chen, Ling Liu
87
Voted
SIGMOD
2005
ACM
119views Database» more  SIGMOD 2005»
16 years 17 days ago
DogmatiX Tracks down Duplicates in XML
Duplicate detection is the problem of detecting different entries in a data source representing the same real-world entity. While research abounds in the realm of duplicate detect...
Melanie Weis, Felix Naumann
113
Voted
DAWAK
2009
Springer
15 years 7 months ago
Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data
The appropriate choice of a method for imputation of missing data becomes especially important when the fraction of missing values is large and the data are of mixed type. The prop...
Vadim V. Ayuyev, Joseph Jupin, Philip W. Harris, Z...
115
Voted
KDD
2005
ACM
153views Data Mining» more  KDD 2005»
16 years 25 days ago
Using retrieval measures to assess similarity in mining dynamic web clickstreams
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...
Olfa Nasraoui, Cesar Cardona, Carlos Rojas
110
Voted
COLING
2010
14 years 7 months ago
Multi-Sentence Compression: Finding Shortest Paths in Word Graphs
We consider the task of summarizing a cluster of related sentences with a short sentence which we call multi-sentence compression and present a simple approach based on shortest p...
Katja Filippova