Sciweavers

5962 search results - page 1124 / 1193
» Efficient Clustering for Orders
Sort
View
107
Voted
KDD
2009
ACM
156views Data Mining» more  KDD 2009»
16 years 1 months ago
Turning down the noise in the blogosphere
In recent years, the blogosphere has experienced a substantial increase in the number of posts published daily, forcing users to cope with information overload. The task of guidin...
Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
16 years 1 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
16 years 1 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2008
ACM
193views Data Mining» more  KDD 2008»
16 years 1 months ago
A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances
This work introduces a new family of link-based dissimilarity measures between nodes of a weighted directed graph. This measure, called the randomized shortest-path (RSP) dissimil...
Luh Yen, Marco Saerens, Amin Mantrach, Masashi Shi...
KDD
2008
ACM
246views Data Mining» more  KDD 2008»
16 years 1 months ago
Direct mining of discriminative and essential frequent patterns via model-based search tree
Frequent patterns provide solutions to datasets that do not have well-structured feature vectors. However, frequent pattern mining is non-trivial since the number of unique patter...
Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Y...
« Prev « First page 1124 / 1193 Last » Next »