Sciweavers

1804 search results - page 330 / 361
» Ranking for data repairs
Sort
View
KDD
2007
ACM
177views Data Mining» more  KDD 2007»
16 years 7 days ago
Mining optimal decision trees from itemset lattices
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal...
Élisa Fromont, Siegfried Nijssen
KDD
2006
ACM
185views Data Mining» more  KDD 2006»
16 years 7 days ago
Understanding Content Reuse on the Web: Static and Dynamic Analyses
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Ricardo A. Baeza-Yates, Álvaro R. Pereira J...
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
16 years 7 days ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
KDD
2001
ACM
150views Data Mining» more  KDD 2001»
16 years 7 days ago
Empirical bayes screening for multi-item associations
This paper considers the framework of the so-called "market basket problem", in which a database of transactions is mined for the occurrence of unusually frequent item s...
William DuMouchel, Daryl Pregibon
WSDM
2010
ACM
251views Data Mining» more  WSDM 2010»
15 years 9 months ago
Large Scale Query Log Analysis of Re-Finding
Although Web search engines are targeted towards helping people find new information, people regularly use them to re-find Web pages they have seen before. Researchers have noted ...
Jaime Teevan, Sarah K. Tyler