Sciweavers

2033 search results - page 376 / 407
» Ranking on Data Manifolds
Sort
View
KDD
2007
ACM
177views Data Mining» more  KDD 2007»
15 years 10 months ago
Mining optimal decision trees from itemset lattices
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal...
Élisa Fromont, Siegfried Nijssen
KDD
2006
ACM
185views Data Mining» more  KDD 2006»
15 years 10 months ago
Understanding Content Reuse on the Web: Static and Dynamic Analyses
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Ricardo A. Baeza-Yates, Álvaro R. Pereira J...
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
15 years 10 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
KDD
2001
ACM
150views Data Mining» more  KDD 2001»
15 years 10 months ago
Empirical bayes screening for multi-item associations
This paper considers the framework of the so-called "market basket problem", in which a database of transactions is mined for the occurrence of unusually frequent item s...
William DuMouchel, Daryl Pregibon
102
Voted
WSDM
2010
ACM
251views Data Mining» more  WSDM 2010»
15 years 7 months ago
Large Scale Query Log Analysis of Re-Finding
Although Web search engines are targeted towards helping people find new information, people regularly use them to re-find Web pages they have seen before. Researchers have noted ...
Jaime Teevan, Sarah K. Tyler