Sciweavers

780 search results - page 89 / 156
» Cost-Sharing Approximations for h
Sort
View
WWW
2006
ACM
15 years 12 months ago
Towards practical genre classification of web documents
Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...
George Ferizis, Peter Bailey
WWW
2005
ACM
15 years 12 months ago
Duplicate detection in click streams
We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...
Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi
WWW
2005
ACM
15 years 12 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
KDD
2006
ACM
107views Data Mining» more  KDD 2006»
15 years 11 months ago
Out-of-core frequent pattern mining on a commodity PC
In this work we focus on the problem of frequent itemset mining on large, out-of-core data sets. After presenting a characterization of existing out-of-core frequent itemset minin...
Gregory Buehrer, Srinivasan Parthasarathy, Amol Gh...
KDD
2006
ACM
143views Data Mining» more  KDD 2006»
15 years 11 months ago
Algorithms for discovering bucket orders from data
Ordering and ranking items of different types are important tasks in various applications, such as query processing and scientific data mining. A total order for the items can be ...
Aristides Gionis, Heikki Mannila, Kai Puolamä...