Sciweavers

2513 search results - page 367 / 503
» Improving Generalization by Data Categorization
Sort
View
WWW
2007
ACM
16 years 5 months ago
A new suffix tree similarity measure for document clustering
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Hung Chim, Xiaotie Deng
WWW
2007
ACM
16 years 5 months ago
U-REST: an unsupervised record extraction system
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Yuan Kui Shen, David R. Karger
WWW
2007
ACM
16 years 5 months ago
Classifying web sites
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality....
Christoph Lindemann, Lars Littig
PPOPP
2009
ACM
16 years 5 months ago
Comparability graph coloring for optimizing utilization of stream register files in stream processors
A stream processor executes an application that has been decomposed into a sequence of kernels that operate on streams of data elements. During the execution of a kernel, all stre...
Xuejun Yang, Li Wang, Jingling Xue, Yu Deng, Ying ...
PODS
2009
ACM
110views Database» more  PODS 2009»
16 years 5 months ago
Optimal tracking of distributed heavy hitters and quantiles
We consider the the problem of tracking heavy hitters and quantiles in the distributed streaming model. The heavy hitters and quantiles are two important statistics for characteri...
Ke Yi, Qin Zhang