Sciweavers

2572 search results - page 497 / 515
» codes 2008
Sort
View
WWW
2008
ACM
15 years 11 months ago
Can chinese web pages be classified with english data source?
As the World Wide Web in China grows rapidly, mining knowledge in Chinese Web pages becomes more and more important. Mining Web information usually relies on the machine learning ...
Xiao Ling, Gui-Rong Xue, Wenyuan Dai, Yun Jiang, Q...
WWW
2008
ACM
15 years 11 months ago
Networked graphs: a declarative mechanism for SPARQL rules, SPARQL views and RDF data integration on the web
Easy reuse and integration of declaratively described information in a distributed setting is one of the main motivations for building the Semantic Web. Despite of this claim, reu...
Simon Schenk, Steffen Staab
84
Voted
STOC
2009
ACM
120views Algorithms» more  STOC 2009»
15 years 11 months ago
Bit-probe lower bounds for succinct data structures
We prove lower bounds on the redundancy necessary to represent a set S of objects using a number of bits close to the information-theoretic minimum log2 |S|, while answering vario...
Emanuele Viola
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
15 years 10 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
KDD
2008
ACM
163views Data Mining» more  KDD 2008»
15 years 10 months ago
The cost of privacy: destruction of data-mining utility in anonymized data publishing
Re-identification is a major privacy threat to public datasets containing individual records. Many privacy protection algorithms rely on generalization and suppression of "qu...
Justin Brickell, Vitaly Shmatikov