Abstract—Nowadays companies increasingly aggregate location data from different sources on the Internet to offer locationbased services such as estimating current road traffic c...
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
— One of the central problems for data quality is inconsistency detection. Given a database D and a set Σ of dependencies as data quality rules, we want to identify tuples in D ...
This paper demonstrates the applicability of the recently proposed supervised dimension reduction, hierarchical linear discriminant analysis (h-LDA) to a well-known spatial locali...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...