Sciweavers

32 search results - page 6 / 7
» Near-duplicate detection for web-forums
Sort
View
PVLDB
2010
126views more  PVLDB 2010»
13 years 4 months ago
Set Similarity Join on Probabilistic Data
Set similarity join has played an important role in many real-world applications such as data cleaning, near duplication detection, data integration, and so on. In these applicati...
Xiang Lian, Lei Chen 0002
CIVR
2007
Springer
155views Image Analysis» more  CIVR 2007»
14 years 5 days ago
Detection of near-duplicate images for web search
Among the vast numbers of images on the web are many duplicates and near-duplicates, that is, variants derived from the same original image. Such near-duplicates appear in many we...
Jun Jie Foo, Justin Zobel, Ranjan Sinha, Seyed M. ...
AAAI
2007
13 years 8 months ago
Temporal and Information Flow Based Event Detection from Social Text Streams
Recently, social text streams (e.g., blogs, web forums, and emails) have become ubiquitous with the evolution of the web. In some sense, social text streams are sensors of the rea...
Qiankun Zhao, Prasenjit Mitra, Bi Chen
ICAIL
2007
ACM
13 years 10 months ago
Essential deduplication functions for transactional databases in law firms
As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...
Jack G. Conrad, Edward L. Raymond
DEXA
2006
Springer
197views Database» more  DEXA 2006»
13 years 8 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife