Sciweavers

43 search results - page 6 / 9
» Efficient similarity joins for near duplicate detection
Sort
View
76
Voted
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
15 years 10 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
CIVR
2007
Springer
247views Image Analysis» more  CIVR 2007»
15 years 3 months ago
Near-duplicate keyframe retrieval with visual keywords and semantic context
Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by explo...
Xiao Wu, Wanlei Zhao, Chong-Wah Ngo
CLEF
2010
Springer
14 years 10 months ago
Fuzzy Semantic-Based String Similarity for Extrinsic Plagiarism Detection - Lab Report for PAN at CLEF 2010
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-proce...
Salha Alzahrani, Naomie Salim
DEXA
2006
Springer
197views Database» more  DEXA 2006»
14 years 11 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife
MM
2009
ACM
249views Multimedia» more  MM 2009»
15 years 2 months ago
MyFinder: near-duplicate detection for large image collections
The explosive growth of multimedia data poses serious challenges to data storage, management and search. Efficient near-duplicate detection is one of the required technologies for...
Xin Yang, Qiang Zhu, Kwang-Ting Cheng