Sciweavers

26 search results - page 2 / 6
» Partial duplicate detection for large book collections
Sort
View
CIKM
2003
Springer
13 years 10 months ago
Online duplicate document detection: signature reliability in a dynamic retrieval environment
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
Jack G. Conrad, Xi S. Guo, Cindy P. Schriber
MM
2009
ACM
249views Multimedia» more  MM 2009»
13 years 9 months ago
MyFinder: near-duplicate detection for large image collections
The explosive growth of multimedia data poses serious challenges to data storage, management and search. Efficient near-duplicate detection is one of the required technologies for...
Xin Yang, Qiang Zhu, Kwang-Ting Cheng
WCRE
1999
IEEE
13 years 9 months ago
Partial Redesign of Java Software Systems Based on Clone Analysis
Code duplication, plausibly caused by copying source code and slightly modifying it, is often observed in large systems. Clone detection and documentation have been investigated b...
Magdalena Balazinska, Ettore Merlo, Michel Dagenai...
COLING
2010
13 years 1 days ago
Large Scale Parallel Document Mining for Machine Translation
A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...
ICAIL
2007
ACM
13 years 9 months ago
Essential deduplication functions for transactional databases in law firms
As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...
Jack G. Conrad, Edward L. Raymond