Sciweavers

26 search results - page 3 / 6
» Partial duplicate detection for large book collections
Sort
View
DAS
2010
Springer
13 years 9 months ago
Nearest neighbor based collection OCR
Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. We describe a Coll...
K. Pramod Sankar, C. V. Jawahar, Raghavan Manmatha
CIVR
2007
Springer
273views Image Analysis» more  CIVR 2007»
13 years 11 months ago
Scalable near identical image and shot detection
This paper proposes and compares two novel schemes for near duplicate image and video-shot detection. The first approach is based on global hierarchical colour histograms, using ...
Ondrej Chum, James Philbin, Michael Isard, Andrew ...
NSDI
2010
13 years 7 months ago
Carousel: Scalable Logging for Intrusion Prevention Systems
We address the problem of collecting unique items in a large stream of information in the context of Intrusion Prevention Systems (IPSs). IPSs detect attacks at gigabit speeds and...
Vinh The Lam, Michael Mitzenmacher, George Varghes...
WCRE
2008
IEEE
13 years 12 months ago
Detecting Clones in Business Applications
A business application automates a collection of business processes. A business process describes how a set of logically related tasks are executed, ordered and managed by followi...
Jin Guo, Ying Zou
DGO
2006
134views Education» more  DGO 2006»
13 years 6 months ago
Next steps in near-duplicate detection for eRulemaking
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Hui Yang, Jamie Callan, Stuart W. Shulman