Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. We describe a Coll...
K. Pramod Sankar, C. V. Jawahar, Raghavan Manmatha
This paper proposes and compares two novel schemes for near duplicate image and video-shot detection. The first approach is based on global hierarchical colour histograms, using ...
Ondrej Chum, James Philbin, Michael Isard, Andrew ...
We address the problem of collecting unique items in a large stream of information in the context of Intrusion Prevention Systems (IPSs). IPSs detect attacks at gigabit speeds and...
Vinh The Lam, Michael Mitzenmacher, George Varghes...
A business application automates a collection of business processes. A business process describes how a set of logically related tasks are executed, ordered and managed by followi...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...