Search Sciweavers | Sciweavers

241 search results - page 4 / 49

» Detecting Co-Derivative Documents in Large Text Collections

139

click to vote

CHI
1997
ACM

127views Human Computer Interaction» more CHI 1997»

Computational Models of Information Scent-Following in a Very Large Browsable Text Collection

15 years 9 months ago

Download www2.parc.com

An ecological-cognitive framework of analysis and a model-tracing architecture are presented and used in the analysis of data recorded from users browsing a large document collect...

Peter Pirolli

claim paper

Read More »

157

click to vote

SIGIR
2004
ACM

136views Information Technology» more SIGIR 2004»

Constructing a text corpus for inexact duplicate detection

15 years 11 months ago

Download www.conradweb.org

As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...

Jack G. Conrad, Cindy P. Schriber

claim paper

Read More »

149

click to vote

SIGIR
2010
ACM

169views Information Technology» more SIGIR 2010»

Efficient partial-duplicate detection based on sequence matching

15 years 8 days ago

Download homepage.fudan.edu.cn

With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...

Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang

claim paper

Read More »

129

click to vote

ECIR
2009
Springer

105views Information Technology» more ECIR 2009»

Revisiting N-Gram Based Models for Retrieval in Degraded Large Collections

16 years 2 months ago

Download www.dc.fi.udc.es

The traditional retrieval models based on term matching are not eﬀective in collections of degraded documents (output of OCR or ASR systems for instance). This paper presents a n...

Javier Parapar, Ana Freire, Alvaro Barreiro

claim paper

Read More »

169

click to vote

CIKM
2011
Springer

191views Information Technology» more CIKM 2011»

Partial duplicate detection for large book collections

14 years 5 months ago

Download www.cs.umass.edu

A framework is presented for discovering partial duplicates in large collections of scanned books with optical character recognition (OCR) errors. Each book in the collection is r...

Ismet Zeki Yalniz, Ethem F. Can, R. Manmatha

claim paper

Read More »

« Prev « First page 4 / 49 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers