Sciweavers

1012 search results - page 21 / 203
» Testing documentation with
Sort
View
NAACL
2004
14 years 11 months ago
Cross-Document Coreference on a Large Scale Corpus
In this paper, we will compare and evaluate the effectiveness of different statistical methods in the task of cross-document coreference resolution. We created entity models for d...
Chung Heong Gooi, James Allan
CLEF
2010
Springer
14 years 10 months ago
MapReduce for Information Retrieval Evaluation: "Let's Quickly Test This on 12 TB of Data"
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use ...
Djoerd Hiemstra, Claudia Hauff
111
Voted
ICDAR
2003
IEEE
15 years 2 months ago
Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features
The recognition of script in historical documents requires suitable techniques in order to identify single words. Segmentation of lines and words is a challenging task because lin...
Markus Feldbach, Klaus D. Tönnies
ACST
2006
14 years 11 months ago
Distributed hierarchical document clustering
This paper investigates the applicability of distributed clustering technique, called RACHET [1], to organize large sets of distributed text data. Although the authors of RACHET c...
Debzani Deb, M. Muztaba Fuad, Rafal A. Angryk
ICDAR
2009
IEEE
15 years 4 months ago
Metadata Extraction from PDF Papers for Digital Library Ingest
In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...
Simone Marinai