Sciweavers

5 search results - page 1 / 1
» Topic Segmentation with Hybrid Document Indexing
Sort
View
EMNLP
2007
13 years 5 months ago
Topic Segmentation with Hybrid Document Indexing
We present a domain-independent unsupervised topic segmentation approach based on hybrid document indexing. Lexical chains have been successfully employed to evaluate lexical cohe...
Irina Matveeva, Gina-Anne Levow
CLEF
2010
Springer
13 years 5 months ago
External and Intrinsic Plagiarism Detection Using a Cross-Lingual Retrieval and Segmentation System - Lab Report for PAN at CLEF
We present our hybrid system for the PAN challenge at CLEF 2010. Our system performs plagiarism detection for translated and non-translated externally as well as intrinsically plag...
Markus Muhr, Roman Kern, Mario Zechner, Michael Gr...
CIKM
2004
Springer
13 years 9 months ago
Processing content-oriented XPath queries
Document-centric XML collections contain text-rich documents, marked up with XML tags that add lightweight semantics to the text. Querying such collections calls for a hybrid quer...
Börkur Sigurbjörnsson, Jaap Kamps, Maart...
ICDAR
1997
IEEE
13 years 8 months ago
Representing OCRed documents in HTML
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Tao Hong, Sargur N. Srihari
ICIP
2009
IEEE
13 years 2 months ago
Semantic keyword extraction via adaptive text binarization of unstructured unsourced video
We propose a fully automatic method for summarizing and indexing unstructured presentation videos based on text extracted from the projected slides. We use changes of text in the ...
Michele Merler, John R. Kender