Sciweavers

9 search results - page 1 / 2
» Word Length n-Grams for Text Re-use Detection
Sort
View
CICLING
2010
Springer
13 years 8 months ago
Word Length n-Grams for Text Re-use Detection
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
Alberto Barrón-Cedeño, Chiara Basile...
ECIR
2009
Springer
14 years 1 months ago
On Automatic Plagiarism Detection Based on n-Grams Comparison
Abstract. When automatic plagiarism detection is carried out considering a reference corpus, a suspicious text is compared to a set of original documents in order to relate the pla...
Alberto Barrón-Cedeño, Paolo Rosso
DRR
2009
13 years 2 months ago
Text-image alignment for historical handwritten documents
We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text...
Svitlana Zinger, John Nerbonne, Lambert Schomaker
13
Voted
ICDAR
2007
IEEE
13 years 11 months ago
An Efficient Word Segmentation Technique for Historical and Degraded Machine-Printed Documents
Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...
Michael Makridis, N. Nikolaou, Basilios Gatos
COLING
1996
13 years 5 months ago
The Automatic Extraction of Open Compounds from Text Corpora
This paper describes a new method for extracting open compounds (uninterrupted sequences of words) from text corpora of languages, such as Thai, Japanese and Korea that exhibit un...
Virach Sornlertlamvanich, Hozumi Tanaka