Search Sciweavers | Sciweavers

15

CICLING
2010
Springer

174views Natural Language Processing» more CICLING 2010»

Word Length n-Grams for Text Re-use Detection

13 years 8 months ago

Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...

Alberto Barrón-Cedeño, Chiara Basile...

claim paper

Read More »

16

click to vote

ECIR
2009
Springer

155views Information Technology» more ECIR 2009»

On Automatic Plagiarism Detection Based on n-Grams Comparison

14 years 1 months ago

Download users.dsic.upv.es

Abstract. When automatic plagiarism detection is carried out considering a reference corpus, a suspicious text is compared to a set of original documents in order to relate the pla...

Alberto Barrón-Cedeño, Paolo Rosso

claim paper

Read More »

19

click to vote

DRR
2009

166views Document Analysis» more DRR 2009»

Text-image alignment for historical handwritten documents

13 years 2 months ago

Download vca.ele.tue.nl

We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text...

Svitlana Zinger, John Nerbonne, Lambert Schomaker

claim paper

Read More »

13

Voted

ICDAR
2007
IEEE

159views Document Analysis» more ICDAR 2007»

An Efficient Word Segmentation Technique for Historical and Degraded Machine-Printed Documents

13 years 11 months ago

Download users.iit.demokritos.gr

Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...

Michael Makridis, N. Nikolaou, Basilios Gatos

claim paper

Read More »

10

click to vote

COLING
1996

160views Computational Linguistics» more COLING 1996»

The Automatic Extraction of Open Compounds from Text Corpora

13 years 5 months ago

Download acl.ldc.upenn.edu

This paper describes a new method for extracting open compounds (uninterrupted sequences of words) from text corpora of languages, such as Thai, Japanese and Korea that exhibit un...

Virach Sornlertlamvanich, Hozumi Tanaka

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers