Sciweavers

COLING
2008

Source Language Markers in EUROPARL Translations

14 years 10 months ago
Source Language Markers in EUROPARL Translations
This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (87.2%96.7% accuracy depending on classification method). The paper also examines in detail which positive markers are most powerful and identifies a number of linguistic aspects as well as culture- and domain-related ones.1
Hans van Halteren
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Hans van Halteren
Comments (0)