Sciweavers

64
Voted
LREC
2010

Towards a Large Parallel Corpus of Cleft Constructions

14 years 12 months ago
Towards a Large Parallel Corpus of Cleft Constructions
We present our efforts to create a large-scale, semi-automatically annotated parallel corpus of cleft constructions. The corpus is intended to reduce or make more effective the manual task of finding examples of clefts in a corpus. The corpus is being developed in the context of the Collaborative Research Centre SFB 632, which is a large, interdisciplinary research initiative to study information structure. We show how state-of-the-art NLP tools, like POS taggers and statistical dependency parsers, may facilitate powerful and precise searches, and we demonstrate through preliminary empirical findings how such a resource may provide new opportunities for the linguistic research of cleft constructions.
Gerlof Bouma, Lilja Øvrelid, Jonas Kuhn
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Gerlof Bouma, Lilja Øvrelid, Jonas Kuhn
Comments (0)