We present a grand challenge to build a corpus that will include all of the world's languages, in a consistent structure that permits large-scale cross-linguistic processing,...
This paper describes a process of building a bilingual syntactically annotated corpus, the PCEDT (Prague Czech-English Dependency Treebank). The corpus is being created at Charles...
This paper explores the potential for annotating and enriching data for low-density languages via the alignment and projection of syntactic structure from parsed data for resource...
IIuman intervention and/or training corpora tagged with various kinds of information were often assumed in many natural language acquisition models. This assumption is a major sou...
We have designed, implemented and evaluated an end-to-end system spellchecking and autocorrection system that does not require any manually annotated training data. The World Wide...
Casey Whitelaw, Ben Hutchinson, Grace Chung, Ged E...