This paper presents an attempt at building a large scale distributed composite language model that simultaneously accounts for local word lexical information, mid-range sentence s...
We report about tools for the extraction of German multiword expressions (MWEs) from text corpora; we extract word pairs, but also longer MWEs of different patterns, e.g. verb-nou...
In the Japanese language, as a predicate is placed at the end of a sentence, the content of a sentence cannot be inferred until reaching the end. However, when the content is comp...
Most of the known stochastic sentence generators use syntactically annotated corpora, performing the projection to the surface in one stage. However, in full-fledged text generati...
Bernd Bohnet, Leo Wanner, Simon Mille, Alicia Burg...
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...