Current statistical machine translation (SMT) systems are trained on sentencealigned and word-aligned parallel text collected from various sources. Translation model parameters ar...
Spyros Matsoukas, Antti-Veikko I. Rosti, Bing Zhan...
The present paper presents the structure of a cross-linguistic database of production data. The database contains annotated texts collected from a sample of fifteen different langu...
We present a critique of language-based modelling for text input research, and propose an alternative inputbased approach. Current language-based statistical models are derived fr...
Macrophone is a corpus of approximately 200,000 utterances, recorded over the telephone from a broad sample of about 5,000 American speakers. Sponsored by the Linguistic Data Cons...
The notion of a fragment was coined by Montague 1974 to illustrate the formal handling of certain puzzles, such as de dicto/de re, in a truth-conditional semantics for natural lan...