Of the ten million words of contemporary standard Dutch in the Spoken Dutch Corpus (Corpus Gesproken Nederlands, CGN), a selection of one million words of natural spoken language ...
Heleen Hoekstra, Michael Moortgat, Ineke Schuurman...
We study the problem of similarity detection by sequence alignment with gaps, using a recently established theoretical framework based on the morphology of alignment paths. Alignm...
We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. Based on an extension of the incremental joint model for POS tagging and ...
Jun Hatori, Takuya Matsuzaki, Yusuke Miyao, Jun-ic...
We describe a syntactically annotated parallel corpus containing English, Swedish and Turkish. The corpus consists of approximately 300 000 tokens in Swedish, 160 000 in Turkish a...
ibe various syntactic and semantic conditions for finding abstract nouns which refer to concepts of adjectives from a text, in an attempt to explore the creation of a thesaurus fr...
Kyoko Kanzaki, Francis Bond, Noriko Tomuro, Hitosh...