This paper describes an approach for inferring syntactic frames of verbs in Urdu from an untagged corpus. Urdu, like many other South Asian languages, is a free word order and cas...
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized corpus is a ten-million-word high-quality sentence-aligned bidirectional parall...
This paper describes the process and the resources used to automatically annotate a French corpus of spontaneous speech transcriptions in super-chunks. Super-chunks are enhanced c...
Olivier Blanc, Matthieu Constant, Anne Dister, Pat...
With the development of speech and language processing, speech translation systems have been developed. These studies target spoken dialogues, and employ consecutive interpretatio...
The paper presents Bulgarian National Corpus project (BulNC) - a large-scale, representative, online available corpus of Bulgarian. The BulNC is also a monolingual general corpus,...