Wikipedia provides an interesting amount of text for more than hundred languages. This also includes languages where no reference corpora or other linguistic resources are easily ...
The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
The Linguistic Data Consortium (LDC) is currently involved in a major effort to expand its multilingual text resources, in particular for machine translation, message understandin...
In this paper, we present how creation and dynamic synthesis of linguistic resources of Greek Sign Language (GSL) may serve to support development and provide content to an educat...
Kostas Karpouzis, George Caridakis, Stavroula-Evit...
Parallel corpora encode extremely valuable linguistic knowledge, the revealing of which is facilitated by the recent advances in multilingual corpus linguistics. The linguistic dec...