The aim of the paper is to present recent -- as of March 2010 -- developments in the construction of the National Corpus of Polish (NKJP). The NKJP project was launched at the ver...
Many languages are in serious danger of being lost and as a result, there has been a significant increase in language documentation projects, and also in attempts to preserve lang...
This paper outlines the new resource technologies, products and applications that have been constructed during the development of a multi-modal (MM hereafter) corpus tool on the D...
The ability to make progress in Computational Linguistics depends on the availability of large annotated corpora, but creating such corpora by hand annotation is very expensive an...
This paper describes efforts by the University of Pennsylvania's Linguistic Data Consortium to create and distribute shared linguistic resources – including data, annotation...