We present a multi-lingual dictionary of dirty words. We have collected about 3,200 dirty words in several languages and built a database of these. The language with the most word...
This paper describes BABYLON, a system that attempts to overcome the shortage of parallel texts in low-density languages by supplementing existing parallel texts with texts gather...
Morfette is a modular, data-driven, probabilistic system which learns to perform joint morphological tagging and lemmatization from morphologically annotated corpora. The system i...
Grzegorz Chrupala, Georgiana Dinu, Josef van Genab...
This paper proposes a new method of the sentiment analysis utilizing inter-sentence structures especially for coping with reversal phenomenon of word polarity such as quotation of...
The ability to make progress in Computational Linguistics depends on the availability of large annotated corpora, but creating such corpora by hand annotation is very expensive an...