In this paper, we address the task of crosslingual semantic relatedness. We introduce a method that relies on the information extracted from Wikipedia, by exploiting the interlang...
The number and sizes of parallel corpora keep growing, which makes it necessary to have automatic methods of processing them: combining, checking and improving corpora quality, et...
Named Entity (NE) extraction is an important subtask of document processing such as information extraction and question answering. A typical method used for NE extraction of Japan...
This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristi...
Jun Xu, Jing Yao, Jiaqian Zheng, Qi Sun, Junyu Niu
This paper presents a quantitative performance analysis of two different approaches to the lemmatization of the Czech text data. The first one is based on manually prepared diction...