This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focus...
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
We conduct large-scale search engine relevance experiments, using the 12% of queries that contain placenames, matching the placenames to places in the documents, and examining the...
We introduce a statistical model for abbreviation disambiguation in Web search, based on analysis of Web data resources, including anchor text, click log and query log. By combini...
We consider the application of machine learning techniques for sequence modeling to Information Retrieval (IR) and surface Information Extraction (IE) tasks. We introduce a generi...
Massih-Reza Amini, Hugo Zaragoza, Patrick Gallinar...