Finding good representations of text documents is crucial in information retrieval and classification systems. Today the most popular document representation is based on a vector ...
General-purpose ontologies (e.g. WordNet) are convenient, but they are not always scientifically valid. We draw on techniques from semantic class learning to improve the scientific...
Roget's Thesaurus has gone through many revisions since it was first published 150 years ago. But how do these revisions affect Roget's usefulness for NLP? We examine th...
In this paper we address the problem of discovering word semantic similarities via statistical processing of text corpora. We propose a knowledge-poor method that exploits the sen...
Aristomenis Thanopoulos, Nikos Fakotakis, George K...
This paper presents a direct word reordering model with novel syntax-based features for statistical machine translation. Reordering models address the problem of reordering source...