This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
Almost all automatic semantic role labeling (SRL) systems rely on a preliminary parsing step that derives a syntactic structure from the sentence being analyzed. This makes the ch...
Automatic acquisition of novel compounds is notoriously difficult because most novel compounds have relatively low frequency in a corpus. The current study proposes a new method t...
Identification of transliterations is aimed at enriching multilingual lexicons and improving performance in various Natural Language Processing (NLP) applications including Cross ...
Abstract. We present results of a new approach to detect destructive article revisions, so-called vandalism, in Wikipedia. Vandalism detection is a one-class classification problem...