An unsupervised part-of-speech (POS) tagging system that relies on graph clustering methods is described. Unlike in current state-of-the-art approaches, the kind and number of dif...
We present an approach to pronoun resolution based on syntactic paths. Through a simple bootstrapping procedure, we learn the likelihood of coreference between a pronoun and a can...
In this paper, we describe a rote extractor that learns patterns for finding semantic relationships in unrestricted text, with new procedures for pattern generalization and scorin...
Enrique Alfonseca, Pablo Castells, Manabu Okumura,...
In this paper, we describe the research using machine learning techniques to build a comma checker to be integrated in a grammar checker for Basque. After several experiments, and...
In this paper, we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be u...
This paper presents Archivus, a multimodal language-enabled meeting browsing and retrieval system. The prototype is in an early stage of development, and we are currently explorin...
Marita Ailomaa, Miroslav Melichar, Agnes Lisowska,...
Morphological disambiguation is the process of assigning one set of morphological features to each individual word in a text. When the word is ambiguous (there are several possibl...
In this paper, we present a method that improves Japanese dependency parsing by using large-scale statistical information. It takes into account two kinds of information not consi...
We investigate the unsupervised detection of semi-fixed cue phrases such as "This paper proposes a novel approach. . . 1" from unseen text, on the basis of only a handfu...