This paper investigates the use of stemming for classification of Dutch (email) texts. We introduce a stemmer, which combines dictionary lookup (implemented efficiently as a finit...
Collocational prepositional phrases like ten koste van (at the expense of), met het oog op (with an eye on), and onder het mom van (under the pretext of) are patterns of the form ...
We discuss the annotation with part of speech and lemma of the Dutch PAROLE Internet Corpus. The PAROLE PoS tagger is a combination of statistical taggers. It includes the Markov ...
In this paper we present a definition of Performance Grammar (PG), a psycholinguistically motivated syntax formalism, in declarative terms. PG aims not only at describing and expl...
The paper describes ongoing empirical research into a fundamental problem of linguistics, viz. the architecture of grammar, or the division of labor between lexicon and grammar. W...
Finding semantically related words is a first step in the direction of automatic ontology building. Guided by the view that similar words occur in similar contexts, we looked at t...
In Jijkoun et al. [2004] we showed that off-line answer extraction using syntactic patterns is a successful method for answering English factoid questions. In this paper I will di...
Machine learning and statistical methods have yielded impressive results in a wide variety of natural language processing tasks. These advances have generally been regarded as eng...
This paper presents a machine learning approach to the resolution of coreferential relations between nominal constituents in Dutch. It is the first significant automatic approach ...