Corpus-based methods for natural language processing often use supervised training, requiring expensive manual annotation of training corpora. This paper investigates methods for ...
Corpus-based grammar induction generally relies on hand-parsed training data to learn the structure of the language. Unfortunately, the cost of building large annotated corpora is...
The last decade has witnessed substantial progress in speech recognition technology, with todays state-of-the-art systems being able to transcribe unrestricted broadcast news audi...
In this paper we present an active approach to annotate with lexical and semantic labels an Italian corpus of conversational human-human and Wizard-of-Oz dialogues. This procedure...
Christian Raymond, Kepa Joseba Rodriguez, Giuseppe...
Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...