We present a diacritization system for written Arabic which is based on a lexical resource. It combines a tagger and a lexeme language model. It improves on the best results repor...
This paper presents a maximum entropy machine translation system using a minimal set of translation blocks (phrase-pairs). While recent phrase-based statistical machine translatio...
This paper studies methods that automatically detect action-items in e-mail, an important category for assisting users in identifying new tasks, tracking ongoing ones, and searchi...
This paper describes a probabilistic answer selection framework for question answering. In contrast with previous work using individual resources such as ontologies and the Web to...
We introduce an answer typing strategy specific to quantifiable how questions. Using the web as a data source, we automatically collect answer units appropriate to a given how-q...
The task of selecting and ordering information appears in multiple contexts in text generation and summarization. For instance, methods for title generation construct a headline b...
This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus. We describe three novel knowledge-based...
This paper introduces the use of speech translation technology for a new type of voice-interactive Computer Aided Language Learning (CALL) application. We describe a computer game...
A new architecture for identifying and interpreting temporal expressions is introduced, in which the large set of complex hand-crafted rules standard in systems for this task is r...