To better understand the ordering of clause aggregation operators in a text generation application, we manually annotated a small corpus. The annotated corpus supports the preferr...
In this paper we address issues related to building a large-scale Chinese corpus. We try to answer four questions: (i) how to speed up annotation, (ii) how to maintain high annota...
Spoken queries are a natural medium for searching the Web in settings where typing on a keyboard is not practical. This paper describes a speech interface to the Google search eng...
We describe a language-independent, flexible, and accurate method for the detection of abbreviations in text corpora. It is based on the idea that an abbreviation can be viewed as...
We describe an in-depth study of using a dictionary (WordNet) and web search engines (Altavista, MSN, and Google) to boost the performance of an automated question answering syste...