Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov mod...
We report on a method for utilising corpora collected in natural settings. It is based on distilling re-writing natural dialogues to elicit the type of dialogue that would occur i...
A finite-state method, based on leftmost longestmatch replacement, is presented for segmenting words into graphemes, and for converting graphemes into phonemes. A small set of han...
Many corpus-based machine translation systems require parallel corpora. In this paper, we present a word-for-word glossing algorithm that requires only a source language corpus. T...
We describe Talk'n'Travel, a spoken dialogue language system for making air travel plans over the telephone. Talk'n'Travel is a fully conversational, mixed-ini...