We present a novel approach to the task of word lemmatisation. We formalise lemmatisation as a category tagging task, by describing how a word-to-lemma transformation rule can be ...
As the number of learners of English is constantly growing, automatic error correction of ESL learners’ writing is an increasingly active area of research. However, most researc...
We present a joint model for Chinese word segmentation and new word detection. We present high dimensional new features, including word-based features and enriched edge (label-tra...
We demonstrate a web-based environment for development and testing of different pedestrian route instruction-giving systems. The environment contains a City Model, a TTS interface...
As one of the most popular micro-blogging services, Twitter attracts millions of users, producing millions of tweets daily. Shared information through this service spreads faster ...
We present LLCCM, a log-linear variant of the constituent context model (CCM) of grammar induction. LLCCM retains the simplicity of the original CCM but extends robustly to long s...
Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate se...
Majid Razmara, George Foster, Baskaran Sankaran, A...
This paper describes the creation of the first large-scale corpus containing drafts and final versions of essays written by non-native speakers, with the sentences aligned acros...
We propose a new approach to characterizing the timeline of a text: temporal dependency structures, where all the events of a narrative are linked via partial ordering relations l...
Oleksandr Kolomiyets, Steven Bethard, Marie-Franci...
Temporal reasoners for document understanding typically assume that a document’s creation date is known. Algorithms to ground relative time expressions and order events often re...