Learning Information Status of Discourse Entities

13 years 7 months ago
Learning Information Status of Discourse Entities
In this paper we address the issue of automatically assigning information status to discourse entities. Using an annotated corpus of conversational English and exploiting morpho-syntactic and lexical features, we train a decision tree to classify entities introduced by noun phrases as old, mediated, or new. We compare its performance with hand-crafted rules that are mainly based on morpho-syntactic features and closely relate to the guidelines that had been used for the manual annotation. The decision tree model achieves an overall accuracy of 79.5%, significantly outperforming the hand-crafted algorithm (64.4%). We also experiment with binary classifications by collapsing in turn two of the three target classes into one and retraining the model. The highest accuracy achieved on binary classification is 93.1%.
Malvina Nissim
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Authors Malvina Nissim
Comments (0)