Sciweavers

CICLING
2010
Springer

A Sequential Model for Discourse Segmentation

13 years 11 months ago
A Sequential Model for Discourse Segmentation
Identifying discourse relations in a text is essential for various tasks in Natural Language Processing, such as automatic text summarization, question-answering, and dialogue generation. The first step of this process is segmenting a text into elementary units. In this paper, we present a novel model of discourse segmentation based on sequential data labeling. Namely, we use Conditional Random Fields to train a discourse segmenter on the RST Discourse Treebank, using a set of lexical and syntactic features. Our system is compared to other statistical and rule-based segmenters, including one based on Support Vector Machines. Experimental results indicate that our sequential model outperforms current state-of-the-art discourse segmenters, with an F-score of 0.94. This performance level is close to the human agreement F-score of 0.98.
Hugo Hernault, Danushka Bollegala, Mitsuru Ishizuk
Added 18 May 2010
Updated 18 May 2010
Type Conference
Year 2010
Where CICLING
Authors Hugo Hernault, Danushka Bollegala, Mitsuru Ishizuka
Comments (0)