A Web-Trained Extraction Summarization System

8 years 10 months ago
A Web-Trained Extraction Summarization System
A serious bottleneck in the development of trainable text summarization systems is the shortage of training data. Constructing such data is a very tedious task, especially because there are in general many different correct ways to summarize a text. Fortunately we can utilize the Internet as a source of suitable training data. In this paper, we present a summarization system that uses the web as the source of training data. The procedure involves structuring the articles downloaded from various websites, building adequate corpora of (summary, text) and (extract, text) pairs, training on positive and negative data, and automatically learning to perform the task of extraction-based summarization at a level comparable to the best DUC systems.
Liang Zhou, Eduard H. Hovy
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Authors Liang Zhou, Eduard H. Hovy
Comments (0)