Sciweavers

NAACL
2003

A Web-Trained Extraction Summarization System

13 years 5 months ago
A Web-Trained Extraction Summarization System
A serious bottleneck in the development of trainable text summarization systems is the shortage of training data. Constructing such data is a very tedious task, especially because there are in general many different correct ways to summarize a text. Fortunately we can utilize the Internet as a source of suitable training data. In this paper, we present a summarization system that uses the web as the source of training data. The procedure involves structuring the articles downloaded from various websites, building adequate corpora of (summary, text) and (extract, text) pairs, training on positive and negative data, and automatically learning to perform the task of extraction-based summarization at a level comparable to the best DUC systems.
Liang Zhou, Eduard H. Hovy
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where NAACL
Authors Liang Zhou, Eduard H. Hovy
Comments (0)