Sciweavers

NAACL
2010

Quantifying the Limits and Success of Extractive Summarization Systems Across Domains

13 years 1 months ago
Quantifying the Limits and Success of Extractive Summarization Systems Across Domains
This paper analyzes the topic identification stage of single-document automatic text summarization across four different domains, consisting of newswire, literary, scientific and legal documents. We present a study that explores the summary space of each domain via an exhaustive search strategy, and finds the probability density function (pdf) of the ROUGE score distributions for each domain. We then use this pdf to calculate the percentile rank of extractive summarization systems. Our results introduce a new way to judge the success of automatic summarization systems and bring quantified explanations to questions such as why it was so hard for the systems to date to have a statistically significant improvement over the lead baseline in the news domain.
Hakan Ceylan, Rada Mihalcea, Umut O'zertem, Elena
Added 14 Feb 2011
Updated 14 Feb 2011
Type Journal
Year 2010
Where NAACL
Authors Hakan Ceylan, Rada Mihalcea, Umut O'zertem, Elena Lloret, Manuel Palomar
Comments (0)