Assessing the Effect of Inconsistent Assessors on Summarization Evaluation

13 years 6 months ago

Download aclweb.org

We investigate the consistency of human assessors involved in summarization evaluation to understand its effect on system ranking and automatic evaluation techniques. Using Text Analysis Conference data, we measure annotator consistency based on human scoring of summaries for Responsiveness, Readability, and Pyramid scoring. We identify inconsistencies in the data and measure to what extent these inconsistencies affect the ranking of automatic summarization systems. Finally, we examine the stability of automatic metrics (ROUGE and CLASSY) with respect to the inconsistent assessments.

Karolina Owczarzak, Peter A. Rankel, Hoa Trang Dan

Real-time Traffic

ACL 2012 | Automatic Evaluation | Automatic Summarization | Computational Linguistics | Summarization Systems |

claim paper

» Evaluating audio skimming and frame rate acceleration for summarizing BBC rushes

» Information retrieval system evaluation effort sensitivity and reliability

» Summarizing BBC Rushes the Informedia Way

» Automatic summarization of rushes video using bipartite graphs

» CRAC Confidentiality risk assessment and ITinfrastructure comparison

» The importance of manual assessment in link discovery

» A comparison and userbased evaluation of models of textual information structure in the co...

» Combining Capability Assessment and Value Engineering A BOOTSTRAP Example

Post Info
More Details (n/a)

Added	29 Sep 2012
Updated	29 Sep 2012
Type	Journal
Year	2012
Where	ACL
Authors	Karolina Owczarzak, Peter A. Rankel, Hoa Trang Dang, John M. Conroy

Comments (0)

Sciweavers

Assessing the Effect of Inconsistent Assessors on Summarization Evaluation

ACL 2012 | Automatic Evaluation | Automatic Summarization | Computational Linguistics | Summarization Systems |

Explore & Download

Productivity Tools

Sciweavers