Sciweavers

NLE
2010

Formal and functional assessment of the pyramid method for summary content evaluation

13 years 2 months ago
Formal and functional assessment of the pyramid method for summary content evaluation
Pyramid annotation makes it possible to evaluate quantitatively and qualitatively the content of machine-generated (or human) summaries. Evaluation methods must prove themselves against the same measuring stick – evaluation – as other research methods. First, a formal assessment of pyramid data from the 2003 Document Understanding Conference (DUC) is presented; this addresses whether the form of annotation is reliable and whether score results are consistent across annotators. A combination of interannotator reliability measures of the two manual annotation phases (pyramid creation and annotation of system peer summaries against pyramid models), and significance tests of the similarity of system scores from distinct annotations, produces highly reliable results. The most rigorous test consists of a comparison of peer system rankings produced from two independent sets of pyramid and peer annotations, which produce essentially the same rankings. Three years of DUC data (2003, 2005,...
Rebecca J. Passonneau
Added 29 Jan 2011
Updated 29 Jan 2011
Type Journal
Year 2010
Where NLE
Authors Rebecca J. Passonneau
Comments (0)