Formal and functional assessment of the pyramid method for summary content evaluation

15 years 2 months ago

Download www1.ccls.columbia.edu

Pyramid annotation makes it possible to evaluate quantitatively and qualitatively the content of machine-generated (or human) summaries. Evaluation methods must prove themselves against the same measuring stick – evaluation – as other research methods. First, a formal assessment of pyramid data from the 2003 Document Understanding Conference (DUC) is presented; this addresses whether the form of annotation is reliable and whether score results are consistent across annotators. A combination of interannotator reliability measures of the two manual annotation phases (pyramid creation and annotation of system peer summaries against pyramid models), and signiﬁcance tests of the similarity of system scores from distinct annotations, produces highly reliable results. The most rigorous test consists of a comparison of peer system rankings produced from two independent sets of pyramid and peer annotations, which produce essentially the same rankings. Three years of DUC data (2003, 2005,...

Rebecca J. Passonneau

Real-time Traffic

Annotations | NLE 2010 | Pyramid | Statistical Power |

claim paper

Post Info
More Details (n/a)

Added	29 Jan 2011
Updated	29 Jan 2011
Type	Journal
Year	2010
Where	NLE
Authors	Rebecca J. Passonneau

Comments (0)

Sciweavers

Formal and functional assessment of the pyramid method for summary content evaluation

Annotations | NLE 2010 | Pyramid | Statistical Power |

Explore & Download

Productivity Tools

Sciweavers