Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

73

Voted

ACL
2009

favoriteEmaildiscussreport

103views Computational Linguistics» more ACL 2009»

Correlating Human and Automatic Evaluation of a German Surface Realiser

14 years 8 months ago

Correlating Human and Automatic Evaluation of a German Surface Realiser

Download www.aclweb.org

We examine correlations between native speaker judgements on automatically generated German text against automatic evaluation metrics. We look at a number of metrics from the MT and Summarisation communities and find that for a relative ranking task, most automatic metrics perform equally well and have fairly strong correlations to the human judgements. In contrast, on a naturalness judgement task, the General Text Matcher (GTM) tool correlates best overall, although in general, correlation between the human judgements and the automatic metrics was quite weak.

Aoife Cahill

Real-time Traffic

ACL 2009 | Automatic Evaluation Metrics | Automatic Metrics | Computational Linguistics | Human Judgements |

claim paper

Related Content

» Human Evaluation of a German Surface Realisation Ranker

» Incorporating Information Status into Generation Ranking

» Finding Common Ground Towards a Surface Realisation Shared Task

» Model Summaries for Locationrelated Images

» Further MetaEvaluation of BroadCoverage Surface Realization

» A Psychophysical Evaluation of Texture Degradation Descriptors

» A Corpusbased Account of Regular Polysemy The Case of Contextsensitive Adjectives

» Valmet A New Validation Tool for Assessing and Improving 3D Object Segmentation

Post Info
More Details (n/a)

Added	16 Feb 2011
Updated	16 Feb 2011
Type	Journal
Year	2009
Where	ACL
Authors	Aoife Cahill

Comments (0)