Sciweavers

KCAP
2011
ACM

Let's agree to disagree: on the evaluation of vocabulary alignment

12 years 6 months ago
Let's agree to disagree: on the evaluation of vocabulary alignment
Gold standard mappings created by experts are at the core of alignment evaluation. At the same time, the process of manual evaluation is rarely discussed. While the practice of having multiple raters evaluate results is accepted, their level of agreement is often not measured. In this paper we describe three experiments in manual evaluation and study the way different raters evaluate mappings. We used alignments generated using different techniques and between vocabularies of different type. In each experiment, five raters evaluated alignments and talked through their decisions using the think aloud method. In all three experiments we found that inter-rater agreement was low and analyzed our data to find the reasons for it. Our analysis shows which variables can be controlled to affect the level of agreement including the mapping relations, the evaluation guidelines and the background of the raters. On the other hand, differences in the perception of raters, and the complexity of th...
Anna Tordai, Jacco van Ossenbruggen, Guus Schreibe
Added 16 Sep 2011
Updated 16 Sep 2011
Type Journal
Year 2011
Where KCAP
Authors Anna Tordai, Jacco van Ossenbruggen, Guus Schreiber, Bob J. Wielinga
Comments (0)