We investigate the consistency of human assessors involved in summarization evaluation to understand its effect on system ranking and automatic evaluation techniques. Using Text A...
Karolina Owczarzak, Peter A. Rankel, Hoa Trang Dan...
Assessors frequently disagree on the topical relevance of documents. How much of this disagreement is due to ambiguity in assessment instructions? We have two assessors assess TRE...
For the first time in 2007, TRECVID considered structured evaluation of automated video summarization, utilizing BBC rushes video. In 2007, we conducted user evaluations with the ...
The effectiveness of information retrieval systems is measured by comparing performance on a common set of queries and documents. Significance tests are often used to evaluate the...
For the first time in 2007, TRECVID considered structured evaluation of automated video summarization, utilizing BBC rushes video. This paper discusses in detail our approaches fo...
Alexander G. Hauptmann, Michael G. Christel, Wei-H...