Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

13 years 6 months ago

Download www.stanford.edu

Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. For all five, we show high agreement between Mechanical Turk non-expert annotations and existing gold standard labels provided by expert labelers. For the task of affect recognition, we also show that using non-expert labels for training machine learning algorithms can be as effective as using gold standard annotations from experts. We propose a technique for bias correction that significantly improves annotation quality on two tasks. We conclude that many large labeling tasks can be effectively designed and carried out in th...

Rion Snow, Brendan O'Connor, Daniel Jurafsky, Andr

Real-time Traffic

Annotation | EMNLP 2008 | Mechanical Turk | Natural Language Processing | Turk Non-expert Annotations |

claim paper

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	EMNLP
Authors	Rion Snow, Brendan O'Connor, Daniel Jurafsky, Andrew Y. Ng

Sciweavers

Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

Annotation | EMNLP 2008 | Mechanical Turk | Natural Language Processing | Turk Non-expert Annotations |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers