Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper d...
John D. Burger, John C. Henderson, George Kim, Gui...
Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon's Mechanical Turk syst...
Rion Snow, Brendan O'Connor, Daniel Jurafsky, Andr...
Professional manual transcription of speech is an expensive and time consuming process. This paper focuses on the problem of combining noisy transcriptions from multiple non-exper...
Kartik Audhkhasi, Panayiotis G. Georgiou, Shrikant...
Crowd-sourcing is a recent framework in which human intelligence tasks are outsourced to a crowd of unknown people ("workers") as an open call (e.g., on Amazon's Me...
Unrehearsed spoken language often contains disfluencies. In order to correctly interpret a spoken utterance, any such disfluencies must be identified and removed or otherwise deal...