We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report ...
Kevin Gimpel, Nathan Schneider, Brendan O'Connor, ...
This paper describes the framework of the StatCan Daily Translation Extraction System (SDTES), a computer system that maps and compares webbased translation texts of Statistics Can...
Approximate text search is a basic technique to handle recognized text that contains recognition errors. This paper proposes an approximate string search for recognized text using...