This paper describes recent efforts at Linguistic Data Consortium at the University of Pennsylvania to create manual transcripts as a shared resource for human language technology...
Morphological analyzers and part-of-speech taggers are key technologies for most text analysis applications. Our aim is to develop a part-of-speech tagger for annotating a wide ra...
The production of rich multilingual speech corpus resources on a large scale is a requirement for many linguistic, phonetic and technological tasks, in both research and applicati...
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a tools for manipulating these annotations. The ...
Steven Bird, David Day, John S. Garofolo, John Hen...
There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and info...
Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Wa...