Sciweavers

CORR
2000
Springer

ATLAS: A flexible and extensible architecture for linguistic annotation

13 years 4 months ago
ATLAS: A flexible and extensible architecture for linguistic annotation
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on "Annotation Graphs," a graph model for annotations on linear signals (such as text and speech) indexed by intervals, for which efficient database storage and querying techniques are applicable. We note how a wide range of existing annotated corpora can be mapped to this annotation graph model. This model is then generalized to encompass a wider variety of linguistic "signals," including both naturally occuring phenomena (as recorded in images, video, multi-modal interactions, etc.), as well as the derived resources that are increasingly important to the engineering of natural language processing systems (such as word lists, dictionaries, aligned b...
Steven Bird, David Day, John S. Garofolo, John Hen
Added 17 Dec 2010
Updated 17 Dec 2010
Type Journal
Year 2000
Where CORR
Authors Steven Bird, David Day, John S. Garofolo, John Henderson, Christophe Laprun, Mark Liberman
Comments (0)