Sciweavers

ACL
2007

Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus

13 years 6 months ago
Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus
The AMI Meeting Corpus is now publicly available, including manual annotation files generated in the NXT XML format, but lacking explicit metadata for the 171 meetings of the corpus. To increase the usability of this important resource, a representation format based on relational databases is proposed, which maximizes informativeness, simplicity and reusability of the metadata and annotations. The annotation files are converted to a tabular format using an easily adaptable XSLT-based mechanism, and their consistency is verified in the process. Metadata files are generated directly in the IMDI XML format from implicit information, and converted to tabular format using a similar procedure. The results and tools will be freely available with the AMI Corpus. Sharing the metadata using the Open Archives network will contribute to increase the visibility of the AMI Corpus.
Andrei Popescu-Belis, Paula Estrella
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where ACL
Authors Andrei Popescu-Belis, Paula Estrella
Comments (0)