Sciweavers

BMCBI
2007

BioInfer: a corpus for information extraction in the biomedical domain

13 years 4 months ago
BioInfer: a corpus for information extraction in the biomedical domain
Background: Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. Results: We present BioInfer (Bio Information Extraction Resource), a new public resource providing an annotated corpus of biomedical English. We describe an annotation scheme capturing named entities and their relationships along with a dependency analysis of sentence syntax. We further present ontologies defining the types of entities and relationships annotated in the corpus. y, the corpus contains 1100 sentences from abstracts of biomedical research articles annotated for relationships, named entities, as well as syntactic dependencies. Supporting software is provided with the corpus. The corpus is unique in the domain in combining these annotation types for a singl...
Sampo Pyysalo, Filip Ginter, Juho Heimonen, Jari B
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2007
Where BMCBI
Authors Sampo Pyysalo, Filip Ginter, Juho Heimonen, Jari Björne, Jorma Boberg, Jouni Järvinen, Tapio Salakoski
Comments (0)