Sciweavers

BMCBI
2005

A simple approach for protein name identification: prospects and limits

13 years 3 months ago
A simple approach for protein name identification: prospects and limits
Background: Significant parts of biological knowledge are available only as unstructured text in articles of biomedical journals. By automatically identifying gene and gene product (protein) names and mapping these to unique database identifiers, it becomes possible to extract and integrate information from articles and various data sources. We present a simple and efficient approach that identifies gene and protein names in texts and returns database identifiers for matches. It has been evaluated in the recent BioCreAtIvE entity extraction and mention normalization task by an independent jury. Methods: Our approach is based on the use of synonym lists that map the unique database identifiers for each gene/protein to the different synonym names. For yeast and mouse, synonym lists were used as provided by the organizers who generated them from public model organism databases. The synonym list for fly was generated directly from the corresponding organism database. The lists were then e...
Katrin Fundel, Daniel Güttler, Ralf Zimmer, J
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2005
Where BMCBI
Authors Katrin Fundel, Daniel Güttler, Ralf Zimmer, Joannis Apostolakis
Comments (0)