Sciweavers

BMCBI
2005

Data-poor categorization and passage retrieval for Gene Ontology Annotation in Swiss-Prot

13 years 4 months ago
Data-poor categorization and passage retrieval for Gene Ontology Annotation in Swiss-Prot
Background: In the context of the BioCreative competition, where training data were very sparse, we investigated two complementary tasks: 1) given a Swiss-Prot triplet, containing a protein, a GO (Gene Ontology) term and a relevant article, extraction of a short passage that justifies the GO category assignement; 2) given a Swiss-Prot pair, containing a protein and a relevant article, automatic assignement of a set of categories. Methods: Sentence is the basic retrieval unit. Our classifier computes a distance between each sentence and the GO category provided with the Swiss-Prot entry. The Text Categorizer computes a distance between each GO term and the text of the article. Evaluations are reported both based on annotator judgements as established by the competition and based on mean average precision measures computed using a curated sample of Swiss-Prot. Results: Our system achieved the best recall and precision combination both for passage retrieval and text categorization as eva...
Frédéric Ehrler, Antoine Geissbü
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2005
Where BMCBI
Authors Frédéric Ehrler, Antoine Geissbühler, Antonio Jimeno-Yepes, Patrick Ruch
Comments (0)