Sciweavers

SIGIR
2005
ACM

An application of text categorization methods to gene ontology annotation

13 years 8 months ago
An application of text categorization methods to gene ontology annotation
This paper describes an application of IR and text categorization methods to a highly practical problem in biomedicine, specifically, Gene Ontology (GO) annotation. GO annotation is a major activity in most model organism database projects and annotates gene functions using a controlled vocabulary. As a first step toward automatic GO annotation, we aim to assign GO domain codes given a specific gene and an article in which the gene appears, which is one of the task challenges at the TREC 2004 Genomics Track. We approached the task with careful consideration of the specialized terminology and paid special attention to dealing with various forms of gene synonyms, so as to exhaustively locate the occurrences of the target gene. We extracted the words around the gene occurrences and used them to represent the gene for GO domain code annotation. As a classifier, we adopted a variant of k-Nearest Neighbor (kNN) with supervised term weighting schemes to improve the performance, making ou...
Kazuhiro Seki, Javed Mostafa
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where SIGIR
Authors Kazuhiro Seki, Javed Mostafa
Comments (0)