Sciweavers

JCP
2008

Analysis and Improved Recognition of Protein Names Using Transductive SVM

13 years 4 months ago
Analysis and Improved Recognition of Protein Names Using Transductive SVM
We first analyzed protein names using various dictionaries and databases and found five problems with protein names; i.e., the treatment of special characters, the treatment of homonyms, cases where the protein-name string may be a substring of a different protein-name string, cases where one protein exists in different organisms, and the treatment of modifiers. We confirmed that we could use a machine-learning approach to recognizing protein names to solve these problems. Thus, machine-learning methods have recently been used in research to recognize protein names. A classifier trained in a specific domain, however, can cause overfitting and be so inflexible that it can only be used in that domain. We therefore developed a new corpus on breast cancer and investigated the flexibility of classifiers trained on the GENIA [1] or the breast-cancer corpora. We used a transductive support vector machine (SVM) to avoid overfitting, and we evaluated the effect of transductive learning. We foun...
Masaki Murata, Tomohiro Mitsumori, Kouichi Doi
Added 13 Dec 2010
Updated 13 Dec 2010
Type Journal
Year 2008
Where JCP
Authors Masaki Murata, Tomohiro Mitsumori, Kouichi Doi
Comments (0)