Logo Goletty

Analysis and Improved Recognition of Protein Names Using Transductive SVM
Journal Title Journal of Computers
Journal Abbreviation jcp
Publisher Group Academy Publisher
Website http://ojs.academypublisher.com
PDF (349 kb)
   
Title Analysis and Improved Recognition of Protein Names Using Transductive SVM
Authors Mitsumori, Tomohiro; Murata, Masaki; Doi, Kouichi
Abstract We first analyzed protein names using various dictionaries and databases and found five problems with protein names; i.e., the treatment of special characters, the treatment of homonyms, cases where the protein-name string may be a substring of a different protein-name string, cases where one protein exists in different organisms, and the treatment of modifiers. We confirmed that we could use a machine-learning approach to recognizing protein names to solve these problems. Thus, machine-learning methods have recently been used in research to recognize protein names. A classifier trained in a specific domain, however, can cause overfitting and be so inflexible that it can only be used in that domain. We therefore developed a new corpus on breast cancer and investigated the flexibility of classifiers trained on the GENIA [1] or the breast-cancer corpora. We used a transductive support vector machine (SVM) to avoid overfitting, and we evaluated the effect of transductive learning. We found that transductive SVM prevented overfitting in experiments and yielded higher accuracies than were obtained from the conventional SVM. The transductive SVM increased the F-scores (70.46 to 79.64 and 70.63 to 74.61) in our two experiments for the criterion of “Sub” that we define in this paper.
Publisher ACADEMY PUBLISHER
Date 2008-01-01
Source Journal of Computers Vol 3, No 1 (2008)
Rights Copyright © ACADEMY PUBLISHER - All Rights Reserved.To request permission, please check out URL: http://www.academypublisher.com/copyrightpermission.html.

 

See other article in the same Issue


Goletty © 2024