We first analyzed protein names using various dictionaries and databases and found five problems with protein names; i.e., the treatment of special characters, the treatment of hom...
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, source-sentence reordering metrics, and discriminative unigram precision, as well as...
In text management tasks, the dimensionality reduction becomes necessary to computation and interpretability of the results generated by machine learning algorithms. This paper de...
This paper proposes a method for creating a high quality collection of researchers’ homepages. The proposed method consists of three phases: rough filtering of the possible web p...
Semi-supervised learning (SSL), is classification where additional unlabeled data can be used to improve accuracy. Generative approaches are appealing in this situation, as a mode...