Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
Background: In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary ...
We provide a worst-case analysis of selective sampling algorithms for learning linear threshold functions. The algorithms considered in this paper are Perceptron-like algorithms, ...
We present a new family of linear time algorithms based on sufficient statistics for string comparison with mismatches under the string kernels framework. Our algorithms improve t...
Abstract. Trained support vector machines (SVMs) have a slow runtime classification speed if the classification problem is noisy and the sample data set is large. Approximating the...