Motivation: High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alig...
The usefulness of the results produced by data mining methods can be critically impaired by several factors such as (1) low quality of data, including errors due to contamination, ...
Fang Chu, Yizhou Wang, Carlo Zaniolo, Douglas Stot...
Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions...
Background: In addition to known protein-coding genes, large amounts of apparently non-coding sequence are conserved between the human and mouse genomes. It seems reasonable to as...
We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequen...