Abstract--Today's molecular biology is confronted with enormous amounts of data, generated by new high-throughput technologies, along with an increasing number of biological m...
A new statistical model for DNA considers a sequence to be a mixture of regions with little structure and regions that are approximate repeats of other subsequences, i.e. instance...
Lloyd Allison, Linda Stern, Timothy Edgoose, Trevo...
Spreadsheets applications allow data to be stored with low development overheads, but also with low data quality. Reporting on data from such sources is difficult using traditiona...
Existing sequence mining algorithms mostly focus on mining for subsequences. However, a large class of applications, such as biological DNA and protein motif mining, require effici...
The first available genome of a multicellular organism, C. elegans, was used as a test case for protein fold assignment using PSI-BLAST, followed by rational structure modeling an...