This paper addresses a problem of natural language text alignment, from a humanities discipline called textual genetic criticism where different text versions must be compared. The...
While much research has been done on finding similarities between protein sequences, there has not been the same progress on finding similarities between protein structures. Here ...
Tom Milledge, Gaolin Zheng, Tim Mullins, Giri Nara...
— Part of the challenge of modeling protein sequences is their discrete nature. Many of the most powerful statistical and learning techniques are applicable to points in a Euclid...
We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...
Michael Cameron, Yaniv Bernstein, Hugh E. Williams
Background: Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome seque...