We present a lossless compression algorithm, GenCompress, for genetic sequences, based on searching for approximate repeats. Our algorithm achieves the best compression ratios for...
Growing sequencing and assembly efforts have been met by the advances in high throughput machines. However, the presence of massive amounts of repeats and transposons complicates ...
Nirmalya Bandyopadhyay, A. Mark Settles, Tamer Kah...
Chaining fragments is a crucial step in genome alignment. Existing chaining algorithms compute a maximum weighted chain with no overlaps allowed between adjacent fragments. In prac...
Probabilistic models of languages are fundamental to understand and learn the profile of the subjacent code in order to estimate its entropy, enabling the verification and predicti...
Background: The ability to rapidly map millions of oligonucleotide fragments to a reference genome is crucial to many high throughput genomic technologies. Results: We propose an ...
Wei Li, Jason S. Carroll, Myles Brown, X. Shirley ...