Here we identify duplicated genes in five mammalian genomes and classify these duplicates based on the mechanisms by which they were generated. Retrotransposition accounts for at l...
Paul Ryvkin, Jin Jun, Edward Hemphill, Craig Nelso...
Multiple sequence alignment is the most fundamental task in bioinformatics and computational biology. In this paper, we present a new algorithm to conduct multiple sequences align...
Many basic tasks in computational biology involve operations on individual DNA and protein sequences. These sequences, even when anonymized, are vulnerable to re-identification a...
Conditional random field (CRF) is a popular graphical model for sequence labeling. The flexibility of CRF poses significant computational challenges for training. Using existing o...
We propose a novel semi-supervised clustering method for the task of gene regulatory module discovery. The technique uses data on dna binding as prior knowledge to guide the proces...