Background: Analysis of sequence composition is a routine task in genome research. Organisms are characterized by their base composition, dinucleotide relative abundance, codon us...
We have developed a new algorithm that allows the exhaustive determination of words of up to 12 nucleotides in DNA sequences. It is fast enough as to be used at a genomic scale ru...
In this article, we propose a new method for computing rare maximal exact matches between multiple sequences. A rare match between k sequences S1; : : :; Sk is a string that occur...
Background: The number of k-words shared between two sequences is a simple and effcient alignment-free sequence comparison method. This statistic, D2, has been used for the cluste...
The spatial clustering of genes across different genomes has been used to study important problems in comparative genomics, from identification of operons to detection of homologo...