Sequence data is ubiquitous and finding frequent sequences in a large database is one of the most common problems when analyzing sequence data. Unfortunately many sources of seque...
We present an index structure for managing weightedsequences in large databases. A weighted-sequence is defined as a two-dimensional structure where each element in the sequence i...
Biosequences typically have a small alphabet, a long length, and patterns containing gaps (i.e., “don’t care”) of arbitrary size. Mining frequent patterns in such sequences ...
Sequential pattern mining is an active field in the domain of knowledge discovery and has been widely studied for over a decade by data mining researchers. More and more, with the ...
Background: Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few...
Jiajie Zhang, Amir Madany Mamlouk, Thomas Martinet...