Discovering Frequent Poly-Regions in DNA Sequences

15 years 11 months ago

Download www.cs.bu.edu

The problem of discovering arrangements of regions of high occurrence of one or more items of a given alphabet in a sequence, is studied, and two efﬁcient approaches are proposed to solve it. The ﬁrst approach is entropy-based and uses an existing recursive segmentation technique to split the input sequence into a set of homogeneous segments. The key idea of the second approach is to use a set of sliding windows over the sequence. Each sliding window keeps a set of statistics of a sequence segment that mainly includes the number of occurrences of each item in that segment. Combining these statistics efﬁciently yields the complete set of regions of high occurrence of the items of the given alphabet. After identifying these regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a region). An efﬁcient algorithm for mining frequent arrangements of temporal intervals on a single sequence is applied on the converted sequence to discover freq...

Panagiotis Papapetrou, Gary Benson, George Kollios

Real-time Traffic

Data Mining | ICDM 2006 | Input Sequence | Sequence Segment | Sliding Window |

claim paper

» REBMEC Repeat Based Maximum Entropy Classifier for Biological Sequences

» A combinatorial optimization approach for diverse motif finding applications

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	ICDM
Authors	Panagiotis Papapetrou, Gary Benson, George Kollios

Comments (0)

Sciweavers

Discovering Frequent Poly-Regions in DNA Sequences

Data Mining | ICDM 2006 | Input Sequence | Sequence Segment | Sliding Window |

Explore & Download

Productivity Tools

Sciweavers