Free Online Productivity Tools
i2Speak
i2Symbol
i2OCR
iTex2Img
iWeb2Print
iWeb2Shot
i2Type
iPdf2Split
iPdf2Merge
i2Bopomofo
i2Arabic
i2Style
i2Image
i2PDF
iLatex2Rtf
Sci2ools

SIGMOD

2005

ACM

2005

ACM

We study a problem of mining frequently occurring periodic patterns with a gap requirement from sequences. Given a character sequence S of length L and a pattern P of length l, we consider P a frequently occurring pattern in S if the probability of observing P given a randomly picked length-l subsequence of S exceeds a certain threshold. In many applications, particularly those related to bioinformatics, interesting patterns are periodic with a gap requirement. That is to say, the characters in P should match subsequences of S in such a way that the matching characters in S are separated by gaps of more or less the same size. We show the complexity of the mining problem and discuss why traditional mining algorithms are computationally infeasible. We propose practical algorithms for solving the problem, and study their characteristics. We also present a case study in which we apply our algorithms on some DNA sequences. We discuss some interesting patterns obtained from the case study.

Related Content

Added |
08 Dec 2009 |

Updated |
08 Dec 2009 |

Type |
Conference |

Year |
2005 |

Where |
SIGMOD |

Authors |
Minghua Zhang, Ben Kao, David Wai-Lok Cheung, Kevin Y. Yip |

Comments (0)