DNA Motif Representation with Nucleotide Dependency

12 years 1 months ago
DNA Motif Representation with Nucleotide Dependency
The problem of discovering novel motifs of binding sites is important to the understanding of gene regulatory networks. Motifs are generally represented by matrices (PWM or PSSM) or strings. However, these representations cannot model biological binding sites well because they fail to capture nucleotide interdependence. It has been pointed out by many researchers that the nucleotides of the DNA binding site cannot be treated independently, e.g. the binding sites of zinc finger in proteins. In this paper, a new representation called Scored Position Specific Pattern (SPSP), which is a generalization of the matrix and string representations, is introduced which takes into consideration the dependent occurrences of neighboring nucleotides. Even though the problem of discovering the optimal motif in SPSP representation is proved to be NP-hard, we introduce a heuristic algorithm called SPSP-Finder, which can effectively find optimal motifs in most simulated cases and some real cases for whic...
Francis Y. L. Chin, Henry C. M. Leung
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where TCBB
Authors Francis Y. L. Chin, Henry C. M. Leung
Comments (0)