Sciweavers

ISMB
1996

The Megaprior Heuristic for Discovering Protein Sequence Patterns

13 years 5 months ago
The Megaprior Heuristic for Discovering Protein Sequence Patterns
Several computeralgorithms for discovering patterns in groups of protein sequences are in use that are basedon fitting the parametersof a statistical model to a group of related sequences. Theseinclude hidden Markovmodel(HMM)algorithms for multiple sequence alignment, and the MEMEand Gibbs sampler aagorithms for discovering motifs. These algorithms axe sometimesprone to producingmodelsthat are incorrect because two or morepatterns have been tombitted. Thestatistical modelproducedin this situation is a convexcombination (weighted average) two or moredifferent models. This paper presents a solution to the problemof convexcombinationsin the formof a heuristic basedon using extremelylowvarianceDirichlet mixturepriors as past of the statistical model. This heuristic, which wecall the megaprior heuristic, increases the strength (i.e., decreases the variance) of the prior in proportion to the size of the sequencedataset. This causes each columnin the final modelto strongly resemble the meano...
Timothy L. Bailey, Michael Gribskov
Added 02 Nov 2010
Updated 02 Nov 2010
Type Conference
Year 1996
Where ISMB
Authors Timothy L. Bailey, Michael Gribskov
Comments (0)