Sciweavers

BMCBI
2006

Automatic discovery of cross-family sequence features associated with protein function

13 years 4 months ago
Automatic discovery of cross-family sequence features associated with protein function
Background: Methods for predicting protein function directly from amino acid sequences are useful tools in the study of uncharacterised protein families and in comparative genomics. Until now, this problem has been approached using machine learning techniques that attempt to predict membership, or otherwise, to predefined functional categories or subcellular locations. A potential drawback of this approach is that the human-designated functional classes may not accurately reflect the underlying biology, and consequently important sequence-to-function relationships may be missed. Results: We show that a self-supervised data mining approach is able to find relationships between sequence features and functional annotations. No preconceived ideas about functional categories are required, and the training data is simply a set of protein sequences and their UniProt/Swiss-Prot annotations. The main technical aspect of the approach is the co-evolution of amino acid-based regular expressions a...
Markus Brameier, Josien Haan, Andrea Krings, Rober
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2006
Where BMCBI
Authors Markus Brameier, Josien Haan, Andrea Krings, Robert M. MacCallum
Comments (0)