Sciweavers

CIBCB
2005
IEEE

The Homology Kernel: A Biologically Motivated Sequence Embedding into Euclidean Space

13 years 10 months ago
The Homology Kernel: A Biologically Motivated Sequence Embedding into Euclidean Space
— Part of the challenge of modeling protein sequences is their discrete nature. Many of the most powerful statistical and learning techniques are applicable to points in a Euclidean space but not directly applicable to discrete sequences. One way to apply these techniques to protein sequences is to embed the sequences into a Euclidean space and then apply these techniques to the embedded points. In this paper, we introduce a biologically motivated sequence embedding, the homology kernel, which takes into account intuitions from local alignment, sequence homology, and predicted secondary structure. We apply the homology kernel in several ways. We demonstrate how the homology kernel can be used for protein family classification and outperforms state-ofthe-art methods for remote homology detection. We show that the homology kernel can be used for secondary structure prediction and is competitive with popular secondary structure prediction methods. Finally, we show how the homology kern...
Eleazar Eskin, Sagi Snir
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where CIBCB
Authors Eleazar Eskin, Sagi Snir
Comments (0)