SIMAP - a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters

14 years 11 months ago

Download bioinfo.cipf.es

The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date precalculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million nonredundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are precalculated for all sequences in SIMAP and the data a...

Thomas Rattei, Patrick Tischler, Stefan Götz,

Real-time Traffic

NAR 2010 | Protein | SIMAP | Similarity Matrix |

claim paper

» The protein common interface database ProtCID a comprehensive database of interactions of...

» Clustering protein sequences with a novel metric transformed from sequence similarity scor...

» The Protein Information Resource PIR

» A functional hierarchical organization of the protein sequence space

» The PIRInternational Protein Sequence Database

» DBAli tools mining the protein structure space

» Extension of the COG and arCOG databases by amino acid and nucleotide sequences

» Super paramagnetic clustering of protein sequences

Post Info
More Details (n/a)

Added	20 May 2011
Updated	20 May 2011
Type	Journal
Year	2010
Where	NAR
Authors	Thomas Rattei, Patrick Tischler, Stefan Götz, Marc-André Jehl, Jonathan Hoser, Roland Arnold, Ana Conesa, Hans-Werner Mewes

Comments (0)

Sciweavers

SIMAP - a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters

NAR 2010 | Protein | SIMAP | Similarity Matrix |

Explore & Download

Productivity Tools

Sciweavers