Revealing and avoiding bias in semantic similarity scores for protein pairs

15 years 5 months ago

Download www.biomedcentral.com

Background: Semantic similarity scores for protein pairs are widely applied in functional genomic researches for finding functional clusters of proteins, predicting protein functions and protein-protein interactions, and for identifying putative disease genes. However, because some proteins, such as those related to diseases, tend to be studied more intensively, annotations are likely to be biased, which may affect applications based on semantic similarity measures. Thus, it is necessary to evaluate the effects of the bias on semantic similarity scores between proteins and then find a method to avoid them. Results: First, we evaluated 14 commonly used semantic similarity scores for protein pairs and demonstrated that they significantly correlated with the numbers of annotation terms for the proteins (also known as the protein annotation length). These results suggested that current applications of the semantic similarity scores between proteins might be unreliable. Then, to reduce thi...

Jing Wang 0004, Xianxiao Zhou, Jing Zhu, Chenggui

Real-time Traffic

BMCBI 2010 | Proteins | Semantic Similarity | Semantic Similarity Scores |

claim paper

Post Info
More Details (n/a)

Added	08 Dec 2010
Updated	08 Dec 2010
Type	Journal
Year	2010
Where	BMCBI
Authors	Jing Wang 0004, Xianxiao Zhou, Jing Zhu, Chenggui Zhou, Zheng Guo

Comments (0)

Sciweavers

Revealing and avoiding bias in semantic similarity scores for protein pairs

BMCBI 2010 | Proteins | Semantic Similarity | Semantic Similarity Scores |

Explore & Download

Productivity Tools

Sciweavers