Background: Identification of the structural domains of proteins is important for our understanding of the organizational principles and mechanisms of protein folding, and for ins...
Abstract: We examine a new approach to building decision tree by introducing a geometric splitting criterion, based on the properties of a family of metrics on the space of partiti...
Huge amounts of data are stored in autonomous, geographically distributed sources. The discovery of previously unknown, implicit and valuable knowledge is a key aspect of the expl...
Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation b...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
Abstract. Bayesian spam filters, in general, compute probability estimations for tokens either without considering the email areas of occurrences except the body or treating the s...