Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
In this paper we describe a new cluster model which is based on the concept of linear manifolds. The method identifies subsets of the data which are embedded in arbitrary oriented...
Background: Non-homology based methods such as phylogenetic profiles are effective for predicting functional relationships between proteins with no considerable sequence or struct...
Ryosuke Watanabe, Enrique Morett, Edgar E. Vallejo
This paper describes one of the first attempts to model the temporal structure of massive data streams in real-time using data stream clustering. Recently, many data stream clust...
Keeping track of changes in user interests from a document stream with a few relevance judgments is not an easy task. To tackle this problem, we propose a novel method that integr...