This paper investigates a new learning model in which the input data is corrupted with noise. We present a general statistical framework to tackle this problem. Based on the stati...
The problem of automatic classification of scientific texts is considered. Methods based on statistical analysis of probabilistic distributions of scientific terms in texts are dis...
For a wide variety of classification algorithms, scalability to large databases can be achieved by observing that most algorithms are driven by a set of sufficient statistics that...
Background: Clustering methods are widely used on gene expression data to categorize genes with similar expression profiles. Finding an appropriate (dis)similarity measure is crit...
Kyungpil Kim, Shibo Zhang, Keni Jiang, Li Cai, In-...
A moving cluster is defined by a set of objects that move close to each other for a long time interval. Real-life examples are a group of migrating animals, a convoy of cars movin...