Sciweavers

KDD
2001
ACM
163views Data Mining» more  KDD 2001»
14 years 4 months ago
The "DGX" distribution for mining massive, skewed data
Skewed distributions appear very often in practice. Unfortunately, the traditional Zipf distribution often fails to model them well. In this paper, we propose a new probability di...
Zhiqiang Bi, Christos Faloutsos, Flip Korn
KDD
2001
ACM
113views Data Mining» more  KDD 2001»
14 years 4 months ago
Mining massively incomplete data sets by conceptual reconstruction
Charu C. Aggarwal, Srinivasan Parthasarathy
KDD
2001
ACM
155views Data Mining» more  KDD 2001»
14 years 4 months ago
Evaluating the novelty of text-mined rules using lexical knowledge
Sugato Basu, Raymond J. Mooney, Krupakar V. Pasupu...
KDD
2001
ACM
187views Data Mining» more  KDD 2001»
14 years 4 months ago
Random projection in dimensionality reduction: applications to image and text data
Random projections have recently emerged as a powerful method for dimensionality reduction. Theoretical results indicate that the method preserves distances quite nicely; however,...
Ella Bingham, Heikki Mannila
KDD
2001
ACM
156views Data Mining» more  KDD 2001»
14 years 4 months ago
Classification of genes using probabilistic models of microarray expression profiles
Paul Pavlidis, Christopher Tang, William Stafford ...
KDD
2001
ACM
169views Data Mining» more  KDD 2001»
14 years 4 months ago
Hierarchical cluster analysis of SAGE data for cancer profiling
In this paper we present a method for clustering SAGE (Serial Analysis of Gene Expression) data to detect similarities and dissimilarities between different types of cancer on the...
Jörg Sander, Monica C. Sleumer, Raymond T. Ng
KDD
2001
ACM
163views Data Mining» more  KDD 2001»
14 years 4 months ago
Learning to recognize brain specific proteins based on low-level features from on-line prediction servers
During the last decade, the area of bioinformatics has produced an overwhelming amount of data, with the recently published draft of the human genome being the most prominent exam...
Henrik Boström, Joakim Cöster, Lars Aske...
KDD
2001
ACM
152views Data Mining» more  KDD 2001»
14 years 4 months ago
A scalable algorithm for clustering protein sequences
Valerie Guralnik, George Karypis
KDD
2001
ACM
145views Data Mining» more  KDD 2001»
14 years 4 months ago
A learning algorithm for string assembly
Mark K. Goldberg, Darren T. Lim, Malik Magdon-Isma...