Sciweavers

694 search results - page 2 / 139
» On the Dimensions of Data Complexity through Synthetic Data ...
Sort
View
TSMC
2010
12 years 12 months ago
Distance Approximating Dimension Reduction of Riemannian Manifolds
We study the problem of projecting high-dimensional tensor data on an unspecified Riemannian manifold onto some lower dimensional subspace1 without much distorting the pairwise geo...
Changyou Chen, Junping Zhang, Rudolf Fleischer
KDD
2007
ACM
165views Data Mining» more  KDD 2007»
14 years 5 months ago
Finding low-entropy sets and trees from binary data
The discovery of subsets with special properties from binary data has been one of the key themes in pattern discovery. Pattern classes such as frequent itemsets stress the co-occu...
Eino Hinkkanen, Hannes Heikinheimo, Heikki Mannila...
ICPP
2000
IEEE
13 years 9 months ago
A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets
Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Harsha S. Nagesh, Sanjay Goil, Alok N. Choudhary
COMAD
2008
13 years 6 months ago
Disk-Based Sampling for Outlier Detection in High Dimensional Data
We propose an efficient sampling based outlier detection method for large high-dimensional data. Our method consists of two phases. In the first phase, we combine a "sampling...
Timothy de Vries, Sanjay Chawla, Pei Sun, Gia Vinh...
BMCBI
2006
101views more  BMCBI 2006»
13 years 5 months ago
SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms
Background: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validatio...
Tim Van den Bulcke, Koen Van Leemput, Bart Naudts,...