Sciweavers

1314 search results - page 106 / 263
» Approximate data mining in very large relational data
Sort
View
ICPP
2000
IEEE
15 years 6 months ago
A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets
Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Harsha S. Nagesh, Sanjay Goil, Alok N. Choudhary
EDBT
2004
ACM
122views Database» more  EDBT 2004»
16 years 1 months ago
Sketch-Based Multi-query Processing over Data Streams
Abstract. Recent years have witnessed an increasing interest in designing algorithms for querying and analyzing streaming data (i.e., data that is seen only once in a fixed order) ...
Alin Dobra, Minos N. Garofalakis, Johannes Gehrke,...
SIGMOD
2006
ACM
219views Database» more  SIGMOD 2006»
16 years 1 months ago
Modeling skew in data streams
Data stream applications have made use of statistical summaries to reason about the data using nonparametric tools such as histograms, heavy hitters, and join sizes. However, rela...
Flip Korn, S. Muthukrishnan, Yihua Wu
DKE
2006
157views more  DKE 2006»
15 years 1 months ago
XML structural delta mining: Issues and challenges
Recently, there is an increasing research efforts in XML data mining. These research efforts largely assumed that XML documents are static. However, in reality, the documents are ...
Qiankun Zhao, Ling Chen 0002, Sourav S. Bhowmick, ...
ECAI
2004
Springer
15 years 5 months ago
Avoiding Data Overfitting in Scientific Discovery: Experiments in Functional Genomics
Functional genomics is a typical scientific discovery domain characterized by a very large number of attributes (genes) relative to the number of examples (observations). The dang...
Dragan Gamberger, Nada Lavrac