Sciweavers

11 search results - page 1 / 3
» On the Efficient Gathering of Sufficient Statistics for Clas...
Sort
View
KDD
1998
ACM
99views Data Mining» more  KDD 1998»
13 years 9 months ago
On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases
For a wide variety of classification algorithms, scalability to large databases can be achieved by observing that most algorithms are driven by a set of sufficient statistics that...
Goetz Graefe, Usama M. Fayyad, Surajit Chaudhuri
CIDR
2009
159views Algorithms» more  CIDR 2009»
13 years 6 months ago
RIOT: I/O-Efficient Numerical Computing without SQL
R is a numerical computing environment that is widely popular for statistical data analysis. Like many such environments, R performs poorly for large datasets whose sizes exceed t...
Yi Zhang 0011, Herodotos Herodotou, Jun Yang 0001
ICDE
2008
IEEE
161views Database» more  ICDE 2008»
14 years 6 months ago
COLR-Tree: Communication-Efficient Spatio-Temporal Indexing for a Sensor Data Web Portal
Abstract-- We present COLR-Tree, an abstraction layer designed to support efficient spatio-temporal queries on live data gathered from a large collection of sensors. We use COLR-Tr...
Yanif Ahmad, Suman Nath
BMCBI
2011
12 years 8 months ago
BICEPP: an example-based statistical text mining method for predicting the binary characteristics of drugs
Background: The identification of drug characteristics is a clinically important task, but it requires much expert knowledge and consumes substantial resources. We have developed ...
Frank P. Y. Lin, Stephen Anthony, Thomas M. Polase...
KDD
2000
ACM
222views Data Mining» more  KDD 2000»
13 years 8 months ago
Interactive exploration of very large relational datasets through 3D dynamic projections
The grand tour, one of the most popular methods for multidimensional data exploration, is based on orthogonally projecting multidimensional data to a sequence of lower dimensional...
Li Yang