Sciweavers

969 search results - page 11 / 194
» Clustering performance data efficiently at massive scales
Sort
View
KDD
2006
ACM
153views Data Mining» more  KDD 2006»
15 years 10 months ago
Spatial scan statistics: approximations and performance study
Spatial scan statistics are used to determine hotspots in spatial data, and are widely used in epidemiology and biosurveillance. In recent years, there has been much effort invest...
Deepak Agarwal, Andrew McGregor, Jeff M. Phillips,...
VLDB
2002
ACM
154views Database» more  VLDB 2002»
14 years 9 months ago
I/O-Conscious Data Preparation for Large-Scale Web Search Engines
Given that commercial search engines cover billions of web pages, efficiently managing the corresponding volumes of disk-resident data needed to answer user queries quickly is a f...
Maxim Lifantsev, Tzi-cker Chiueh
CLUSTER
2009
IEEE
15 years 4 months ago
Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help?
Abstract— As the datasets used to fuel modern scientific discovery grow increasingly large, they become increasingly difficult to manage using conventional software. Parallel d...
Sarah Loebman, Dylan Nunley, YongChul Kwon, Bill H...
ICCS
2004
Springer
15 years 2 months ago
Chunking-Coordinated-Synthetic Approaches to Large-Scale Kernel Machines
We consider a kernel-based approach to nonlinear classification that coordinates the generation of “synthetic” points (to be used in the kernel) with “chunking” (working wi...
Francisco J. González-Castaño, Rober...
KDD
2002
ACM
182views Data Mining» more  KDD 2002»
15 years 10 months ago
ANF: a fast and scalable tool for data mining in massive graphs
Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences...
Christopher R. Palmer, Phillip B. Gibbons, Christo...