Sciweavers

IPPS
2000
IEEE
13 years 9 months ago
PaDDMAS: Parallel and Distributed Data Mining Application Suite
Discovering complex associations, anomalies and patterns in distributed data sets is gaining popularity in a range of scientific, medical and business applications. Various algor...
Omer F. Rana, David W. Walker, Maozhen Li, Steven ...
IDEAS
2000
IEEE
112views Database» more  IDEAS 2000»
13 years 9 months ago
Bulk Loading a Data Warehouse Built Upon a UB-Tree
This paper considers the issue of bulk loading large data sets for the UB-Tree, a multidimensional index structure. Especially in dataware housing (DW), data mining and OLAP it is...
Robert Fenk, Akihiko Kawakami, Volker Markl, Rudol...
ICPR
2000
IEEE
13 years 9 months ago
Scaling-Up Support Vector Machines Using Boosting Algorithm
In the recent years support vector machines (SVMs) have been successfully applied to solve a large number of classification problems. Training an SVM, usually posed as a quadrati...
Dmitry Pavlov, Jianchang Mao, Byron Dom
DASFAA
2010
IEEE
195views Database» more  DASFAA 2010»
13 years 9 months ago
Transitivity-Preserving Skylines for Partially Ordered Domains
The skyline of a set P of multi-dimensional points (tuples) consists of those points in P for which no clearly better point in P exists, using component-wise comparison on domains ...
Henning Köhler, Kai Zheng, Jing Yang, Xiaofan...
IPPS
2002
IEEE
13 years 9 months ago
Predicting the Performance of Wide Area Data Transfers
As Data Grids become more commonplace, large data sets are being replicated and distributed to multiple sites, leading to the problem of determining which replica can be accessed ...
Sudharshan Vazhkudai, Jennifer M. Schopf, Ian T. F...
ICDM
2002
IEEE
159views Data Mining» more  ICDM 2002»
13 years 9 months ago
O-Cluster: Scalable Clustering of Large High Dimensional Data Sets
Clustering large data sets of high dimensionality has always been a serious challenge for clustering algorithms. Many recently developed clustering algorithms have attempted to ad...
Boriana L. Milenova, Marcos M. Campos
IPPS
2003
IEEE
13 years 9 months ago
Simulation of Dynamic Data Replication Strategies in Data Grids
Data Grids provide geographically distributed resources for large-scale data-intensive applications that generate large data sets. However, ensuring efficient access to such huge...
Houda Lamehamedi, Zujun Shentu, Boleslaw K. Szyman...
KDD
2004
ACM
624views Data Mining» more  KDD 2004»
13 years 10 months ago
Programming the K-means clustering algorithm in SQL
Using SQL has not been considered an efficient and feasible way to implement data mining algorithms. Although this is true for many data mining, machine learning and statistical a...
Carlos Ordonez
SSD
2005
Springer
122views Database» more  SSD 2005»
13 years 10 months ago
Selectivity Estimation of High Dimensional Window Queries via Clustering
Abstract. Query optimization is an important functionality of modern database systems and often based on estimating the selectivity of queries before actually executing them. Well-...
Christian Böhm, Hans-Peter Kriegel, Peer Kr&o...
ISMDA
2005
Springer
13 years 10 months ago
Simultaneous Scheduling of Replication and Computation for Bioinformatic Applications on the Grid
Abstract. One of the first motivations of using grids comes from applications managing large data sets like for example in High Energy Physic or Life Sciences. To improve the glob...
Frederic Desprez, Antoine Vernois, Christophe Blan...