Sciweavers

SDM
2003
SIAM
125views Data Mining» more  SDM 2003»
13 years 5 months ago
Scalable, Balanced Model-based Clustering
This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. Partition...
Shi Zhong, Joydeep Ghosh
SDM
2003
SIAM
129views Data Mining» more  SDM 2003»
13 years 5 months ago
Approximate Query Answering by Model Averaging
In earlier work we have introduced and explored a variety of different probabilistic models for the problem of answering selectivity queries posed to large sparse binary data set...
Dmitry Pavlov, Padhraic Smyth
SDM
2003
SIAM
174views Data Mining» more  SDM 2003»
13 years 5 months ago
STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential Data
In this paper, we focus on mining periodic patterns allowing some degree of imperfection in the form of random replacement from a perfect periodic pattern. In InfoMiner+, we propo...
Jiong Yang, Wei Wang 0010, Philip S. Yu
SDM
2003
SIAM
148views Data Mining» more  SDM 2003»
13 years 5 months ago
ATLaS: A Native Extension of SQL for Data Mining
A lack of power and extensibility in their query languages has seriously limited the generality of DBMSs and hampered their ability to support data mining applications. Thus, ther...
Haixun Wang, Carlo Zaniolo
SDM
2003
SIAM
156views Data Mining» more  SDM 2003»
13 years 5 months ago
Detection of Underrepresented Biological Sequences using Class-Conditional Distribution Models
A labeled sequence data set related to a certain biological property is often biased and, therefore, does not completely capture its diversity in nature. To reduce this sampling b...
Slobodan Vucetic, Dragoljub Pokrajac, Hongbo Xie, ...
SDM
2003
SIAM
91views Data Mining» more  SDM 2003»
13 years 5 months ago
An Outlier-based Data Association Method for Linking Criminal Incidents
Serial criminals are a major threat in the modern society. Associating incidents committed by the same offender is of great importance in studying serial criminals. In this paper,...
Song Lin, Donald E. Brown
SDM
2003
SIAM
124views Data Mining» more  SDM 2003»
13 years 5 months ago
A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection
Aleksandar Lazarevic, Levent Ertöz, Vipin Kum...
SDM
2003
SIAM
183views Data Mining» more  SDM 2003»
13 years 5 months ago
ApproxMAP: Approximate Mining of Consensus Sequential Patterns
Conventional sequential pattern mining methods may meet inherent difficulties in mining databases with long sequences and noise. They may generate a huge number of short and trivi...
Hye-Chung Kum, Jian Pei, Wei Wang 0010, Dean Dunca...
SDM
2003
SIAM
120views Data Mining» more  SDM 2003»
13 years 5 months ago
Estimation of Topological Dimension
We present two extensions of the algorithm by Broomhead et al [2] which is based on the idea that singular values that scale linearly with the radius of the data ball can be explo...
Douglas R. Hundley, Michael J. Kirby
SDM
2003
SIAM
110views Data Mining» more  SDM 2003»
13 years 5 months ago
Mixture Models and Frequent Sets: Combining Global and Local Methods for 0-1 Data
We study the interaction between global and local techniques in data mining. Specifically, we study the collections of frequent sets in clusters produced by a probabilistic clust...
Jaakko Hollmén, Jouni K. Seppänen, Hei...