Sciweavers

SDM
2007
SIAM
109views Data Mining» more  SDM 2007»
13 years 6 months ago
Segmentations with Rearrangements
Sequence segmentation is a central problem in the analysis of sequential and time-series data. In this paper we introduce and we study a novel variation to the segmentation proble...
Aristides Gionis, Evimaria Terzi
SDM
2007
SIAM
140views Data Mining» more  SDM 2007»
13 years 6 months ago
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but...
Jing Gao, Wei Fan, Jiawei Han, Philip S. Yu
SDM
2007
SIAM
176views Data Mining» more  SDM 2007»
13 years 6 months ago
Adaptive Concept Learning through Clustering and Aggregation of Relational Data
We introduce a new approach for Clustering and Aggregating Relational Data (CARD). We assume that data is available in a relational form, where we only have information about the ...
Hichem Frigui, Cheul Hwang
SDM
2007
SIAM
121views Data Mining» more  SDM 2007»
13 years 6 months ago
Mining Visual and Textual Data for Constructing a Multi-Modal Thesaurus
We propose an unsupervised approach to learn associations between continuous-valued attributes from different modalities. These associations are used to construct a multi-modal t...
Hichem Frigui, Joshua Caudill
SDM
2007
SIAM
96views Data Mining» more  SDM 2007»
13 years 6 months ago
Understanding and Utilizing the Hierarchy of Abnormal BGP Events
Abnormal events, such as security attacks, misconfigurations, or electricity failures, could have severe consequences toward the normal operation of the Border Gateway Protocol (...
Dejing Dou, Jun Li, Han Qin, Shiwoong Kim, Sheng Z...
SDM
2007
SIAM
130views Data Mining» more  SDM 2007»
13 years 6 months ago
Towards Attack-Resilient Geometric Data Perturbation
Data perturbation is a popular technique for privacypreserving data mining. The major challenge of data perturbation is balancing privacy protection and data quality, which are no...
Keke Chen, Gordon Sun, Ling Liu
SDM
2007
SIAM
89views Data Mining» more  SDM 2007»
13 years 6 months ago
Preventing Information Leaks in Email
The widespread use of email has raised serious privacy concerns. A critical issue is how to prevent email information leaks, i.e., when a message is accidentally addressed to non-...
Vitor R. Carvalho, William W. Cohen
SDM
2007
SIAM
81views Data Mining» more  SDM 2007»
13 years 6 months ago
A PAC Bound for Approximate Support Vector Machines
We study a class of algorithms that speed up the training process of support vector machines (SVMs) by returning an approximate SVM. We focus on algorithms that reduce the size of...
Dongwei Cao, Daniel Boley
SDM
2007
SIAM
149views Data Mining» more  SDM 2007»
13 years 6 months ago
WAT: Finding Top-K Discords in Time Series Database
Finding discords in time series database is an important problem in a great variety of applications, such as space shuttle telemetry, mechanical industry, biomedicine, and financ...
Yingyi Bu, Oscar Tat-Wing Leung, Ada Wai-Chee Fu, ...
SDM
2007
SIAM
130views Data Mining» more  SDM 2007»
13 years 6 months ago
Maximizing the Area under the ROC Curve with Decision Lists and Rule Sets
Decision lists (or ordered rule sets) have two attractive properties compared to unordered rule sets: they require a simpler classification procedure and they allow for a more co...
Henrik Boström