Sciweavers

KDD
2002
ACM
160views Data Mining» more  KDD 2002»
14 years 5 months ago
Scaling multi-class support vector machines using inter-class confusion
Support vector machines (SVMs) excel at two-class discriminative learning problems. They often outperform generative classifiers, especially those that use inaccurate generative m...
Shantanu Godbole, Sunita Sarawagi, Soumen Chakraba...
KDD
2002
ACM
128views Data Mining» more  KDD 2002»
14 years 5 months ago
Privacy preserving mining of association rules
We present a framework for mining association rules from transactions consisting of categorical items where the data has been randomized to preserve privacy of individual transact...
Alexandre V. Evfimievski, Ramakrishnan Srikant, Ra...
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
14 years 5 months ago
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
KDD
2002
ACM
112views Data Mining» more  KDD 2002»
14 years 5 months ago
From run-time behavior to usage scenarios: an interaction-pattern mining approach
A key challenge facing IT organizations today is their evolution towards adopting e-business practices that gives rise to the need for reengineering their underlying software syst...
Mohammad El-Ramly, Eleni Stroulia, Paul G. Sorenso...
KDD
2002
ACM
155views Data Mining» more  KDD 2002»
14 years 5 months ago
SyMP: an efficient clustering approach to identify clusters of arbitrary shapes in large data sets
We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. SyMP represents each data point by an Integrate-and-Fire oscill...
Hichem Frigui
KDD
2002
ACM
118views Data Mining» more  KDD 2002»
14 years 5 months ago
SECRET: a scalable linear regression tree algorithm
Recently there has been an increasing interest in developing regression models for large datasets that are both accurate and easy to interpret. Regressors that have these properti...
Alin Dobra, Johannes Gehrke
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
14 years 5 months ago
Enhanced word clustering for hierarchical text classification
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
KDD
2002
ACM
125views Data Mining» more  KDD 2002»
14 years 5 months ago
Pattern discovery in sequences under a Markov assumption
In this paper we investigate the general problem of discovering recurrent patterns that are embedded in categorical sequences. An important real-world problem of this nature is mo...
Darya Chudova, Padhraic Smyth
KDD
2002
ACM
85views Data Mining» more  KDD 2002»
14 years 5 months ago
DualMiner: a dual-pruning algorithm for itemsets with constraints
Cristian Bucila, Johannes Gehrke, Daniel Kifer, Wa...
KDD
2002
ACM
138views Data Mining» more  KDD 2002»
14 years 5 months ago
Learning to match and cluster large high-dimensional data sets for data integration
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
William W. Cohen, Jacob Richman