Sciweavers

SDM
2008
SIAM
158views Data Mining» more  SDM 2008»
13 years 6 months ago
Similarity Measures for Categorical Data: A Comparative Evaluation
Measuring similarity or distance between two entities is a key step for several data mining and knowledge discovery tasks. The notion of similarity for continuous data is relative...
Shyam Boriah, Varun Chandola, Vipin Kumar
SDM
2008
SIAM
133views Data Mining» more  SDM 2008»
13 years 6 months ago
A RELIEF Based Feature Extraction Algorithm
RELIEF is considered one of the most successful algorithms for assessing the quality of features due to its simplicity and effectiveness. It has been recently proved that RELIEF i...
Yijun Sun, Dapeng Wu
SDM
2008
SIAM
123views Data Mining» more  SDM 2008»
13 years 6 months ago
Constrained Co-clustering of Gene Expression Data
In many applications, the expert interpretation of coclustering is easier than for mono-dimensional clustering. Co-clustering aims at computing a bi-partition that is a collection...
Ruggero G. Pensa, Jean-François Boulicaut
SDM
2008
SIAM
140views Data Mining» more  SDM 2008»
13 years 6 months ago
Large-Scale Many-Class Learning
In many multiclass learning scenarios, the number of classes is relatively large (thousands,...), or the space and time efficiency of the learning system can be crucial. We invest...
Omid Madani, Michael Connor
SDM
2008
SIAM
139views Data Mining» more  SDM 2008»
13 years 6 months ago
Simultaneous Unsupervised Learning of Disparate Clusterings
Most clustering algorithms produce a single clustering for a given data set even when the data can be clustered naturally in multiple ways. In this paper, we address the difficult...
Prateek Jain, Raghu Meka, Inderjit S. Dhillon
SDM
2008
SIAM
139views Data Mining» more  SDM 2008»
13 years 6 months ago
Semi-Supervised Learning Based on Semiparametric Regularization
Semi-supervised learning plays an important role in the recent literature on machine learning and data mining and the developed semisupervised learning techniques have led to many...
Zhen Guo, Zhongfei (Mark) Zhang, Eric P. Xing, Chr...
SDM
2008
SIAM
138views Data Mining» more  SDM 2008»
13 years 6 months ago
Clustering from Constraint Graphs
In constrained clustering it is common to model the pairwise constraints as edges on the graph of observations. Using results from graph theory, we analyze such constraint graphs ...
Ari Freund, Dan Pelleg, Yossi Richter
SDM
2008
SIAM
125views Data Mining» more  SDM 2008»
13 years 6 months ago
Mining and Ranking Generators of Sequential Patterns
Sequential pattern mining first proposed by Agrawal and Srikant has received intensive research due to its wide range applicability in many real-life domains. Various improvements...
David Lo, Siau-Cheng Khoo, Jinyan Li
SDM
2008
SIAM
197views Data Mining» more  SDM 2008»
13 years 6 months ago
A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees
Decision trees are among the most popular pattern types in data mining due to their intuitive representation. However, little attention has been given on the definition of measure...
Irene Ntoutsi, Alexandros Kalousis, Yannis Theodor...
SDM
2008
SIAM
161views Data Mining» more  SDM 2008»
13 years 6 months ago
Efficient Maximum Margin Clustering via Cutting Plane Algorithm
Maximum margin clustering (MMC) is a recently proposed clustering method, which extends the theory of support vector machine to the unsupervised scenario and aims at finding the m...
Bin Zhao, Fei Wang, Changshui Zhang