Sciweavers

PAKDD
2005
ACM
114views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Increasing Classification Accuracy by Combining Adaptive Sampling and Convex Pseudo-Data
The availability of microarray data has enabled several studies on the application of aggregated classifiers for molecular classification. We present a combination of classifier ag...
Chia Huey Ooi, Madhu Chetty
PAKDD
2005
ACM
184views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Adjusting Mixture Weights of Gaussian Mixture Model via Regularized Probabilistic Latent Semantic Analysis
Mixture models, such as Gaussian Mixture Model, have been widely used in many applications for modeling data. Gaussian mixture model (GMM) assumes that data points are generated fr...
Luo Si, Rong Jin
PAKDD
2005
ACM
180views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Conditional Random Fields for Transmembrane Helix Prediction
Abstract. It is estimated that 20% of genes in the human genome encode for integral membrane proteins (IMPs) and some estimates are much higher. IMPs control a broad range of event...
Lior Lukov, Sanjay Chawla, W. Bret Church
PAKDD
2005
ACM
164views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Covariance and PCA for Categorical Variables
Covariances from categorical variables are defined using a regular simplex expression for categories. The method follows the variance definition by Gini, and it gives the covaria...
Hirotaka Niitsuma, Takashi Okada
PAKDD
2005
ACM
161views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Online Algorithms for Mining Inter-stream Associations from Large Sensor Networks
We study the problem of mining frequent value sets from a large sensor network. We discuss how sensor stream data could be represented that facilitates efficient online mining and ...
K. K. Loo, Ivy Tong, Ben Kao
PAKDD
2005
ACM
128views Data Mining» more  PAKDD 2005»
13 years 10 months ago
A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets
Traditional association rules mining cannot meet the demands arising from some real applications. By considering the different values of individual items as utilities, utility mini...
Ying Liu, Wei-keng Liao, Alok N. Choudhary
PAKDD
2005
ACM
132views Data Mining» more  PAKDD 2005»
13 years 10 months ago
SETRED: Self-training with Editing
Self-training is a semi-supervised learning algorithm in which a learner keeps on labeling unlabeled examples and retraining itself on an enlarged labeled training set. Since the s...
Ming Li, Zhi-Hua Zhou
PAKDD
2005
ACM
128views Data Mining» more  PAKDD 2005»
13 years 10 months ago
A Framework for Incorporating Class Priors into Discriminative Classification
Abstract. Discriminative and generative methods provide two distinct approaches to machine learning classification. One advantage of generative approaches is that they naturally mo...
Rong Jin, Yi Liu
PAKDD
2005
ACM
103views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Subgroup Discovery Techniques and Applications
This paper presents the advances in subgroup discovery and the ways to use subgroup discovery to generate actionable knowledge for decision support. Actionable knowledge is explici...
Nada Lavrac
PAKDD
2005
ACM
112views Data Mining» more  PAKDD 2005»
13 years 10 months ago
Approximated Clustering of Distributed High-Dimensional Data
In many modern application ranges high-dimensional feature vectors are used to model complex real-world objects. Often these objects reside on different local sites. In this paper,...
Hans-Peter Kriegel, Peter Kunath, Martin Pfeifle, ...