Sciweavers

SDM
2010
SIAM
153views Data Mining» more  SDM 2010»
13 years 6 months ago
The Generalized Dimensionality Reduction Problem
The dimensionality reduction problem has been widely studied in the database literature because of its application for concise data representation in a variety of database applica...
Charu C. Aggarwal
SDM
2010
SIAM
165views Data Mining» more  SDM 2010»
13 years 6 months ago
Direct Density Ratio Estimation with Dimensionality Reduction
Methods for directly estimating the ratio of two probability density functions without going through density estimation have been actively explored recently since they can be used...
Masashi Sugiyama, Satoshi Hara, Paul von Büna...
SDM
2010
SIAM
218views Data Mining» more  SDM 2010»
13 years 6 months ago
Confidence-Based Feature Acquisition to Minimize Training and Test Costs
We present Confidence-based Feature Acquisition (CFA), a novel supervised learning method for acquiring missing feature values when there is missing data at both training and test...
Marie desJardins, James MacGlashan, Kiri L. Wagsta...
SDM
2010
SIAM
181views Data Mining» more  SDM 2010»
13 years 6 months ago
Making k-means Even Faster
The k-means algorithm is widely used for clustering, compressing, and summarizing vector data. In this paper, we propose a new acceleration for exact k-means that gives the same a...
Greg Hamerly
SDM
2010
SIAM
146views Data Mining» more  SDM 2010»
13 years 6 months ago
Evaluating Query Result Significance in Databases via Randomizations
Many sorts of structured data are commonly stored in a multi-relational format of interrelated tables. Under this relational model, exploratory data analysis can be done by using ...
Markus Ojala, Gemma C. Garriga, Aristides Gionis, ...
SDM
2010
SIAM
149views Data Mining» more  SDM 2010»
13 years 6 months ago
Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization
Real-world relational data are seldom stationary, yet traditional collaborative filtering algorithms generally rely on this assumption. Motivated by our sales prediction problem, ...
Liang Xiong, Xi Chen, Tzu-Kuo Huang, Jeff Schneide...
SDM
2010
SIAM
195views Data Mining» more  SDM 2010»
13 years 6 months ago
Adaptive Informative Sampling for Active Learning
Many approaches to active learning involve periodically training one classifier and choosing data points with the lowest confidence. An alternative approach is to periodically cho...
Zhenyu Lu, Xindong Wu, Josh Bongard
SDM
2010
SIAM
151views Data Mining» more  SDM 2010»
13 years 6 months ago
Fast Stochastic Frank-Wolfe Algorithms for Nonlinear SVMs
The high computational cost of nonlinear support vector machines has limited their usability for large-scale problems. We propose two novel stochastic algorithms to tackle this pr...
Hua Ouyang, Alexander Gray
SDM
2010
SIAM
200views Data Mining» more  SDM 2010»
13 years 6 months ago
Residual Bayesian Co-clustering for Matrix Approximation
In recent years, matrix approximation for missing value prediction has emerged as an important problem in a variety of domains such as recommendation systems, e-commerce and onlin...
Hanhuai Shan, Arindam Banerjee
SDM
2010
SIAM
259views Data Mining» more  SDM 2010»
13 years 6 months ago
Semi-supervised Bio-named Entity Recognition with Word-Codebook Learning
We describe a novel semi-supervised method called WordCodebook Learning (WCL), and apply it to the task of bionamed entity recognition (bioNER). Typical bioNER systems can be seen...
Pavel P. Kuksa, Yanjun Qi