Sciweavers

SDM
2008
SIAM
147views Data Mining» more  SDM 2008»
13 years 6 months ago
The Asymmetric Approximate Anytime Join: A New Primitive with Applications to Data Mining
It has long been noted that many data mining algorithms can be built on top of join algorithms. This has lead to a wealth of recent work on efficiently supporting such joins with ...
Lexiang Ye, Xiaoyue Wang, Dragomir Yankov, Eamonn ...
SDM
2008
SIAM
130views Data Mining» more  SDM 2008»
13 years 6 months ago
Mining Sequence Classifiers for Early Prediction
Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since te...
Zhengzheng Xing, Jian Pei, Guozhu Dong, Philip S. ...
SDM
2008
SIAM
150views Data Mining» more  SDM 2008»
13 years 6 months ago
A Stagewise Least Square Loss Function for Classification
This paper presents a stagewise least square (SLS) loss function for classification. It uses a least square form within each stage to approximate a bounded monotonic nonconvex los...
Shuang-Hong Yang, Bao-Gang Hu
SDM
2008
SIAM
97views Data Mining» more  SDM 2008»
13 years 6 months ago
Efficient Distribution Mining and Classification
We define and solve the problem of "distribution classification", and, in general, "distribution mining". Given n distributions (i.e., clouds) of multi-dimensi...
Yasushi Sakurai, Rosalynn Chong, Lei Li, Christos ...
SDM
2008
SIAM
176views Data Mining» more  SDM 2008»
13 years 6 months ago
A General Model for Multiple View Unsupervised Learning
Multiple view data, which have multiple representations from different feature spaces or graph spaces, arise in various data mining applications such as information retrieval, bio...
Bo Long, Philip S. Yu, Zhongfei (Mark) Zhang
SDM
2008
SIAM
256views Data Mining» more  SDM 2008»
13 years 6 months ago
Graph Mining with Variational Dirichlet Process Mixture Models
Graph data such as chemical compounds and XML documents are getting more common in many application domains. A main difficulty of graph data processing lies in the intrinsic high ...
Koji Tsuda, Kenichi Kurihara
SDM
2008
SIAM
177views Data Mining» more  SDM 2008»
13 years 6 months ago
Robust Clustering in Arbitrarily Oriented Subspaces
In this paper, we propose an efficient and effective method to find arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possi...
Elke Achtert, Christian Böhm, Jörn David...
SDM
2008
SIAM
134views Data Mining» more  SDM 2008»
13 years 6 months ago
Direct Density Ratio Estimation for Large-scale Covariate Shift Adaptation
Covariate shift is a situation in supervised learning where training and test inputs follow different distributions even though the functional relation remains unchanged. A common...
Yuta Tsuboi, Hisashi Kashima, Shohei Hido, Steffen...
SDM
2008
SIAM
135views Data Mining» more  SDM 2008»
13 years 6 months ago
A Spamicity Approach to Web Spam Detection
Web spam, which refers to any deliberate actions bringing to selected web pages an unjustifiable favorable relevance or importance, is one of the major obstacles for high quality ...
Bin Zhou 0002, Jian Pei, ZhaoHui Tang