Sciweavers

KDD
2008
ACM
142views Data Mining» more  KDD 2008»
14 years 6 months ago
Weighted graphs and disconnected components: patterns and a generator
The vast majority of earlier work has focused on graphs which are both connected (typically by ignoring all but the giant connected component), and unweighted. Here we study numer...
Mary McGlohon, Leman Akoglu, Christos Faloutsos
KDD
2008
ACM
172views Data Mining» more  KDD 2008»
14 years 6 months ago
Structured metric learning for high dimensional problems
The success of popular algorithms such as k-means clustering or nearest neighbor searches depend on the assumption that the underlying distance functions reflect domain-specific n...
Jason V. Davis, Inderjit S. Dhillon
KDD
2008
ACM
110views Data Mining» more  KDD 2008»
14 years 6 months ago
Mining preferences from superior and inferior examples
Mining user preferences plays a critical role in many important applications such as customer relationship management (CRM), product and service recommendation, and marketing camp...
Bin Jiang, Jian Pei, Xuemin Lin, David W. Cheung, ...
KDD
2008
ACM
186views Data Mining» more  KDD 2008»
14 years 6 months ago
Scalable and near real-time burst detection from eCommerce queries
In large scale online systems like Search, eCommerce, or social network applications, user queries represent an important dimension of activities that can be used to study the imp...
Nish Parikh, Neel Sundaresan
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
14 years 6 months ago
Multi-class cost-sensitive boosting with p-norm loss functions
We propose a family of novel cost-sensitive boosting methods for multi-class classification by applying the theory of gradient boosting to p-norm based cost functionals. We establ...
Aurelie C. Lozano, Naoki Abe
KDD
2008
ACM
132views Data Mining» more  KDD 2008»
14 years 6 months ago
Partitioned logistic regression for spam filtering
Naive Bayes and logistic regression perform well in different regimes. While the former is a very simple generative model which is efficient to train and performs well empirically...
Ming-wei Chang, Wen-tau Yih, Christopher Meek
KDD
2008
ACM
159views Data Mining» more  KDD 2008»
14 years 6 months ago
Semi-supervised learning with data calibration for long-term time series forecasting
Many time series prediction methods have focused on single step or short term prediction problems due to the inherent difficulty in controlling the propagation of errors from one ...
Haibin Cheng, Pang-Ning Tan
KDD
2008
ACM
138views Data Mining» more  KDD 2008»
14 years 6 months ago
Quantitative evaluation of approximate frequent pattern mining algorithms
Traditional association mining algorithms use a strict definition of support that requires every item in a frequent itemset to occur in each supporting transaction. In real-life d...
Rohit Gupta, Gang Fang, Blayne Field, Michael Stei...
KDD
2008
ACM
217views Data Mining» more  KDD 2008»
14 years 6 months ago
Stream prediction using a generative model based on frequent episodes in event sequences
This paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wis...
Srivatsan Laxman, Vikram Tankasali, Ryen W. White