Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a k-category proximal support vector machine (PSVM) classifier. Proximal s...
Accurate, well-calibrated estimates of class membership probabilities are needed in many supervised learning applications, in particular when a cost-sensitive decision must be mad...
The biological sciences are undergoing an explosion in the amount of available data. New data analysis methods are needed to deal with the data. We present work using KDD to analys...
Surveys are an important part of marketing and customer relationship management, and open answers (i.e., answers to open questions) in particular may contain valuable information ...
Skewed distributions appear very often in practice. Unfortunately, the traditional Zipf distribution often fails to model them well. In this paper, we propose a new probability di...