Learning from imbalanced datasets presents a convoluted problem both from the modeling and cost standpoints. In particular, when a class is of great interest but occurs relatively...
Nitesh V. Chawla, David A. Cieslak, Lawrence O. Ha...
Re-identification is a major privacy threat to public datasets containing individual records. Many privacy protection algorithms rely on generalization and suppression of "qu...
y abstract, it is grounded in the experience of many markets. As I will illustrate, it explains much strategic behavior. Information spreads With a bit of effort, any technically s...
Good-Turing adjustments of word frequencies are an important tool in natural language modeling. In particular, for any sample of words, there is a set of words not occuring in tha...
Learning temporal causal structures between time series is one of the key tools for analyzing time series data. In many real-world applications, we are confronted with Irregular T...