Mining data streams is important in both science and commerce. Two major challenges are (1) the data may grow without limit so that it is difficult to retain a long history; and (...
Abstract. In emerging data stream applications, data sources are typically distributed. Evaluating multi-join queries over streams from different sources may incur large communica...
Multi-instance multi-label learning (MIML) refers to the
learning problems where each example is represented by a
bag/collection of instances and is labeled by multiple labels.
...
Rong Jin (Michigan State University), Shijun Wang...
Dyadic data refers to a domain with two nite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This type of data arises natur...
We present our experience in applying a rule induction technique to an extremely imbalanced pharmaceutical data set. We focus on using a variety of performance measures to evaluate...