Multiple instance (MI) learning is a recent learning paradigm that is more flexible than standard supervised learning algorithms in the handling of label ambiguity. It has been u...
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streamin...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join...
Violations of functional dependencies (FDs) are common in practice, often arising in the context of data integration or Web data extraction. Resolving these violations is known to...
Regular expression pattern matching is widely used in computational biology. Searching through a database of sequences for a motif (a simple regular expression), or its variations...