The ability to efficiently discover information using partial knowledge (for example keywords, attributes or ranges) is important in large, decentralized, resource sharing distri...
—In cloud computing, clients usually outsource their data to the cloud storage servers to reduce the management costs. While those data may contain sensitive personal information...
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streamin...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating o...
In many database applications involving string data, it is common to have near neighbor queries (asking for strings that are similar to a query string) or nearest neighbor queries...