We explore in this paper a progressive sampling algorithm, called Sampling Error Estimation (SEE), which aims to identify an appropriate sample size for mining association rules. S...
Recent advances in flash media have made it an attractive alternative for data storage in a wide spectrum of computing devices, such as embedded sensors, mobile phones, PDA's...
Recently, there has been growing interest in random sampling from online hidden databases. These databases reside behind form-like web interfaces which allow users to execute sear...
We consider the problem of creating a sample view of a database table. A sample view is an indexed, materialized view that permits efficient sampling from an arbitrary range query...
In this work we tackle the open problem of self-join size (SJS) estimation in a large-scale Distributed Data System, where tuples of a relation are distributed over data nodes whic...