There is clear demand for a global spatial public domain roads data set with improved geographic and temporal coverage, consistent coding of road types, and clear documentation of...
Andrew Nelson 0002, Alexander de Sherbinin, France...
Statistical estimation and approximate query processing have become increasingly prevalent applications for database systems. However, approximation is usually of little use witho...
We provide several new sampling-based estimators of the number of distinct values of an attribute in a relation. We compare these new estimators to estimators from the database an...
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, L...
We address the fundamental question: what does it mean for data in a database to be of high quality? We motivate our discussion with examples, where traditional views on data quali...
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalizatio...