Splitting compound words has proved to be useful in areas such as Machine Translation, Speech Recognition or Information Retrieval (IR). Furthermore, real-time IR systems (such as...
Abstract. We propose a framework in which query sizes can be estimated from arbitrary statistical assertions on the data. In its most general form, a statistical assertion states t...
Large highly distributed data sets are poorly supported by current query technologies. Applications such as endsystembased network management are characterized by data stored on l...
Dushyanth Narayanan, Austin Donnelly, Richard Mort...
Many scientific, financial, data mining and sensor network applications need to work with continuous, rather than discrete data e.g., temperature as a function of location, or sto...
It is often useful to get high-level views of datasets in order to identify areas of interest worthy of further exploration. In relational databases, the high-level view can be de...