Clustering is the process of locating patterns in large data sets. It is an active research area that provides value to scientific as well as business applications. Practical clust...
For large data sets in medicine and science, efficient isosurface extraction and rendering is crucial for interactive visualization. Previous GPU acceleration techniques have been...
There is growing interest in applying Bayesian techniques to NLP problems. There are a number of different estimators for Bayesian models, and it is useful to know what kinds of t...
This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 t...
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz Jo...
Support Vector Machines (SVM) have gained profound interest amidst the researchers. One of the important issues concerning SVM is with its application to large data sets. It is rec...
In this paper we present algorithms for identifying interesting subsets of a given database of records. In many real life applications, it is important to automatically discover s...
Visual similarity matrices (VSMs) are a common technique for visualizing graphs and other types of relational data. While traditionally used for small data sets or well-ordered la...
Christopher Mueller, Benjamin Martin, Andrew Lumsd...
Information visualization is essential in making sense out of large data sets. Often, high-dimensional data are visualized as a collection of points in 2-dimensional space through...
Traditional optimizers fail to pick good execution plans, when faced with increasingly complex queries and large data sets. This failure is even more acute in the context of XQuery...
Riham Abdel Kader, Maurice van Keulen, Peter A. Bo...
This paper explores the use of multi-dimensional trees to provide spatial and temporal e ciencies in imaging large data sets. Each node of the tree contains a model of the data in...