All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
Given that commercial search engines cover billions of web pages, efficiently managing the corresponding volumes of disk-resident data needed to answer user queries quickly is a f...
Tourist photographs constitute a large part of the images uploaded to photo sharing platforms. But filtering methods are needed before one can extract useful knowledge from noisy ...
Adrian Popescu, Gregory Grefenstette, Pierre-Alain...
Motivated by the increasing need to analyze complex, uncertain multidimensional data this paper proposes probabilistic OLAP queries that are computed using probability distributio...
Igor Timko, Curtis E. Dyreson, Torben Bach Pederse...
Outlier detection has recently become an important problem in many industrial and financial applications. In this paper, a novel feature bagging approach for detecting outliers in...