This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence ...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng ...
Motivation: High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alig...
An unsupervised discriminative training procedure is proposed for estimating a language model (LM) for machine translation (MT). An English-to-English synchronous context-free gra...
Zhifei Li, Ziyuan Wang, Sanjeev Khudanpur, Jason E...
Machine-learning algorithms are employed in a wide variety of applications to extract useful information from data sets, and many are known to suffer from superlinear increases in ...
Karthik Nagarajan, Brian Holland, Alan D. George, ...
Ambiguous queries constitute a significant fraction of search instances and pose real challenges to web search engines. With current approaches the top results for these queries ...