g:Profiler (http://biit.cs.ut.ee/gprofiler/) is a public web server for characterising and manipulating gene lists resulting from mining high-throughput genomic data. g:Profiler h...
The Random forest classifier comes to be the working horse for visual recognition community. It predicts the class label of an input data by aggregating the votes of multiple tree...
A heterogeneous information network is a network composed of multiple types of objects and links. Recently, it has been recognized that strongly-typed heterogeneous information net...
Ming Ji, Yizhou Sun, Marina Danilevsky, Jiawei Han...
Data de-duplication has become a commodity component in dataintensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, b...
Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongshen...
This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence ...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng ...