The Iceberg SemiJoin (ISJ) of two datasets R and S returns the tuples in R which join with at least k tuples of S. The ISJ operator is essential in many practical applications incl...
Mohammed Kasim Imthiyaz, Dong Xiaoan, Panos Kalnis
Prefetching is an effective technique for improving file access performance, which can reduce access latency for I/O systems. In distributed storage system, prefetching for metadat...
Lin Lin, Xueming Li, Hong Jiang, Yifeng Zhu, Lei T...
Time series data is usually stored and processed in the form of discrete trajectories of multidimensional measurement points. In order to compare the measurements of a query traje...
Map-reduce framework has received a significant attention and is being used for programming both large-scale clusters and multi-core systems. While the high productivity aspect of ...
Abstract--Large high dimension datasets are of growing importance in many fields and it is important to be able to visualize them for understanding the results of data mining appro...
Jong Youl Choi, Seung-Hee Bae, Xiaohong Qiu, Geoff...