Scalable execution of continuous queries over massive data streams often requires splitting input streams into parallel sub-streams over which query operators are executed in paral...
— Extracting useful correlation from a dataset has been extensively studied. In this paper, we deal with the opposite, namely, a problem we call correlation hiding (CH), which is...
Yufei Tao, Jian Pei, Jiexing Li, Xiaokui Xiao, Ke ...
Abstract— In parallel query-processing environments, accurate, time-oriented progress indicators could provide much utility given that inter- and intra-query execution times can ...
— Massive data analysis on large clusters presents new opportunities and challenges for query optimization. Data partitioning is crucial to performance in this environment. Howev...
— Commercial tuple extraction systems have enjoyed some success to extract tuples by regarding HTML pages as tree structures and exploiting XPath queries to find attributes of t...