In this paper we consider the problem of answering queries using views, with or without ontological constraints, which is important for data integration, query optimization, and d...
Web search logs contain extremely sensitive data, as evidenced by the recent AOL incident. However, storing and analyzing search logs can be very useful for many purposes (i.e. in...
MapReduce has emerged as a promising architecture for large scale data analytics on commodity clusters. The rapid adoption of Hive, a SQL-like data processing language on Hadoop (...
Conventional research on similarity search focuses on measuring the similarity between objects with the same type. However, in many real-world applications, we need to measure the...
Chuan Shi, Xiangnan Kong, Philip S. Yu, Sihong Xie...
The Web of Linked Data grows rapidly and already contains data originating from hundreds of data sources. The quality of data from those sources is very diverse, as values may be ...