Efficient join processing over uncertain data

11 years 10 months ago
Efficient join processing over uncertain data
In an uncertain database, each data item is modeled as a range associated with a probability density function. Previous works for this kind of data have focussed on simple queries such as range and nearest-neighbor queries. Queries that join multiple relations have not been addressed in earlier work despite the significance of joins in databases. In this paper, we address probabilistic join over uncertain data, essentially a query that augments the results with probability guarantees to indicate the likelihood of each join tuple being part of the result. We extend the notion of join operators, such as equality and inequality, for uncertain data. We also study the performance of probabilistic join. We observe that a user may only need to know whether the probability of the results exceeds a given threshold, instead of the precise probability value. By incorporating this constraint, it is possible to achieve much better performance. In particular, we develop three sets of optimization t...
Reynold Cheng, Sarvjeet Singh, Sunil Prabhakar, Ra
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where CIKM
Authors Reynold Cheng, Sarvjeet Singh, Sunil Prabhakar, Rahul Shah, Jeffrey Scott Vitter, Yuni Xia
Comments (0)