Lossy Reduction for Very High Dimensional Data

10 years 11 months ago
Lossy Reduction for Very High Dimensional Data
We consider the use of data reduction techniques for the problem of approximate query answering. We focus on applications for which accurate answers to selective queries are required, and for which the data are very high dimensional (having hundreds or perhaps thousands of dimensions). We carefully examine the assumptions underlying many existing reduction techniques. To ensure both speed and accuracy, we show that these methods assume statistical characteristics that very high dimensional datasets do not in general possess. We present a new data reduction method that does not suffer from these limitations, called the RS Kernel. We demonstrate the effectiveness of this method for answering difficult, highly selective queries over high dimensional data using several real datasets.
Chris Jermaine, Edward Omiecinski
Added 14 Jul 2010
Updated 14 Jul 2010
Type Conference
Year 2002
Where ICDE
Authors Chris Jermaine, Edward Omiecinski
Comments (0)