We consider the Entity Resolution (ER) problem (also known as deduplication, or merge-purge), in which records determined to represent the same real-world entity are successively ...
David Menestrina, Omar Benjelloun, Hector Garcia-M...
In implementations of non-standard database systems, large objects are often embedded within an aggregate of different types, i.e. a tuple. For a given size and access probabilit...
Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process ...
We present algorithms for fast quantile and frequency estimation in large data streams using graphics processor units (GPUs). We exploit the high computational power and memory ba...
Naga K. Govindaraju, Nikunj Raghuvanshi, Dinesh Ma...
Recently Zhang et al described an algorithm for the detection of ±1 LSB steganography based on the statistics of the amplitudes of local extrema in the greylevel histogram. Exper...