The explosive growth of multimedia data poses serious challenges to data storage, management and search. Efficient near-duplicate detection is one of the required technologies for...
Many data mining tasks (e.g., Association Rules, Sequential Patterns) use complex pointer-based data structures (e.g., hash trees) that typically suffer from sub-optimal data loca...
Srinivasan Parthasarathy, Mohammed Javeed Zaki, We...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer v...
Ting Liu, Andrew W. Moore, Alexander G. Gray, Ke Y...
In this paper, we present a novel steganographic technique for hiding data in a kind of PDF texts. We first point out the secret channels in a kind of PDF English texts, which ar...