Classification is an important data mining problem. Given a training database of records, each tagged with a class label, the goal of classification is to build a concise model ...
Johannes Gehrke, Venkatesh Ganti, Raghu Ramakrishn...
Accurate probability-based ranking of instances is crucial in many real-world data mining applications. KNN (k-nearest neighbor) [1] has been intensively studied as an effective c...
Caching approximate values instead of exact values presents an opportunity for performance gains in exchange for decreased precision. To maximize the performance improvement, cach...
Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam...
It is known that if a 2-universal hash function H is applied to elements of a block source (X1, . . . , XT ), where each item Xi has enough min-entropy conditioned on the previous...