The problem of selecting a sample subset sufficient to preserve diversity arises in many applications. One example is in the design of recombinant inbred lines (RIL) for genetic a...
Feng Pan, Adam Roberts, Leonard McMillan, David Th...
A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...
— Most of existing privacy preserving techniques, such as k-anonymity methods, are designed for static data sets. As such, they cannot be applied to streaming data which are cont...
Jianneng Cao, Barbara Carminati, Elena Ferrari, Ki...
Private data often comes in the form of associations between entities, such as customers and products bought from a pharmacy, which are naturally represented in the form of a larg...
Graham Cormode, Divesh Srivastava, Ting Yu, Qing Z...
This paper introduces mass estimation—a base modelling mechanism in data mining. It provides the theoretical basis of mass and an efficient method to estimate mass. We show that...
Kai Ming Ting, Guang-Tong Zhou, Fei Tony Liu, Jame...