We study runtime distributions of subsumption testing. On graph data randomly sampled from two different generative models we observe a gradual growth of the tails of the distribut...
This paper is organized as follows. In Section 2, we formulate the proposed UMPC method for modeling nonstationary and multi-modal data. Both MPC and UPC are shown to be special ca...
Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
Social networking sites (SNS) are only as good as the content their users share. Therefore, designers of SNS seek to improve the overall user experience by encouraging members to ...
Similarity search has been widely studied in peer-to-peer environments. In this paper, we propose the Bounded Locality Sensitive Hashing (Bounded LSH) method for similarity search...