We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus. Most earlier approaches to this problem rely on combining cluste...
Bhavana Bharat Dalvi, William W. Cohen, Jamie Call...
In this paper, we explore the discriminating subsequencebased clustering problem. First, several effective optimization techniques are proposed to accelerate the sequence mining p...
Jianyong Wang, Yuzhou Zhang, Lizhu Zhou, George Ka...
The problem of efficiently finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied. However...
There has been increasing interest in the problem of building accurate data mining models over aggregate data, while protecting privacy at the level of individual records. One app...
Alexandre V. Evfimievski, Johannes Gehrke, Ramakri...
Randomization is an economical and efficient approach for privacy preserving data mining (PPDM). In order to guarantee the performance of data mining and the protection of individ...