XML and other semi-structured data may have partially specified or missing schema information, motivating the use of a structural summary which can be automatically computed from ...
Raghav Kaushik, Pradeep Shenoy, Philip Bohannon, E...
Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creatin...
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
Abstract. The task of similarity search is widely used in various areas of computing, including multimedia databases, data mining, bioinformatics, social networks, etc. For a long ...
The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the r...