Sciweavers

KDD
2007
ACM

Xproj: a framework for projected structural clustering of xml documents

14 years 5 months ago
Xproj: a framework for projected structural clustering of xml documents
XML has become a popular method of data representation both on the web and in databases in recent years. One of the reasons for the popularity of XML has been its ability to encode structural information about data records. However, this structural characteristic of data sets also makes it a challenging problem for a variety of data mining problems. One such problem is that of clustering, in which the structural aspects of the data result in a high implicit dimensionality of the data representation. As a result, it becomes more difficult to cluster the data in a meaningful way. In this paper, we propose an effective clustering algorithm for XML data which uses substructures of the documents in order to gain insights about the important underlying structures. We propose new ways of using multiple sub-structural information in XML documents to evaluate the quality of intermediate cluster solutions, and guide the algorithms to a final solution which reflects the true structural behavior ...
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua F
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2007
Where KDD
Authors Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua Feng, Mohammed Javeed Zaki
Comments (0)