Kernels for Semi-Structured Data

11 years 2 months ago
Kernels for Semi-Structured Data
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured data. In this paper, we discuss applications of kernel methods for semistructured data. We model semi-structured data by labeled ordered trees, and present kernels for classifying labeled ordered trees based on their tag structures by generalizing the convolution kernel for parse trees introduced by Collins and Duffy (2001). We give algorithms to efficiently compute the kernels for labeled ordered trees. We also apply our kernels to node marking problems that are special cases of information extraction from trees. Preliminary experiments using artificial data and real HTML documents show encouraging results.
Hisashi Kashima, Teruo Koyanagi
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2002
Where ICML
Authors Hisashi Kashima, Teruo Koyanagi
Comments (0)