A crucial preprocessing stage in applications such as OCR is text extraction from mixed-type documents. The present work, in contrast to most until now, successfully faces the pro...
This paper presents a methodology for learning taxonomic relations from a set of documents that each explain one of the concepts. Three different feature extraction approaches with...
In the past few years, the fast proliferation of available XML documents has stimulated a great deal of interest in discovering hidden and nontrivial knowledge from XML repositori...
Ling Chen 0002, Sourav S. Bhowmick, Liang-Tien Chi...
Web based services and applications have increased the availability and accessibility of information. XML has recently emerged as an important standard in the area of information ...
This work explores the application of clustering methods for grouping structurally similar XML documents. Modeling the XML documents as rooted ordered labeled trees, we apply clust...
Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, T...