Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML w...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Text mining concerns the discovery of knowledge from unstructured textual data. One important task is the discovery of rules that relate specific words and phrases. Although exist...
In recent years XML has became very popular for representing semistructured data and a standard for data exchange over the web. Mining XML data from the web is becoming increasing...
This paper describes SHOE, a set of Simple HTML Ontology Extensions which allow World-Wide Web authors to annotate their pages with semantic knowledge such as “I am a graduate s...
Sean Luke, Lee Spector, David Rager, James A. Hend...