Sciweavers

Share
WWW
2004
ACM

OntoMiner: bootstrapping ontologies from overlapping domain specific web sites

11 years 3 months ago
OntoMiner: bootstrapping ontologies from overlapping domain specific web sites
In this paper, we present automated techniques for bootstrapping and populating specialized domain ontologies by organizing and mining a set of relevant overlapping Web sites provided by the user. We develop algorithms that detect and utilize HTML regularities in the Web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships. We also extract semi-structured concept instances annotated with their labels whenever they are available. Experimental evaluation for the News, Travel, and Shopping domains indicates that our algorithms can bootstrap and populate domain specific ontologies with high precision and recall. Categories and Subject Descriptors H.4.m [Information Systems]: Miscellaneous; I.2.6 [Artificial Intelligence]: Learning--Knowledge Acquisition General Terms Algorithms, Performance, Experimentation Keywords Web Mining, Semantic Web, Ontology, Data ...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag
Added 22 Nov 2009
Updated 22 Nov 2009
Type Conference
Year 2004
Where WWW
Authors Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nagarajan
Comments (0)
books