

Simple Classification into Large Topic Ontology of Web Documents

13 years 12 months ago
Simple Classification into Large Topic Ontology of Web Documents
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology and providing it with enriched data by including additional information on the Web page context obtained from the link structure of the Web. The context is generated form the in-coming and out-going links of the Web document we want to classify (the target document), meaning that for representing a document we use, not only text of the document itself, but also the text from the documents pointing to the target document as well as the text form the documents that the target document is pointing to. The idea is that providing enriched data is compensating for the simplicity of the approach while keeping it efficient and capable of handling large topic ontology. Keywords. Classification of documents, topic ontology of Web documents, Web document context, link structure of the Web
Marko Grobelnik, Dunja Mladenic
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2005
Where CIT
Authors Marko Grobelnik, Dunja Mladenic
Comments (0)