Sciweavers

PAKDD
2009
ACM

Tree-Based Method for Classifying Websites Using Extended Hidden Markov Models

13 years 11 months ago
Tree-Based Method for Classifying Websites Using Extended Hidden Markov Models
One important problem proposed recently in the field of web mining is website classification problem. The complexity together with the necessity to have accurate and fast algorithms yield to many attempts in this field, but there is a long way to solve these problems efficiently, yet. The importance of the problem encouraged us to work on a new approach as a solution. We use the content of web pages together with the link structure between them to improve the accuracy of results. In this work we use Na¨ıve-bayes models for each predefined webpage class and an extended version of Hidden Markov Model is used as website class models. A few sample websites are adopted as seeds to calculate models’ parameters. For classifying the websites we represent them with tree structures and we modify the Viterbi algorithm to evaluate the probability of generating these tree structures by every website model. Because of the large amount of pages in a website, we use a sampling technique that n...
Majid Yazdani, Milad Eftekhar, Hassan Abolhassani
Added 20 May 2010
Updated 20 May 2010
Type Conference
Year 2009
Where PAKDD
Authors Majid Yazdani, Milad Eftekhar, Hassan Abolhassani
Comments (0)