Sciweavers

188 search results - page 18 / 38
» The hybrid representation model for web document classificat...
Sort
View
93
Voted
SGAI
2004
Springer
15 years 3 months ago
Neighbourhood Exploitation in Hypertext Categorization
As the web expands exponentially, the need to put some order to its content becomes apparent. Hypertext categorization, that is the automatic classification of web documents into ...
Houda Benbrahim, Max Bramer
100
Voted
CIKM
2006
Springer
15 years 1 months ago
A comparative study on classifying the functions of web page blocks
In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, an...
Xiangye Xiao, Qiong Luo, Xing Xie, Wei-Ying Ma
SIGIR
2008
ACM
14 years 9 months ago
Deep classification in large-scale text hierarchies
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification me...
Gui-Rong Xue, Dikan Xing, Qiang Yang, Yong Yu
EMNLP
2008
14 years 11 months ago
HTM: A Topic Model for Hypertexts
Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
Congkai Sun, Bin Gao, Zhenfu Cao, Hang Li
CIKM
2005
Springer
15 years 3 months ago
Web-centric language models
We investigates language models for informational and navigational web search. Retrieval on the web is a task that differs substantially from ordinary ad hoc retrieval. We perfor...
Jaap Kamps