Abstract--Search engines have greatly influenced the way people access information on the Internet as such engines provide the preferred entry point to billions of pages on the Web...
Ao-Jan Su, Y. Charlie Hu, Aleksandar Kuzmanovic, C...
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content [7], but a U...
Keeping people away from litigious information becomes one of the most important research area in network information security. Indeed, Web filtering is used to prevent access to u...
A Web site is a hyperlinked network environment, which consists of hundreds of inter-connected pages, usually without an engineered architecture. This is often a large, complex We...
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...