Focused Web browsing activities such as periodically looking up headline news, weather reports, etc., which require only selective fragments of particular Web pages, can be made more efficient for users of limited-display-size handheld mobile devices by delivering only the target fragments. Semantic bookmarks provide a robust conceptual framework for recording and retrieving such targeted content not only from the specific pages used in creating the bookmarks but also from any user-specified page with similar content semantics. This paper describes a technique for realizing semantic bookmarks by coupling machine learning with Web page segmentation to create a statistical model of the bookmarked content. These models are used to identify and retrieve the bookmarked content from Web pages that share a common content domain. In contrast to ontology-based approaches where semantic bookmarks are limited to available concepts in the ontology, the learning-based approach allows users to book...
Saikat Mukherjee, I. V. Ramakrishnan