This paper presents a grammar-induction based approach to partitioning a Web page into several small pages while each small page fits not only spatially but also logically for mob...
The Internet makes it possible to share and manipulate a vast quantity of information efficiently and effectively, but the rapid and chaotic growth experienced by the Net has gener...
This paper provides an explanation of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sit...
The Web is the richest source of information and knowledge. Unfortunately the current structure of Web pages makes it difficult for users to retrieve the information or knowledge ...
This paper reports our research in the Web page filtering process in specialized search engine development. We propose a machine-learning-based approach that combines Web content a...