Users of the World-Wide Web are not only confronted by an immense overabundance of information, but also by a plethora of tools for searching for the web pages that suit their inf...
A significant fraction of Web data is available only for short periods of time. We consider methods to keep track and to record such dynamic information automatically. The main p...
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
Search engine technology plays an important role in Web information retrieval. However, with Internet information explosion, traditional searching techniques cannot provide satisfa...
Baile Shi, Guoyu Hao, Hongtao Xu, Mei Wang, Qi Zha...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...