Most of the challenges faced when building the Semantic Web require a substantial amount of human labor and intelligence. Despite significant advancement in ontology learning and h...
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
It is necessary to provide a method to store Web information effectively so it can be utilised as a future knowledge resource. A commonly adopted approach is to classify the retri...
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content [7], but a U...
An increasing number of data sources now become available on the Web, but often their contents are only accessible through query interfaces. For a domain of interest, there often ...
Wensheng Wu, Clement T. Yu, AnHai Doan, Weiyi Meng