Clipping Web pages, namely extracting the informative clips (areas) from Web pages, has many applications, such as Web printing and e-reading on small handheld devices. Although m...
Lei Zhang, Linpeng Tang, Ping Luo, Enhong Chen, Li...
The proliferation of digital libraries and the large amount of existing documents raise important issues in efficient handling of documents. Printed texts in documents need to be...
Information integration systems combine data from multiple heterogeneous Web services to answer complex user queries, provided a user has semantically modeled the service first. T...
Kristina Lerman, Anon Plangprasopchok, Craig A. Kn...
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
: Automatically recognising which HTML documents on the Web contain items of interest for a user is non-trivial. As a step toward solving this problem, we propose an approach based...