In this paper we present a new document representation model based on implicit user feedback obtained from search engine queries. The main objective of this model is to achieve be...
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
The limitations of the traditional SOA operational model, such as the lack of rich service descriptions, weaken the role of service registries. Their removal from the model violate...
Mohammed AbuJarour, Felix Naumann, Mircea Craculea...
: ? Towards Combining Web Classification and Web Information Extraction: a Case Study Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi HP Laboratories HPL-2009-86 Classific...