The correct web site text content must be help to the visitors to find what they are looking for. However, the reality is quite different, many times the web page text content is a...
When automatically extracting information from the world wide web, most established methods focus on spotting single HTMLdocuments. However, the problem of spotting complete web s...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
This work aims to provide a novel, site-specific web page segmentation and section importance detection algorithm, which leverages structural, content, and visual information. The...
Replication and caching mechanisms are often employed to enhance the performance of Web applications. In this article, we present a qualitative and quantitative analysis of state-...
Abstract. This paper presents first steps towards building a music information system like last.fm, but with the major difference that the data is automatically retrieved from the ...
Markus Schedl, Peter Knees, Tim Pohle, Gerhard Wid...