Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Web user search customization research has been fueled by the recognition that if the WWW is to attain to its optimal potential as an interactive medium the development of new and...
More and more users are contributing and sharing more and more contents on the Web via the use of content hosting sites and social media services. These user–generated contents ...
This paper presents the method for retrieving and summarizing changes in topics from online resources. Users often want to know what are the major changes in their areas of intere...
In this paper we address the problem of analyzing web log data collected at a typical online newspaper site. We propose a two-way clustering technique based on probability theory....
Hannes Wettig, Jussi Lahtinen, Tuomas Lepola, Petr...