We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
The web provides an unprecedented opportunity to evaluate ideas quickly using controlled experiments, also called randomized experiments (single-factor or factorial designs), A/B ...
The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discoveri...
The Web is a distributed network of information sources where the individual sources are autonomously created and maintained. Consequently, syntactic and semantic heterogeneity of ...
In this paper we address the problem of analyzing web log data collected at a typical online newspaper site. We propose a two-way clustering technique based on probability theory....
Hannes Wettig, Jussi Lahtinen, Tuomas Lepola, Petr...