Sciweavers

KDD
2005
ACM

Using retrieval measures to assess similarity in mining dynamic web clickstreams

14 years 4 months ago
Using retrieval measures to assess similarity in mining dynamic web clickstreams
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stoppages and reconfigurations is still an open challenge. This dynamic and single pass setting can be cast within the framework of mining evolving data streams. In this paper, we explore the task of mining mass user profiles by discovering evolving Web session clusters in a single pass with a recently proposed scalable immune based clustering approach (TECNO-STREAMS), and study the effect of the choice of different similarity measures on the mining process and on the interpretation of the mined patterns. We propose a simple similarity measure that has the advantage of explicitly coupling the precision and coverage criteria to the early learning stages, and furthermore requiring that the affinity of the data to the learned profiles or summaries be defined by the minimum of their coverage or precision, hence requir...
Olfa Nasraoui, Cesar Cardona, Carlos Rojas
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2005
Where KDD
Authors Olfa Nasraoui, Cesar Cardona, Carlos Rojas
Comments (0)