Sliding window technique for the web log analysis

14 years 7 months ago
Sliding window technique for the web log analysis
The results of the Web query log analysis may be significantly shifted depending on the fraction of agents (non-human clients), which are not excluded from the log. To detect and exclude agents the Web log studies use threshold values for a number of requests submitted by a client during the observation period. However, different studies use different observation periods, and a threshold assigned to one period is usually incomparable with the threshold assigned to the other period. We propose the uniform method equally working on the different observation periods. The method bases on the sliding window technique: a threshold is assigned to the sliding window rather than to the whole observation period. Besides, we determine the sub-optimal values of the parameters of the method: a window size and a threshold and recommend 5-7 unique queries as an upper bound of the threshold for 1-hour sliding window. Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search a...
Nikolai Buzikashvili
Added 22 Nov 2009
Updated 22 Nov 2009
Type Conference
Year 2007
Where WWW
Authors Nikolai Buzikashvili
Comments (0)