Sciweavers

SIGMOD
2004
ACM

Identifying Similarities, Periodicities and Bursts for Online Search Queries

14 years 3 months ago
Identifying Similarities, Periodicities and Bursts for Online Search Queries
We present several methods for mining knowledge from the query logs of the MSN search engine. Using the query logs, we build a time series for each query word or phrase (e.g., `Thanksgiving' or `Christmas gifts') where the elements of the time series are the number of times that a query is issued on a day. All of the methods we describe use sequences of this form and can be applied to time series data generally. Our primary goal is the discovery of semantically similar queries and we do so by identifying queries with similar demand patterns. Utilizing the best Fourier coefficients and the energy of the omitted components, we improve upon the state-of-the-art in time-series similarity matching. The extracted sequence features are then organized in an efficient metric tree index structure. We also demonstrate how to efficiently and accurately discover the important periods in a time-series. Finally we propose a simple but effective method for identification of bursts (long or ...
Michail Vlachos, Christopher Meek, Zografoula Vage
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2004
Where SIGMOD
Authors Michail Vlachos, Christopher Meek, Zografoula Vagena, Dimitrios Gunopulos
Comments (0)