Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple...
Query logs of a Web search engine have been increasingly used as a vital source for data mining. This paper presents a study on largescale domain-independent entity extraction fro...
Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...
This paper discusses about how business intelligence on a website could be obtained from users’ access records instead of web logs of “hits”. Users’ access records are cap...
We present an extensive analysis of long-term statistics of the queries to websites using logs collected on several web caches in Russian academic networks and on US IRCache cache...
Serge A. Krashakov, Anton B. Teslyuk, Lev N. Shchu...