Abstract. This paper extends previous studies that investigated the accessibility of different web sites of specific content, to an analysis of the whole web of a specific country ...
A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site...
We present WSQ/DSQ (pronounced “wisk-disk”), a new approach for combining the query facilities of traditional databases with existing search engines on the Web. WSQ, for Web-S...
Many vertical search tasks such as local search focus on specific domains. The meaning of relevance in these verticals is domain-specific and usually consists of multiple well-d...
Changsung Kang, Xuanhui Wang, Yi Chang, Belle L. T...
Although most of existing research usually detects events by analyzing the content or structural information of Web documents, a recent direction is to study the usage data. In th...