Address geocoding, the process of finding the map location for a structured postal address, is a relatively well-studied problem. In this paper we consider the more general proble...
Tanuja Joshi, Joseph Joy, Tobias Kellner, Udayan K...
We investigate to what extent people making relevance judgements for a reusable IR test collection are exchangeable. We consider three classes of judge: "gold standard" ...
Peter Bailey, Nick Craswell, Ian Soboroff, Paul Th...
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...
Online communities have become popular for publishing and searching content, as well as for finding and connecting to other users. User-generated content includes, for example, pe...
Ralf Schenkel, Tom Crecelius, Mouna Kacimi, Sebast...
In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...
Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...