Many text databases on the web are "hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the conte...
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Saha...
When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we...
Although the ever growing Web contain information to virtually every user’s query, it does not guarantee effectively accessing to those information. In many situations, the user...
The lack of a large scale Chinese test collection is an obstacle to the Chinese information retrieval development. In order to address this issue, we built such a collection compos...
Current Web search engines generally impose link analysis-based re-ranking on web-page retrieval. However, the same techniques, when applied directly to small web search such as i...