One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Relevance Feedback has proven very effective for improving retrieval accuracy. A difficult yet important problem in all relevance feedback methods is how to optimally balance the...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
We have studied the problem of linking event information across different languages without the use of translation systems or dictionaries. The linking is based on interlingua in...
This paper discusses the combination of collaborative and contentbased filtering in the context of web-based recommender systems. In particular, we link the well-known MovieLens ...