We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it the task of information extraction (IE) by utili...
Yanjun Qi, Ronan Collobert, Pavel Kuksa, Koray Kav...
The publish-subscribe paradigm is an effective approach for data publishers to asynchronously disseminate relevant data to a large number of data subscribers. A lot of recent res...
While numerous metrics for information retrieval are available in the case of binary relevance, there is only one commonly used metric for graded relevance, namely the Discounted ...
Olivier Chapelle, Donald Metlzer, Ya Zhang, Pierre...
Search engine switching describes the voluntarily transition from one Web search engine to another. In this paper we present a study of search engine switching behavior that combi...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...