This work proposes a new approach to the retrieval of images from text queries. Contrasting with previous work, this method relies on a discriminative model: the parameters are sel...
In this paper, we describe some experiments in large-scale Information Extraction (IE) focusing on book texts. We investigate the scalability of IE techniques to full-sized books,...
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
Temporal information has been the focus of recent attention in information extraction, leading to some standardization effort, in particular for the task of relating events in a t...
In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper sh...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...