Readers on the Web often skim through text to cope with the volume of available information. In a previous study [11] readers’ eye movements were tracked as they skimmed through...
Co-training is a semi-supervised technique that allows classifiers to learn with fewer labelled documents by taking advantage of the more abundant unclassified documents. However, ...
Many text databases on the web are "hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the conte...
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Saha...
The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
Selective sampling, a form of active learning, reduces the cost of labeling training data by asking only for the labels of the most informative unlabeled examples. We introduce a ...