With the large and increasing amount of visual information available in digital libraries and the Web, efficient and robust systems for image retrieval are urgently needed. In thi...
We consider the problem of finding a few representatives for a dataset, i.e., a subset of data points that efficiently describes the entire dataset. We assume that each data poi...
In order to reduce human efforts, there has been increasing interest in applying active learning for training text classifiers. This paper describes a straightforward active learni...
Zhao Xu, Kai Yu, Volker Tresp, Xiaowei Xu, Jizhi W...
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real...
Finding a small set of representative instances for large datasets can bring various benefits to data mining practitioners so they can (1) build a learner superior to the one cons...