Large collections of scanned documents (books and journals) are now available in Digital Libraries. The most common method for retrieving relevant information from these collectio...
We present a novel approach for classifying documents that combines different pieces of evidence (e.g., textual features of documents, links, and citations) transparently, through...
Adriano Veloso, Wagner Meira Jr., Marco Cristo, Ma...
Many important problems involve clustering large datasets. Although naive implementations of clustering are computationally expensive, there are established efficient techniques f...
Often remote investigations use autonomous agents to observe an environment on behalf of absent scientists. Predictive exploration improves these systems’ efficiency with onboa...
In this paper, we propose a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Our approach has three unique features. First, we use the c...