From the standpoint of the automated extraction of scientific knowledge, an important but little-studied part of scientific publications are the figures and accompanying captions....
William W. Cohen, Richard C. Wang, Robert F. Murph...
The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...
Increasing applications are demanding effective and efficient support to perform retrieval in large collections of digital images. The work presented here is an early stage resear...
Giovanna Castellano, Gianluca Sforza, Maria Alessa...
The success of popular algorithms such as k-means clustering or nearest neighbor searches depend on the assumption that the underlying distance functions reflect domain-specific n...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...