We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
There exist practical bit-parallel algorithms for several types of pair-wise string processing, such as longest common subsequence computation or approximate string matching. The b...
Despite the fact that electronic publishing is a common activity to scholars, electronic journals are still based in the print model and do not take full advantage of the faciliti...
We apply a new active learning formulation to the problem of learning medical concepts from unstructured text. The new formulation is based on maximizing the mutual information th...
Analyzing, structuring and organizing documented knowledge is an important aspect of knowledge management. In order to ease the access to text collections, in literature so-called...