The exponential data growth rate of the Internet makes it increasingly difficult for people to find desired information in a timely fashion. Information filtering and dissemina...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
—Text-line extraction is a key task in document analysis. Methods based on anisotropic Gaussian filtering and ridge detection have shown good results. This paper describes perfo...
Syed Saqib Bukhari, Faisal Shafait, Thomas M. Breu...
: Information retrieval tries to identify relevant documents for an information need. The problems that an IR system should deal with include document indexing (which tries to extr...