Automated text categorization is an important technique for many web applications, such as document indexing, document filtering, and cataloging web resources. Many different appr...
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology an...
Many approaches to Information Extraction (IE) have been proposed in literature capable of finding and extract specific facts in relatively unstructured documents. Their applicatio...
Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and p...
Amin Zollanvari, Ulisses Braga-Neto, Edward R. Dou...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...