The manipulation of large-scale document data sets often involves the processing of a wealth of features that correspond with the available terms in the document space. The employm...
We propose a hybrid, unsupervised document clustering approach that combines a hierarchical clustering algorithm with Expectation Maximization. We developed several heuristics to ...
This paper presents an efficient hybrid feature selection model based on Support Vector Machine (SVM) and Genetic Algorithm (GA) for large healthcare databases. Even though SVM an...
Rick Chow, Wei Zhong, Michael Blackmon, Richard St...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
This paper presents the result of an adaptive region growing segmentation technique for color document images using an irregular pyramid structure. The emphasis is in the segmentat...