In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clus...
This short paper presents a methodological framework for commentaries from various sources on the same texts allowing them to be assembeled in an intelligent and coherent way. Cat...
In this paper, we present a Support Vector Machine (SVM) based ensemble approach to combat the extractive multi-document summarization problem. Although SVM can have a good general...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Abstract--This paper extends the transition method for binarization based on transition pixels, a generalization of edge pixels. This method originally computes transition threshol...