Text clustering is most commonly treated as a fully automated task without user feedback. However, a variety of researchers have explored mixed-initiative clustering methods which...
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
: Document clustering has been used for better document retrieval and text mining. In this paper, we investigate if a biomedical ontology improves biomedical literature clustering ...
This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often anno...
S. R. K. Branavan, Harr Chen, Jacob Eisenstein, Re...