This paper proposes a word segmentation method for machine-printed text lines. It utilizes gaps and special symbols as delimiters between words. A gap clustering technique is used...
Soo-Hyung Kim, Chang Bu Jeong, Hee K. Kwag, Ching ...
Clustering short length texts is a difficult task itself, but adding the narrow domain characteristic poses an additional challenge for current clustering methods. We addressed thi...
Keyphrases are widely used as a brief summary of documents. Since manual assignment is time-consuming, various unsupervised ranking methods based on importance scores are proposed...
While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimension...
Web Clustering is useful for several activities in the WWW, from automatically building web directories to improve retrieval performance. Nevertheless, due to the huge size of the...