Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...
Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular...
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image’s multi-features into a single one-dimensional structure. Both f...
Text message stream is a newly emerging type of Web data which is produced in enormous quantities with the popularity of Instant Messaging and Internet Relay Chat. It is benefici...
Abstract. We describe a semantic clustering method designed to address shortcomings in the common bag-of-words document representation for functional semantic classification tasks....