The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual...
Learning the user’s semantics for CBIR involves two different sources of information: the similarity relations entailed by the content-based features, and the relevance relatio...
Recently, there has been a growth in the amount of machine readable information pertaining to the biomedical field. With this growth comes a desire to be able to extract informati...
This paper proposes a distributional model of word use and word meaning which is derived purely from a body of text, and then applies this model to determine whether certain words...
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...