Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
— In vector space model (VSM), textual documents are represented as vectors in the term space. Therefore, there are two issues in this representation, i.e. (1) what should a term...
The choice of the kernel function is crucial to most applications of support vector machines. In this paper, however, we show that in the case of text classification, term-frequenc...
The amount of available Thai broadcast news transcribed text for training a language model is still very limited, comparing to other major languages. Since the construction of a b...
MetaMap is an online application that allows mapping text to UMLS Metathesaurus concepts, which is very useful interoperability among different languages and systems within the bi...