In this paper, we describe a new approach for mining concept associations from large text collections. The concepts are short sequences of words that occur frequently together acr...
This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...
Abstract. It is already known that parallel multiple context-free grammar (PMCFG) [1] is an instance of the equivalent formalisms simple literal movement grammar (sLMG) [2,3] and r...
Abstract. Automated Text Categorization has reached the levels of accuracy of human experts. Provided that enough training data is available, it is possible to learn accurate autom...