Topics in prior-art patent search are typically full patent applications and relevant items are patents often taken from sources in different languages. Cross language patent retr...
This paper proposes the use of a crossbar-like tree structure to use with Dynamic Markov Compression (DMC) for the compression of Chinese text files. DMC had previously been found...
Automated text categorisation systems learn a generalised hypothesis from large numbers of labelled examples. However, in many domains labelled data is scarce and expensive to obta...
Feature selection plays a vital role in text categorisation. A range of different methods have been developed, each having unique properties and selecting different features. We ...
Finding Contiguous Sequential Patterns (CSP) is an important problem in Web usage mining. In this paper we propose a new data structure, UpDown Tree, for CSP mining. An UpDown Tre...