Word-based compression over natural language text has shown to be a good choice to trade compression ratio and speed, obtaining compression ratios close to 30% and very fast decom...
We propose an HMM-based text-indicated writer verification method, which is based on a challenge and response type of authentication process. In this method, a different text incl...
Automatic text chunking is a task which aims to recognize phrase structures in natural language text. It is the key technology of knowledge-based system where phrase structures pro...
There is a close relationship between formal language theory and data compression. Since 1990's various types of grammar-based text compression algorithms have been introduced...
Due to the great variation of biological names in biomedical text, appropriate tokenization is an important preprocessing step for biomedical information retrieval. Despite its im...