This paper describes a new method for extracting open compounds (uninterrupted sequences of words) from text corpora of languages, such as Thai, Japanese and Korea that exhibit un...
DNA microarray experiments generate thousands of gene expression measurement simultaneously. Analyzing the difference of gene expression in cell and tissue samples is useful in dia...
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
The sounds generated by a writing instrument provide a rich and under-utilized source of information for pattern recognition. We examine the feasibility of recognition of handwrit...
Following the advent of the Internet technology and the rapid growth of its applications, users have spent long periods of time browsing through the ocean of information found in ...