In practice, learning from data is often hampered by the limited training examples. In this paper, as the size of training data varies, we empirically investigate several probabil...
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
Several solutions have been proposed to exploit the availability of heterogeneous sources of biomolecular data for gene function prediction, but few attention has been dedicated t...
The problem of automatically classifying the gender of a blog author has important applications in many commercial domains. Existing systems mainly use features such as words, wor...
In the last decade there has been an explosion of interest in mining time series data. Literally hundreds of papers have introduced new algorithms to index, classify, cluster and s...