Accessing the structured content of PDF document is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we first present different methods...
The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel c...
Andrew Lumsdaine, Douglas Gregor, Bruce Hendrickso...
We present a framework to extract the most important features (tree fragments) from a Tree Kernel (TK) space according to their importance in the target kernelbased machine, e.g. ...
Background: Manual curation of biological databases, an expensive and labor-intensive process, is essential for high quality integrated data. In this paper we report the implement...