We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
We present an implicit discourse relation classifier in the Penn Discourse Treebank (PDTB). Our classifier considers the context of the two arguments, word pair information, as we...
This paper describes a validation approach of a socio-technical design support system using data mining techniques. Bayesian Belief Networks (BBN) are used to assess human error an...
Andreas Gregoriades, Alistair G. Sutcliffe, Harala...
If the dataset available to machine learning results from cluster sampling (e.g. patients from a sample of hospital wards), the usual cross-validation error rate estimate can lead...
This paper explores the application of the Minimum Description Length principle for pruning decision trees. We present a new algorithm that intuitively captures the primary goal o...