Sciweavers

LREC
2008

Cost-Sensitive Learning in Answer Extraction

13 years 6 months ago
Cost-Sensitive Learning in Answer Extraction
One problem of data-driven answer extraction in open-domain factoid question answering is that the class distribution of labeled training data is fairly imbalanced. This imbalance has a deteriorating effect on the performance of resulting classifiers. In this paper, we propose a method to tackle class imbalance by applying some form of cost-sensitive learning which is preferable to sampling. We present a simple but effective way of estimating the misclassification costs on the basis of the class distribution. This approach offers three benefits. Firstly, it maintains the distribution of the classes of the labeled training data. Secondly, this form of meta-learning can be applied to a wide range of common learning algorithms. Thirdly, this approach can be easily implemented with the help of state-of-the-art machine learning software.
Michael Wiegand, Jochen L. Leidner, Dietrich Klako
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors Michael Wiegand, Jochen L. Leidner, Dietrich Klakow
Comments (0)