Real life datasets often suffer from the problem of class imbalance, which thwarts supervised learning process. In such data sets examples of positive (minority) class are signific...
Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have const...
Yang Liu, Nitesh V. Chawla, Mary P. Harper, Elizab...
In this paper we examine a novel approach to the difficult problem of querying video databases using visual topics with few examples. Typically with visual topics, the examples a...
Jelena Tesic, Apostol Natsev, Lexing Xie, John R. ...
One problem of data-driven answer extraction in open-domain factoid question answering is that the class distribution of labeled training data is fairly imbalanced. This imbalance...
Michael Wiegand, Jochen L. Leidner, Dietrich Klako...
Background: When analysing microarray and other small sample size biological datasets, care is needed to avoid various biases. We analyse a form of bias, stratification bias, that...