Most machine learning researchers perform quantitative experiments to estimate generalization error and compare algorithm performances. In order to draw statistically convincing c...
In the context of binary classification, we define disagreement as a measure of how often two independently-trained models differ in their classification of unlabeled data. We exp...