This paper describes the application of the PARADISE evaluation framework to the corpus of 662 human-computer dialogues collected in the June 2000 Darpa Communicator data collecti...
Marilyn A. Walker, Rebecca J. Passonneau, Julie E....
We describe a dataset containing 16,000 translations produced by four machine translation systems and manually annotated for quality by professional translators. This dataset can ...
A new framework is proposed for comparing evaluation metrics in classification applications with imbalanced datasets (i.e., the probability of one class vastly exceeds others). Fo...
Mehrdad Fatourechi, Rabab K. Ward, Steven G. Mason...
This paper presents METEOR-NEXT, an extended version of the METEOR metric designed to have high correlation with postediting measures of machine translation quality. We describe c...
—The importance of introducing distance constraints to data dependencies, such as differential dependencies (DDs) [28], has recently been recognized. The metric distance constrai...