Sciweavers

ACL
2012

Prediction of Learning Curves in Machine Translation

11 years 7 months ago
Prediction of Learning Curves in Machine Translation
Parallel data in the domain of interest is the key resource when training a statistical machine translation (SMT) system for a specific purpose. Since ad-hoc manual translation can represent a significant investment in time and money, a prior assesment of the amount of training data required to achieve a satisfactory accuracy level can be very useful. In this work, we show how to predict what the learning curve would look like if we were to manually translate increasing amounts of data. We consider two scenarios, 1) Monolingual samples in the source and target languages are available and 2) An additional small amount of parallel corpus is also available. We propose methods for predicting learning curves in both these scenarios.
Prasanth Kolachina, Nicola Cancedda, Marc Dymetman
Added 29 Sep 2012
Updated 29 Sep 2012
Type Journal
Year 2012
Where ACL
Authors Prasanth Kolachina, Nicola Cancedda, Marc Dymetman, Sriram Venkatapathy
Comments (0)