Options
Language model pre-training and transfer learning for very low resource languages
Journal
WMT 2021 - 6th Conference on Machine Translation, Proceedings
Date Issued
2021-01-01
Author(s)
Khatri, Jyotsana
Rudra Murthy, V.
Bhattacharyya, Pushpak
Abstract
This paper describes our submission for the shared task on Unsupervised MT and Very Low Resource Supervised MT at WMT 2021. We submitted systems for two language pairs: German ↔ Upper Sorbian (de ↔ hsb) and German ↔ Lower Sorbian (de ↔ dsb). For de ↔ hsb, we pretrain our system using MASS (Masked Sequence to Sequence) objective and then finetune using iterative back-translation. We perform final finetuning using the provided parallel data for translation objective. For de ↔ dsb, no parallel data is provided in the task, we use final de ↔ hsb model as initialization of the de ↔ dsb model and train it further using iterative back-translation, using the same vocabulary as used in the de ↔ hsb model.