Skip to Main content Skip to Navigation

Multi-domain Neural Machine Translation

Minh-Quang Pham 1, 2, 3 
3 TLP - Traitement du Langage Parlé
LISN - Laboratoire Interdisciplinaire des Sciences du Numérique, STL - Sciences et Technologies des Langues
Abstract : Today, neural machine translation (NMT) systems constitute state-of-the-art systems in machine translation. However, such translation models require relatively large train data and struggle to handle a specific domain text. A domain may consist of texts from a particular topic or texts written for a particular purpose. While NMT systems can be adapted for better translation quality in a target domain given a representative train corpus, this technique has adverse side-effects, including brittleness against out-of-domain examples and "catastrophic forgetting" of previous domains represented in the train data. Moreover, one translation system must cope with many possible domains in real applications, making the "one domain one model" impractical. A promising solution is to build multi-domain NMT systems trained from many domains and adapted to multiple target domains. The rationale behind this is twofold. First, large train corpora improve the generalization of the NMT system. Secondly, texts from one domain can be valuable for adapting an NMT model to a similar domain. The scarcity of data and the hypothetical positive transfer effect are also two main reasons for building multilingual NMT systems. Maintaining multiple bilingual MT systems requires lots of hardware resources as the number of language pairs grows quadratically with the increasing number of languages. Both multi-domain and multilingual NMT systems are essential for saving resources for the MT industry and improving the quality of the MT service. This thesis first unifies domain adaptation and multi-domain adaptation in one mathematical framework. In addition, we review the literature of (multi-)domain adaptation through a structural approach by pointing out four principal cases and matching previous methods to each application case. Secondly, we propose a novel multi-criteria evaluation of multi-domain approaches. We point out the requirements for a multi-domain system and perform an exhaustive comparison of a large set of methods. We also propose new methods for multi-domain adaptation, including sparse word embeddings, sparse layers, and gated residual adapters, which are cheap and able to handle many domains. To balance the heterogeneity in the train data, we explore and study techniques relating to dynamic data sampling, which iteratively adapt the train distribution to a pre-determined testing distribution. Finally, we are interested in context augmented translation approaches, which reuse similar translation memories to improve the prediction of a sentence. We carefully analyze and compare several methods in this line and demonstrate that they are suitable for adapting our NMT system to an unknown domain at the expense of additional computational costs.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Friday, January 28, 2022 - 11:19:25 AM
Last modification on : Sunday, January 30, 2022 - 3:28:46 AM
Long-term archiving on: : Friday, April 29, 2022 - 6:33:56 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03546910, version 1


Minh-Quang Pham. Multi-domain Neural Machine Translation. Artificial Intelligence [cs.AI]. Université Paris-Saclay, 2021. English. ⟨NNT : 2021UPASG109⟩. ⟨tel-03546910⟩



Record views


Files downloads