Newsletter – July 2021
A few months ago, the Lingua Custodia Lab submitted a research paper to the ACL (Association for Computational Linguistics). The paper, entitled “Encouraging Neural Machine Translation to Satisfy Terminology Constraints”, was accepted last May in the “Findings” category. This new recognition confirms the Lab’s position as a leader in the field of NLP. Here, Melissa Ailem, researcher, and NLP expert at the Lab tells us more about the objectives of this research.
The Association for Computational Linguistics (ACL) is the premier international scientific and professional society for people working on computational problems involving human language; a field often referred to as Natural Language Processing (NLP). It rewards the best research papers in computational linguistics, worldwide. It is a significant reference for experts in NLP and Machine Translation in general.
Neural translation models are the new state-of-the-art in machine translation.They can generate excellent quality translations if they are generic. However, when we want to translate specialised texts (in our case, finance), difficulties arise: neural models do not allow us to create explicit source-target links between translations. As a result, we do not know how to “impose” the translation of one term by another using neural translation models.
It is particularly problematic when translating texts from specialised domains such as finance. Here, specific terminologies must be respected to generate appropriate translations. The objective of our research paper is to propose a solution to this problem. We have proposed a new method to integrate terminology constraints into neural translation models. It is based on two main components. First, an increase of the training data. Second, an improvement of the objective function (cross-entropy) to consider the constraints. Our results show that the proposed method considers the terminology constraints and generates better translations than traditional neural methods.
Over the past ten years, our teams have developed methods to offer the best specialised machine translation engines in finance. Yet, terminology control remains an open field of research. For example, our clients often ask us to impose certain translations on specific terms.
It will soon be possible thanks to our research. They will be able to load a bilingual lexicon specific to their company or their team’s needs. And therfore, the interface will systematically return the desired translation, considering the constraints imposed.