Interested in Machine Translation Research? Find out R&D Director Franck Burlot most recent articles.

R&D Director Franck Burlot, in co-working with researchers from different Labs and Universities in Europe, submitted two pieces of research material to the annual Workshop on Machine Translation (WMT) Conference; They have got selected for presentation in front of a panel of academic and industrial representatives from all over the world.

Machine Translation is all about Research & Development to keep staying ahead-of-the-curve and offer state-of-the-art automated translation solutions to our financial clients. Lingua Custodia is dedicating significant time and resources to keep on improving quality of what we do.

 

 

You will find below the published articles:

“Using Monolingual data in Neural Machine Translation: a Systematic Study”

by Franck Burlot, R&D Director and François Yvon, LIMSI CNRS

Abstract

Neural Machine Translation (MT) has radically changed the way systems are developed. A major difference with the previous generation (Phrase-Based MT) is the way monolingual target data, which often abounds, is used in these two paradigms. While PhraseBased MT can seamlessly integrate very large language models trained on billions of sentences, the best option for Neural MT developers seems to be the generation of artificial parallel data through back-translation – a technique that fails to fully take advantage of existing datasets. In this paper, we conduct a systematic study of back-translation, comparing alternative uses of monolingual data, as well as multiple data generation procedures. Our findings confirm that back-translation is very effective and give new explanations as to why this is the case. We also introduce new data simulation techniques that are almost as effective, yet much cheaper to implement.

Extensive published article on the WMT18 Conference website:

http://www.statmt.org/wmt18/pdf/WMT015.pdf

 

“The WMT18 Morpheval test suites for English-Czech, English-German, English-Finish and Turkish-English.”

by Franck Burlot, Yves Scherrer, Vinit Ravishankar, Ondřej Bojar, Stig-Arne Grönroos, Maarit Koponen, Tommi Nieminen and François Yvon

Abstract

Progress in the quality of machine translation output calls for new automatic evaluation procedures and metrics. In this paper, we extend the Morpheval protocol introduced by Burlot and Yvon (2017) for the English-to Czech and English-to-Latvian translation directions to three additional language pairs, and report its use to analyze the results of WMT 2018’s participants for these language pairs. Considering additional, typologically varied source and target languages also enables us to draw some generalizations regarding this morphology-oriented evaluation procedure

Extensive published article on the WMT18 Conference website:

http://www.statmt.org/wmt18/pdf/WMT060.pdf

 

 

 

Categories

Enjoy this blog? Please spread the word :)

Twitter
LinkedIn