Document-level Machine Translation: Recent Progress and The Crux of Evaluation
Rico Sennrich is an SNSF Professor at the University of Zurich working on natural language processing, with a special focus on machine translation and deep learning. His SNSF project focuses on better natural language understanding with multilingual resources and multi-task learning.
He is also a Lecturer at the University of Edinburgh. He is a member of the Machine Translation Group and the Edinburgh NLP Group.
Machine translation (MT) is still predominantly modelled and evaluated on the level of sentences, but neural methods have the potential to overcome this limitation and allow effective document-level modelling. However, practical challenges of document-level MT include the lack of suitable training data, the high computational cost of wider-context models, and low reward for "context-aware" translation in automatic metrics.
In his talk, he will discuss recent neural architectures that take into account wider context and address computational and data bottlenecks in different ways, and their evaluation with test sets that are targeted towards discourse phenomena. While evaluation with automatic metrics such as BLEU is noisy and hard to interpret, he will show that targeted evaluation can guide the development of document-level system by highlighting the effects of various modelling decisions.