Evaluating machine translation systems and their employed methods and techniques has a long and controversal history, and was debated continuously and very intensely over the past 60+ years. Mainly centred around the volatile concept of the delivered (translation output) quality and associated metrics, resulting in terms such as "good enough" (for a particular purpose) till "high quality" (compared with human translation) within mainly two application scenarios which distinguish in-bound, i.e. enterprise internal use, and out-bound, i.e. enterprise external use, deployment, consensus has not been reached yet on how to effectively and efficiently rate and rank translation automation systems.
Today, mainstream systems employ complex mathematical computations to "predict" possible translation results. Prominent examples of these systems are Google Translate and Microsoft's Bing Translator, and many other systems employed by large and small Language Service Providers are founded on the open source MOSES platform. The underlying mathematical models of all these systems are based on identifying various types of correlations between huge amounts of mono-, bi- and multi-lingual data as indicators of highly probable (language) expression substitutions. These statistical models are the so-called language and translation models. Applying these systems means that there has been a shift from causal analysis of language translation by exact and clean (linguistic) building rules (the why) to the (statistical) searching for patterns and correlations (the what) in very large data sets.
Although statistical systems have a very large economic impact because of their reduced costs in terms of development time and consumed resources, systems of the former type, the so-called rule-based systems, are still in use, such as Systran and Lucy LT, and some of these systems are even partly extended to also account for the statistical approach, misleadingly termed "hybrid systems."
Currently, the European Commission as an inherently multi-lingual institution and an early adopter of automated translation systems (dating back to 1970s) is in the process of developing their own statistical machine translation system to cover the ever increasing number of language pairs of the European Union. They have decided to follow the statistical approach because of its extensibility and cost-effectiveness as well as its maintainability over a system's entire lifecycle.
In this proseminar, we will shed light on the evaluation of automated translation from the user, system and quality perspectives with a specific emphasis on the challenges for translators surrounded by different kinds of technologies from translation memories to post-editing output that has been generated by statistical predictions. You will also learn how to live and work in interdisciplinary global team environments within our hyper-connected world.