Project Description

Our goal: to investigate translation variation, focusing on textual and lexico-grammatical variation brought about by translation production type (human vs. machine vs. computer-aided translation vs. post-edited translation)

Resources: a translation corpus which can be analysed for variation phenomena in terms of language typology, contrastive text/register typology and translation production types (machine vs. computer-aided vs. human).

This corpus will allow analysis of VARIATION across:
--> LANGUAGES: English vs. German
--> MODES OF PRODUCTION: original vs. translation
--> PRODUCTION TYPES: human vs. CAT vs. MT and PET?
--> REGISTERS: essays vs. speeches vs. manuals, etc.
Result: features specific for
- languages
- originals and translations
- various translation production types
- various registers
- contrastive linguistics
- translatology and translator training
- CAT tools development
- MT (quality and error analysis)
- further ???


Phase 1 (June 2012 - May 2013):

Building a Corpus for Studying Variation in Translation

In the present project, we aim to investigate translation variation by means of qualitative and quantitative methods derived from register analysis, translation analysis and corpus linguistics, concentrating especially on the textual and lexico-grammatical variation influenced machine and computer-aided translation. For this pupose we create a translation corpus, a collection of texts which can be analysed for variation phenomena in terms of language typology, contrastive text typology and process types (machine vs. computer-aided vs. human). These phenomena are reflected in lingustic features contained in translation texts which belong to different registers and were produced within different translation processes.
To our knowledge, none of the existing corpus resources can enable this kind of analysis. There are corpora built to serve similar tasks, e.g. EUROPARL, vgl. (Koehn, 2005) oder DARPA-94, vgl. (White et al. 1994). However, they would provide resources for partial analysis only, as they contain texts translated by either machine systems or humans only, and do not contain any variants produced with the help of computer-aided tools. Therefore, it is important to create resources suitable for our research goals.

Project Team


This project is supervised by Prof. Dr. Elke Teich