You are on page 1of 9

Introduction to post-editing

1. Introduction: Why post-editing MT outputs?


Bartolom Mesa-Lao Is it really necessary for a translator to acquire post-editing skills? If the machine will replace
bm.ibc@cbs.dk the work of a technical translator, why acquiring these new skills? The answer is simple.
Center for Research and Innovation in Translation and Translation Technology Technical translators need to acquire these skills or at least be familiar with the peculiarities of
this task because there is currently an increasing demand in the market to post-edit texts coming
Copenhagen Business School, Denmark
from machine translation (MT) engines in order to attain different levels of quality.
22/05/2013 SEECAT project
From the industry perspective, there are several reasons for using MT: a) to lower productivity
prices, b) to publish more content, c) to publish into more languages, d) to publish in less time.
This hand-out presents the basic concepts of post-editing in the localization industry. In a recent survey carried out by TAUS (2010), 52% of the sixty seven companies in the US,
Europe and Asia declared that they provided post-editing services on a regular basis to their
clients, and that 74% of the resources they used to carry out the task were freelance translators.
Aims of the session: As MT is being improved, the role of post-editors might eventually change but there will be a
To acquire basic concepts about post-editing. need for their involvement in the process of creating automatic output either by editing the
To reflect on the concept of quality in localization. output or implementing changes to the corpus or engines. For example, post-editors could be
To identify different types & levels of post-editing. involved in selecting the adequate corpus and cleaning up the data so the output is more suitable
To present general post-editing guidelines. for a particular customer as well as providing constant feedback to improve the engines
performance.
There is room for translators in this new field but there is also a need to be prepared and
Contents acquire knowledge so translators can be the best capable resource to carry out these tasks as
1. Introduction: Why post-editing MT outputs? ....................................................... 2 well as to contribute to the development of MT and post- editing techniques and guidelines.
According to Vasconcellos and Len (1985), who led the first post-editing experience at the
2. Machine Translation ................................................................................................ 2 PAHO (an organization with one of the longest traditions on MT implementation and post-
2.1. MT integrated in the localization process......................................................... 3 editing), their experience has led to the conclusion that post-editing requires a trained
professional translator because only an experienced translator will be aware of the words
3. Basic concepts in post-editing .................................................................................. 4
whose variable meanings are dependent on extra linguistic context. Text disambiguation
3.1. Defining Post-editing........................................................................................ 4 requires the attention of a translator with training, experience, good knowledge of the subject
3.2. Post-editing vs. Translation .............................................................................. 5 matter, vocabulary in both languages, and technical understanding of what is meant by the text.
3.3. Post-editing vs. Revision .................................................................................. 5 Also, they explained that the post-editor is the professional best fit to give feedback about the
3.4. Post-editor profile ............................................................................................. 6 engine and to suggest improvements.
3.5. Pre-editing and controlled language ................................................................. 7 Moreover, acquiring post-editing skills might be a good practice in translation training. As
4. Common MT errors ................................................................................................. 8 Kliffer (2008) concludes, following an experiment where translation students post-edit raw
output, post-editing drove impressed upon our students the importance of a holistic approach to
5. Quality in Translation .............................................................................................. 9 interpreting the source text and translating the phrase rather than the word. The activity also
5.1. Quality concepts in Localization .................................................................... 11 provided them with a taste of what to expect if they undertake a career in translation. He also
5.2. Quality of post-edited material: assessment ................................................... 12 remarked that the experience was confidence building for students and increased their
motivation.
6. Types of post-editing .............................................................................................. 13
As a conclusion, training in post-editing does not only serve the purpose of acquiring new skills
6.1. Fast post-editing ............................................................................................. 14 for MT related tasks but it also helps to open up a different perspectives in the already known
6.2. Full post-editing.............................................................................................. 14 translation tasks.
7. General post-editing guidelines ............................................................................. 14
7.1. Guidelines for fast post-editing ...................................................................... 15 2. Machine Translation
7.2. Guidelines for full post-editing....................................................................... 15 The definition of machine translation on the homepage of the European Association of Machine
Translation (EAMT) reads:
8. Post-editing effort and productivity...................................................................... 16
8.1. Temporal post-editing effort........................................................................... 17 Machine translation (MT) is the application of computers to the task of translating
8.2. Cognitive post-editing effort .......................................................................... 17 texts from one natural language to another. One of the very earliest pursuits in
8.3. Technical post-editing effort .......................................................................... 17 computer science, MT has proved to be an elusive goal, but today a number of systems
9. References................................................................................................................ 17 are available which produce output which, if not perfect, is of sufficient quality to be
useful in a number of specific domains. (EAMT 2008)

1 of 18 2 of 18
Although the definition is broad, since computers are used to translate texts in other forms that
are not called machine translation, such as translation memories, it reflects the use of MT
today. MT should be useful in a number of specific domains but not necessarily a replacement
for human translation. The idea of a fully automatic high quality translation (FAHQT) has been Phase 1: Phase 2: Phase 3:
replaced by a more practical use of human aided machine translation (HAMT) within restricted Translation memories (TMs) Machine Translation (MT) Post-editing by humans
environments.
Machine translation is used in different industries more or less successfully, especially in those
that produce large contents of highly repetitive nature (as is the content in the localization
Hybrid text
industry) that can be easily understood by an engine. MT is frequently associated with Translation 100% translated
Source text Machine but with
controlled language and controlled translation because if technical writers of source texts follow 0% translated
memory
Translatoin MT
(TM)
repetitive syntactical patterns, they will facilitate the implementation of MT solutions in a given (MT) errors
Human Translator
company, thus increasing their translation capacity and saving costs. Even in this case, not Post-editor
everything is automatic in MT; there is a need for human interaction either before or after the
machine has processed the data. The intervention before the machine processes the data is called
pre-editing and it occurs at the source-language level to change language structures so that the Hybrid text
machine-translation engine is not confronted with ambiguous options. The intervention after the (only translated Target text
with retrieved Untranslated 100%
machine processes the data is called post-editing and it occurs at the target-language level to matches) segments?
translated
x % translated
correct frequent errors in the machine-translated output. Post-editing is still essential to produce
an end-quality product, meaning an end-quality product without frequent language mistakes
found in the machine-translated output.

2.1. MT integrated in the localization process Figure 1. Current translation workflow for most language service providers (LSPs)
The standard localization workflow consists of a pre-production or analysis phase, a production
phase and a post-production phase. During the pre-production phase, files are analyzed to 3. Basic concepts in post-editing
establish type of files, subject matter, language combination and volume by means of word-
counts, thus establishing the complexity of the project. This information serves to calculate the In this first section we would look at the basic concepts necessary to understand the nature of
most frequent variables in a localization project: time, cost and quality. The word-counts are this task as opposed to other already frequent tasks in translation/localization.
frequently done using a computer-aided tool (CAT), such as SDL Trados, MemoQ, Dj Vu or It is quite common that students and professional translators are trained (academically) in
a clients proprietary tool. Project Managers or Localization Engineers, depending on the size of translation strategies and theories, but it is rarer to be trained for revision and post-editing.
the agencies, carry out word-counts against an existing translation memory (TM) using a Therefore, it is advisable to have a clear idea of the tasks involved in post-editing and revision,
specific language combination. This process determines the level of full and fuzzy matches in as well as in translation itself, as well as to have a basic knowledge of how MT operates.
the text. These figures are used in all the financial transactions of a localization project Looking at different concepts will help us to define the task and focus on its execution.
(quotations, purchase orders and invoices). There are standards already set for different levels of
fuzzy matches and projects are paid and charged according to these standards (even if fuzzy
match payment experiences some variance in the market).
3.1. Defining Post-editing
Post-editing can be defined as reviewing a pre-translated text generated by a MT
In recent years, however, there has been a change in the workflow of localization projects.
engine against an original source text, correcting possible errors, in order to comply
Many of the main software developers have introduced a new variable: machine translation
with a set quality criteria in as few edits as possible (in general).
(MT). The problem arises precisely at this point because as with any new practice, there is the
need to create new processes. These processes, in turn, are based on answers to new questions. That is, the post-editor reads the output provided by the MT engine, observes possible errors,
How should MT segments be charged and paid? How much time would a translator take to checks the original in case of doubt and corrects the text according to the quality that has been
complete the task of post-editing? How should this task be scheduled? What is the agreed originally with the customer. It is important to underline that we are speaking of a set
corresponding TM fuzzy match value for MT segments? Should the same localizers be used or quality criteria and not a personal idea of translation quality. It is also important that the post-
is there a new professional profile needed? editor performs these changes in as few edits as possible, thus increasing his or her productivity.
Machine translation is not generally used in isolation but it is included in the same workflow as Other definitions given by experts in the field of post-editing and revision are: to edit, modify
existing TMs. MT is used in the localization industry as a new form of TM, assumed to be less and/or correct pre-translated text that has been processed by an MT system from a source
perfect because it has not been created entirely by human translators, but it is introduced in the language into (a) target language(s) (Allen 2003:296) or revising the output of a machine
same workflow. In this way, translators are asked to use a given CAT tool and download (aka translation program, where revising means the process of checking a draft translation for
pretranslate) the existing segments in order to modify or post-edit them. A particular segment errors and making appropriate amendments (Mossop 2001: 168-169, italics in original). In this
could come from a TM or directly from the MT raw output. There is, in fact, a new hybrid last definition, we would assume that in post-editing the draft translation is the MT output
model created using a combination of MT and TM segments. and the post-editor has a role similar to the reviser and therefore carries out similar tasks, mainly
checking for errors and correcting them. However, the nature of these errors is different which
makes the post-editor to consider other factors.

3 of 18 4 of 18
Post-editing: examination and correction of the text resulting from an automatic or semi- In a commercial setting, revising is carried out in order to improve texts, supervise quality
automatic machine system (machine translation, translation memory) to ensure it complies with produced by contractors, as well as revise work done by new employees or contractors.
the natural laws of grammar, punctuation, spelling and meaning according to the Draft of Sometimes, this step is not carried out at all for time or budget constraints and sometimes
European Standard for Translation Services (in Joscelyne 2006). because the process is already defined as such and it is deemed more efficient not to revise.
In this last definition, post-editing also refers to the edition of TMs outputs. Although post- Although the EN-15038 specifies that the revision needs to be carried out by a third party, not
editing MT outputs and TM outputs tend to run in parallel, they require different skills or at all translation companies follow this standard. The fall in the price of translation has also
least they require a different focus on different type of errors. We will see that when comparing contributed greatly in the elimination of this quality step.
post-edition with revision later on. Although it is not mentioned in Joscelynes definition, it is Post-editing also involves revising but the main difference is the source text, while in post-
important to highlight that the task of post-editing is closely related to the set quality editing the text comes from a MT engine (output) when revising, the source is a translation done
expectations within a project. by a human translator.
As a consequence the resulting target text contains different type of errors than those found in a
3.2. Post-editing vs. Translation human translation. This type of errors will need to be corrected in a different way depending on
Now that we have a definition of post-editing. How does post-editing differ from translation? the purpose of the text. As Laurian (1984) states post-editing is not revision, nor correction,
And how is post-editing related to translation? nor rewriting. It is a new way of considering a text, a new way of working on it, for a new aim.
There are many theories that give different definitions of translation such as the traditional, Krings (2001), who has carried out the most comprehensive post-editing research to date, also
functionalist or communicative approach. However, translation is seen in localization as an points out that this task deals with recurring, predictable errors, while revising checks for
individual step in which the source text is given an equivalent target text. The EN-15028 (the mistranslation or omissions. Later on we will see the most frequent errors found in raw output,
European quality standard for translation services) defines translation as the rendering of the but in general terms, the errors done by a human translator are randomly made and
written text in the source language into the target language. On many occasions this is only one unpredictable while MT follows certain patterns that can be anticipated according to the
single string of source text rendered into another string of target text. Translation, as most of us language combination, the type of text and the engine used. On some occasions human errors
understand it, is something more sophisticated and broader, that encompasses an in depth are more difficult to spot but at the same time the texts are easier to read as they follow a
knowledge of each language and culture in order to communicate the same meaning in both human logic. Post-editing involves revising a text that might follow an odd syntactical
languages. In the localization industry, however, a simpler concept is used. structure. This type of texts put a strain on the person reviewing that it is quite different to the
effort required to revise human translations. As Krings points out working with three different
In technical translation, the standard translation process is: translators translate the source text
texts in the post-editing situation with source text (source text, machine translation, and the
using a substantial amount of given reference material (style guides, glossaries, dictionaries,
subjects own target text) leads to an additional cognitive load vis--vis normal translation with
term banks and TMs). Then, they will or should revise their work and correct any possible
only two texts involved. In conclusion, the task of post-editing appears to be a more
mistakes. And finally, if there is enough money in the budget to afford that, a reviewer will go
demanding task than translation in terms of cognitive effort.
over the translation again and check issues to do with language (including specific
terminology), transfer and layout. What seems to be clear is that both revising and post-editing require specific skills, and that
translators are key agents in both activities.
The difference at this point is that, during the post-editing task, the translator already has a draft
version of the source text (MT output) and depending on the quality provided by the MT engine,
the output might require a) translating again from scratch (if it is not useful), b) correcting quite 3.4. Post-editor profile
a lot of errors, c) correcting a few errors or d) simply accepting the proposal without any After analyzing what post-editing is and the difference between this task and other translation
change. Therefore the post-editor is faced with two source texts (the actual source and the MT related tasks, it would be a natural step to look into the profile needed to carry out the task and
proposal). In this sense, post-editing is closer to reviewing than to translating. During this the differences from those requirements needed for a translator.
process, translators will use known translation and revision strategies and also new strategies As we saw before, most translation agencies use their regular pool of freelance translators to
(that would be described later on) for this type of text. post-edit MT outputs. Each company has its own set of pre-requisites used when recruiting
On occasions, post-editing can be done by a native speaker that does not speak the source freelance translator such as: a) native speakers, b) degree in translation or relative subjects, c)
language but that simply revises the target text to conform to the language and layout rules of certain experience as translators, d) experience on the subject matter, e) experience on a set of
the target language. This approach can be extremely dangerous as the monolingual reviewer tools and technology related requirements. Once freelancers are selected as possible candidates,
might try and decipher obscure passages from MT and simply choose the wrong alternative. The they usually pass a series of tests and fill in questionnaires related to their experience, and
source text does help in clarifying the output provided by the MT, if obscure, and therefore a finally they are tested on the job.
bilingual reviser is of essence. Are these the same requirements needed for a post-editor? Not all freelance translators make
As a conclusion, post-editing would be a task among the many tasks that a translator can efficient post-editing tasks, and not even all efficient freelance translators. Obviously, those
perform and that belong to the realm of translation but it is not actually to translate although freelance translators that stand out in the performance of their tasks will have more possibilities
the post-editor might have to translate an entire sentence because the MT proposal has to be of meeting the requirements of a good post-editor than those that already come short or barely
discarded. meet the companys expectations in terms of translation quality.
OBrien (2002) describes some of the post-editing skills required, adding to her own view, the
3.3. Post-editing vs. Revision view of other experts on the field. The skills can be summarized as follows:
According to Brian Mossop (2007) revising is that function of professional translators in which Degree on Translation and Interpreting or related subjects.
they identify features of the draft translation that fall short of what is acceptable and make Previous experience on localization and/or technical translation.
appropriate corrections and improvements. Expert in the subject area and target language.

5 of 18 6 of 18
Proficient knowledge of the source language and contrastive knowledge of source and target Avoid the use of more than three nouns.
languages. Avoid too many adjectives modifying a noun.
Advanced word processing skills; full key proficiency and efficiency in cursor positioning. Effective Use determiners.
use of search and replace functions. Avoid spelling mistakes and make sure punctuation is correct.
Positive, tolerant and open minded predisposition towards MT. Use the active voice.
Confidence in abilities and technical expertise. Use that, in order to and which after verbs that admit omissions.
Recognition of typical or repetitive MT errors. When using phrasal verbs, make sure that the preposition is as close to the verb as possible.
Ability to use macros and coded dictionaries. Repeat prepositions in conjoined constructions.
Advanced terminology management skills. Use parallel structures in coordinated sentences.
Background knowledge of MT technology and history including types of post-editing and different Use always the same term for the same item/product: avoid synonyms.
levels of expected quality. Use general dictionary terms rather than obscure terms.
Pre-editing and Controlled Language skills. Knowledge of controlled authoring tools. Use acronyms and abbreviations that will not cause ambiguity.
Programming skills (for automatically correcting errors).
For example:
Text Linguistics knowledge.
When reading this text, make sure to take notes.
Some of these skills are shared with those of a translator. However, there are additional skills
such as MT technology knowledge and tolerance, pre-editing and controlled language skills or When you are reading this text, make sure that you take notes.
programming skills that are not normally required when looking for translators to take part in a The consistency of the source text guarantees a smooth process when using MT or TMs and
post-editing project. reduces costs for the companies that use it. Additionally, it avoids translators to constantly
query for obscure passages in the text.
3.5. Pre-editing and controlled language However, controlled language is not always performed on the source texts that will then be
There are several pre-editing techniques that allow reducing the post-editing effort. These are: machine translated and eventually post-edited. Although the post-editing time is reduced
following a style guide (technical writers), controlled terminology (using a set of unique terms considerably, the initial investment required in order to apply controlled language is high, and
when writing) and controlled language. therefore companies might avoid this step. Post-editors will find that a vast number of texts that
they will work with would not be written using controlled language nor will they be pre-edited.
Controlled language means that the source language (e.g. a technical text) is written in a
standard way to avoid lexical ambiguity and complex grammatical structures, and thus making
it easier for the user to read and understand it and consequently easier to apply technology to the 4. Common MT errors
text such as TMs or MTs. As a consequence texts have a consistent and direct style, they can be There are several classifications of MT errors. The aim of classifying the errors is not only to
easily reused, they are easier and cheaper to translate, and easier to read. Controlled language improve MT output by providing feedback but also to raise awareness amongst post-editors. If
focuses mainly on Vocabulary and Grammar and it is intended for very specific domains, even they know the type of errors frequently found when performing this task, it is easier to spot
for specific companies. It is indeed useful to create high quality MT output but also to avail them and to know what to change, thus avoiding unnecessary changes.
fully of existing TMs (avoiding fully matches with minor or unnecessary lexical or syntactical It is important to point out that depending on the type of engine, the content and language pair
changes throughout a text). Basically, controlled language will help disambiguation of terms the type of errors might change considerably. These are just examples of errors and of error
and sentences by keeping a very high level of consistency both externally (terms) and internally typology.
(grammatical structure).
Laurian (1984) distinguishes between three types of errors:
In order to use controlled language, writers write following certain rules to avoid correcting
errors at the post-editing phase. In a way, using controlled language means optimizing the 1. Errors on isolated words.
2. Errors on the expression of relations.
whole process as to make a better use of MT.
3. Errors on the structure and on the information display.
It is useful for translators to be familiar with these controlled language rules in order to
understand the possible problems the MT engine will face and therefore spot output errors more These errors are subsequently classified in three tables:
rapidly. 1.1. Vocabulary, terminology
Overview of general rules of controlled language include (Mitamura 1999, Rico and Torrejn, 1.2. Proper names and abbreviations
2004): 1.3. Relators: in nominal groups and in verbal groups,
1.4. Noun determinants, verbal modificators;
Write short sentences. 2.1 Verb forms (tense),
Use simple grammatical structures: for example avoid complex and ambiguous subordinate 2.2 Verb forms (passive/active)
sentences. 2.3 Expression of modality or not,
Use sentences with nouns rather than using pronouns. 2.4 Negation;
3.1. Logical relations, phrase introducers,
For the same process, step or idea, write the same sentence. 3.2. Word order
Write complete sentences with noun, verb and compliments. 3.3. General problems of incidence.
Avoid the use of gerunds and participles. For example, ing after.
When , While, if, Where or participles not introduced by that. Schffer (2003) from SAP offers the following error classification:
1. Lexical errors

7 of 18 8 of 18
1.1. General vocabulary word for word translation, the source text and equivalence, the target text and the receiving
1.1.1. Function words (articles, pronouns, conjunctions) reader/culture, the communication act and the role of translator as mediator, the purpose
1.1.2. Other categories (verbs, nouns, adjectives) (skopos) of the translation, or even the mental state of the translator and her cognitive processes.
1.2. Terminology Of course, every theory draws from the previous one and they all seem to live together, not
1.3. Homographs/Polysemic words (words like uses, report and starts)
altogether in harmony, but at least in constant development through these same differences.
1.4. Idioms (MT systems will tend to translate them literally)
2. Syntactic errors It is obvious then that depending on the translation theory a good translation will be classified
2.1. Sentence/Clause analysis (wrong analysis of structures, relative pronouns, use of commas) differently. What might appear to be good for one theory might not be sufficient, and sometimes
2.2. Syntagmatic structures (wrong interpretation of past participle, for example) completely wrong, for another theory. Quality is therefore an obscure and elusive concept. In
2.3. Word order MT the predominant theory, as Chesterman (2000) reflects, is equivalence in its most pure
3. Grammatical mistakes (for example, the translation of the pronoun IT or gender in the romance form: strict equivalence is a sine qua non. Instead of waffling about mystical energy,
languages or phrasal verbs carry out, porter dehors in French instead of excuter). practitioners of machine translation are concerned with practical rules of language use. They
3.1. Tense
have to believe that rules exist, and that they are as stable as those of gravity. Pym (2004) also
3.2. Number
3.3. Active / passive voice
points out that equivalence is the prevailing translation theory behind all processes in
4. Errors due to defective input text (mistakes in the source language) localization.
And it is not only in translation theory that we find divergent points of view, it seems that
Krings (2001), on a similar line, classifies errors from the MT output of this extensive professionals in the translation field have their own very particular view of what a good
study as below. The classification is not intended as a general one but to his particular output. translation is and sometimes if they are queried about it, (what is quality for you?), it is hard for
However, it is useful to see how errors were classified in this extensive study. them to come up with a definition.
When translation is a transfer of a source string into a target string with the least amount of
Lexical: Part of speech recognition error: verbs recognized as nouns or vice versa.
changes and at maximum speed, equivalence becomes the prevailing concept, even without
Lexical: Other: wrong use of certain terms in the context.
being conscious about the theory behind it, in any translators behavior when translating. I
Morphology: Word formation: wrong formation of words. For example, Drhten des Telefons would add that the skopos theory also plays a very important role, as the purpose of the
instead of Telefondrte. translation and all the players involved in the translation activity play a fundamental role in
Morphology: Other: incorrect infinitive form, incorrect plural form. localization and in machine translation post-editing. In this context, a good translation is the
Syntax: Word order one that renders an equivalent target text according to the skopos of the project in question.
Syntax: Other: wrong use of infinitives Therefore, the translation quality should be judged according to these variables and not
Stylistic usage norms according to an abstract notion of linguistic quality.
Punctuation: incorrect comma usage In the localization industry, quality is frequently seen as a series of procedures carried out in
Textual coherence: incorrect gender of anaphoric reference form, inconsistent form of address for order to guarantee a linguistic quality that is then again very volatile and that tends to be
text addressees (Du and Sie) simplified by classifying errors in different categories and counting them. The translation will
Textual pragmatics: inappropriate form of address for text addressees. be a Pass if the overall count reaches a level, or a Fail if the overall count is below a level. In the
Literal transfer from ST first case, the overall quality is deemed to be good enough.
He rightly points out that several MT errors can overlap; each error can sometimes be assigned Brian Mossop who has written a complete and intelligent guide on editing and revising for
to different categories. translators (2001) distinguishes between quality control and quality assessment and
explains that both contribute to quality assurance.
Although all these classifications are valid for their specific purpose of a particular engine or
project, for the sake of simplicity and practicality we would look at examples of errors classified Quality control occurs before delivering the translation to the customer and it involves all the
in four main areas (similar to Schffers classification): steps necessary to provide a translation that fits into the customers needs. Quality Assessment
might occur after delivery and it consists in identifying problems in a text to establish if it meets
1. Terminological (verbs as nouns, nouns as adjectives, wrong use of term in context, abbreviations) the professional standards in the translation company.
2. Grammar and Spelling (tense, gender, number, active and passive voice)
3. Syntactical (Syntactic errors/errors on the structure and on the information display, Word order)
Therefore, quality control is text oriented and client/reader oriented and quality assessment is
business-oriented.
4. Punctuation and Style (upper case and lower case, formatting, form of address)
5. Others: Additions, omissions He also explains that when revising an overall quality levels need to be considered. He
distinguishes then four types of overall quality levels that I found extremely practical. The first
level (A) is Intelligible: a translation that is readable and clear, and roughly accurate. The
5. Quality in Translation second level (B) is Fully accurate: the translation avoids misleading the reader, it is fully
Quality in translation studies is a much debated subject. Different definitions are offered accurate, but it is only fairly readable and fairly clear. The third level (C) is Well written: the
depending on the school of thought. Defining quality is almost as elusive as defining final translation is fully accurate, clear and quite well tailored and smoothed. And finally the last
translation itself. Knowing when a translation is good is not as easy as it may seem to the level (D) is Very well written: the reading experience is in itself and interesting and enjoyable,
professionals or regular people outside the field. quite apart from the content.
There is an array of translation theories dealing with the basic concept of translation and He rightly points out that, in general, translators aim at C level, well written texts or even the D
defining what a good translation is. As Chesterman (2000) points out the current pool of level, when in fact in some cases only level A or B is required by a specific customer.
translation memes is a highly heterogeneous one. Some might give more importance to the

9 of 18 10 of 18
5.1. Quality concepts in Localization If we apply all the concepts that we have seen before we can conclude that quality is not a set of
grammatical rules set on stone or an ideal to try and reach, it is a variable concept that will very
In Localization, the concept of quality is considered an implicit value provided in all much depend on the characteristics of a given project as defined on many occasions by the
translations carried out by Language Service Providers (LSPs) or freelance translators. Quality customer and by the translation agency.
is in most cases a given value, an assumed service provided to the client. When working in a
translation agency or as a translator, to deliver good quality is a must. Good quality is, then, More often than not, there will be no clear information about the quality of the MT output.
variable depending on the customer, its product, audience, style guides, reference material, and Depending on who is providing the information about the output, the quality feedback could be
QA group, amongst others. Since Quality is difficult to define, everyone refers to it in very overly enthusiastic or extremely negative. It is rare to receive a serious analysis of the output
general and abstract terms. with samples and scores. Some MT output providers might send an automatic score (Blue,
Meteor, NIST or TER) that gives information on how close the output is to human quality with
What customers refer to quality translation is that translation, especially in technical translation a single number. Unfortunately, this number might mean very little in practical terms.
and localization, reflects exactly the content of the source text. What does reflecting exactly the
content mean? As we saw before, translation is perceived as the rendering of an equivalent It is advisable to assess the output for each language combination using different parameters (for
text, almost as a word by word exercise. The target text should contain exactly what the source example, Grammar, Terminology, Format) in a randomly selected set of strings extracted from
text contains with minor exceptions, that is, few adaptations to the local markets. On the other the overall content (that could be classified according to segment length) where a post-editor
hand, the LSPs use a much more functionalists approach, that is, the quality provided varies can then classify the quality of the segment (Excellent, Good, Poor, or even from 0 to 4, or any
according to the translation brief discussed with the customer, the focus is on the customers other classification).
needs and what they pay for. If they do not pay for review, well then the translation is not Even though time is required for this assessment, it will give a clear idea of the productivity
reviewed by a third party. Translators, on the other hand, have different approaches. On some savings the team of post-editors might be expected to obtain during the project. If the post-
occasions, they will work for a customer oriented purpose and, on other occasions, they might editor and the translation team do not have this information, they are working pretty much in the
work towards their idea of quality; an idea that is related to the use of correct grammar and dark in terms of prices and might be overwhelmed by the number of e-mails sent by post-editors
language style. complaining about the quality of the output with little data available to discuss the matter.
The truth is that there is not much time allowed in localization to offer a very well written The customers quality expectations for the final project need to be very specific as post-editing
translation (in Mossops definition), and we aim at a well written translation in most cases, can be superficial or thorough depending on the purpose of that translation.
while reality more often than not obliges translation providers in general to produce a Fully As in general revision terms, there are different types of expected quality levels. Post-editing is
accurate and even Intelligible translation. in general classified in two: Full post-editing leading to human quality translation and rapid
Most localization agencies, however, will follow procedures to guarantee the quality of the post-editing with minimal corrections for text gisting. Between these two options, there is a
translated products. These procedures cover everything from correctly selecting the translators wide range of alternatives. Establishing the quality expected by the customer will help
to checking the quality of the translation or offering the right translation brief during the project. determining the price as well as writing specific instructions to post-editors. If this is not done,
This set of procedures is normally known as Quality Assurance (QA) and it is designed to some might correct only major errors thinking that they are obliged to utilize the MT proposal
assess the quality of products or services provided. QA implies that a series of steps are taken in as much as possible, while others will correct major, minor, and even acceptable proposals
order to guarantee quality and that corrective actions are in place in case errors are detected in because they feel the text has to be as human as possible. In general terms, customers know
the product or service. Normally, companies will use procedures and indicators to monitor this their readers and the type of text they want to produce. Post-editors should have a very clear
process. idea of the expected quality. Otherwise, they will not be able to start the assignment.
Wikipedia offers a very clear definition of Quality Assurance: Quality Assurance refers to
planned and systematic production processes that provide confidence in a product's suitability 5.2. Quality of post-edited material: assessment
for its intended purpose. It refers to a set of activities intended to ensure that products (goods One of the reasons to introduce MT in the localization cycle is to save costs. It would not make
and/or services) satisfy customer requirements in a systematic, reliable fashion. QA cannot much sense, then, to review the post-edited text. However, on certain occasions revision might
absolutely guarantee the production of quality products, unfortunately, but makes this more be necessary to obtain either a very high quality or to determine post-editors competence. The
likely. reviewer should receive the same information as the post-editor. Most localization companies
Two key principles characterize QA: "fit for purpose" (the product should be suitable for the use review forms that comply with LISA, J2450 standards or similar ones created within the
intended purpose) and "right first time" (mistakes should be eliminated). QA includes regulation company. LISA defines 7 categories of errors. These are:
of the quality of raw materials, assemblies, products and components; services related to Mistranslation
production; and management, production and inspection processes. It is important to realize also Accuracy
that quality is determined by the intended users, clients or customers, not by society in general. Terminology
Quality Control (QC) is the application of QA for a particular project and it will happen during Language
the life-time of the project while QA will normally be part of the quality processes within the Style
company. Country
Consistency
There are different Quality Standards used in the localization industry. The most important ones Format
are: ISO 9000 Series, ASTM F2575-06 and EN-15038. The advantage of complying with a
Quality Standard is that the company will register all steps and procedures in the company and Mistranslation refers to the incorrect understanding of the source text; Accuracy to omissions,
have periodical audits that guarantee the quality of the translation. However, an agency can have additions, cross-references, headers and footers and not reflecting the source text properly;
processes documented and zero customer complaints without necessarily being certified. It only Terminology to glossary adherence, Language to grammar, semantics, spelling, punctuation;
serves as an indication that you might be able to produce a quality translation. Style to adherence to style guides; Country to country standards and local suitability;

11 of 18 12 of 18
Consistency to coherence in terminology across the project and Format to correct use of tags, The engine used
correct character styles, correct footnotes translation, hotkeys not duplicated, correct flagging, The language pair
correct resizing, correct use of parser, template or project settings file. The desired quality specified by the customer or purpose of the translation
The errors found are then assigned a severity than can be Minor, Major and Critical. All errors The volume of documents that needs to be translated
are weighted according to this severity. For example, an error classified as Minor weights 1 The time available for the translation
point, if classified as Major, 5 points, and finally if it is deemed to be Critical it is worth the The structure of the given text
total amount of allowed errors plus 1. They type of readers or users for that particular text
The use of the final text
Similarly, the J2450 errors are classified as:
Wrong term Depending on these factors, there will be different levels ranging from Full post-editing leading
Wrong meaning to human quality or rapid post-editing with minimal corrections for text gisting.
Omission In MT and post-editing, it is frequent to differentiate between texts that will be read quickly, for
Structural error internal use and perishable, and texts that will be published and are intended for a wider
Misspelling audience. In the first case, the texts needs to be understandable and accurate, but the style is not
Punctuation error fundamental and it even admits some grammatical and spelling errors. In the second case, the
Miscellaneous error text needs to be understandable and accurate, but also the style, grammar, spelling and
terminology need to be similar to the one provided by a human translator. The texts are
Errors are then divided into Serious and Minor, and each category is assigned different points classified as well as needed for assimilation (roughly understand the text from another language
according to these two subcategories. that is not yours) or dissemination (publish a text for a wide audience from your native language
into several others, depending on the different aims, the level of post-editing will vary.
Wrong term is similar to the previous Terminology, and it refers to not adherence to the
customer glossary, wrong term for the domain, inconsistent term translation, and wrong Lets not forget that we might have MT output directly published in the Internet. This means
translation of a particular term throughout. Wrong meaning would be similar to Mistranslation that no post-editing is done in the text and it is published as it is. Normally this type of texts will
and Language, and it refers to wrong word order, incorrect syntagmatic structure, and wrong have a disclaimer explaining the reader that the text has been translated by a machine.
grammatical category. Omission would be similar to Accuracy and it refers to missing text.
Structural error corresponds to the previous Language, and it refers to wrong word structure 6.1. Fast post-editing
(case, gender, number, tense, prefix, suffix) and agreement error. Misspelling would correspond Fast post-editing evidently points out to the fact that very little corrections are necessary to
to Language and it refers to problems of orthography in the target language. Punctuation error publish the text. It is also called gist post-editing, rapid post-editing and light post-editing. It is
corresponds to Language, and it refers to the text complying with complying with the target used in general for texts that are needed urgently and will have an internal, perishable use,
language punctuation rules. Miscellaneous errors correspond to Mistranslation and other normally emails, reports, meeting agendas, and very specific technical reports for a small
categories in LISA and it refers to literal translation, register issues and mistranslation issues. audience. Allen (2003) specifies that rapid post-editing (part of an inbound translation approach,
The J2450 does not contain a Style category because it has been mainly designed for the meaning that MT is used for acquisition or assimilation or gathering of information) is to
automotive industry. provide minimal editing on texts in order to remove blatant and significant errors and therefore
Another alternative would be to create specific post-editing categories. The reviewer should stylistic issues should not be considered. It is important to point out that fast-post-editing is
consider at least the following points: also meant to be done in the shortest time possible and, thus, with the minimum number of
Accuracy: To what extent the post-edited version contains the same information as the changes and keystrokes.
source text?
Language: Is the Language appropriate? There are no Spelling and Grammar mistakes, the 6.2. Full post-editing
text follows the customers Style Guide and style is idiomatic. On the other hand, full post-editing belongs to the outbound (dissemination) approach (Allen
Terminology: The terminology follows the linguistic reference material provided for the 2003) and it is aimed at a much bigger audience.
project and it is consistent. In full post-editing, the objective is to obtain a text that corresponds to a human translation,
There could be a rating (from 0 to 4) or an error count in order to provide a final result. Post- meaning that the reader will not be able to tell if what he or she is reading came from a machine
editors should receive this feedback to accelerate their learning curve. or a human translator/writer. In this case, the raw output requires maximal editing not only to
remove blatant and significant errors but to correct all errors and style so the final text is
These are only samples and other reviewing aspects can be considered (Readability, Clarity, compliant with the language stylistic norms and also with the customers specific terminological
Transfer, and Logic) as there are multiple ways of assessing quality. It is crucial, nonetheless, to and stylistic rules. As with fast post-editing, the task is meant to be done in less time that
establish a relation between the speed and the quality of post-editing because there is no point in translating from scratch as to not defeat the purpose of using MT output.
having post-editors with a high productivity rate that do not provide the expected level of
quality.
7. General post-editing guidelines
6. Types of post-editing There should not be a post-editing projects without specific post-editing guidelines. These
guidelines are not the same as Style Guides, Project Briefs or Localization kits describing
As we saw before, there are different levels or types of post-editing. This level or type will be instructions, technical and linguistic, for the project. Post-editors need language specific
determined by several factors such as: guidelines created for the actual post-editing task.

13 of 18 14 of 18
What should these guidelines cover? Obviously it is difficult to answer this question as it will Post-editors should read the source segment first to understand the meaning of the sentence.
depend on the quality of the output, language combination, and the usual variables in MT. Then, proceed to read the MT suggestion, so that they can decide whether it can be recycled in
Besides, post-editors cannot be burdened with a whole book on post-editing, as time is of post-editing. There are some basic pointers to help with this decision:
essence and their work needs to be profitable. The guidelines should be short and precise and The suggestions should be applied if:
they should cover the following areas:
1. Large pieces of the sentence/term are correct (these can be reused during post-edit).
Description of the type of engine used.
2. The raw MT quality is very high, although some minor corrections may be needed.
Description of the source text (type and structure of source text).
3. Raw MT output contains several errors which might slow down the post-editing task. However, the
Brief description of the quality of output for that language combination. post-editor types slowly, so post-editing still proves to be faster than translating from scratch.
Expected quality by the customer (as described above). 4. The MT output has the correct meaning and it is completely understandable.
Scenarios when to discard a not useful segment (post-editors should have an idea of how much time
to spend in order to recycle a segment or discard it altogether).
You should NOT apply the suggestion if:
Typical type of errors for that language combination that should be corrected (including reference to
1. Raw MT does not make any sense and it would take longer to post-edit than to translate from
tagging and links). scratch.
Changes to be avoided (according to customers expectations, for example certain stylistic changes). 2. The user takes a few minutes trying to figure out what the raw MT is trying to say, but it doesnt
How to deal with terminology (according to output analysis and customers expectations. The make sense.
terminology provided by MT could be perfect or it could be obsolete, or a mix alternative). 3. If there are multiple errors that require rearranging most of the text.
Even though, time is needed to create guidelines, the more it is devoted to create and improve 4. Multiple tagging problems between source and the MT match.
them, the better the post-editing task will be performed. 5. There are too many changes terms to change and it will take longer than translating from scratch.
OBrien (2009) advises on general post-editing rules:
Retain as much raw translation as possible.
Microsoft offers some guidelines in order to make these decisions:
Dont hesitate too long over a problem. 1. The 5-10 second evaluation rule: this is the maximum time that you should spend evaluating the
Dont worry if style is repetitive. validity of the MT suggestion. If it is hard to understand at the beginning, do not read the whole
Dont embark on time-consuming research. sentence, proceed to translate from scratch instead.
2. The high 5 and low 5 rule Microsoft rule. When you detect a long sentence, do the following:
Make changes only where absolutely necessary, i.e. correct words or phrases are (a) nonsensical, (b)
wrong, and if theres enough time left, (c) ambiguous. Read the first 5 words. If its good, read on until its bad, then stop and copy the correct
part and continue to translate and forget about reading on.
7.1. Guidelines for fast post-editing If the first 5 or 6 words arent good, skip to read the last 5 or 6 words. If the last part of the
phrase is correct, use it, or just start the whole thing from scratch.
There are different types of guidelines according to the different expectations from the
customer. These are only samples that can be useful for post-editors, but when working on the If both the 5 first and 5 last words are incorrect, do not carry on reading through the middle
field, others might be included or eliminated. to try to identify correct MT segments. Just discard the MT suggestion and proceed to
translate from scratch.
In general terms, it is important that the post-editor should read the source segment first to
understand the meaning of the sentence. Then, proceed to read the MT suggestion and make the Once the post-editor decides to use the segment (this happens quite quickly in a real project), he
necessary changes. or she can follow these guidelines:
These are some rules that can be useful when doing fast-post-editing: If the terminology in the MT output is incorrect, do not spend time researching this, but apply the
term as used in the approved term database.
Make sure the sentence is accurate.
Often the output from the MT will be repetitive; this can be used to your benefit as the post-edited
If the terminology in the MT output is incorrect, do not spend too much time researching. output will be more consistent.
Be careful not to post-edit word order in a sentence if the sentence can be understood even if it Be careful not to post-edit word order in a sentence that does not violate semantic intelligibility
violates language rules. rules.
Do not change style or change any proposal for stylistic preferences. Be careful not to change grammatically or semantically correct phrases to stylistically preferred
Avoid replacing a word with a synonym if the original word is correct. phrases.
Do not correct grammar mistakes unless the target sentence does not reflect the source. Avoid replacing a word with a synonym if the original word is correct.
On occasions MT suggestions might help out with translators block. This might be useful even
7.2. Guidelines for full post-editing for fuzzy segments.
The same considerations we had before can be applicable to full post-editing, that is, there are
different types of guidelines according to the different expectations from the customer as we 8. Post-editing effort and productivity
already mentioned earlier. These are only samples that can be useful for post-editors, but when Productivity constitutes one of the big unknown factors in projects involving MT and post-
working on the field, others might be included or eliminated. editing. This is partially due to the fact that using MT in localization projects is relatively new

15 of 18 16 of 18
and, therefore, standard metrics do not exist yet, but mainly to the amount of variables to pp 45.
consider. At any rate, we have little information on productivity of translators work in general. Guerberof, A. 2009. Productivity and Quality in the post-editing of outputs from translation
The industry uses standards (for example, 2000 to 2500 translated words per day) but we all memories and machine translation. Localisation Focus. The International Journal of
know these standards are hardly applicable to all translators. Moreover, there are also agreed Localisation. Vol. 7 Issue 1.
metrics on TM editing (percentages paid according to fuzzy match level), but most translators
would agree in saying that these percentages hardly represent the amount of work they need to Joscelyne, A. 2006. Best practices in post-editing. In TAUS. www.translationautomation.com
perform on each proposed segment. The studies dealing with productivity when post-editing Kliffer, D. 2008. Post-Editing Machine Translation as an FSL Exercise. In Porta Linguarum.
MT segments (such as Krings 2001, OBrien 2006, Guerberof 2008 and 2009) do not show Number 9. 53-67
pronounced productivity increases when using MT. Frequently, however, MT developers will Krings, H. 2001. Repairing Texts: Empirical Investigations of Machine Translation Post-
claim that their engine dramatically increases the translators productivity without necessarily editing Processes. G. S. Koby, ed. Ohio. Kent State University Press.
making their methodology available.
Mitamura, T. 1999. Controlled Language for Multilingual Machine Translation. In
There is definitely uncertainty about the gains when using MT and post-editing. A figure that is Proceedings of Machine Translation Summit VII. Singapore. 13-17.
normally used when discussing productivity in post-editing is 5,000 words per day but the
reality is that each project will have different productivity according to the different variables. Laurian, A.M. 1984. Machine Translation: What type of post-editing on what type of
documents for what type of users. Proceedings of the 10th International Conference
Krings (2001) discusses post-editing effort as the key element in determining if the application on Computational Linguistics and 22nd annual meeting on Association for
of MT is worthwhile and distinguishes three main concepts necessary in order to understand Computational Linguistics. 236-238.
post-editing effort:
Mossop, B. 2001. Editing and Revising for Translators. Manchester: St. Jerome Publishing.
8.1. Temporal post-editing effort Mossop, B. 2007. Editing and Revising for Translators. 2nd Edition. Manchester: St. Jerome
Constitutes the time needed in order to correct the machine translated text according to the Publishing. 109
given quality. If the post-editor saves time in comparison to human translation then using MT OBrien, S. 2002. Teaching post-editing: A proposal for course content. In Proceedings for
might be a recommended tool. Depending on the type of errors and in reality on the quality of the European Association of Machine Translation. Available from <http://mt-
the raw output the time involved might be more or less. archive.info/EAMT-2002-OBrien.pdf>. Last accessed May 2013.
OBrien, S. 2006. Eye-tracking and Translation Memory Matches Perspectives: Studies in
8.2. Cognitive post-editing effort Translatology. 14 (3): 185-205.
Directly related to the previous concept, the cognitive effort describes the brain effort needed Rico, C. Torrejn, E. 2004. Controlled Translation as a New Translation Scenario:
in order to resolve these MT errors. For example, it is not the same effort to have to correct a Training the Future User. ASLIB Conference Translating and the Computer 26.
very obvious mistake of gender where the post-editor does not need to research or consult the
Pym, A. 2004. The Moving Text. Localization, translation, and distribution. Amsterdam &
source text than correcting a complex syntactical structure that renders the text ambiguous and
Philadelphia. Benjamins.
that requires checking the original to disambiguate, think about the possible solutions, make and
decision and actually making the correction. Schffer, F. 2003. MT post-editing: how to shed light on the "unknown task". Experiences at
SAP. Controlled language translation, EAMT-CLAW-03. Dublin City University.133-
140.
8.3. Technical post-editing effort
Vasconcellos, M.; Len, M. 1985. SPANAM and ENGSPAN: Machine Translation at the
This concept refers to the actual physical effort to correct a text. For example, if we need to
Pan American Health Organization. In Computational Linguistics. Volume 11,
deleted, reorder, insert or carry out a combination of all of these actions. The more cursor
Numbers 2-3. 125.
movements, for example, required to correct an error, the more technical effort necessary to
post-edit.
As we can see, the easiest variable to measure in post-editing is in fact temporal post-editing,
since for cognitive and technical effort we would require special tools (Translog or eye-
tracking tools) or protocols (Think Aloud Protocols) that make it more difficult to use in the
commercial world.

9. References
Allen, J. 2003. Post-editing. In Computers and Translation: A Translators Guide. Harold
Somers, ed. Amsterdam & Philadelphia: Benjamins. 297-317.
Bennett, S. Gerber, L. 2003. Inside commercial machine translation. In Computers and
Translation: A Translators Guide. Harold Somers, ed. Amsterdam & Philadelphia:
Benjamins. 175-190.
Chesterman, A. 2000. Memes of Translation. The spread of ideas in translation theory.
Amsterdam & Philadelphia: Benjamins.
Guerberof, A. 2008. Post-editing MT and TM: a Spanish case. Multilingual. Vol .19. Issue 6.

17 of 18 18 of 18

You might also like