TAUS Quality Ebook

Redefining Translation Quality: From User Data to Business Intelligence
TAUS Signature Editions
REDEFINING
TRANSLATION
QUALITY:
FROM USER DATA
TO BUSINESS
INTELLIGENCE
Published by TAUS Signature Editions, Keizersgracht 74, 1015 CT

Amsterdam, The Netherlands
Tel: +31 (0) 20 773 41 72
E-mail: memberservices@taus.net
www.taus.net
All rights reserved. No part of this book may be reproduced or transmitted
in any form or by any means, electronic or mechanical, including
photocopying, recording or by any information storage and retrieval
system, without written permission from the author, except for the inclusion
of brief quotations in a review.
Copyright 2016 by Attila Grg and TAUS Signature Editions
TABLE
OF CONTENTS
Introduction 4
Translation Quality 5
French Poetry and DQF 5
Translation: A Nice-to-Have? 7
Buy Translation Like You Book a Hotel Room!
10
Quality Evaluation 12
Quality Evaluation on a Shoestring
12
Translation Productivity Revisited 14
Business Intelligence 20
Business Intelligence and Quality Evaluation Data
20
Crowdshaping Translations 24
Further Reading and Reference
28
About 30
Attila Grg 30
TAUS 31
INTRODUCTION
Translation quality is one of the key concepts in the translation industry today.
Measuring and tracking translation quality is essential for all players of the
industry. More and more translation vendors offer different types and levels
of quality resulting in dynamic pricing. Translation buyers are seeking to know
whether their customized Machine Translation (MT) engine is improving and
they would like to compare different MT providers. Finally, translators need to
set the threshold of TM/MT matches at the most optimal levels. These are just
a few examples where translation quality becomes central and increasingly
tuned to user satisfaction.
The aim of this eBook is to redefine the way we look at translation quality and
evaluate translations. We are not trying to achieve this by a collection of scientific papers or by heavy argumentation. What we offer is a selection of short
articles in reflectional style on various aspects of the quality of translations.
The first three articles are on the topic of Translation Quality. We will focus
on past and new interpretations of the concept of translation quality. The
uber-ization of the translation industry is near and well investigate how approaches to translation quality reflect this new phenomenon.
The second topic, Quality Evaluation, includes two articles. I will explain how
to save on time and resources by applying new techniques and methods for
translation quality evaluation. Also, how to be fair to translators when measuring productivity? How to take into account the quality of the available resources when comparing translators and post-editors? These are some of the
questions we are trying to answer in this chapter.
Finally, we complete our journey from quality to intelligence with two more
articles on the topic of Business Intelligence. We will consider evolving trends
in translation technology, new methods of evaluation and we even touch hot
topics like crowdshaping and big data.
Enjoy reading!
Attila Grg
4
TRANSLATION
QUALITY
French Poetry and DQF
When I studied French at university in my homeland, Hungary,
I remember attending a class on translating poetry. In this class
we discussed the translation of a poem for two hours. Each time
we worked on a different poem. Everyone prepared in advance
and we would go line by line, word by word, reading out loud
our version of the poem.
Philipp Koehn once cleverly wrote: ten different translators will
almost always produce ten different translations. And indeed,
each one of us came up with a different masterpiece. Of course,
not all of them were equally good. In some cases a translated
line was semantically or stylistically incorrect, or It didnt fit into
the context, didnt match the rhyme scheme or it just sounded awful! These versions were ruled out without mercy by our
teacher, Dr. Brdos, or by some older, more experienced fellow
students.
At the end, we created a pool of good, acceptable translations
(two, three, four depending on the difficulty of the poem). This
is when quality evaluation comes into play: these different versions were all considered publishable quality. There were slight
differences in word order, word choice, tone, style etc., but they
were still good translations of the French poem. The ones outside the pool had serious errors, but the ones selected into the
pool were all correct and well-written pieces.
What happened next was a vote. All of us, teacher included,

had to vote for our favorite version to decide which one was
the overall best translation of the poem. Our quest for the highest quality, through votes. This happened, quite inefficiently,
around a table on a lazy afternoon somewhere in an old, historic building. We spent time and effort that we could have spent
on real work. Well, I personally learnt a lot from it.
Since then, many things have changed. I moved to the
Netherlands and after a long career in linguistic research and
the language industry, I became product manager of the TAUS
Dynamic Quality Framework (DQF). DQF offers a set of tools
that are made for a similar but slightly different exercise than
the one above: to vote for different MT engines or human translations, to score translated segments, to count errors based on
an error-typology, and to measure post-editing productivity.
DQF offers different tools for different purposes. It is made with
the student, the developer, and the project manager in mind.
And the good news is that DQF is available for academics free
of charge. It is perhaps the only workflow tool for quality evaluation that can be used in the translation classroom today.
Translation: A Nice-to-Have?
Todays fast-changing landscape of goods and services is
filled with past must-haves that are evolving into nice-tohaves. Otherwise they become superfluous. At the other end
of the spectrum, we find things weve never dreamed of that
we desperately need today.
Technological innovation coupled with an unstable economy, topped with globalization and increasing environmental
awareness awaken in us new needs and extinguish old necessities. Pen and paper are becoming a nice-to-have bookshelves too. In big cities, owning a car is being replaced by
car-sharing concepts and app-based transportation pooling.
Cash has made room for plastic money, but maybe not for
long; mobile payment is going to take up more of that room.
In a couple of months, Im giving up preparing food on gas
and I will become the happy user of an induction cooker. I will
actually say goodbye to gas altogether because Im moving
into a house with district heating.
Old must-haves are turning into nice-to-haves or no-need-tohaves. At the same time, in just a couple of years, Wi-Fi availability has become self-evident. Smartphones, tablets and
smartwatches are fast becoming extensions of our bodies.
We cant live without electricity anymore and we will die if we
cant read our emails for a day.
Stuff and fluff (i.e. products and services) become indispensable or superfluous or just nice-to-haves. This new law
of nature seems to influence our lives in an ever-accelerating tempo. The translation industry reflects the same trend.
Translation used to be superfluous: tradesman used a lingua
franca. Then, it became a must have. If you launch your product globally, you should address your target audience in their
mother tongue. You should translate your user guides, your
website, marketing content and, at least, you should subtitle
your videos.
What is translation today? In a narrow sense

Translation transfers a written source text into a written target
text of roughly equivalent length. Such a translation conveys all
the source texts meaning, making only those adjustments necessary for cultural appropriateness without adding, omitting,
condensing, or adapting anything else.
Defining the landscape of translation by Melby, Fields, Hague,
Koby & Lommel
Is such a translation a must-have today? Sometimes. In a broader sense
Translation departs from or is inspired by source content in one
language with the aim of providing fit-for-purpose, comprehensible content in a foreign language.
How do you define translation? Broadly or narrowly? To what
extent are adequacy and fluency important to you?
In some areas or situations, translation (in the narrow sense)
is evolving into a nice-to-have. An expensive nice-to-have
sometimes a superfluous one, too. More and more companies
are introducing preliminary content profiling in order to determine whether the source in question will deserve a translation
at all, and if yes, should it be in the narrow or broader sense.
With the increase in crowdsourcing, post-editing, interactive
and adaptive machine translation and learning management
tools, new workflows and technologies are entering the market. Different quality requirements define different translation
processes, giving each its equal right of existence. As Monica
Guy cleverly articulates in her blog post:
No type of translation is inherently better than another, but
each is appropriate for a different and specific purpose. And all
should work together in harmony to provide a powerful tool to
reach target markets across the world.
No type of translation is inherently better than another, but each is appropriate

for a different and specific purpose. And all should
work together in harmony
to provide a powerful tool to
reach target markets across
the world.
- Monica Guy
In our age of hyper-globalization, translation in the narrow

sense is becoming a nice-to-have. On the other hand, translation in the broader sense encompassing transcreation, localization, gist translation, raw machine translation, summary/extract translation etc., is and always will be a must-have. Content
profiling can help you select the right process, the right quality
level and the right evaluation type. To learn more about content profiling, please watch this webinar or read the Dynamic
Quality Framework report referenced at the end of this book.
Buy Translation Like You Book a Hotel Room!

In 2014, at the VViN conference (the Dutch version of ATA), I lead
two breakout sessions on translation quality. It was interesting
to hear how Language Service Providers (LSP) experience what
I would call the quality-paradox: most clients desire top quality
but want to pay budget prices. Why is the translation industry so different from the well-known hospitality business where
you don't expect to get a cheap room in a 5-star hotel? And
when a cheap room is offered through some campaign, you
become suspicious. Some participants suggested that translation is a service and hotel ratings have to do with the features
of an accommodation. I would say offering a hotel room is also
a service. It's definitely no product I'm buying when I'm staying
at a hotel. When we evaluate a translation, don't we do that by
focusing on some features of the text (style, terminology, accuracy, fluency, readability, usability etc.)? I personally don't see
the difference.
The translation industry lacks an equivalent to the 5-star hotel
rating. One of the main topics and aims of the 2014 Translation
QE Summit in Vancouver was to try to come up with a similar system. A bold aim, isn't it? Unfortunately, it seemed to be
too difficult a challenge for a break-out session of one hour.
We want to reach this by first looking at the different processes
and quality levels offered already in the translation and localization business. A first and very limited investigation showed
that there are three camps today: one that openly promotes
different quality levels and offers the choice to customers, one
that offers the same but not openly (i.e. promoting premium
quality on their websites, but if the customer wants, also offer
lower quality levels) and, finally, LSPs that are unwilling to offer
anything else but premium quality as yet.
The magic word that comes into play is benchmarking. No
quality levels without comparison. No weights and no penalties
either. One should have at least a vague idea what makes premium quality a 5-star and what are the minimum requirements
10
before we can call a translation a translation (i.e. 1-star). For

the basic level of quality, I would say readability, usability and
comprehensibility tests would be good yardsticks. First, define
the purpose of the translation and of course your budget. For
some purposes, a 1-star quality is enough. For others, it's not.
For the top level, obviously a top score based on a combination of adequacy, fluency and error-typology evaluations would
be required. And this leads us to the following topic of the QE
Summit in Vancouver: Quality in the translation workflow. A
different workflow will usually result in different quality levels.
MT + light post-edit will be lower level than MT + full post-edit
will be lower than full post-edit or human translation + review
etc. Depending on the different steps, different results can be
expected. I would be curious to know how quality control and
evaluation are built in the subsequent steps of the translation
workflow and how do you train your personnel for evaluation.
Finally, what is the cost and benefit of quality evaluation? How
much do you lose by delivering random quality? How do you
calculate ROI? Let's face it: we rarely know the quality level of a
translation we deliver. "But it has been made by my best translator and reviewed by my best proofreader!" Translators have bad
days, reviewers too. We all do. Without evaluating the product
at least by using samples, you can't be sure you are delivering
the required quality.
11
QUALITY
EVALUATION
Quality Evaluation on a Shoestring
Quality is a hot topic today for all players of the translation industry: translation buyers want different types of quality and
flexible ways of pricing; LSPs would like to know whether their
customized MT solution is improving; and translators are keen
on setting the threshold of fuzzy matches/MT suggestions at the
most optimal level. These are just a few examples where quality
evaluation plays a crucial role.
Unfortunately, theres no such thing as a free lunch! Quality
Evaluation (QE) can save money but it also costs money.
Assessing the quality of a translation can sometimes cost you
even more than producing the translation itself! Nonetheless,
continuous monitoring of translation quality and sharing evaluation data are indispensable practices for developing metrics
for automated QE. Without that, no advances will occur in the
translation industry. As Maxim Khalilov of Booking.com mentioned very cleverly in one of TAUSs translation quality webinars:
Improving automated metrics for QE will also improve the quality of existing MT solutions.
Better QE means better tools.
Obviously, a hassle-free improvement of MT output is a fairy
tale. In order to develop and improve, we need to measure
12
Improving automated metrics for QE will also improve

the quality of existing MT
solutions.
- Maxim Khalilov
quality constantly. But how can we achieve that when budgets

and resources set-aside for this purpose are so tight. How do
we become efficient in QE?
Lets talk efficiency!
Efficiency in general, describes the extent to which time, effort or
cost is well used for the intended task or purpose. (Wikipedia).
The main theme of the 2014 QE Summit held in Dublin was efficiency how to save on time and resources by applying new
techniques and methods for translation QE. Speakers at the
event elaborated on five topics which are also five proposed
ways of saving on budget when it comes to QE. The outcomes
of the break-out sessions and the presentations were bundled
in the following five best practices:
Quality Estimation
Community Evaluation
Readability
Usability Evaluation
Sampling
13
Translation Productivity Revisited

Once upon a time in the Land of Translations...
... we wanted to know how many words we could produce per
month, per day, per hour. How much time we need to post-edit
machine translated segments. And we wanted to track the edit
distance. Why on Earth?! Well, to find ways to profile translators
and post-editors, to set prices, compare vendors, categorize
content, evaluate MT engine performance... the list is endless,
but are we doing it right?
Productivity tells you how fast a translation was completed. Due
to many variables, however, it will never be a reliable measurement when it comes to profiling post-editors and translators,
comparing vendors or evaluating MT output. And by the way,
will it ever give us valid insights into the quality and difficulty of
the specific content we receive from our customers?
Productivity defined
According to Wikipedia:
Productivity is an average measure of the efficiency of production. It can be expressed as the ratio of output to inputs used in
the production process, i.e. output per unit of input.
This formula works well when all the variables on the in- and
output side are listed, well defined and measured consistently.
Problems arise through only taking a limited number of variables. Unfortunately, thats exactly what is happening in the
translation industry today: we take time as the only input, and
words as the only output. As a result, the more words produced
in a shorter amount of time, the higher the productivity will be.
This is just too simplistic if you ask me. Im just wondering how
our industry could get away with it so long!
There is much more to productivity than the number of words
per hour. Why not also take into account the number of (final) edits per hour. This means calculating one unique score
that is based on the total number of words translated by the
14
translator in an hour, combined with the number of final edits

done in the whole process of producing the translation (and
calculated from the character-based edit distance). This gives a
more reliable productivity score. Its easier to translate fast when
the translation memory gives many exact matches or context
matches, and when the MT engine is in top shape and we hardly need to translate anything from scratch, as opposed to the
situation where there are no available resources or ones of very
poor quality.
For this reason, it is a good step forward to include the number
of edits per hour in the productivity score (and I will talk about
this later), but one should also take into account the following
variables:
Difficulty of the source content (using some measurement
independent of language)
Quality of the source content (based on human assessment by the translator or the reviewer)
Available resources (also called translation process):
whether the translator did or didnt use an MT engine, a translation memory, glossary etc
Quality of these resources (using fuzzy match and MT confidence information combined with edit distance)
Number of corrections applied by the reviewer(s)
Number of errors, weights and penalties applied by the
reviewer(s) in the review cycle(s)
Now, I dont say this is all easy to measure, keep track of, or
aggregate in one single score. But still, lets try and see what
happens!
TAUS Efficiency Score
That is also what one of the developers (Nikos Argyropoulos)
thought when he came up with a new metric to measure productivity called the TAUS Efficiency Score. This score replaces
traditional productivity measurement because it can be applied
to every form of translation: translation from scratch, translation
with translation memory, PEMT or a mix of these three. More and
more translation jobs have a mixed nature: one can post-edit
15
MT suggestions, insert TM matches or translate segments from

scratch in the very same translation job. There is no hard divide
anymore between MT, TM and human translation. This should
be reflected in a new metric measuring productivity.
In the TAUS Efficiency Score, time is measured for producing
(and, if needed, updating) each segment regardless of the segment origin (MT, PE, glossary, scratch, etc).The Efficiency Score
is flexible in that the number of variables used to calculate it
and the ways the different measurements are taken into account
vary based on user requirements and available data. The score
is also relative because it is calculated based on the data present in the underlying database at the moment of calculation.
Variables
The variables involved in producing the Efficiency Score are, in
the first place, the two obligatory variables (core variables) and
any additional variables (optional variables) that are added to
the calculation to increase precision and credibility. The score
can be calculated to measure translator efficiency, but the focus can also be on CAT/TMS efficiency or MT engine efficiency.
While edit distance and the edits per hour are calculated in
many translation tools, this measurement tends to be only applied to evaluate MT engines and less so for evaluating post-editing productivity. This is simply because no one has come up
with a method that would combine a productivity score with
edit distance information and normalize the score in a dynamic
way. This is exactly what the TAUS Efficiency Score does when it
is based on the core variables.
In order to unify the two measurements (processed words per
hour and final edits per hour that are based on edit distance),
one needs to convert relative scores into absolute scores. The
Efficiency Score is calculated on an ongoing basis using data
from the DQF database which is fed with data from real life
projects. The score is displayed in the TAUS Quality Dashboard.
The more data and the more homogenous data is used to calculate the score, the more precise and meaningful that score
will be.
16
The Efficiency Score is not yet implemented in the TAUS

Quality Dashboard. If you want to read more about the Quality
Dashboard please click here.
Use case
The Efficiency score based on core variables is calculated using
the following data:
1. The number of words that a translator processed. (Note:
each time a translator returns to a segment, the extra time will
be added on that segment.)
2. The edit-distance is calculated using the Wagner & Fischer
algorithm after the translation process.
In the example below, four translators have been involved in
similar translation projects. The table offers information on the
actual number of words processed, the actual time spent, the
speed expressed in word per hour, the aggregated edit distance based on all segments and this normalized to the number
of edits per hour.
Name
Translator 1
Translator 2
Translator 3
Translator 4
Number
of Words
Time (seconds)
WPH
150
140
3857
100
80
120
120
3000
120
2400
130
3323
Edit Distance
Edits per
Hour
80
2057
50
70
30
1500
2100
831
Table 1: Translator data - Example

The normalization of all the variables will be calculated using
the Min-Max normalization because it is simple, and it has the
advantage of preserving exactly all relationships in the data and
provides easy way to compare values that are measured using
different scales.
17
Using the Min-Max normalization the following scores will be

obtained:
Having these results, it becomes clear what is the rate of each
translator in the distribution for the words per hour and edit-distance measurements, and the difference between them
can be seen in a scale from [0.0, 1,0]. Both measurements have
an equal share in assessing translators.
Based on the probabilities above, the Efficiency Score can be
calculated. This is based on the total of the two normalized
scores divided by 2.
Name
WPH
Translator 1
3000
Translator 3
2400
Translator 2
Translator 4
WPH MinMax nor.
Edit-distance
Edits per
Hour
Edit-distance
Min-Max nor.
1.0
80
2057
1.0
0.411
3857
0.0
3323
50
1500
70
0.633
0.4
2100
30
0.8
831
0.0
Table 2: Translators - Probabilities

Name
WHP Normal
Translator 1
0.411
Translator 3
0.0
Translator 2
Translator 4
1.0
0.633
Edit-distance
normal.
Sum
0.4
0.811
0.8
0.8
1.0
0.0
2.0
0.633
Table 3: Composite indicator for producing the Efficiency Score
18
Efficiency
Score
0.405
1.0
0.4
0.316
Summary
For the Efficiency Score based on the core variables, we measure time for processing segments while tracking the segment
origin. Next, we measure the edit distance and calculate the edit
distance per segment (minimum number of edits needed to get
from A to B) and produce the number of edits per hour. Finally,
we normalize and unify the two measurements. For more precision and credibility, we can base our calculation of the score on
additional (optional) features.
There are a number of reasons for developing a composite indicator for productivity based on the words per hour measurement and the edit-distance scores:
1. It can offer a rounded assessment of performance.
2. It presents the big picture and can be easily understood than trying to find an answer in the two (or more) other
measurements.
3. It can help for the implementation of better analytical
methods and better quality data.
The two data points are used to generate a numerical score that
will show the efficiency of the translator among other translators who worked in similar projects (technology, process and
content). As I mentioned earlier, you can also use the score to
compare technologies, processes etc. Before calculating the
Efficiency Score, the data needs to be preprocessed and transformed to fall within a smaller and common range for all the
metrics, such as [0.0, 1.0]. This way we give data points an equal
weight.
Future work will involve adding the Efficiency Score to the TAUS
Quality Dashboard. Initially this score will be calculated based
on the core variables. In a later phase, the possibility of adding
quality and content difficulty scores is envisioned.
Lets see whether this will reform the way we look at translation
productivity and determine our prices. In any case, one thing is
for sure: the traditional way of measuring productivity is dead.
19
BUSINESS
INTELLIGENCE
Business Intelligence and Quality
Evaluation Data
Business Intelligence (BI) in the translation industry is about engineering an environment of answers by selecting, collecting
and interpreting data derived at various stages of the translation process. In this webinar, Tom Shaw (Capita) explained how
quality evaluation data of even a small sample can predict ROI
and support business decisions when this type of data is recorded and interpreted correctly. Business Intelligence is starting to catch on in the translation industry, and with good reason
using smart ways to transform data of any type into actionable
information yields business benefits and helps stakeholders
make informed decisions.
Once the KPIs of a business or project are defined and ways
to measure these are set, data collection and monitoring will
begin. Today, high quality data is abundant and we have many
ways to harvest it. Whats very often missing is the extra step
of analyzing and interpreting this data. One example from our
industry is the collection of post-editing productivity data that
is produced in ever-increasing volumes, but that is not being
linked to post-editor profiles, quality levels or content difficulty
levels and, as a result, doesnt find its way to the pricing cycle.
One could use this type of data in a dynamic way to adjust prices. Unfortunately, dynamic pricing is still a neglected concept in
the translation industry and would definitely deserve an article
on its own.
20
Lets take a look at some common areas where translation vendors and buyers can benefit from collecting and analyzing data.
Zooming in on one of the components of the well-known translation pyramid (quality-speed-cost), we can track translation
quality on an ongoing basis by collecting benchmarking data
available in different stages of the process: from content authoring through actual translation or post-editing to publishing
and post-publishing.
We can measure source-text quality using readability measures;
we can evaluate the quality of MT output applying relative or
absolute metrics; we can categorize post-editors based on productivity tests that involve time measurements and edit distance
data. Once this data is linked to user profiles, type of CAT tools,
content profiles, industry-domain etc., it can offer powerful information to vendors and buyers of translation services. All this
data helps to spot trends, improve efficiency, adjust translation
workflows, enhance tools (CAT, MT, etc.) and select, discard or
update resources (both TMs and Human resources). A core capability of any BI environment is to isolate granular sub-sets of
data. The key is the ability to conduct multidimensional analysis
quickly and to test theories and identify trends.
BI data also enables us to do adjustments and introduce more
efficiency in our workflow processes. One example is when
low post-editing productivity is linked to poor MT output when
actually it is the result of
wrong choices made further down the road: missing guidelines, untrained
post-editors selected or
environments that are unsuitable or inappropriately
set-up for the job.
While evaluation scores
tracked internally can be
of enormous value for
safeguarding
efficiency,
21
they might become meaningless without absolute values to

compare them with. Is a 30% productivity increase for EnglishJapanese statistical machine translation (SMT) good or bad?
One of the main problems in the translation industry today is
the lack of benchmarking. Translations cannot be compared
to industry averages or standards because these are not yet
available. At the same time, buyers of translation services are
increasingly interested in translated content of different quality
levels and different pricing models. They want to save some
resources on some content and invest more in others. As a result, several vendors today are offering services and products
tailored to various needs. But how can these customers specify
the quality level they need? And how do vendors make sure
the right quality is delivered?
The general consensus is that you shouldnt wait for your business to have big enough data or perfect data to track BI and
do benchmarking. Because of the elusive nature of data, you
will never have enough of it and it will never be perfect, despite your best efforts. Whats more, reports and analytics that
BI provides actually help expose the faults in your data. This
being said, its still important to understand really good, actionable business intelligence in the translation industry depends
on complete and accurate data. This is the old garbage in, garbage out axiom and its as true now as it ever was.
Here are some tips on how to get started with BI in your
company.
1. Make sure you use the right tools and record the right
data for each and every project in your translation workflow.
2. Once you have decided on collecting data and doing
data analysis, you need to connect the results to a limited number of KPIs. You cant possibly follow and interpret dozens of
data points. It will get too complicated down the line, so just
keep it simple.
3. KPIs have to be linked to what you try to achieve. KPIs set
and data measured in isolation of the bigger business picture,
just for the sake of measuring is meaningless. What are your
objectives?
22
4. While its important to track results and act based on new

insights resulting from data, the focus should remain on business, not on improving scores.
5. Finally, its important to train your team and explain in advance how results will be and should be interpreted. The maturity of the team around a data-centric approach is indispensable.
Clearly, there are myriad different ways in which BI can support
your translation/localization business. As BI becomes more
commonplace in our industry, no doubt we will see more and
more vendors and buyers create different use cases for it. TAUS
is committed to share new insights, benchmarks and best practices on Machine Translation, Post-Editing, Data and Evaluation
to help the industry move forward. Are you interested in learning more? Please visit the different TAUS shared services pages.
23
Crowdshaping Translations
Is your business...
Improving translation quality without spending a fortune
on manual assessment?
Providing awesome customer experience within the limits of your budget?
Monitoring translator performance?
Crowdshaping might be of help.
Crowdshaping versus crowdsourcing
Crowdshaping is a recent successor of crowdsourcing and it is
increasingly deployed in various settings: during dance events,
in retail stores, in the construction of road systems and in football stadia. It has much in common with crowdsourcing, but differs from it in one important aspect: user participation. While
crowdsourcing refers to people intentionally and actively sharing their opinions, preferences, or ideas, crowdshaping is relatively passive, generally using technology that detects peoples
preferences and interests based on their actions.
Crowdsourcing is vulnerable to reporting errors, particularly
when it comes to certain kinds of input: what people say about
their preferences, feelings and future behavior doesnt always
align with what they do. Crowdshaping overcomes this limitation by not asking how people would act or what they think,
but by recording what people actually do by indirectly tapping
input from them.
Of course, you need to own such technology and you also have
to know how to interpret and put the harvested data to work.
That might be a drawback as crowdshaping technology is still
evolving and best practices are missing in various industries.
Also you might think: Wait a minute. How about my privacy?
But is data privacy really an issue today? Evidence suggests
that consumers are growing accustomed to a world in which
data is a shared resource.
24
A new IBM study found that consumers are willing to share their
personal information with retailers, particularly if they get good
value in exchange. The percentage of consumers willing to share
their current location via GPS nearly doubled year-over-year to
36 percent. 38 percent of consumers would provide their mobile number for the purpose of receiving text messages and 32
percent would share their social handles with retailers.
Adjusting party beats with biometric wristbands
One of my favourite examples of crowdshaping in action is
Lightwaves biometric wristbands. These gadgets have been
utilized at events
where DJs exploited the real-time data
to adjust the music
selection. The wristbands have four sensors: an accelerometer to measure the
wearers movement;
a microphone to detect decibel levels;
a gauge to measure
both body and room
temperature; and a
sensor to detect skin
physiological and psychological arousal through sweat. When
the temperature of the crowd reached a set point, for example,
the crowd unlocked a round of drinks. Leaderboards rated individual dancers for energy, and, during a boys vs girls danceoff, both teams competed to see who could dance the most
energetically.
Reshaping customer experience with in-store technology
Sounds far-fetched? The possibilities of tracking user data to
improve user experience are endless. Retailers, for instance,
will soon start to employ in-store technology that can collect
and use data to reshape the experience served to shoppers?
25
Crowdshaped advertising on digital screens. Crowdshaped instore music. In fact, you could soon crowdshape the music at
your next party if a prototype called Chne (a small, smart jukebox made by design firm Clearleft) is launched. Chne uses
technology to strip music playlists out of your smartphone,
and then plays a music selection that is an aggregation of your
preferences.
Crowdshaping in the translation industry
While user data is omnipresent in the translation industry as
well, it is not aggregated or used in a smart way. An example
where the translation industry could deploy crowdshaping is
the real-time adjustment of translation quality based on visitor
preferences of published content. Website content that is machine translated could receive better quality translation (light
or full post-edit, crowdsourced translation, etc.) based on information such as page views, bounce rates, engagement and
most popular pages in order to offer better user experience.
Evaluating quality using indirect methods is an increasingly
dominant topic in the industry today. How can we harvest user
data to offer the right level of quality? How to measure usability and user behavior without spending much effort on manual
assessment? Here are some examples from the translation industry where crowdshaping could be or is already effectively
applied:
Tracking user behavior: evaluating the quality of translated content on multilingual sites by comparing user data (pageviews, geo-location, sessions, entrances, clicks, purchases
and transactions) for the different languages.
User-adaptive MT: improving the quality of an MT engine
by constantly feeding post-editing data or user feedback to
the system.
Interactive MT: the computer software that assists the human translator attempts to predict the text the user is going
to put in by taking into account all the available information.
Whenever a prediction is wrong and the user provides feedback to the system, a new prediction is performed consid-
26
ering the new information. This process is repeated until the

translation provided matches the users expectations.
Technology evaluation: Evaluate translation technology
including CAT tools and MT engines by tracking productivity
data in real time.
These are just a couple of examples, but Im sure there are more
out there and even more to come
The TAUS Quality Dashboard
There are multiple ways to tap into translation data and user
information. Just like other crowdshaping technologies, the
TAUS Quality Dashboard will bring many benefits to users who
are willing to share their translation data. But the Dashboard is
also the first vendor independent application that supports the
benchmarking and analysis of translation efficiency on a large
scaleand its all done without additional effort on your side.
The TAUS Quality Dashboard will tell users:
Which translator to choose for a certain job
What the origin is of translated segments
How efficient a vendor or a technology is in a given project
How to adjust pricing based on performance
How the delivered quality by a vendor compares to industry average
and the list can become almost endless.
The data is there and we make so little use of it. The good new
is: all you want to know will soon be available at your fingertips
by using an intelligent solution for aggregating data, the TAUS
Quality Dashboard.
So, where will your first journey take you in a crowdshaped
universe? Where will you encounter crowdshaping for the first
time? At a dance event, in a hospital, in a football stadium or
in the translation marketplace? With different translation quality
levels offered by different vendors on the market, crowdshaping
is already reshaping the way we approach translation quality.
27
FURTHER
READING &
REFERENCE
This list of reading material is far from complete. It can serve as
a first step towards understanding the main topics in translation quality evaluation.
Attila Grg & Pilar Sanchez-Gijn (eds.): Revista Tradumtica
special issue on Translation and Quality, No. 12 (2014)
http://revistes.uab.cat/tradumatica/issue/view/5
Attila Grg: TAUS Dynamic Quality Framework Report 2015
https://www.taus.net/think-tank/reports/evaluate-reports/
dynamic-quality-framework-report-2015
Lena Marg, Sharon OBrien, Attila Grg, Miguel Gonzalez:
TAUS Best Practices on Community Evaluation
https://evaluation.taus.net/resources-c/guidelines-c/
community-evaluation-best-practices
Luigi Muzii: Quality Assessment and Economic Sustainability
of Translation
http://www.openstarts.units.it/dspace/bitstream/10077/2891/1/ritt9_05muzii.pdf
Sharon OBrien, Rahzeb Choudhury, Jaap van der Meer, Nora
Aranberri Monasterio: TAUS Dynamic Quality
Evaluation Framework: TAUS Labs report
28
https://www.taus.net/reports/
translation-quality-evaluation-is-catching-up-with-the-times
Sharon OBrien: Towards a Dynamic Quality Evaluation Model
for Translation
http://www.jostrans.org/issue17/art_obrien.pdf
Sharon OBrien: Translation Quality - Its time that we agree
https://taus.net/taus-ilf-dublin-3-june-2014-translation-qualityit-s-time-that-we-agree
TAUS Dynamic Quality Framework Report
TAUS Best Practices: Adequacy & Fluency
29
ABOUT
Attila Grg
Attila Grg is Director of Enterprise
Member Services at TAUS. He previously held a position in the product development team for the
TAUS Dynamic Quality Framework.
Currently he is the key account
holder for enterprise member companies. Attila has been involved in
various national and international
projects on language technology
for more than ten years. He has a
solid background in quality evaluation, post-editing and terminology Management. One of his key
tasks is to encourage and facilitate the discussion around translation quality by organizing international workshops and conferences on the topic. Participants are vendors and buyers of
translation services, governmental organizations and academic institutions. Attila is also the host of the annual QE Summits
that have translation technology and quality evaluation as their
main focus. He published various articles on quality evaluation,
terminology management and computational linguistics in the
past in various academic and industry journals.
30
TAUS
TAUS is a resource center for the global language and translation industries. Founded in 2004, TAUS provides insights, tools,
metrics, benchmarking, data and knowledge for the translation
industry through its Academy, Data Cloud and Quality Dashboard.
Working with partners and representatives globally, TAUS supports all translation operators translation buyers, language service providers, individual translators and government agencies
with a comprehensive suite of online services, software and
knowledge that help them to grow and innovate their business.
Through sharing translation data and quality evaluation metrics,
promoting innovation and encouraging positive change, TAUS
extends the reach and growth of the translation industry.
To find out how we translate our mission into services, please
write to memberservices@taus.net to schedule an introductory
call.
31
TAUS Signature Editions

32

TAUS Quality Ebook

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

TAUS Quality Ebook

Diunggah oleh

Hak Cipta:

Format Tersedia

Redefining Translation Quality: From User Data to Business Intelligence

TAUS Signature Editions

Redefining Translation Quality: From User Data to Business Intelligence

Published by TAUS Signature Editions, Keizersgracht 74, 1015 CT

Redefining Translation Quality: From User Data to Business Intelligence

Translation Productivity Revisited 14

Further Reading and Reference

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

What happened next was a vote. All of us, teacher included,

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

What is translation today? In a narrow sense

Redefining Translation Quality: From User Data to Business Intelligence

No type of translation is inherently better than another, but each is appropriate

In our age of hyper-globalization, translation in the narrow

Redefining Translation Quality: From User Data to Business Intelligence

Buy Translation Like You Book a Hotel Room!

Redefining Translation Quality: From User Data to Business Intelligence

before we can call a translation a translation (i.e. 1-star). For

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Improving automated metrics for QE will also improve

quality constantly. But how can we achieve that when budgets

Redefining Translation Quality: From User Data to Business Intelligence

Translation Productivity Revisited

Redefining Translation Quality: From User Data to Business Intelligence

translator in an hour, combined with the number of final edits

Redefining Translation Quality: From User Data to Business Intelligence

MT suggestions, insert TM matches or translate segments from

Redefining Translation Quality: From User Data to Business Intelligence

The Efficiency Score is not yet implemented in the TAUS

Table 1: Translator data - Example

Redefining Translation Quality: From User Data to Business Intelligence

Using the Min-Max normalization the following scores will be

WPH MinMax nor.

Table 2: Translators - Probabilities

Table 3: Composite indicator for producing the Efficiency Score

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

they might become meaningless without absolute values to

Redefining Translation Quality: From User Data to Business Intelligence

4. While its important to track results and act based on new

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

ering the new information. This process is repeated until the

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

Redefining Translation Quality: From User Data to Business Intelligence

TAUS Signature Editions

Anda mungkin juga menyukai