Anda di halaman 1dari 270

Crowdsourced Translation Practices from the

Process Flow Perspective

A Thesis Submitted for the Degree of


Doctor of Philosophy
By

Aram Morera Mesa


Department of Computer Science and Information Systems,
University of Limerick

Supervisors: J.J. Collins, Dr. David Filip


Co-Supervisor: Reinhard Schler

Submitted to the University of Limerick, October 2014

Table of Contents
Abstract ....................................................................................................................................vii
Declaration ............................................................................................................................. viii
Acknowledgments..................................................................................................................... ix
Publications and Presentations from this Research Project ....................................................... x
List of Figures ........................................................................................................................... xi
List of Tables ...........................................................................................................................xii
Chapter 1 Introduction ............................................................................................................... 1
1.1 Overview .......................................................................................................................... 1
1.2 Research Question and Objectives................................................................................... 4
1.3 Methodology .................................................................................................................... 7
1.4 Scope ................................................................................................................................ 8
1.5 Thesis Structure ............................................................................................................... 8
Chapter 2 Literature Review .................................................................................................... 10
2.1 Introduction .................................................................................................................... 10
2.2 Localisation .................................................................................................................... 10
2.2.1 Criticism of Localisation......................................................................................... 13
2.2.2 The localisation process .......................................................................................... 13
2.2.3 Localisation technologies........................................................................................ 18
2.2.4 Localisation Levels ................................................................................................. 23
2.2.5 Summary of the literature review for localisation .................................................. 25
2.3 Crowdsourcing ............................................................................................................... 25
2.3.1 Introduction to Crowdsourcing ............................................................................... 25
2.3.2 Crowdsourcing in Action ........................................................................................ 29
2.3.3 Crowdsourcing Classifications ............................................................................... 32
2.3.4 Matching the Taxonomy to the Definition.............................................................. 39
2.3.5 Crowdsourcing in Localisation ............................................................................... 40

ii

2.3.6 Taxonomies within Localisation ............................................................................. 43


2.3.7 Benefits of Crowdsourcing in the Context of Localisation .................................... 46
2.3.8 Criticism in the Context of Localisation ................................................................. 48
2.3.9 Other Collections of Practices ................................................................................ 51
2.3.10 Summary of the Literature Review for Crowdsourcing ....................................... 51
2.4 Workflows...................................................................................................................... 51
2.4.1 Workflow Models ................................................................................................... 52
2.4.2 Workflow Patterns .................................................................................................. 53
2.4.3 Workflows in the Language Industry ..................................................................... 54
2.4.4 Industry workflows for crowdsourced translation .................................................. 55
2.4.5 Models of crowdsourced translation workflows ..................................................... 56
2.4.6 Modelling practices ................................................................................................. 57
2.4.7 Summary of the Literature Review for Workflows ................................................ 67
Chapter 3 Taxonomy of crowdsourcing .................................................................................. 69
3.1 Data Collection .............................................................................................................. 70
3.1.1 Data from Models ................................................................................................... 70
3.1.2 Online Questionnaire and Survey considerations ................................................... 71
3.1.3 Survey Administration ............................................................................................ 72
3.1.4 Survey Design ......................................................................................................... 73
3.2 Clustering ....................................................................................................................... 77
3.2. 1 TwoStep Clustering ............................................................................................... 77
3.2.2 Other Approaches to Clustering.............................................................................. 83
3.2.2.2 Hierarchical clustering ........................................................................................ 90
3.3 Conclusions .................................................................................................................... 91
Chapter 4 Workflow Models ................................................................................................... 93
4.1 Static models vs simulable models ................................................................................ 93
4.2 The models ..................................................................................................................... 93
iii

4.2.1 Crowdin................................................................................................................... 94
4.2.2 Asia Online ............................................................................................................. 98
4.2.3 Facebook ................................................................................................................. 99
4.2.4 Pootle .................................................................................................................... 101
4.2.5 Launchpads Translations ..................................................................................... 104
4.2.6 DotSub .................................................................................................................. 107
4.2.7 Amara .................................................................................................................... 110
4.2.8 Kiva ....................................................................................................................... 112
4.3 The practices ................................................................................................................ 114
4.4 Summary ...................................................................................................................... 118
Chapter 5 Refinement of the Practices................................................................................... 119
5.1 Introduction .................................................................................................................. 119
5.2 The Choice of Semi Structured Interview ................................................................... 119
5.3 The Selection of Interviewees ...................................................................................... 119
5.4 The interviews .............................................................................................................. 121
5.5 The questions ............................................................................................................... 122
5.5.1 Question Sequence ................................................................................................ 123
5.5.2 Question list .......................................................................................................... 123
5.6 Approach to Data Analysis .......................................................................................... 129
5.7 Analysis Outcome ........................................................................................................ 133
Practice 1: Content Selection ......................................................................................... 133
Practice 2: TU Granularity Selection ............................................................................. 142
Practice 3: Leveraging Translation Memory ................................................................. 149
Practice 4: Leveraging MT ............................................................................................ 152
Practice 5: Leveraging Terminology ............................................................................. 155
Practice 6: Translation without Redundancy ................................................................. 157
Practice 7: Open Alternative Translations ..................................................................... 160
iv

Practice 8: Hidden Alternative Translations .................................................................. 163


Practice 9: Super Iterative Translation .......................................................................... 166
Practice 10: Freeze ......................................................................................................... 171
Practice 11: Version Rollback ........................................................................................ 175
Practice 12: Deadlines ................................................................................................... 176
Practice 13: Open Assessment ........................................................................................ 180
Practice 14: Hidden Assessment..................................................................................... 185
Practice 15: Expert Selection and Edition ..................................................................... 186
Practice 16: Metadata Based Selection .......................................................................... 192
5.8 Discussion of Practices ................................................................................................ 196
5.9 Summary ...................................................................................................................... 207
Chapter 6 Practices and scenarios .......................................................................................... 208
6.1 Scenario 1 Translation for Engagement ....................................................................... 208
6.1.1 Practices to Support Translators ........................................................................... 209
6.1.2 Practices to Enable Higher Engagement ............................................................... 210
6.1.3. Practices that give the Crowd Ownership ............................................................ 211
6.1.4 Other Practices for Translation for Engagement .................................................. 211
6.1.5 Discussion of the Translation for Engagement Scenario ...................................... 212
6.2 Scenario 2 Crowd TEP................................................................................................. 213
6.2.1 Volunteer translator .............................................................................................. 213
6.2.2 Crowd Post-Edition............................................................................................... 216
6.3 Scenario 3 Colony translation ...................................................................................... 217
6.4 Scenario 4 Wiki Style Translation ............................................................................... 219
6.5 Long tail scenario variations ........................................................................................ 221
6.6 Summary ...................................................................................................................... 222
7 Conclusions ......................................................................................................................... 223
7.1. Summary of results ..................................................................................................... 223
v

7.3. Impact of the Research Contributions......................................................................... 225


7.4. Limitations and Future Research ................................................................................ 226
7.5 Summary ...................................................................................................................... 232
Bibliography .......................................................................................................................... 233
Appendix 1 Survey Questionnaire ......................................................................................... 245
Appendix 2 Email Template .................................................................................................. 247
Appendix 3 Ethical Clearance Application Form for Survey ................................................ 248
Appendix 4 Ethical Clearance Application Form for Interviews .......................................... 252
Appendix 5 Survey Responses............................................................................................... 256
Appendix 6 Failed Mind Map for Pattern Language Development ...................................... 258

vi

Abstract
This thesis explores a series of crowdsourced translation platforms that includes Asia
Onlines Wikipedia Translation Project, DotSub, Amara, Kiva and others. It then creates a
taxonomy of them based on a previously existing one for general crowdsourced processes.
The taxonomy resulted in four approaches to crowdsourced translation: translation for
engagement, colony translation, crowd TEP and wiki style translation.

Having created the taxonomy, the thesis focuses on the different processes enacted by the
platforms and presents workflow models for a selection of the platforms. Through analysis of
the workflow models, the thesis identifies fourteen crowdsourced translation practices. Some
of these, such as the leverage of MT and TM, are common in mainstream localisation
processes, while others, such as the collection of redundant alternative translations and
redundantly iterative translations, appeared only with the emergence of crowdsourcing.

In order to define the practices in a way that emulates design patterns for software, eight
interviews with experts in crowdsourced translation processes were carried out. These
resulted in over 64,000 words discussing the advantages and disadvantages, pre-requisites
and other features of the practices. This information was used to refine the definition of the
practices. The refined practices definitions have been organized in a pattern catalogue that
describes the relationships between the different patterns and different scenarios. This makes
them helpful for organisations interested in implementing crowdsourced translation.

vii

Declaration
I hereby declare that this thesis is entirely my own work, and that it has not been submitted as
an exercise for a degree at any other university.

viii

Acknowledgments
This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of
the Centre for Next Generation Localisation (www.cngl.ie) at University of Limerick and the
Armando Morera Fumero Foundation.
This thesis started as something completely different and over the almost five years it took to
write it many people had an impact on it. I would like to thank Lamine Aouad and Eoin
Conchir who were briefly my supervisors and during that time they made valuable
contributions that got me closer to finishing. Reinhard Schler set the scene for this research
and without his initial push, this thesis would never have started. J.J. Collins and David Filip
were critical to the completion of this thesis not only because of their guidance, but because
of the great work they did in finding flaws in my research, helping me solve them and
encouraging my work until I finished.
I would also like to thank Karl Kelly and Geraldine Harrahill, who with their mastery of the
internal processes of the University of Limerick solved numerous issues for me, allowing me
to focus on my research. Without them, I would have been more stressed and whined more
about the universitys inner workings.
Whining there was, though, and it was Asanka Wasala, Luca Morado, Naoto Nishio, Rajat
Gupta and Solomon Gizaw, my office colleagues, who had to listen to half of it. Without the
supportive atmosphere that they created, I would have likely given up and that makes their
support as valuable a contribution as any.
The other half of the whining was enjoyed by my parents, friends and partner. Five years is
very long time and I have to thank them for not disinheriting, ostracising or dumping me.
Writing a thesis affects both your professional and private life, and having such kind people
around me in the private life was also fundamental for the eventual completion of this
research.
Lastly, I would like to thank the experts and researchers who contributed their knowledge to
the interviews and the survey in this thesis.

ix

Publications and Presentations from this Research


Project
Papers
Aouad, L., OKeeffe, I., Collins, J.J., Wasala, A., Nishio, N., Morera, A., Morado, L., Ryan,
L., Gupta, R., Schaler, R. (2011) A View of Future Technologies and Challenges for
the Automation of Localisation Processes: Visions and Scenarios, in Lee, G., Howard,
D. and lzak, D., eds., Convergence and Hybrid Information Technology,
Communications in Computer and Information Science, Springer Berlin Heidelberg,
371382, available: http://dx.doi.org/10.1007/978-3-642-24106-2_48.
Morera, A., Aouad, L., Collins, J. (2012) Assessing Support for Community Workflows in
Localisation, Presented at the Business Process Management Workshops, Springer,
195206.
Morera-Mesa, A., Collins, J.J., Filip, D. (2013) Selected Crowdsourced Translation
Practices, Presented at the Translating and the Computer Conference 35, London.

Presentations
Morera, A., Auouad, L. (2011) Towards an intelligent localisation workflow management
system, in II Simposio Internacional de Jvenes Investigadores En Traduccin,
Interpretacin y Estudios Interculturales, Barcelona.
Morera, A., Auouad, L., Collins J.J. (2011) Assessing Enterprise Support for Community
Workflows in Localization, in SIMPDA 2011, Campione dItalia.
Morera, A., Auouad, L., Collins J.J. (2011) Integrating the Community in the Workflow:
Mapping Project Attributes to Patterns, in SIMPDA 2011, Campione dItalia.
Morera, A., Auouad, L., Collins J.J. (2011) Elevator Pitch for Crowdsourced Translation
Workflows, in SFI Digital Content Workshop, Dublin.
Morera, A., Auouad, L., Collins J.J. (2011) Assessing Support for Community Workflows
in Localisation, in BPMS2, Clermont-Ferrand.
Reviews
Mesa, A.M. (2013) Keiran J. Dunne and Elena S. Dunne (eds.): Translation and localization
project management: the art of the possible, Machine Translation, 18.

List of Figures
Figure 1 Industrial TEP process. Adapted from Ray and Kelly (2011)..................................... 2
Figure 2 Percentage of Internet users that add content according to Eurostat ......................... 31
Figure 3 An industrial localisation process model on WorldServer ........................................ 54
Figure 4 High level representation of a crowdsourced localisation timeline. Adapted from
DePalma and Kelly (2011)....................................................................................................... 55
Figure 5 Representation of a crowdsourced process. Adapted from Vashee (2009) ............... 56
Figure 6 A BPMN model of a suggested process for bi-text managemet in crowdsourcing
scenarios. Used with permission (Filip and Conchir 2011) ............................................... 57
Figure 7 Petri nets elements used in this thesis........................................................................ 59
Figure 8 The same behaviour modelled in Petri nets and BPMN 2.0...................................... 60
Figure 9 Summary of TwoStep clustering with five clusters .................................................. 80
Figure 10 Summary of TwoStep clustering with six clusters .................................................. 83
Figure 11 Crowdin Process Model at the Locale Level ........................................................... 94
Figure 12 Crowdin MT and TM leveraging subworkflow ...................................................... 96
Figure 13 Crowdin translate and vote subworkflow ................................................................ 97
Figure 14 Model for Asia Onlines Wikipedia translation project .......................................... 98
Figure 15 Facebooks process at the locale level................................................................... 100
Figure 16 Facebooks process at the string level. .................................................................. 101
Figure 17 Model of a Pootle Process at the locale level ........................................................ 102
Figure 18 Model of a Pootle process at the string level ......................................................... 103
Figure 19 Model of Launchpad's translation platform at the locale level ............................. 105
Figure 20 A model of the Launchpad Translation process at the string level ....................... 106
Figure 21 A suggested model for Launchpad Translation process at the string level ........... 107
Figure 22 A model of DotSub's process at the video level .................................................... 108
Figure 23 A model of the Amara process at the video level .................................................. 110
Figure 24 A model of Amara's process at the subtitle level .................................................. 111
Figure 25 A suggested model for Amara at the video level .................................................. 112
Figure 26 A model of Kiva's process from the point of view of a volunteer......................... 113
Figure 27 References for each practice .................................................................................. 133

xi

List of Tables
Table 1 Comparison of the localisation process as described by Esselink (2000), Pym (2004)
and Schler (2008a) ................................................................................................................. 15
Table 2 Localisation process stages and related practices ....................................................... 17
Table 3 Levels of localisation adapted from Carey (1998)...................................................... 24
Table 4 Google Scholar hits per year for crowdsourcing and some crowdsourcing related
terms ......................................................................................................................................... 26
Table 5 Preselection of Contributors from Geiger, Seedorf et al (2011)................................. 38
Table 6 Aggregation of Contributions from Geiger, Seedorf et al (2011)............................... 38
Table 7 Remuneration of Contributions from Geiger, Seedorf et al (2011) ............................ 38
Table 8 Accessibility of peer contributions from Geiger, Seedorf et al (2011) ....................... 38
Table 9 Characteristics of the platforms obtained through the modelling process .................. 70
Table 10 Additional data for the taxonomy ............................................................................. 78
Table 11 Original cluster distribution and distribution after data expansion........................... 82
Table 12 Cluster membership with the same weight assigned to all dimensions .................... 86
Table 13 Conversion values for the characteristics of each dimension ................................... 87
Table 14 Cluster membership after triplicating the weight of the Aggregation dimension..... 87
Table 15 Cluster membership as per hierarchical clustering ................................................... 91
Table 16 Date and duration of interviews for each subject ................................................... 121

xii

Chapter 1 Introduction
1.1 Overview
The objective of this thesis is to create a collection of practices that facilitates the
implementation of crowdsourced translation processes. This chapter introduces the concept of
localisation and the challenge posed by the increased demand for localised content and
services. An overview is presented on crowdsourcing and its connection to localisation; and
on workflows and their relationship to crowdsourcing in localisation. The chapter then
discusses the research questions addressed by this work, the expected contribution to
knowledge, and presents a number of potential answers to the questions posed. The
methodology is presented next in which the methodologies used to explore the research
questions is described, and is followed by the scope and finally, the structure of the thesis.

1.1.1 Localisation
Localisation was defined by LISA (Localization Industry Standards Association) as "the
process of modifying products or services to account for differences in distinct markets"
(Lommel 2003). Localisation is a complex process that involves "project management,
engineering, quality assurance, and human-computer interface design issues" (Schler
2008a). This complexity results in localisation involving the collaboration of many different
professionals such as software developers, localisation engineers, project managers,
translators, etc that are often geographically dispersed throughout different time zones.
Localisation was initially carried out by internal teams in multinational corporations, but as
the market grew, specialized companies known as Language Service Provides (LSP)
appeared (Esselink 2000).
The interest in localising content is a consequence of globalisation, that is the process of
increasing interdependence and global enmeshment which occurs as money, people, images,
values, and ideas flow ever more swiftly and smoothly across national boundaries (Hurrell
and Woods 1995). This increase in globalization also has resulted in an increased demand for
localized content that has become a challenge for the industry both because of volume and
diversity of locales (Ryan et al 2009, p.17).

1.1.2 Crowdsourcing
One of the recent developments in localisation is the introduction of crowdsourcing, which is
the practice of leveraging communities to carry out tasks that were traditionally carried out by
professionals under contract (Howe 2006a). The idea of using crowdsourcing to deal with the
increasing volume of content and locales has gained traction within the industry to the point
that it has been argued that with a crowd of motivated, tech-savvy users, crowdsourcing is
actually the best way of localising a product (Kelly 2009; Rickard 2009). It has already been
proven that for some tasks, a crowd may deliver better quality than an expert (Sakamoto et al.
2011). In language related tasks, including translation, the outcome has been of quality
comparable to that of professionals for long tail languages (Callison-Burch 2009; Bloodgood
and Callison-Burch 2010; Zaidan and Callison-Burch 2011).
The potential gain from adoption of this paradigm was illustrated by Facebook that in 2009
had sixty five languages available and an additional forty in production (DePalma and Kelly
2011), in some cases with a quality that is higher than what could be expected of
professionals (Jimnez-Crespo 2013) Although this success has lead industry experts to state
that crowdsourcing will become integrated in the content supply chain, most LSPs have not
been able to find a way of integrating crowdsourcing into their processes and see it as a threat
to the traditional industry model that follows a Translate Edit Publish (TEP) process similar
to the one illustrated in Figure 1 (Ray and Kelly 2011).

Figure 1 Industrial TEP process. Adapted from Ray and Kelly (2011)

In this context, not many organisations have methods that allow them to use crowdsourcing
in their processes, leaving them ill prepared to take advantage of the possibilities that
crowdsourcing brings. The fact that research into best practices for working with swarms of
volunteers in crowdsourcing has only begun underlines this problem (ibid.). Furthermore, as
noticed by Zhao and Zhu (2012), the fact that crowdsourcing is such a new phenomenon is
observable in the many publications focused on conceptualization, i.e. defining

crowdsourcing and how it relates to other similar phenomena, instead of focusing on systems
or applications of crowdsourcing.
This thesis explores the crowdsourcing processes used by Non-Governmental Organisations
(NGO), open source projects like Ubuntu (Mackenzie 2006), LibreOffice and Firefox (Dalvit
et al. 2008), commercial companies like Facebook (Losse 2008; Mesipuu 2010) and Asia
Online (Morera et al. 2012) that have been able to leverage generally unpaid volunteer
translators. Companies like Amazon with Amazon Mechanical Turk (Kittur et al. 2008),
TextEagle (Eagle 2009) and Crowdflower (Munro et al. 2010) that have been able to offer
translation by tapping into generally paid crowds, are also included. All of these
organisations have successfully leveraged the crowd and the bespoke processes that they use
or enable illustrate the current crowdsourced translation practices.
Practices in this thesis are paths of action that are followed in given contexts. These practices
are diverse and depend on the main purpose of the organisation's crowdsourcing effort; be
this increasing customer engagement, lowering cost or some other purpose that has not been
discussed in the literature. These practices contrast with LSP approaches to translation that
generally focus on leveraging technologies such as Machine Translation (MT), Translation
Memories (TM), Translation Management Systems (TMS) and Terminology Databases (TD)
to produce an output with volume and quality that would otherwise require many more
human resources or time (Somers 2003; Bowker 2005; Plitt and Masselot 2010; Straub and
Schmitz 2010).
1.1.3 Workflows
According to the Workflow Management Coalition a workflow is The automation of a
business process, in whole or part, during which documents, information or tasks are passed
from one participant to another for action, according to a set of procedural rules (WfMC
1999).
Business process models are often used to represent workflows. These models are similar to,
but simpler than that which they represent (Maria 1997), in this case, workflows since they
do not represent information that is not relevant to the purpose of the model. This reduction
of complexity, among other things, makes the models useful for understanding and
communicating the processes they represent (Giaglis 2001). There are a variety of business
process modelling languages like Business Process Execution Language (BPEL) Yet Another
Workflow Language (YAWL), XML Process Definition Language (XPDL) and many more
3

(van der Aalst 2004) that can be used to define workflows and a number of systems able to
enact them including jBoss, Windows Workflow Foundation, WebSphere Process Server
among others (Louridas 2008). These business process modeling and execution languages are
not frequently used by LSPs, which instead use Translation Management Systems (TMS).
TMSs often integrate workflow solutions suitable for the Translation, Editing, and
Proofreading (TEP) process (Rinsche and Portera-Zanotti 2009). Although there are
organisations that claim to use crowdsourced translation in their processes, the literature
underlines once more shows the immaturity of the field by having very few examples of
process models as it will be shown in Chapter 2.
This thesis uses coloured Petri nets, which have been extensively used in the workflow
literature, to create process models that facilitate human understanding of the process, one of
the functions of process models according to Curtis et al (1992).

1.2 Research Question and Objectives


The objective of this research is to create a collection of practices that facilitates the
implementation of crowdsourced translation processes. These practices are based on the
patterns of other disciplines. In this context, a pattern is formal way of documenting a
common solution to a common problem in a particular field of expertise (Dsilets and van
der Meer 2011). According to Buschmann et al (2007), the optimal way to present patterns to
facilitate their implementation is in the form of a pattern language, that is, a collection that
makes explicit the interdependencies, synergies and conflicts that exist between the patterns.
However, organizing the practices in a manner similar to that of a pattern language falls out
of the scope of this thesis and instead a series of recommendations of practice combinations
matching existing scenarios were developed. In order to produce these recommendations a
series of questions had to be answered.
Q1. What are the existing kinds of crowdsourced translation processes?
Several classifications for crowdsourcing processes in general have been developed and at
least two specific to crowdsourced translation. The existing classifications specific to
crowdsourced translation discussed in Chapter 2 do not meet the requirement of having a set
of dimensions specific aspects of the objects being classified with characteristics values
for each dimension that are mutually exclusive and collectively exhaustive (Bailey 1994;

Nickerson et al. 2009), necessary for a taxonomy to be useful. Therefore a new taxonomy is
hereby proposed.
Hypothesis 1A: All crowdsourced translation processes fit in the same group and the
cluster has a good fit. This would indicate that the processes are too similar to be
divided in taxa. If this is true, a number of random platforms can be selected to review
the processes that they enact and it is likely that most practices appear across all the
platforms.
Hypothesis 1B: All crowdsourced translation processes fit in one or more clusters but
the cluster fit is poor. This would indicate that the processes are too dissimilar to form
distinct groups or if they do, the within-group commonalities are limited. This could
be true or the result of insufficient data. If this is true, a number of random platforms
can be selected to review the processes that they enact, but it is unlikely that any
common practices will emerge. If common practices emerged, it would indicate that
the clustering process needs additional data to work correctly.
Hypothesis 1C: There are different types of crowdsourcing processes and the fit is
good. If this is true at least a representative platform for each class will be selected to
review the processes that it enables. Some practices will appear only within a specific
group, and others will appear across multiple groups.
Q2. What practices appear in the different types of crowdsourced translation processes?
One of the tools used to understand how a process works is the process model. Current
process representations for crowdsourced translation, such as the one presented by DePalma
and Kelly (2011), are very high level and lack sufficient detail necessary to support an
understanding of crowdsourcing in action. To obtain a deeper understanding of the processes
a series of workflow models were created. These models facilitate the comparison of
processes in order to find their similarities and differences which will be useful when
identifying the practices that are common among different types of crowdsourced translation
processes and the practices that fit only in one crowdsourced translation scenario. By
analysing the models and considering practices described in the literature, it is possible to
identify suitable candidates because they appear repeatedly.

Hypothesis 2A: If the answer to Q1 is H1A (one cluster, good fit) and the models are
very similar, there will be similarities that will be a starting point for practices. In this
5

case the practices extracted from them will be generally applicable in crowdsourced
translation processes.
Hypothesis 2B: If the answer to Q1 is H1B (one or more clusters, poor fit) and the
process models are diverse, it would indicate that no practices can be extracted from
the data in this thesis. This does not imply that there are no useful practices for
crowdsourced translation, since it is possible that by adding more platforms and more
models, clusters with a better fit and similarities among models emerge.
Hypothesis 2C: If the answer to Q1 is H1C (multiple clusters, good fit) and the
process models are diverse across different classes but similar within a class, there
will be practices that appear repeatedly across classes and are applicable to most
crowdsourced translation processes; these would be generally applicable practices.
There will also be practices that appear only within a given class and are applicable
mainly in those processes; these would be practices applicable only within specific
classes of crowdsourced translation.
Q3. What are the forces that shape the candidate practices?
Refining practices to make them more useful for those who may want to apply them requires
a good understanding of the forces, i.e. requirements, consequences and constraints that shape
them. Some of these forces will be evident - e.g. if you only collect one suggested translation
per person, you need more than one person involved in order to have Open Alternative
Translations, the practice of openly collecting multiple translations for a single source TU.
Other forces may not be as transparent and can only be explicated through systematic
research. To identify the latter, a group of experts were interviewed. This is an exploratory
endeavour and too many hypothesis specific to each practice exist to list them here. Those
hypotheses that the researcher considered but were not confirmed by the interviewees appear
in the discussion of each practice in Chapter 5.
Q4. How can the different practices be combined?
Buschmann et als (2007) talk about patterns, a formal way of defining practices (Dsilets
and van der Meer 2011), and how in order to make their implementation less challenging,
they should be organized in a pattern language. They suggested that a pattern language can be
created by organizing the patterns according to the order in which their problem part appears
in the processes. This type of pattern organisation illustrates their relationships of
6

interdependence, synergies and conflicts. Ideally the practices should have been organized in
a manner analogous to that of a pattern language, but in order to create a pattern language a
more holistic approach to the collection of practices would have been necessary. For
example, resourcing and data management practices would have had to be considered, but
these fell out of scope. For this reason, instead of organizing the practices in a manner similar
to a pattern language, Chapter 6 contains a series of combinations of practices based on
existing scenarios.

1.3 Methodology
A survey was used to develop a taxonomy that allows for better understanding of the
different approaches used in crowdsourced translation. The survey had seven questions of
which four were used to collect the attributes used by the taxonomy and three to identify
crowdsourcing platform, organisation and rationale for crowdsourcing. Given the low number
of responses, the researcher selected other platforms and familiarized himself with them in
order to be able to add them to the taxonomy. With this data at hand the researcher used the
k-means algorithm to find out the different groups within the data. This was done according
to the three level model proposed by Bailey (1994). More details on how the classification
was carried out can be read in Chapter 3.
From the taxonomy a number of platforms were selected for a deeper analysis resulting in the
creation of workflow models. A standard methodology for the creation of workflow models
has yet to be proposed. Van der Aalst et al (2004) observe that modelling a workflow is a
non-trivial task that requires knowledge of the workflow language and lengthy discussions
with the workers and management involved. Curtis et al (1992) stated that In practice, most
process descriptions have employed narrative text and simple diagrams to express process.
The models in this thesis were created either from the narrative presented by representatives
of companies in conferences, or by creating several user accounts in different services in
order to take the role of different stakeholders and directly modelling each step.
The analysis of the workflow models allowed for the identification of a number of practices
that were later on refined using interviews with experts in order to make them easier to
implement by making them more similar to the design patterns proposed by Alexander el al
(1977).

Lastly, combinations of practices were attached to different existing scenarios in order to


make their implementation easier. These collections take cues from pattern languages, and it
is the case with the latter, they emerge from praxis, will evolve with it and there is not any
established academic methodology for their development.

1.4 Scope
This research focuses on crowdsourced translation and deals only with the practices that
directly affect payload data (the translation themselves) or metadata for the payload data
generated by the crowd (votes and comments).
There are crowdsourcing practices related to resourcing and motivation that are also
important for a successful crowdsourcing process, but these are outside the scope of this
thesis.

1.5 Thesis Structure


Chapter 1 has provided background information about localisation, its relationship with
crowdsourcing and pointed out the high level nature of the research on processes done so far.
Chapter 2 has three main sections. Section one discusses localisation in more depth, including
more background information and discussion of the tools and technologies used in
mainstream processes. Section two is dedicated to crowdsourcing. It discusses some of the
existing taxonomies for general crowdsourcing and specifically for crowdsourced translation.
Besides that, it addresses the benefits that crowdsourcing offers and criticisms that have been
raised against it. Section three discusses workflows, the role of workflow models, their
presence in the industry and issues related to their creation.
Chapter 3 discusses the methodology for the creation of a new taxonomy specific for
crowdsourced translation processes, presents the data used to create the taxonomy and
present the new taxonomy.
Chapter 4 discusses the approach used for the development of the workflow models, presents
the workflow models for eight different crowdsourcing platforms and outlines the practices
that will be refined and become part of the collection.

Chapter 5 presents the methodology used to refine the proposed practices through interviews;
presents the practices after their being refined with the outcome of the interviews together
with a discussion of features that did not emerge during the interviews.
Chapter 6 presents a series of scenarios and suggestions of practices based on real scenarios.
Chapter 7 summarises the content of the thesis, discusses how the research questions have
been answered, limitations and future research.

Chapter 2 Literature Review


2.1 Introduction
This chapter reviews the relevant literature on localisation, crowdsourcing, and workflows;
the three fields that are the focus of this thesis. The localisation section examines the
paradigm, the processes involved and their evolution, and identifies crowdsourced translation
as a gap in the literature. The section on crowdsourcing defines the concept, uses illustrative
examples to further elaborate upon it, and discusses classifications. A critique is offered in
the context of localisation and the manner in which some current practices minimise some of
the liabilities referred to in the literature. The following section introduces workflows and the
role of process models in this thesis. Support for workflows in the localisation process is
reviewed as well as models for crowdsourced translation processes. Finally, a review is
presented of existing guidelines proposed to achieve good quality process models and their
influence in the development of the models in Chapter 4 is discussed.

2.2 Localisation
This section presents the results of a literature review for localisation in general. This
includes the localisation process, aspects of which have been affected by crowdsourcing and
the practices in this thesis. When one of those aspects of the process is discussed, the
connection to the relevant practice is pointed out. The literature included was identified via a
search for localisation translation in Google Scholar and the candidate list of articles was
refined by analysing the abstracts to determine relevance. If a work was added to the
selection, both the works in its bibliography and the works that cited it where identified and
submitted to the same selection process. Furthermore, the researcher included papers and
reports that he became aware of through conferences and industry meetings, some of which
were not indexed by Google Scholar at the time.
As observed in Chapter 1, localisation is a consequence of the phenomenon of globalisation,
the increased flow of ideas, money and people across national boundaries (Hurrell and
Woods 1995). In the 1980s this process led to multinational corporations starting their
localisation efforts with internal teams in order to expand their reach in foreign markets
(Schler 2008a). As the foreign markets grew in importance and, as a result, the need for

10

localisation increased, specialized vendors known as Language Service Provides (LSP)


appeared (Esselink 2000).
Within the localisation industry, there is also a specialised meaning for globalisation, that is
the action of addressing the business issues associated with taking a product global and
includes considerations regarding the integration of localisation across a company, product
design, marketing sales and support (Lommel 2003) and having international websites
(Esselink 2000). One of the aspects of the globalization of a product or service that is of great
importance to localisation is internationalisation.
Internationalisation, often referred to as i18n, has been described as the process of
generalizing a product so that it can handle multiple languages and cultural conventions
without the need for redesign (Lommel 2003), which will ensure that the product is
functional and accepted in international markets and localizable (Esselink 2000). Pym
(2004) noted that some authors use the term globalization to refer to internationalisation, and
cites Brooks (2000) as an example of author that uses globalization when talking about
internationalisation. That DiFranco (2006) still uses the term globalization to refer to
internationalisation shows that the distinction between both was not clear until recently.

The internationalisation process is carried out during the development of the product
(Esselink 2000) and beyond the technical aspects such as separation of text and code, it has
been argued that it is attained by taking culture specific features out of the object to be
localised (Pym 2004). Internationalisation practices include:

The separation of text and code, to avoid translators having to deal with source code
(Esselink 2000).

The usage of abstractions for dates and times that can then formatted according to the
locale (Pym 2004).

The usage of style guides to create controlled languages, or writing for a global
audience (Esselink 2000; Pym 2004).

Preparation of a monolingual glossary for the project (Pym 2004).

The usage of a text encoding that supports international characters (Esselink 2000;
Pym 2004)

Poor or no internationalisation contributes to increased cost of localisation (Sprung and


Jaroniec 2000), resulting in that better internationalised projects are cheaper to localise
11

(Schler 2008a). Furthermore, following internationalisation best practices can lead to


localisation becoming focused only on the translation process (Schmitt 1999). This
observation of translation being the focus of a localisation process that follows best practice
is one of the justifications for this thesis focusing on crowdsourced translation processes
instead of crowdsourced localisation processes. Furthermore, the practices discussed in this
thesis are applicable in translation processes that do not have to be part of a localisation
effort.

Localisation was defined by LISA (Localisation Industry Standards Association) as "the


process of modifying products or services to account for differences in distinct markets"
(Lommel 2003). Pym (2004) defined it as the adaptation and translation of a text (like a
software program) to suit a particular reception situation which is referred as a locale,
that is a group of coinciding linguistic and cultural options including language, currencies,
date formats, number separators, sorting order and more. To illustrate this, Microsoft
windows has been localized to 20 varieties of Spanish, that is one language and 20 locales
(Pym 2004). In an ideal case, localisation will make a product seem like it has been
developed in the local market. However, this is a challenging goal given that it is a complex
process involving many processes such as "project management, engineering, quality
assurance, and human-computer interface design issues" (Schler 2008b). Furthermore a
complete localisation effort includes the provision of services and technologies for the
management of multilingualism across the digital global information flow (Schler 2008a).
It has also been noted that the core characteristic of localisation is its relation to digital
content (ibid.) that requires devices that are able to interpret it in order to represent it (Ryan et
al. 2009). Localisation also is often discussed in relation to translation, in this context the
bigger focus of localisation on tools and technologies stands out as a significant difference
(Esselink 2000).
Although translation within the localisation industry is seen mainly as the replacement of
original strings in a language with strings in the target language, Pym (2004) is critical of this
attitude because it addresses translation as a linguistic task instead of a communicative task.
Dunne (2011), who approaches translation from a pragmatic point of view that strongly
considers the impact of translation in the success of a product, also considers that
approaching translation from a linguistic point of view in localisation results in poor quality.

12

2.2.1 Criticism of Localisation


One of the criticisms raised against localisation is that it serves rich countries since the
decisions regarding the markets for which localisation is carried out are based on the GDP,
something that results in many products being localised into Danish with five million speaker
and much fewer localised into Bengali with a hundred million speakers (Schler 2008a).
Crowdsourcing has occasionally been used to alleviate this unbalance when it has been used
to provide translations for underserved languages (Munro 2010; Scannel 2012).
It has been observed that 90% of localisation is done by American corporations to extend
their reach to other markets (Collins 2002) and a part of a trend away from native language
content that should at least concern people in the localisation industry (ibid.). However,
localisation and the work required to support it: encoding, fonts, spell checkers, etc. (Hall
and Schler 2005) has also enabled the creation of local content in languages that are not of
interest for multinational companies, creating room for these languages to have a digital
presence that will contribute to their conservation (Schler 2008b). An example of a
community taking advantage of these localisation enabling technologies can be seen in the
work of communities that have created extensions in order to be able to use Facebook in
languages that are not supported by the company (Scannel 2012).
Pym (2004) also observed that localised products by aiming to betray no provenance
eliminate otherness and in long term do not contribute to the enrichment of intercultural
communication. Considering research carried out by Jimnez-Crespo (2013) this risk of
eliminating the otherness is specially real with crowdsourced translations because they come
closer to texts originally written in the target language than translations done using the TEP
approach.

2.2.2 The localisation process


Pym (2004) and Schler (2008a) refer to Esselink (2000) when they describe the localisation
process, the differences in how they describe the process are mainly in granularity as visible
in Table 1. The blank cells in the table indicate:

13

A stage that is not discussed. For example Esselink includes in the process stages like
Pre-sales and kick off meetings, while Pym and Schler start at the analysis stage and
do not discuss previous related tasks.

A stage that is considered within a bigger, umbrella stage. For example, Schler talks
only about a translation stage, without addressing its substages, while Esselink and
Pym break it down to translation of the software and translation of the help and the
documentation.

In spite of the changes in technology, more current references that do not cite Esselink as the
source of the project defined, describe processes that are almost the same. Three examples of
this are the process analysed by Carla DiFranco (2006) to evaluate cost saving opportunities;
the project for which Zouncourides-Lull (2011) presents a work breakdown structure and the
process described by Yahaya (2008) when discussing the management of geographically
distributed teams working for the UN. Given that the stages of the process have remained
constant in a span of over ten years, the changes that have happened to the process during that
time must have happened within those stages, ie: all the processes described include a
translation stage, but the methods adopted in the translation stage around 2000 are different
from those used in 2008. As an example, in 2000 the translation would not have been
undertaken using online tools, something that was possible in 2008.
The practices that were identified in this thesis have different impacts across the stages. In
order to keep a baseline understanding of what those stages entail, a definition for each of
them together with the practices that affect them follows and Table 2 offers an overview of
the new practices affecting each stage.
-Analysis: The original material is analysed in order to find out how amenable to localisation
it is, if there are changes required for specific markets, if the software supports the right
encoding, if all the necessary materials (strings, animations,etc) are available to the localizers,
what tools will be needed for the project, what is the effort taking on account word count,
images, UI design issues. This stage may include pseudo translation, the process of replacing
original strings with strings that have the traits (special characters, script, extension) of target
languages, which can be very useful to identify internationalisation issues (Schler 2008a).
At this stage, an organisation planning to use crowdsourced translation will have to select the
contents that are suitable for the types of crowdsourcing that they can use. This would mean

14

applying the Content Selection practice discussed in chapters four and five of this thesis or
the similar Identify Compatible Content pattern suggested by Dsilets (2011a).

Table 1 Comparison of the localisation process as described by Esselink (2000), Pym (2004) and Schler
(2008a)
Author
Esselink (2000)

Pym (2004)

Schler (2008a)

Stage n.
Pre-sales (sending and
1

receiving Requests for


Quotations)
Kick-off meeting

(introduction of the team


to the project)
Analysis of source

Analysis of received

material

Material

Scheduling and budgeting

Scheduling and Budgeting

Terminology setup

6
7
8
9
10
11

12

13

Glossary Translation or
Terminology Setup

Analysis
-

Preparation of source

Preparation of Localisation

material

Kit

Translation of software

Translation of the software

Translation of online help

Translation of help and

and documentation

documentation

Processing Updates

Testing of Software

Engineering/Testing

Engineering and testing of


software
Screen Captures
Help engineering and DTP
of documentation

Preparation
Translation
-

Testing of Help and


Publishing of

Documentation

Processing updates

15

14

Product QA and delivery

Product QA and Delivery

15

Project closure

Post-mortem with Client

Project Review

-Preparation: The localisation kit contains the materials necessary to localize the product
and can include: the material, tools, style guides, fonts, protocols for reporting issues, contact
details, translations memories, terminology
Guidelines for translation are used by Facebook (Losse 2008; Lenihan 2014), but they are not
discussed in depth in this thesis. Terminology and TM are used in several of the platforms
discussed in Chapter 4 and special considerations regarding these in the context of
crowdsourced translation are discussed in Chapter 5.
-Translation: This stage is carried out by translators, but does not only consist of translating
strings. The digital nature of localisation means that during the translation stage translators
may have to carry out file management and may use Computer Assisted Translation (CAT)
tools that can leverage Machine Translation (MT) systems, Translation Memories (TM) and
Terminology Databases (TDB). Although there are localisation tools like Alchemy and
Passolo (Reynolds 2009) that allow the translators to see the translations in context, it is
common to work with strings out of context.
As stated before, this thesis will focus on how crowdsourcing has affected this particular
stage. The Super Iterative Translation the practice letting contributors edit the existing
translations in order to improve them or make the language more up to date, Open
Alternative Translations, Hidden Alternative Translations the practice of collecting multiple
translations for a single source TU without letting contributors see the translations suggested
by others, and Translation without Redundancy the practice of collecting a single
translation for each TU all belong to this stage.
-Testing of Software: Although digital content undergoes testing in its original language, the
localisation process can introduce new issues, some affect only the appearance like clipped
strings because of text expansion or special characters not being correctly represented, others
are functional, like keyboard shortcuts no longer working or having been mapped to the
wrong keys. Determining these issues requires the creation of a localisation specific test plan.
It could be argued that with many systems that use crowdsourcing, the content is in a

16

perpetual beta state (Kazman and Chen 2009) and that in this state, the testing of the
localised versions is always crowdsourced. Some aspects of the Super Iterative Translation
practice that is discussed in this thesis and the similar Publish then Revise pattern (Dsilets
2011b) have an impact on this and the product QA or Engineering stage.
-Product QA or Engineering: When tests raise issues, localisation engineers are then
charged with solving them. Freeze the practice of preventing the changes in the translation
of a TU, be this editions or additions of new alternative translations, Open Assessment the
practice of openly collecting assessments for the different translations for a TU; often done in
the form of votes, Hidden Assessmentthe practice of collecting assessments for the
different translations for a TU without letting contributors see how others have voted,
Expert Selection and Edition the practice of having an expert, often a professional
translation or a vetted member of the community, select and edit the translations that will be
published and the different approaches to Metadata Based Selection the practice of using
metadata as a guide to let a computer automatically select the translation that will be
published are QA processes. Super Iterative Translation and Version Rollback the practice
of reverting changes in order to solve a quality loss caused by those changes help solving
the issues raised during linguistic testing and also fit in this stage.
-Delivery: The localized files are sent to the client so that a localized build can be produced.
Some of the crowdsourced translation platforms analysed in this thesis allow for the
download of translated files that can be then used outside the platform. But platforms like
Amara and Facebook display the translations directly online without the inclusion of a stage
that works like the traditional deliver stage. This is again linked to the Super Iterative
Translation practice, where delivery happens automatically any time a translation is
updated.

Table 2 Localisation process stages and related practices

New practices directly

Stage

affecting the stage

Analysis

Content Selection
Super Iterative Translation,

Translation

Open Alternative Translations,


17

Hidden Alternative Translation


Testing

Super Iterative Translation


Open Assessment, Hidden
Assessment, Expert Selection

Product QA or Engineering

and Edition, Metadata Based


Selection Freeze and Rollback

Delivery

Super Iterative Translation

2.2.3 Localisation technologies


One of the core features of localisation is the fact that it deals with digital objects (Ryan et al.
2009) and as a consequence of this, a series of specific tools and technologies have been
developed responding to the needs of practitioners. A comprehensive list of tools used in
localisation is out of scope, since it would have to include general tools and technologies that
are not exclusive of the field, such as spell checkers, word processors, email, ftp clients,
general electronic dictionaries, email, etc, but a list of those that are most relevant to this
thesis follows.
2.2.2.1 Translation Memory (TM) tools:
A translation memory is a database of existing translations stored as bitext (bilingual texts)
that are divided in segments, usually sentences, that can be automatically queried by CAT
(Computer Assisted Translation) tools that are also referred to as Translation Environment
Tools (TEnT). When confronted with a new sentence in a text in the original language, CAT
tools retrieve translations by using a matching algorithm and present them to the translator. In
the case where a match does not exist, once the sentence is translated, the new sentence pair
is stored in the TM (Bowker 2002; Somers 2003).
In the context of crowdsourced translations, TM is used in several of the platforms analysed,
like Pootle, Launchpad and Crowdin. Leveraging TM has therefore been added to the list of
practices. Besides leveraging previously existing translations, industrial CAT tools offer other
advanced features that are often not available in crowdsourcing platforms, at least in the cases
those analysed in Chapter 4. Some of the features of industrial CAT tools are listed below.
The feature lest was collected from Esselink (2000) and Somers (2003):

Concordance search: Search for one term within a whole TM.


18

Filters: Tools that convert file formats that the CAT tool cannot manipulate natively
in to formats that can be manipulated by the CAT tool.

Customisable segmentation: There is a higher chance of finding matches if the


segmentations rules used by the CAT tool are the same that were used for the creation
of the TM.

Alignment: In cases where there is not a TM but there are previously existing
translations, the existing original text and translation can be aligned to create a TM.

Document statistics: Compares a text to the TM and produces a report on matches,


which can be used to estimate budget and deadlines.

Machine translation (MT) system or interface: Obtains an automatically generated


translation if there is not a TM match.

Project management module: Simplifies some project management tasks like


reporting to the client or Key Performance Index.

Quality control module: Can include checkers for spelling, grammar, style guide
compliance, terminology compliance and completeness.

Terminology tools: See below.

Term extractor: Analyses the text and suggest term candidates.

The literature indicates that the use of CAT tools results in shorter time for translation,
reduced costs and increased consistency in the language (Esselink 2000; Webb 2000; Somers
2003; Brki et al 2009; Reynolds 2009; Rinsche and Portera-Zanotti 2009). Bowkers (2005)
article about pilot study carried out to determine the impact of CAT tools noticed that other
researchers have seen increases of productivity between 10-70% with 30% being a reasonable
expectation. Yamada (2011) also saw increases in productivity, but there were also decreases
in productivity when the translations stored in the TM did not stay close to the source.
According to Esselink (2000) CAT tools are popular in localisation because software is
updated regularly and most of the text from one release will match the text from previous
releases and because software documentation tends to be repetitive, which means that the
potential for leveraging previously existing translation is very high.
There are also disadvantages to the use of TM, like the lack of visual context, file
management overhead (Esselink 2000), issues of TM ownership, existence of a learning
curve, potential lower remuneration for translators (Bowker 2002), and the spread of mistakes
that are contained in the TM (Bowker 2005; Moorkens 2011). The issue of the lack of visual
context has been addressed by the tools that Reynolds (2009) calls localisation resource
19

editors. Those tools can work with binaries and represent the visual context (Esselink 2000;
Fernndez Costales 2010).
Chapter 5 presents an evaluation of this technology in the context of crowdsourced
translation using a qualitative technique, and discusses how the advantages and downsides of
TM affect work in crowdsourcing scenarios.
2.2.3.2 Machine Translation

Machine translation has been defined as a methodology and technology used to automate
language translations from one human language to another, using terminology, glossaries
and advanced grammatical, syntactic, and semantic analysis techniques (Esselink 2000) or
more simply, a software that takes inputs as sentences in one natural language and outputs the
corresponding sentences in another natural language. It has been said that TM is used to
support human translators and MT aims to replace them (Esselink 2000; Levitina 2011). It is
common for lower quality solutions to present themselves as alternatives to human translators
while higher quality systems present themselves as productivity tools (Esselink 2000).
As noted above, some TM systems include an MT system and others retrieve MT translations
from web services (ibid.).
By 2002 there was very limited usage of MT (DePalma 2006) and even by 2009 MT was not
widely implemented by translation agencies (Rinsche and Portera-Zanotti 2009). This lack of
penetration may be caused by pure MT output having been proven to be only practical when
used to translate very controlled input (Esselink 2000). However, as of 2007 and depending
on the language only between 3% and 30% of Microsoft knowledge base had been translated
by humans. If an article had not been translated, an MT version was available to the users and
these translations were generally welcomed (Levitina 2011). This signals that MT has
reached a state where its unedited output can be suitable in contexts.

Within the localisation industry MT is mostly used combined with TM and post edition by
human translators, with the usual process being the leverage of the TM, application of MT to
the segments without matches or with low matches and post edition by human (ibid.).
2.2.3.2.1 Types of MT
There are three commonly used types of MT: rule based, corpus based and Hybrid.
20

Rule based MT: Use dictionaries and syntactic rules to translate the content. Since these
rules change between languages new rules must be created for each language pair. Its output
is usually grammatically correct, but often requires a lot of post editing when it is used
without customer specific glossaries (Levitina 2011). These systems were dominant in the
1980s (Hutchins 2005).
Corpus based MT: The two main corpus based MT approaches are statistical MT and
Example Based MT (EBMT). Through the analysis of bilingual corpora the system creates
statistical models of equivalences that are used to translate. Thanks to the large collections of
text available in the internet and the increase in computing power these techniques have
become more viable in recent years (Levitina 2011).
The core difference between both approaches is the way in which their models are created;
while EBMT searches for analogous examples to create the translation, statistical MT uses
statistical correlations. Statistical MT systems can be improved using corrective feedback,
this has been done for example by Asia Online (Baer and Moreno 2009; Vashee 2009), one
of the organisations whose process for a crowdsourced translation project is analysed in
Chapter 4.
Hybrid MT: These systems either apply rules to a statistical engine or apply statistics to
correct the output of a rule based model.
There is also a fourth approach, called interlingua that uses an abstract language model as a
proxy between languages so that it is only necessary to create a conversion from and to that
language (Hiroshi and Meiying 1993), but there is no evidence in the literature of this
approach ever having been used in the localisation industry.
Several of the platforms and projects reviewed for this thesis use MT and as a result MT
leverage was added to the list of practices. The advantages of MT faster throughput, no
untranslated strings and potentially higher consistency if the system has been trained with
consistent TMs still apply, but a new set of issues linked to crowdsourcing appears and these
be discussed in Chapter 5.

2.2.3.3 Terminology tools

21

Identifying equivalents for specialized terms is an important part of every localisation project.
These terms may come from the field of knowledge to which the content belongs or be
dictated by the client like in the case of trademarks (Bowker 2002; Muegge 2007). As
localisation projects grow, the need for a unified terminology that can be used by different
translators becomes apparent (Rinsche and Portera-Zanotti 2009). Having a terminology
setup does not only help the quality of language, reduction of cost and increase the efficiency
in the localisation project, but also during the development and writing stages (Muegge 2007;
Rinsche and Portera-Zanotti 2009; Straub and Schmitz 2010).
The first step in setting up a terminology is the term extraction (Bowker 2002), this can be
done manually or with automated tools. Automated term extraction tools can use linguistic
and statistical methods. The linguistic method identifies certain part-of-speech patterns, like
noun+noun, while the statistical one looks for repeated sequences (Rinsche and PorteraZanotti 2009). Term extractors can be standalone or part of a CAT tool.
Once the terms have been extracted, these must be stored (Bowker 2002). Although some
organisation store their terminology in general purpose databases, spread sheets, websites and
text documents, in the long term this causes issues, such as low efficiency and
interoperability problems (Muegge 2007; Schmitz 2001 cited by Bowker 2002). Besides the
standalone terminology tools, CAT tools and TMS can also integrate a terminology
management module (Bowker 2002; Rinsche and Portera-Zanotti 2009). These tools store the
terms in a diversity of data models, from custom databases to the XML based TermBase
eXchange standard (Melby 2008). The retrieval of terms can be done via manual query of the
database or automatically by an integrated tool.
Again, terminology related technology is used in the context of crowdsourcing. Facebook for
example has two stages in their translation, first the community translates the most important
terminology and then the rest of the site (Losse 2008; Mesipuu 2010; Lenihan 2014). Besides
the increased consistency that is derived through the use of terminology tools, the experts
interviewed for Chapter 5 noticed other benefits that are discussed in Chapter 5.

2.2.3.4 Translation Management Systems


Translation Management Systems (TMS), also referred to as Globalization Management
Systems, are sever based tools that partially automate the orchestration of business functions,
project tasks, process workflows and language technologies characteristic of large translation
22

projects (Sargent and DePalma 2008; Reynolds 2009; Rinsche and Portera-Zanotti 2009;
Freij 2010). This automation helps to reduce the overhead caused by working with global
teams. These tools can include features such as integration with content and documentation
management systems, TM storage, an online translation environment, a quoting and invoicing
module, a workflow system and a terminology management module (Reynolds 2009;
Levitina 2011).
It has been observed that TMSs are difficult to implement and expensive, and that most
agencies probably need a simpler more agile solution (Reynolds 2009). Since these systems
have been developed to respond to the needs of big vendors and publishers, they do not cater
for the needs of most organisations that intend to use some types of crowdsourcing (Morera
et al. 2012). However, it could be argued that the systems such as Pootle or Crowdin that are
used for crowdsourced translation and discussed in Chapter 4 of this thesis are precisely
specialized TMS. Although they do not have quoting and invoicing modules, and workflow
systems, they do integrate many of the translation tools discussed before and automate much
of the process.

2.2.4 Localisation Levels


Several authors have proposed or agreed with others proposals on the levels of localisation
(Carey 1998; Brooks 2000; Pym 2004; Thayer and Kolko 2004; Zhou 2011). Pym (2004)
quotes the levels proposed by Brooks that are:
1) Complete Localisation or adapted, which entails content and examples from
the new locale
2) Partial Localisation, which entails only part of the product being localized; for
example, the UI, but not the documentation.
3) Enabled software, which entails the software being able to handle new locales (by
supporting encoding and fonts).

Zhou deals specifically with video games and embraces the levels proposed by Thayer and
Kolko (2004):
1) Basic: Original GUI and icons and only the text is translated.

23

2) Complex: The GUI and icons have been adapted.


3) Blending: Story and graphics are adapted.

There are some similarities between the different rankings of levels of localisations and
Thayer and Kolkos (2004) observation that Theres no plot behind Microsoft Word
indicates the main reason for the differences between them. If we assume that the
documentation has been localised, their Complex and Blending levels would meet the
Complete level in Brooks ranking and their basic level would match Brooks partial
localisation level.
Careys (1998) ranking depicted in Table 3 contains seven different levels including a level
of no adaptation at the lowest end. Levels two and four would fit into Brooks partial
localisation and levels five to seven would fit into Brooks complete localisation level. Of
these three models of levels of localisation, this thesis favours Brooks because of its being
applicable to content in general, without requiring a narrative like in Thayer and Kolko
(2004) and because all levels are clear, unlike in Careys (1998) that contains an ambiguous
stage enable code that is not explained in their paper.

Table 3 Levels of localisation adapted from Carey (1998)

Careys levels of localisation


One

No localisation effort made.

Two

Translate documentation and packaging only.

Three

Enable code.

Four

Translate software menus and dialogs.

Five

Translate online help, tutorials, and sample and readme files.

Six

Add support for locale-specific hardware.

Seven

Customize features for locale.

24

Independently of the ranking system used to measure the levels of localisation, Carey, Brook
and Zhou agree that the level of localisation is chosen according to the expected return on
investment (ROI). Pym (2005) adds that tolerance to English in certain niches, such as
software development, and tolerance to English in certain locales, such as French speaking
countries, is also a factor that affects the level of localisation of a product.
In the context of the projects and platforms analysed in this thesis, content translated via
crowdsourcing would generally fit Thayer and Kolkos (2004) basic level and Brooks (2000)
third localisation level, i.e. complete localisation; but it is worth noticing that, at least when it
comes to the localisation of a social network platform, crowdsourced translation may feel
more like text originally written in the target language than like a translation (Jimenez-Crespo
2013). From a cultural point of view, this is indicative of a more successful localisation effort
if the purpose is that of conveying the impression of a locally created product.

2.2.5 Summary of the literature review for localisation


In this section we have discussed the concept of localisation, the enterprise process which
includes translation that is the aspect of localisation on which this thesis focuses, the tools
involved in the process, the different localisation levels and how they all relate to the
practices in the crowdsourced translation paradigm.

2.3 Crowdsourcing
This section presents the results of a literature review for crowdsourcing in general and
specifically in localisation. The papers included were identified via a search for
crowdsourcing, localisation and translation in Google Scholar. Relevant candidates where
selected by title and the selection was further refined after reading their summaries. If a paper
was added to the selection, both the papers in its bibliography and the papers that cited it
were identified and submitted to the same selection process. Furthermore, the researcher
included papers and reports that he became aware of through conferences and industry
meetings, some of which were not indexed by Google Scholar at the time.

2.3.1 Introduction to Crowdsourcing


The term crowdsourcing is subject to multiple interpretations because of its being a relatively
new concept that is still evolving. This is visible in how much of the literature is dedicated to
conceptualize it (Zhao and Zhu 2012), how different authors use various names for it, such as
25

peer production, user-powered systems, community systems and mass collaboration among
others (Doan et al. 2011) and how other authors discuss it without defining it (Alonso et al.
2008; Kittur et al. 2008; Huberman et al. 2009).

Table 4 Google Scholar hits per year for crowdsourcing and some crowdsourcing related terms

PLATFORM
YEAR

Facebook

twitter

Wikipedia

Open
Source

YouTube

Amazon
Mechanical
Turk

Ebay

Digg

Innocentive

2006
2007

93
257

86
210

29
145

22
127

45
252

4
8

12
62

11
59

7
45

2008
2009
2010
2011
2012

364
533
965
1590
2400

260
376
869
1450
2018

248
461
868
1300
1750

236
411
771
1170
1540

357
448
702
925
1380

20
55
213
418
655

90
130
225
288
315

100
118
175
207
252

66
99
177
261
296

The popularity of crowdsourcing as a research topic has continually increased since the
publication of Howes seminal article and that is illustrated by Table 4 that shows the number
of hits per year that Google Scholar produces for eigh of the most popular platforms brought
up as examples of crowdsourcing and the open source approach to software development that
is also frequently discussed as a type of crowdsourcing. The table is the result of searching
for the keywords intext:[[PlatformName]] intext:crowdsourcing, which enforces both
keywords to appear in the text of the documents retrieved. Although Google searches are not
trustworthy data for information critical research, the table illustrates the rise in popularity of
crowdsourcing and several platforms as a research topic.
2.3.1.1 A Definition of Crowdsourcing
As stated before, much research on crowdsourcing is dedicated to conceptualize it and part of
the conceptualization is defining it. The most popular definition for crowdsourcing is the one
given by Howe in 2006 in an article for Wired magazine (Howe 2006b) that refers to it as the
practice of leveraging communities to carry out tasks that were traditionally carried out by
professionals. Across the literature, this definition has been paraphrased (Brabham 2008a;
Kleemann et al. 2008; Oprea et al. 2009; Horton and Chilton 2010), and modified it slightly,
for example The use of an Internet-scale community to outsource a task (Yang et al. 2008).

26

However, in contrast with these general definitions, other authors have focused on specific
aspects of the tasks for their definitions. For example Heer and Bostoc (2010) focused on the
size of the task and defined crowdsourcing as a process where web workers complete one or
more small tasks, often for micro-payments on the order of $0.01 to $0.10 per task.;
Brabhams (2008b) definition, which according to Estells-Arolas and Gonzlez-Ladrn-deGuevara (2012) is part of the most cited academic paper about crowdsourcing, focuses on the
problem solving aspect of crowdsourcing, by stating that crowdsourcing is an online,
distributed problem-solving and production model. The impact of this definition can be seen
in authors such as Doan et al (2011), from whose paper the following definition can be
collated: crowdsourcing is a general-purpose problem-solving method that enlists a crowd of
humans to help solve a given problem.
That crowdsourcing is a general purpose practice is underlined by the diversity of companies
that have used it or are using it and the fields in which their activities fit. In his article Howe
(2006) talks about how crowdsourcing was being used in fields such as photography
(iStockPhoto), software development (open source software), video (Web Junk 20), problem
solving (InnoCentive), knowledge collection (Wikipedia) and micro-task execution
(Amazons mechanical turk). It has also been used in data analysis (Humangrid), map
creation

(OpenStreetMap),

to

complement

Optical

Character

Recognition

(OCR)

(ReCaptcha), in design (Wilogo) (Schenk and Guittard 2009) and funding (Kickstarter)
(Howe 2008).

However, if problem solving is understood as the process of finding solutions, or as defined


by DZurilla and Goldfried (1971) a process that makes available effective responses to deal
with problematic situations and increases the probability of selecting the most effective
response to those situations; crowdfunding would not fit Brabhams definition. This is
because in crowdfunding the problem is always insufficient funding and the solution is
always more funding. In this context the crowd is not finding a solution, but enabling it.
With the intention to characterize crowdsourcing in order to define it, it has been argued that
its core characteristic is the open call, meaning that it is not limited to preselected candidates
(Howe 2006b; Schenk and Guittard 2009). However, there are organisations like Kiva, who
use volunteers that have undergone training, or IBM, who used their employees (DePalma
and Kelly 2011), that refer to what they do as crowdsourcing. In fact, Geiger et al (Geiger,

27

Seedorf, et al. 2011) turned this preselection (or lack thereof) of the contributors into one of
the dimensions of their taxonomy.

Schenk and Guittard (2009) did, however, find three elements that always appear in
crowdsourcing independently of the type activity, those three elements are the first explicit
collection of characteristics of crowdsourcing found in this literature review and they are
collaborators, who carry out the tasks; organisations, that benefit from the activity of the
collaborator, and platforms that enable the collaborators. Later, Estells-Arolas and
Gonzlez-Ladrn-de-Guevara

(2012)

identified

eight

defining

characteristics

of

crowdsourcing that must be addressed in a general definition for crowdsourcing. To identify


them, they collected forty definitions from 209 papers from different fields. The
characteristics are as follows:
1) The people who form the crowd.
2) What the people have to do.
3) What the people get from the process.
4) The initiator.
5) What the initiator gets from the process.
6) The type of process.
7) The type of call.
8) The medium in which crowdsourcing happens.
As a result of this collection of defining characteristics, they present the following definition.
Crowdsourcing is a type of participative online activity in which an individual, an institution, a nonprofit organisation, or company proposes to a group of individuals of varying knowledge, heterogeneity,
and number, via a flexible open call, the voluntary undertaking of a task. The undertaking of the task, of
variable complexity and modularity, and in which the crowd should participate bringing their work,
money, knowledge and/or experience, always entails mutual benefit. The user will receive the satisfaction
of a given type of need, be it economic, social recognition, self-esteem, or the development of individual
skills, while the crowdsourcer will obtain and utilize to their advantage that what the user has brought to
the venture, whose form will depend on the type of activity undertaken.

Gonzlez-Ladrn-de-Guevara 2012)

28

(Estells-Arolas and

To validate the definition Estells-Arolas and Gonzlez-Ladrn-de-Guevara qualified eleven


internet activities using the definition and found out that some, for example Wikipedia, fit the
definition while others, such as Delicious and Flickr, did not fit it. The fact that Delicious and
Flickr do not fit it is noteworthy, since they are commonly cited examples of crowdsourcing.
For this thesis, because of its fitness for purpose and the methodological rigour behind it,
when discussing crowdsourcing, it will be according to the definition proposed EstellsArolas and Gonzlez-Ladrn-de-Guevara (2012).

2.3.2 Crowdsourcing in Action


The Internet has opened the doors to big scale collaboration between unaffiliated individuals.
This kind of collaboration has happened in projects that do not pursue financial objectives,
like the Wikipedia (Anthony et al. 2005) and open source software like linux, and also in
services that involve a financial reward for collaborators, as is the case with Threadless,
iStockphoto, and InnoCentive (Brabham 2008b). This kind of collaboration has also been
used for localisation. Especially notable is the case of Facebook that localised their whole
user interface to French in less than 48 hours by letting their users do the translation (Losse
2008), but there have been many other cases of companies leveraging the community:

Google used to have a program called Google In Your Language that is no longer
open but allowed volunteers to translate the Google search UI (Scannel 2012).

Microsoft allows users to suggest translations in one of their MSDN support pages
(Curran et al. 2009).

Kiva that translates the applications for loans of non-English speakers by using
volunteers that have passed some tests and undergone Kiva specific training (Meer
and Rigbi 2011).

Twitter that has been translated into eleven languages by a volunteer workforce of
two hundred thousand (Twitter 2011).

TED talks that allow volunteers to translate the talks given in TED events (Curran et
al. 2009).

Dotsubs that allows volunteers to translate subtitles (Clark and Aufderheide 2009).

Second life was translated by volunteers, who managed terminology and tested the
localized version (Cronin 2010).

29

There are also cases of communities spontaneously deciding to carry out translation work for
free.

ROMhacking, the modification by fans of classic videogames to, among other things,
localise them (Snchez 2009).

Scanlation, the scanning and translation by fans of mainly Japanese comics (OHagan
2009).

Fansubs, the addition of subtitles to video products. This started among Japanese
cartoon fans (Hatcher 2005) but has spread to other content.

Many of these spontaneous translations done by the community are in a legal grey area
(OHagan 2009) that has enabled owners of the original content to threaten to take legal
action against ROMhacking communities (Snchez 2009). However, this thesis does not deal
with the legal issues surrounding those practices.
2.3.2.1 Drivers for Crowdsourcing in Localisation
The motivation for participating in this type of collaboration varies and is a complex issue
that can be approached from the point of view of numerous psychological theories (Borst
2010) and discussing them in detail goes beyond the scope of this thesis. In the case of what
Ray and Kelly (2011) call a cause driven model where volunteers decide to take part in an
unpaid crowdsourcing project, they do it because they have some interest in the project itself.
This means that there is a new criterion that determines if content will be localised: the
existence of a community with enough interest in localizing it. In contrast with this approach
where content is translated because a community decides to do it, mainstream localisation
mirrors the web 1.0 model where companies generate most of the content and decide what is
localised according to their interests. The arrival of the Web 2.0 has resulted in users
generating content too and also deciding to localise some content by taking the task upon
themselves. Hence, it is possible to say that mainstream localisation is market-driven, while
the crowdsourcing model can be community-driven (Frimannsson 2011).
2.3.2.2 Crowdsourcing and User Generated Content
The Internet has become the main means of distribution of digital content with a size that was
estimated to be approximately 5 million terabytes in 2005 (Zahariadis et al. 2011). According
to Baym (2011) initially most internet content was user generated and that only later did
professional content become relevant, but web 2.0 technologies have enabled anyone to
publish content on the Internet with relative ease (Janev and Vranes 2009).
30

At least in the Eurostat countries, the percentage of people adding content to the internet is
growing (Eurostat 2011), as seen in Figure 2.

Figure 2 Percentage of Internet users that add content according to Eurostat

The content created by these users without affiliation is known as User Generated Content
(UGC). UGC can be generated by local communities as the ones observed by Bruns and
Bahnisch (2009) and have a local slant, such as craigslist posts (Allen 2010) and flickr
pictures, of which 47 percent are uploaded by people living within 100km of the place where
they were taken (Hecht and Gergle 2010). UGC may have a universal appeal that is
independent of where it is created and not bound by geographical parameters. In that case,
that content may not need adaptation. Videos that show cats are an example of UGC that
needs no adaptation (Markoff 2012). Independently of its universal or local appeal, much of
this user content is bound by the language and culture it has been created in. Without
localisation, most content is limited to its language silo and this is especially true of UGC that
spreads without the support of traditional marketing strategies.
Generally there is no motivation for paying organisations, be it governments or companies, to
localise UGC, since it is not initially part of their strategy. Furthermore, at least in the case of
youtube videos, UGCs popularity is more ephemeral and has a much more unpredictable

31

behavior than that of traditionally generated content (Meeyoung et al. 2007) and investing in
localising it may not bring ROI, even when it aligns with the agenda of a Government or
company. For these reasons, together with MT, crowdsourcing has been suggested as one of
the ways to tackle this huge volume of content that cannot be localised following traditional
LSP practices because of the cost (van Genabith 2009).
2.3.2.3 A diversity of Practices
As said above, the crowdsourcing approach has been successfully used in multiple contexts.
The companies using it have developed practices that work for their content and their
community, but may not be amenable for other types of content or for communities with a
different idiosyncrasy. As an example of this, it is not known if a Facebook user that has dealt
with short UI strings with variables would be comfortable tackling a long TED talk. In the
same manner, a user accustomed to working with long documents may be uncomfortable
dealing with variable and code snippets that appear in UI strings.
Open source projects have developed their own practices, which overlap with some of the
practices used by companies using crowdsourced translation mainly as a marketing tool. One
difference observed in this research is that, at least Pootle and Launchpads Translation two
of the systems used in open source localisation allow community members to download
segmented files to work on them offline, while the systems of companies like Facebook and
Asia Online do not support this.
User communities working in scanlations, fansubs and ROMhacking have also developed
their own tools that can be as advanced as the professional tools (OHagan 2009) and they
have probably developed their own practices, but their work falls out of scope for this thesis.

2.3.3 Crowdsourcing Classifications


Researchers have tried to conceptualize crowdsourcing by creating taxonomies given that
they are tools that can help to understand the similarities and differences among the classes
they contain (Nickerson et al. 2009). Nickerson et al (ibid.) define a taxonomy as a set of
dimensions, each consisting of mutually exclusive characteristics (if a dimension is X it
cannot be Y at the same time) that are also collectively exhaustive (if the values for the
dimensions are X and Y, there cannot be a value Z). These features, mutual exclusivity and
exhaustivity, are also mentioned by Bailey (1994, page 3) in his book about taxonomy
development.
32

It has been observed that there are many different types of activities that fall under the label
crowdsourcing (Howe 2008; Schenk and Guittard 2009). Small tasks carried out
individually by collaborators whose work may or may not be pooled, like in Amazons
Mechanical Turk; bigger tasks that are solved by a number of members of the crowd that
collaborate, as it happens with open source development or the redaction of a Wikipedia
entry; and bigger tasks that are solved by individuals in the crowd, like the InnoCentive
challenges (Schenk and Guittard 2009). Looking for parallelisms in crowdsourced translation,
Facebook pools many translations of small TUs done by its users in order to have a full
translation; DotSub enables their users to collaborate in a manner that is very similar to the
way that Wikipedia edition works; and volunteers working for The Rosetta Foundation often
work as individuals. The diversity of these scenarios means that not all the practices that this
thesis collects are applicable for all the types of crowdsourced translation processes.
There are numerous taxonomies for crowdsourcing as stated in an article by Geiger, Seedorf,
et al (2011) that evaluates 13 of them. The existing taxonomies use differing dimensions to
formalize classes. The dimensions go from who initiates the crowdsourcing effort, to the kind
of activity carried out, passing through remuneration, pooling or selection of contributions,
etc (Howe 2008; Doan et al 2011; Schenk and Guittard 2009). These criteria reflect the point
of view of the scholars designing the classification. Geiger, Seedorf, et al (2011) do not
explain how they identified the dimensions and characteristics they attribute to the
taxonomies to the taxonomies that they evaluate that do not have dimensions and
characteristics explicitly named in the papers where they originally appear. However, they
refer to the work of Nickerson et al (2009) and Bailey (1994).
Nickerson et al (2009) point out that to create a taxonomy one must select a meta
characteristic that will be the basis of the classification, but that there is no proven and tried
method to select this meta characteristic. Bailey (1994) presents a similar view when he states
that theoretical guidance and prior knowledge are required to identify the right
characteristic. Nickerson et al (2009) further elaborate by saying that in order to identify
dimensions and characteristics, the researcher must observe a collection of the objects that are
going to be classified, and with the meta dimension in mind, identify their similarities and
differences. These similarities and differences are then to be grouped, using the deductive
conceptualization of the researcher.
With these thoughts in mind, Geiger, Seedorf et al (2011) may have looked at the existing
taxonomies and reverse engineered them for their paper. They do not discuss the usefulness
33

of the taxonomies that they break down; however, Nickerson et al (2009) presented criteria to
judge if a taxonomy is useful. According to them a useful taxonomy is defined as one that is:

Concise. Meaning that it has a limited number of dimensions and of


characteristics in each dimension.

Inclusive. Meaning that it contains enough dimensions and characteristics to


be of interest. To illustrate this point, they state that a taxonomy with one
dimension and two characteristics is not of interest.

Comprehensive. Meaning that it can be used to classify all the objects in the
domain of the taxonomy.

Extendible. Meaning that it allows for new dimensions and characteristics to


be added when new objects appear.

This thesis discusses three classifications. Two of them were not covered by Geiger et al
(2011) and are submitted to the same reverse engineering and evaluated using the criteria for
usefulness suggested by Nickerson (2009). The third one is the taxonomy developed by
Geiger et al (2011), which was developed following the method suggested by Nickerson
(2009) and upon which the taxonomy in Chapter 3 is built in order to gain a deeper
understanding of crowdsourced translation processes and how types of processes differ
among themselves.
2.2.3.1 Howes classification
In his classification Howe (2008) discussed four categories of crowdsourcing:

Crowd wisdom or collective intelligence: taking advantage of diversity by collecting


ideas from the crowd.

Crowd creation or user-generated content: collecting the crowds creative output like
videos, photos, designs, etc.

Crowd voting: collecting votes from the crowd.

Crowdfunding: collecting money from the crowd.

According to Nickerson et als (2009) description, Howes four categories are not a formal
taxonomy, given that there are activities where voting is combined with creation and as a
result, the characteristics are not mutually exclusive. Furthermore, in the case that the
dimensions and characteristics were made explicit, Howes classification would not fit
34

Nickersons definition of useful since this classification would only have one dimension,
what the initiator obtains from the crowd, and four characteristics:

Ideas for crowd wisdom.

Artistic content for crowd creation.

Evaluations for crowd voting.

Money for crowdfunding.

This having a single dimension prevents the taxonomy from meeting the inclusivity criteria
and, by not considering activities like translation that do not fit in any of the four categories,
this taxonomy also fails to meet the comprehensiveness criteria.
2.2.3.2 Classification from the information system perspective
Geiger, Rosemann, et al (2011) classified crowdsourcing processes according to the
information system perspective. They observe that over time, information systems have
evolved, with older information systems being intra organisational and having very little
flexibility when compared to newer systems such as workflow systems, which are inter
organisational and were created to be more adaptive and responsive. According to them, the
older systems focus on transactions, predictive behaviour and are restrictive in how they
interact with the environment. Social media and crowdsourcing create a new paradigm, where
the interactions are less focused on transactions and prediction, and the external actors are
unknown and can scale up highly.
According to Geiger, Rosemann et al (2011), systems can be organized according to the type
of inputs they take in and the way the inputs add value. Considering the type of inputs they
take in, on one hand there are systems that treat all external elements as homogeneous. That
means that all inputs are considered equal in value by the system, including those that are
analysed using quantitative methods. These systems often pool and count the inputs. An
example such a system is reCaptcha, which uses humans to recognize characters that an
automatic Optical Character Recognition (OCR) system cannot recognize.
On the other hand, there are systems that treat inputs as heterogeneous, that means that they
are given values according to given criteria and uses qualitative methods to analyse them. In
this case the inputs are often selected, as we can see in Innocentive and design platforms.
Bearing the previous two paragraphs in mind, it is possible to claim that for Geiger,
Rosemann et als taxonomy, the type of input can be considered a dimension and the
dichotomy homogeneous/heterogeneous are its characteristics.
35

Considering how inputs add value, on one hand, there are systems where contributions add
value individually. This happens if the system can use pre-existing criteria to judge if the
input is right. For example, a system that collects algorithms that perform 10% better than an
existing one would use such a system, as it was the case in the Netflix challenge.
On the other hand, there are systems where contributions add value collectively. These
systems analyse the emerging properties of all the collaborations. This is used when there is
no correct answer, for example when crowdsourcing is used for evaluation.
Bearing the previous two paragraphs in mind, it is possible to claim that for this taxonomy
the way in which contributions add value is a dimension and that individually and
collectively are its characteristics.
Using these two dimensions with two characteristics each, Geiger, Rosemann et al define
four types of crowdsourcing information systems and present examples that are summarized
below:
2.2.3.2.1 Crowd processing (homogeneous and individual)
These systems use a large quantity of homogeneous inputs without aiming for an emerging
property, ie, a system characteristic that goes beyond the sum of all elements (Geiger,
Rosemann, et al. 2011). The contributions are independent and independently evaluated.
Example: ReCaptcha. These systems exploit human brain power to carryout tasks that
computers cannot easily do. The combination of human and computer activity is fundamental
for these systems.
2.2.3.2.2 Crowd rating (homogeneous and collective)
The system uses homogeneous inputs, analysing them to extract the wisdom of the crowd
that is an emerging property. Lvy (1997) defines collective intelligence, another of the
names used for the wisdom of the crowd, as a form of universally distributed intelligence,
constantly enhanced, coordinated in real time, and resulting in the effective mobilization of
skills. The reviews used by Ebays reputation system are an example of this kind of system.
2.2.3.2.3 Crowd solving (heterogeneous and individual)
The system uses heterogeneous inputs as parts of the solution to a problem. Given that
properties of the solution are known, the contributions can be assessed for correctness. The
system used for the Netflix Prize is an example of this kind of system. Netflix requested

36

recommendation algorithms that would improve the performance of their current algorithm
by ten percent.
2.2.3.2.4 Crowd creation (heterogeneous and collective)
These systems use heterogeneous inputs whose value can only be assessed in relation to each
other. Crowd creation systems do not aim to solve any problem. According to Geiger,
Rosemann et al, Wikipedia is an example of this kind of system.
Besides the four types of system above the authors say that it is possible for a system to offer
a combination, as in the case of Threadless, that combines crowd creation and crowd rating.
Another difference between these systems that the authors note is that systems using
homogeneous inputs will integrate all inputs in order to obtain a solution in what the authors
call integrative aggregation, while those using heterogeneous inputs can use both integrative
when the different contributions complement each other, and selective aggregation, when
only inputs meeting certain criteria are part of the solution.
.

This classification does not fit the definition of useful taxonomy given by Nickerson (2009)
because a system can be both a crowd creation and crowd solving system, or use both
integrative and selective aggregation, which implies that the classes as stated are not mutually
exclusive. It would be possible to create a derivate taxonomy by combining all the existing
possibilities, but even in that case, this taxonomy fails the usefulness principle by not being
comprehensive enough. Activities such as crowdfunding, where the inputs are homogeneous
and pooled without aiming to extract an emergent quality, to do any kind of processing nor to
assess anything, would not fit in any category that this taxonomy can produce.
2.2.3.3 Taxonomy Based on Parameters that can be Affected by the Initiator.
Geiger, Seedorf et al (2011) developed a taxonomy that follows the definition given by
Nickerson (2009). The taxonomy was developed after analysing previously existing
classifications of crowdsourcing. They wanted their taxonomy to be useful for someone who
wishes to apply crowdsourcing to a process but needs to decide on specifics and also needs to
be able to distinguish crowdsourcing processes. With this in mind, they determined that the
meta dimension for their crowdsourcing taxonomy would be the mechanisms that impact the
process and can be controlled by the crowdsourcer.

37

The headers for Tables five to eight below indicate the dimensions of the taxonomy and the
characteristics appear in bold below.
Table 5 Preselection of Contributors from Geiger, Seedorf et al (2011)

Preselection of Contributors
Qualification-based

Context-specific means

Both means that context

None means

means that the

that the contributors meet

and qualifications are

that anyone

contributors prove

certain context

taken on account.

can

their skill beforehand.

conditions (being a

contribute.

customer, being an
employee).

Table 6 Aggregation of Contributions from Geiger, Seedorf et al (2011)

Aggregation of Contributions
Integrative contribution means that the

Selective contributions means that

complementary input from the crowd is

individual contributions are compared and

pooled.

some are selected.

Table 7 Remuneration of Contributions from Geiger, Seedorf et al (2011)

Remuneration for Contributions


Fixed means that all

Success-based means that

No remuneration means that

contributions receive a

only selected contributions

no payment is given.

payment.

receive a payment.

Table 8 Accessibility of peer contributions from Geiger, Seedorf et al (2011)

Accessibility of Peer Contributions

38

None means that

View means that

Assess means that

Modify means that

contributors are

contributions are

there are systems for

contributors in such

isolated and cannot

visible to any

contributors to

processes can alter or

reuse, complement,

potential contributor.

express their opinion

even delete each

or interact with

on individual

others

contributions by

contributions.

contributions.

others. This could be


for privacy reasons,
ensuring diversity
(avoid group
thinking) or because
it is not necessary.

There are 96 theoretically possible combinations of dimensions and characteristics. However,


some combinations, although possible, will not appear in real cases. For example, a system
where collaborations are selected, rewarded only in case of success and the collaborators can
modify the work of other collaborators is theoretically possible, but issues with collaborators
profiting from the work of others make it unlikely to ever be implemented.

2.3.4 Matching the Taxonomy to the Definition


It is possible to map the dimensions in a taxonomy to the defining characteristics of a
definition. In this case, the four dimensions discussed by Geiger, Seedorf et al (2011) are
covered by the definition given by Estells-Arolas and Gonzlez-Ladrn-de-Guevara (2012)
that was presented on section 2.3.1.1 on page 26.
1) Preselection of Contributors is part of the type of call and who forms the crowd.
2) The accessibility, remuneration and Aggregation of Contributions s are affected
by the type of process.
3) The remuneration for contribution addresses one of the aspects of the
compensation for the contributors.
However, not all the defining characteristics identified by Estells-Arolas and GonzlezLadrn-de-Guevara are fully reflected in the taxonomy.
39

1) The task. Since any non-trivial task is amenable to crowdsourcing (Estells-Arolas


and Gonzlez-Ladrn-de-Guevara 2012), making this a dimension would make
the taxonomy too complex, since a potentially infinite number of classes would
emerge and limit its usefulness, since the taxonomy would no longer meet the
criteria of conciseness proposed by Nickerson et al (2009).
2) The benefit for the collaborators. Geiger, Seedorf et al (2011) only address
monetary compensation. The potential diversity of benefits may result in too
complex a taxonomy. Furthermore, many of these benefits can be psychological,
which fall out of scope for this thesis.
3) The crowdsourcer. Geiger, Seedorf et als (ibid.) taxonomy does not address this.
In this case, considering the options proposed by Estells-Arolas and GonzlezLadrn-de-Guevara

(2012)

an

individual,

an

institution,

non-profit

organisation, or company could become the characteristics of a new dimension,


which would be the crowdsourcer or initiator. This dimension would be useful
since it is likely that the type of processes a for profit company will use will differ
from the type of processes an NGO would use.
4) The benefit for the crowdsourcer. Geiger, Seedorf et als (2011) taxonomy does
not address this. As with the benefit for the collaborators, a dimension that
captures all the possible the benefits for the crowdsourcers would potentially be
too diverse and result in a taxonomy that is too complex.
5) The medium. Since it is always the internet, it does not make a useful dimension
for the taxonomy.
On one hand, the taxonomy developed by Geiger et al covers three defining characteristics
identified by Estells-Arolas and Gonzlez-Ladrn-de-Guevara (2012): the people who form
the crowd, the type of call and the type of process. But, on the other hand, it does not address
the initiator, which could be a useful dimension. And finally, it does not address four
characteristics that would not add value the medium or would add too much complexity:
the benefit for the collaborators, the benefit for the crowdsourcer and the task.

2.3.5 Crowdsourcing in Localisation


As it is the case for crowdsourcing outside the localisation field, crowdsourcing in the
localisation field has been approached by different authors in different ways. For example
40

Curran et al (2009) and OBrien and Schler (2010) talk about crowdsourcing without
defining it. Other authors like Exton et al (2009) and OBrien (2011) use the general
definition provided in Wired Magazine (Howe 2006b) and provide additional details
according to the process they discuss. Some authors focus on one aspect, such as the
translation being free with statements such as from a corporate or NGO point of view,
Crowd-Sourcing is attractive as the crowd works for free (van Genabith 2009) or the
discarding of the term crowdsourcing in favour of the term volunteer translation, because
from their point of view the fundamental difference at stake is the monetary payment (Pym
2011a). Other authors focus on the community aspect, like Lewis et al (2009) who state that
in crowdsourced translation, the content is distributed for translation to a group of bilingual
individuals engaged in a community around the product or service being localised.
Anastasiou and Gupta (2011) in their definition include two actors outsourcer and large
group of people and the medium of the Internet and, without explicitly naming it, make a
reference to the call for volunteers. They do not discuss payment but bring up the
willingness of the crowd, which in the context indicates the assumption that the work will
be done without payment. The richness of features of their definition brings them closer to
covering the eight defining characteristics identified by Estells-Arolas and GonzlezLadrn-de-Guevara (2012).

DePalma and Kelly (2011) observed that crowdsourcing in localisation


opens a translation project to teams comprised of any mix of volunteer translators,
employees, contractors, or language service providers. It leverages the power of the
swarm [...]. [It] requires both technology and a business process to induct ad hoc
resources and then manage them as users within a collaborative online environment. In
some cases, it refers to the business strategy of eliciting volunteer or paid labor from
external, commercial resources

An evaluation of this definition using Estells-Arolas and Gonzlez-Ladrn-de-Guevaras


(2012) eight defining characteristics reveals that DePalma and Kelly address the people who
form the crowd, the medium, and what the collaborators get to some degree. It does not
address the initiator, what the initiator gets from the process, the type of process, the type of
call nor what the crowd has to do.
All the definitions referred to above have issues that fall in two categories:

41

1) Lack of accuracy: When one or more of the defining characteristics of crowdsourcing


is not addressed, for example, payment in Anastasiou and Gupta (2011).
2) Excessive accuracy: When a defining characteristic becomes overconstraining, for
example, the activity being of a collaborative nature (DePalma and Kelly 2011),
which would indicate that crowdsourced projects where translators do not collaborate
are not crowdsourcing.
In some cases the excessive accuracy is justified because the authors are not using the term
crowdsourced translation, but one of its alternatives. The existence of these alternatives at a
level in which they can be interchangeably used with the term crowdsourced translation is
characteristic of the field being new, as it was the case with crowdsourcing in general. Some
terms that are used as synonyms for crowdsourced translation localisation are community
translation, user-generated translation, volunteer translation and collaborative translation
(OHagan 2011).

OHagan (ibid.) who has carried out research on fansubs and ROMhacking localization
(Mangiron and OHagan 2006; OHagan 2009) prefers the term community translation
which was already used for these fan translations and FLOSS translations at a time when the
term crowdsourcing started to gain traction (ibid.). Specifically, the fan translation
phenomenom started in the 1990s (Leonard 2004) giving it a relatively long history before
the emergence of the term crowdsourced translation. Community translation is particularly
appropriate in processes where social interaction among contributors is an integral part of the
process (OHagan 2011). However, it has been noticed that there are community processes
where individuals can work together without ever communicating with each other (Dsilets
2007). There are also communities where members may not know and trust each other
(Dsilets and van der Meer 2011). Furthermore, there are even processes where the lack of
communication among individuals is an intended feature of the process like in TXTEagle
(Eagle 2009) and AsiaOnlines Wikipedia project (Vashee 2009). It is for these reasons that
this thesis considers community translation under the umbrella term crowdsourced
translation.

Pym (2011b) favours the term volunteer translation and underlines the pro bono aspect of the
translation. Although this applies to initiatives like Kivas or The Rosetta Foundation, which
42

have a political agenda, contributors in projects using AMT, TXTEagle, Ryugakusei Network
or Crowdflower are paid. This makes this term and the emphasis on pro bono translations
only applicable for specific types of crowdsourced translations. It is for these reasons that this
thesis considers volunteer translation to be covered by the term crowdsourced translation.
Cronin (2010) argues that with a new medium, a new type of translation will appear. For
before writing, there was only interpretation, once there were written texts, translation
appeared; later on, the inclusion of sound recorded dialogues in film will eventually cause the
need for audiovisual translation. A similar argument is proposed by Dsilets (2007) when he
states that massive collaboration must have an impact not only on how content is created, but
also in how the same content is translated. These views establish a parallelism between usergenerated content and user-generated translation; however, once more projects using
platforms that pay for contributions do not match the self-initiated aspect of the translation
that is part of user-generated content. It is for this reason that this thesis considers user
generated translation to be covered by crowdsourced translation.
Some examples of organisations using or enabling crowdsourced translation are Second Life,
where volunteers translated and also managed the terminology (Cronin 2010), Facebook
(Losse 2008), Kiva (Baer and Moreno 2009), dotSub (Dsilets 2007), Yeeyan, Minna no
honyaku and Ryugakusei Network (Kageura et al. 2011), where collaborators contribute
mainly translations, but there are also organisations focusing on the creation of resources like
TMs, for example VLTM and MyMemory or TDs in the case of wikitionary, ProZ glossaries
and Omega wiki, where the collaborations are in the shape of shared TMs or additions to
glossaries (Dsilets 2007). Although the crowdsourcing of linguistic resources is also an
interesting avenue for research, as stated in the Chapter 1, the focus of this thesis is on
crowdsourced translation.

2.3.6 Taxonomies within Localisation


Since during the planning stages of this research it was decided that a taxonomy would have
to be developed in order to gain deeper understanding of the existing crowdsourced
translation processes, for this reason the literature review contains an overview of the
previously existing relevant taxonomies. Only three of them were found. One was proposed
by Bey et al (2006) and two by Ray and Kelly (2011).

43

Bey et al (2006) talk about Mission-oriented translator communities and Subject-oriented


translator network communities.

Mission-oriented communities: These communities are strongly-coordinated and


translate specific documents related to their mission. According to Bey et al many of
these communities translate technical documentation. They illustrate this type of
community sing the translation of documentation for Linux, W3C and Mozilla as
examples.

Subject-oriented communities: In these communities individual translators translate


news, analyses, and reports and publish them on personal or group web pages.
According to Bey et al these translators do not have a specific mission, but they share
opinions that convey a political stance. They bring up TeanotWar as an example, but
at the date of this research, that organisation could no longer be found. However,
Yeeyan would be a suitable example of such a community.

The first classification proposed by Ray and Kelly describes three models:
Cause driven: The crowds involved in this model are not paid and often connected to
humanitarian causes (Kiva), but also personal interests like Japanese animation (fan subbing)
or international news in China (Yeeyan).
Product driven: This model involves a for-profit company approaching and managing a
crowd. Often the members will meet specific requirements before they are allowed to
participate. Adobe has used this model to localize their site dedicated to developers (Ray and
Kelly 2011), and Symantec has used volunteers to translate, review and make terminology
suggestions (Symantec 2011). The most famous case is Facebook, whose volunteers do not
have to match a specific profile, do not receive free products and merchandise (Losse 2008),
but do appear in leader boards and unlock badge style awards (Kwan 2009). Facebooks
approach is similar to the approach used in Launchpad, a platform used for Free Open Source
Software (FOSS) that contains projects with and without commercial support (Cedeno 2010).
Outsourcing driven: This approach is also called micro working (Fort et al. 2011). This
model uses outsourcing portals and the people carrying out the translation are paid. There are
a number of start-ups that offer facilities for companies willing to use this model. In the
localisation field, there are companies like CrowdCloud/Serv.io (Ipeirotis and Horton 2011)
and OneHourTranslations (Ray and Kelly 2011) that offer translation services using this

44

model. In academia this approach has been used to create MT evaluation sets (Bloodgood and
Callison-Burch 2010).
As the motivation is money the volunteers are satisfying the need for safety, which is the
second level of Maslows hierarchy (Jordan 2002). Since money is a factor, budget becomes
a defining attribute of these projects.
This classification has two dimensions the initiator and payment. The characteristics for the
initiator would be: Crowdsourced translation provider, Publisher or non-commercial entity.
The characteristics for payment would be: none, goods, money. Having two dimensions with
three characteristics each allows for potentially nine different classes, but the authors only
describe three. Cause driven would have non-commercial as the initiator and no payments.
Product driven would have publisher as the initiator and goods or no payment. Outsource
driven would have crowdsourced translation provider as the initiator and money as payment.
Considering the approach proposed by Nickerson et al (2009), product driven is a
problematic category since goods and no payment are both included. This means that the
classification does not meet Nickerson et als requirements to be a taxonomy.
Dolmaya (2011) used this taxonomy, but modified the meaning of the classes. Cause driven
initiatives are focused on humanitarian projects with no references to personal interests,
outsourcing driven initiatives are those by for profit companies, and product driven are those
related to open source software.
Ray and Kelly (2011) also talk about four types of platforms used for crowdsourced
translation. This second classification they suggest is as follows:
Wiki: Uses wiki technology. It is the most basic, difficult to scale beyond a few languages
and a limited number of products or services. It is simple to implement because it only
requires moderators and wiki software.
Database-lite: The publisher facilitates a UI where the source strings appear and there is a
field for the participants to write their translations. According to Ray and Kelly (bid) the lack
of context is not an issue because participants are expected to know the product in depth.
Publishers using this approach usually have someone in house revise the translations or send
them to an LSP. This model makes tracking projects much easier and it is still easy to build
and maintain.
Custom built dashboard: In this model the companies provided a UI that often offered some
manner of context and sometimes a link to the part of the source where the string appeared.
45

Companies using this model often display previously used translations for the string, but
there is no proper TM, terminology or MT integrated and no workflow.
Collaborative translation tool: This can be implemented in-house or using off-the-shelf
solutions. These often include terminology features, TM, MT and an API for integration of
other components including CMSs. The disadvantage is cost in fees or upfront cost plus
maintenance. The main advantage of these systems is scalability.

Although all three localisation related classifications address relevant points, none of them
was suitable as a base for a taxonomy that takes on account the criteria of usefulness
proposed by Nickerson et al (2009).

2.3.7 Benefits of Crowdsourcing in the Context of Localisation


Many benefits have been attributed to using crowdsourced translation. For example, by using
crowdsourcing in their localization projects companies increase their reach by adding
languages (Tsang n.d.), specially long tail languages (Dsilets and van der Meer 2011). In
some cases making these additional languages available would not be viable through
traditional processes because the market forces that they generate are not strong enough (Filip
2012) or the product addresses a niches too small (Dsilets and van der Meer 2011) to justify
the investment. In her paper, Dolmaya (2011) talks about long tail languages being supported
by the crowd too, but also notices that, even in crowdsourcing initiatives, there are language
tiers. To illustrate this she points out how the crowdsourced translation project of Hootsuite
has resulted in an almost complete translation into French, Spanish and Japanese, among
others, while the translation of languages such as Polish and Chinese have seen very little
progress. Scannell (2012) goes further and notices that Facebook has not accepted any
requests to add new languages since 2009 leaving many language communities without a
server side localised version of Facebook. This indicates that even when the translation can
be obtained at a reduced cost, companies may not be interested in supporting very minor
locales.
Another benefit quoted by Adobes Francis Tsang is that crowdsourcing benefits from the
specialist knowledge of the users (Tsang n.d.), who in some cases can be professionals
working on a project they consider worthwhile even if there is no remuneration (Kageura et

46

al. 2011; OHagan 2011), or paid professionals that have been integrated in the crowd (Kelly
et al. 2011a). The interaction of this community of expert users can result in doubts being
resolved more quickly, which can in turn help achieve a faster turnaround (ibid). Another
way in which some types of crowdsourced translation can achieve higher throughput is by
letting a number of contributors work simultaneously on different parts of the project, or even
the same one, thereby achieving high parallelism (ibid.). This high parallelism is taken
advantage of with practices such as Open Alternative Translations, Hidden Alternative
Translations, Open Assessment the practice of openly collecting assessments for the
different translations for a TU. This is often done in the form of votes, and Hidden
Assessment he practice of collecting assessments for the different translations for a TU
without letting contributors see how others have voted that are discussed in Chapter 5.
Although they do not explain how it is achieved, other authors also claim that crowdsourced
translation processes can result in faster turnaround (Lommel and Ray 2009; Dsilets and van
der Meer 2011), and it has been suggested that this faster turnaround can make crowdsourced
translation suitable for the translation of user generated content that is highly transient
(Dsilets and Van der Meer 2011). Besides the higher throughput, if contributors are allowed
to select the material that will be translated, the crowdsourced translation project becomes
effectively smaller and no effort is spent in translating material that is not relevant. According
to Kelly et al. (2011a) Adobe in China successfully takes advantage of this by letting its users
translate only the content that they find relevant.
Many authors bring up quality related benefits. For example, according to Kelly et al.
(2011a) if the community shares linguistic resources, those resources could help to improve
the quality of the translation and given that according to them communities are not broken
down by language, these benefits of sharing should spread more easily. However, in the
experience of the author of this thesis, it is common for communities such as the Ubuntu
translation teams and Facebook users to be effectively broken down by languages.
Another quality related benefit that has been brought up in the literature is that community
assessment can result in higher quality translations by first ensuring that the translation meets
the expectations of the users, since the users are often the translators, and secondly, by
replacing subjectivity with consensus (Jimnez-Crespo 2011). This argument that
crowdsourced translation achieves higher quality, if quality is understood as the acceptability
of the translation among its final users, is supported by the claims of some authors that
crowdsourced translation produces more native sounding translations (Baer 2010; Meyer
47

2011; Dsilets and Van der Meer 2011; Jimenez-Crespo 2013). However, as Jimnez-Crespo
observes, quality is understood in different ways in different contexts and acceptability of the
translation among its final users may not be suitable in some contexts. Furthermore, where
the process is open, judgments emitted by the crowd might be submitted to the social
influence effects (Lorenz et al. 2011; Muchnik et al. 2013) that might hinder quality if judged
by other criteria. The practices Hidden Alternative Translations and Hidden Assessment that
are discussed in Chapter 5 can be leveraged to counter the effects of social influence.
Involving professional reviewers as gate keepers and overseers of the process can help further
improve the quality of the translation thanks to these professionals having a general view of
the process that enables them to correct macrotextual errors like inconsistent terminology
(Jimnez-Crespo 2011). This practice is discussed in Chapter 5 under the name Expert
Selection and Edition. If these experts give feedback to the community, the feedback can be
delivered more quickly and result in the prevention of potential errors which also results in
higher quality (Kelly et al. 2011a).
Besides quality and speed related benefits, some authors point out that crowdsourced
translations is a form of community involvement (Lommel and Ray 2009; Dsilets and van
der Meer 2011) and as such, it can be argued that it has the benefit of potentially increasing
brand loyalty. This idea is supported by the perception that practices like Open Alternative
Translations, Open Assessment and Most voted Metadata based selection introduced in
Chapter 4 and further discussed in Chapter 5 are useful in creating community engagement.
Finally, from the point of view of the contributors crowdsourcing initiatives can work as
learning playgrounds for inexperienced translators ( Dolmaya 2011) talks about. This idea
was supported by one of the interviewees when he talked about the diversity of lengths and
complexities of strings in similar terms when discussing the Unit Granularity Selection
practice in Chapter 5.

2.3.8 Criticism in the Context of Localisation


Dolmaya (2011) argues that marketing strategies can prevent the crowd from becoming
aware of their exploitation in cases where they translate pro-bono benefitting mainly forprofit organisations.

48

Dolmaya (ibid.) talks about the increased visibility that comes with some of the rewarding
mechanisms in some crowdsourcing initiatives potentially having a positive impact for
translators. However, the paper also addresses the fact that sometimes the visibility is gained
by amateurs whose work is actually improved by professionals. This situation is eventually
more likely to have a negative effect on the publics perception of the value of translation.
The paper also questions the claims with respect to future career opportunities that could be
gained by contributors who have gained visibility within crowdsourcing initiatives.
Although some professionals feared that crowdsourced translation would destroy the
translation market, Kageura et al (2011) noticed that crowdsourced translation had not, as of
2011, had an impact as significant as expected. This view is supported by Dsilets (2007)
observation that although crowdsourced translation has been seen as a threat by professionals,
it will not destroy the market, in the same way that Open Source has not destroyed the
proprietary software market. Kelly et al s (2011) observation that collaborative translation
is not a replacement for the TEP (translate, edit, publish) model, but as argued by Cronin
(2010), a consequence of the new technologies, and further supports the idea that
crowdsourced translation and traditional translation are not mutually exclusive practices.
Dsilets (2007) goes further and suggests that organisations with proprietary processes and
data should look for opportunities to combines both. For example he proposed that
TERMIUM and IATE should open a section that can be edited by the crowd.
It has been observed that the management of crowdsourced translation can be more
challenging and costly than using traditional LSPs (Yahaya 2008b; Dsilets and van der Meer
2011; Filip 2012).
Quality control is considered an issue (Ellis 2009; Dsilets and van der Meer 2011).
However, it has also been shown that with the right process even a non-professional crowd
can produce professional level quality (Zaidan and Callison-Burch 2011) and, if the criteria
for quality is acceptability by the user and making the translation appear that it has been
originally written in the target language, crowdsourcing can produce better results (JimenezCrespo 2013).
It is also possible to obtain poor translations from well-meaning contributors with poor
linguistic skills, malicious users and users that are motivated by material rewards. An
example of this is AMT, where contributors will often try to exploit the system to earn more
money and produce low quality translations (Zaidan and Callison-Burch 2011).

49

According to Jimnez-Crespo (2011) the approach used by Facebook, that is, using multiple
translations and letting the users vote for the best one, is more suitable for short isolated
strings than for longer pieces of text. This and other issues regarding the size of the
translation unit are discussed in Chapter 5 under the Select TU Granularity practice.
Motivation is an issue since organisations cannot always count on the emotional attachment
that brings people to help in case of disasters (Munro 2010); humanitarian goals (Kiva,
Translator without Borders), bond to the service/product (Facebook) or pride in their native
tongue.
Decontextualisation, that is TUs appearing as single sentences out of context, as observed by
Dsilets and Van der Meer (2011) can be an issue and hinder quality. Munro el al (2010)
recommended the implementation features that allowed the reviewing of the content as a
whole to minimize its impact.
Parallelism that appeared as an advantage can also be a disadvantage. As observed by Yahaya
(2008) the more translators involved in a project, the higher the chance of inconsistencies in
the translation. Munro el al (2010) recommended the implementation of fora and chat rooms
to enable the communication among translators and reduce the negative impact of having a
high number of translators involved. The experts interviewed for Chapter 5 talked about
leveraging TM and terminology to help deal with inconsistency issues.
Dsilets (2007) points out that it is not possible to set up deadlines. However, there are
crowdsourcing projects where deadlines are used, even though the lack of a contract makes
enforcing them impossible. In Chapter 5 the observations done by experts with respect to the
usefulness of deadlines in crowdsourcing is presented under the Deadlines practice. He also
points out that it is not possible to predict if content will propagate to specific languages or
when it will do it if it does. Another issue he notices that waiting for the entire translation to
be finished before publishing is not always going to be possible. The practice Super Iterative
Translation discussed in Chapter 5 points out some of the pros and cons of publishing
incomplete translations.
It has also been noticed that the usual pricing model per word, sentence or other text
measuring units may not be suitable for a process where value can be added by social actions
(Kelly et al. 2011a). Another issue of crowdsourced translation is the risk of approaching it in
the wrong way and generating backlash from the community of translators as it was the case
with LinkedIn (Kelly 2009). Finally, although they do not point it out, the fact that Ray and
50

Kelly (2011) have noticed that some organisations using crowdsourcing do not advertise that
they do so indicates that there are marketing issues around the practice.

2.3.9 Other Collections of Practices


Dsilets and van der Meer (2011) have created a collection of patterns of their own. They
collected 57 patterns that they organized in six groups.

Planning and Scoping

Community Motivation

Quality

Tools and Processes

Right-Sizing

Contributor Career Path

Some of their patterns are very similar to practices presented in this thesis. For example,
Publish then Review is a pattern that reflects worries similar to those that gave shape to the
Super Iterative Translation practice. There are many noteworthy contributions in this
collection, but the amount of information dedicated to each pattern is too low, making each of
them less valuable as individual contributions, but potentially excellent starting points for
research.

2.3.10 Summary of the Literature Review for Crowdsourcing


This section of the literature review has introduced crowdsourcing with coverage of
definitions for it; three taxonomies, including the one upon which the taxonomy in Chapter 3
is based; compared the dimensions of the taxonomy to the characteristics appearing in the
definitions and discussed crowdsourcing in the context of localisation.

2.4 Workflows
The term workflow is very fuzzy and used in many different contexts (Alonso et al. 1997).
Two general definitions of the term are the process (or steps) by which all activities of a
work are achieved (Liang et al. 1993) and a collection of tasks organized to accomplish
some business process (Georgakopoulos et al. 1993). These contrast with the specialized
concept of workflow used in process automation. The Workflow Management Coalition
(WfMC, 1999) deals with process automation and defines a workflow as "The automation of
a business process, in whole or part, during which documents, information or tasks are
passed from one participant to another for action, according to a set of procedural rules".
51

This thesis focuses on the modelling aspect of workflows and for that purpose it will use the
definition provided by Mendling (2008) whereby a business process model, which in this
thesis is equated to a workflow model, is the result of mapping a business process.

Workflow practices belong to the study and implementation of Business Process


Management that commenced in the early 20th century with the first efforts in mass
production (Mendling 2008), but standards for digital data exchange only started being
developed in the 1970s and it was not until the mid-1990s that the development of workflow
standards started (Zur Muehlen et al 2005). Over time the development of standards has
resulted in a richness of abstractions for representing workflows, to the point that according
to Recker (2006) a PhD student was able to collect a list of 3000 workflow modelling
techniques before stopping, and van der Aalst (2013) talks about a tower of Babel for
process languages. According to van der Aalst (2004) some stand-out business process
modelling languages are: Business Process Modelling and Notation (BPMN), Yet Another
Workflow Language (YAWL) and XML Process Definition Language (XPDL). These
languages can be used to define workflows that can be enacted by a number of systems
including jBoss, Windows Workflow Foundation, WebSphere Process Server among others
(Louridas 2008). Some criteria that have to be taken into account when choosing a workflow
system for industrial implementation are its reliability, performance, interoperability and
scalability (Georgakopoulos et al. 1995) but this thesis focuses on understanding workflows,
not implementing them, and those criteria are thus irrelevant.
The possibility of using workflows is interesting in the context of crowdsourcing because
some approaches to crowdsourcing require the coordination of numerous people. This sort of
coordination is challenging for humans, but feasible for machines.

2.4.1 Workflow Models


As discussed in section 2.2 of this thesis, crowdsourcing is a relatively new phenomenon that
is still being conceptualized. As discussed in the introduction, by virtue of eliminating less
relevant information, process models are less complex than the processes they represent
(Maria 1997). This reduced complexity makes models a useful tool in understanding what
they represent (Curtis et al. 1992b; Giaglis 2001; Latva-Koivisto 2001). Furthermore, this
thesis presents the practises encountered in crowdsourcing processes in a manner analogous
to that of patterns in architecture (Alexander et al. 1977) and software design (Buschmann et
52

al. 2007). In the context of computer sciences, design patterns are often accompanied by
code, pseudo-code excerpts or UML diagrams that help to understand them. The workflow
models perform a function similar to that of the code, pseudo-code and UML diagrams in
design patterns as a tool as a tool to better understand crowdsourced translation processes and
practices.

2.4.2 Workflow Patterns


Since the publication of Workflow Patterns (van Der Aalst et al. 2003), workflow patterns
have become a tool to assess the flexibility of a workflow system, and clarified concepts
thereby allowing a more standardized discussion of workflows (Russell, Hofstede, et al.
2006). Workflow patterns have been defined as a series of constructs that address business
requirements in an imperative workflow style expression, but are removed from specific
workflow languages (van Der Aalst et al. 2003) and a series of constructs that are
embodied in existing offers in response to actual modelling requirements (Russell, Hofstede,
et al. 2006).
Workflow patterns have been approached from different perspectives that should be
addressed in a holistic workflow model:
1) Control flow patterns address which tasks carried out and their invocation order
(Sarshar and Loos 2005; Russell, Hofstede, et al. 2006).
2) Resource patterns address which resources, be these human or automatic, carry
out which tasks (Russell, van der Aalst, et al. 2005).
3) Data patterns address the visibility of the data, the interaction possibilities, the
transfer of data and the data-based routing, which in turn affects the control flow
(Russell, ter Hofstede, et al. 2005).
4) Exception handling workflow patterns (Russell, van der Aalst, et al. 2006).
This thesis focuses only on the control flow aspect of the workflows, leaving resource, data
and exception handling patterns out of the scope. It is, however, worth noticing that some of
the crowdsourced translation practices suggested in the following chapters would benefit
from an in-depth analysis that incorporates these other workflow perspectives.
When talking exclusively about the control flow aspect, the researcher prefers to think of
them as a collection of features with standardized names that can be used to describe control

53

flow in workflows. Since it is necessary to better understand crowdsourcing processes and be


able to discuss them in order to answer the research question posed in chapter 1, this
standardization of names makes workflow patterns useful in the context of this thesis.

Figure 3 An industrial localisation process model on WorldServer

2.4.3 Workflows in the Language Industry


In the language industry it is common to use TMSs for workflow management. These are
often ad-hoc in-house developed solutions but there are off-the-shelf offers too (Rinsche and
Portera-Zanotti 2009). From a workflow patterns perspective industry workflows can be as
simple as the linear representation in Figure 1 on page 6, that would only require support for
the Sequence pattern, i.e. the ability of a system to start a task once the previous one has
finished. They can also be more complex as seen on Figure 3, that is a real industry workflow
on WorldServer platform.
The specific names of the different tasks in Figure 3 have been deleted for privacy reasons,
but this does not prevent anyone familiar with WorldSever from noticing that only the
54

Sequence, Exclusive Choice, Structured Loop, and Simple Merge patterns are used in the
model.

2.4.4 Industry workflows for crowdsourced translation


At the time of the writing of this thesis and to the best of the knowledge of the researcher,
there are no industry workflow management solutions that are suitable for crowdsourced
translation processes other than crowd TEP. The reason for this observation is that TMSs do
not support workflow patterns necessary to realise highly iterative or redundant translation
(Morera et al. 2012). This makes sense because in a scenario where translators are being paid
for their work, redundant translations or constant refinement would add to the cost.

Figure 4 High level representation of a crowdsourced localisation timeline. Adapted from DePalma and
Kelly (2011)

There are also newer companies that have tools with UIs that look like web based CAT tools
in their marketing material, but again, the researcher was unable to gain access to them in
order to find out if they implement iterative or redundant translation practices. Lingotek was
a company brought up during the interviews as a crowdsourcing style industry tool, but
again, although they present screenshots of a web based CAT tool in their marketing material,
the researcher was unable to find out if they allow iterative or redundant translation practices.
55

There are currently some established LSPs that claim to offer crowdsourced translation. The
researcher, however, was unable to gain access to their platforms in order to study their
processes. There are also newer companies that have tools with UIs that look like web based
CAT tools in their marketing material, but it was again not possible for the researcher to gain
access to them in order to find out if they allow iterative or redundant translation practices.

2.4.5 Models of crowdsourced translation workflows


Only three representations of crowdsourced translation processes were found during this
literature review. The first one, presented in Figure 4 (DePalma and Kelly 2011), is a general
view of the process in the form of a timeline that does not use a standardised notation. By
trying to accommodate many different approaches to crowdsourcing, this model does not
provide information at the required level of granularity to be able to help those interested in
implementing crowdsourced translation practices.

Figure 5 Representation of a crowdsourced process. Adapted from Vashee (2009)

The second one presented in Figure 5 represents a specific process used by Asia Online in
their Wikipedia translation project (Vashee 2009). Although the notation is not standard, the
level of detail is useful for anyone who may want to imitate their process. Several of the
practices identified in this thesis were implemented in this project. In Chapter 4 includes a
coloured petri net model of this process that was used for the identification of the practices.
56

Only one article in the literature included a standardized process model related to
crowdsourced translation. The model shown in Figure 6 uses BPM notation to represent a
process for the management of bi-text for NGOs (Filip and Conchir 2011) the process
represented is related to the TU Granularity Selection practice.

Figure 6 A BPMN model of a suggested process for bi-text managemet in crowdsourcing scenarios. Used
with permission (Filip and Conchir 2011)

2.4.6 Modelling practices


As noticed by Van der Aalst et al (2004) creating a workflow model is not a trivial task. It
requires deep knowledge of the business process at hand (i.e., lengthy discussions with the
workers and management are needed) and the workflow language being used. They go
further by saying that models created that way are influenced by perception and tend to be
subjective. They suggest doing data mining instead of interviews to create more objective
models that should be compared to the human-made models. Data mining for workflow
extraction has produced a number of papers from its application in hospitals (Mans et al.
2009), software development (Poncin et al. 2011) and industrial processes (van der Aalst et
al. 2007), to development of tools (van Dongen et al. 2005). The first intention in this
research was to pursue the mining approach, but the organisations approached were very
wary of sharing their data and the mining approach had to be abandoned. As a result of this,
57

the models in this thesis were created by the researcher using, where possible, multiple
accounts to simulate the roles of different stakeholders. The manual approach to model
creation is still the most used approach to process modelling, but it is often done through
interviews with the stakeholders (van der Aalst 2013).
Processes have been studied by different people at different level. Many computer scientist
and mathematicians consider graphical representations of processes as informal and greatly a
matter of aesthetics (Moody 2009), and have opted for low level formal languages like Petri
nets for their models (van der Aalst 2013). Practitioners use higher level conceptual
languages such as BPMN and UML and those who have to implement processes use
execution languages such as BPEL (ibid.).
The models in this thesis are descriptive (van der Aalst 2013), neither normative nor
executable, since their intention is to represent an existing process, not to impose a process or
to facilitate their enactment.
When considering the modelling language to be used in this thesis the researcher considered
three different modelling languages and several modelling platforms.
1) Petri nets: Petri nets are a low level language, but they are extremely popular
within academia with over 150.000 hits in Google Scholar. There is extensive
research dedicated to their characteristics and the techniques that derive from
these (Murata 1989) and most contemporary BPM systems are based on them (van
der Aalst 2013). For the creation of models in Petri nets the researcher was able to
access Yasper, which uses coloured Petri nets (van Hee et al. 2006) and has a
simulation module that among other features supports the automated verification
of the correctness of the models created with it.
2) YAWL: The YAWL (Yet Another Workflow Language) language is at a higher
level of abstraction than Petri net. This makes it potentially easier to understand
than Petri Nets, but with close to 3000 hits on Google Scholar, YAWL is not as
widely known in academia and to be best of the knowledge of the researcher it is
not known in the localisation industry. The YAWL platform does however offer a
modelling module with syntax verification (Van Der Aalst and Ter Hofstede
2005; Russell et al. 2007).
3) BPMN: Business Process Model and Notation (BPMN) is the de-facto standard in
business process modelling (Chinosi and Trombetta 2012). There are multiple

58

tools that support modelling in BPMN (Yan et al. 2011). However, none of the
tools available to the researcher had simulation modules that enabled the
verification of the correctness of the models.
Considering their widespread usage in academia and the availability of a tool that guarantees
the correctness of the models created with it, the researcher opted for using Petri nets in this
thesis. The correctness of the model refers to what Krogstie et al. (2006) call syntactic quality
which according to them is the only criteria for quality of a model that has any hope of being
objectively measured. However, this does not mean that Petri Nets are perfect.
As noticed by Moody, language designers have considered the visual aspect to be
secondary and a matter of aesthetics mainly (Moody 2009), as a result, Petri nets, one of the
older languages, are not necessarily the best looking representation.
Petri nets concrete syntax, that is the representational aspects such as symbols, colors and
position, of the various types of nodes in a process model (e.g. tasks, events, gateways,
roles) (La Rosa, ter Hofstede, et al. 2011), is relatively simple with few elements that need
to be remembered. Not all elements are used in the models in this thesis. The ones used in
Chapter 4 can be seen in Figure 7Figure 7. The complete Yasper implementation of Coloured
Petri nets has more elements that can be found in Yaspers user guide (Yasper User Guide
2005).

Figure 7 Petri nets elements used in this thesis

59

The following statements should be borne in mind in order to understand the models in
Chapter 4.
1) All elements that are not arcs are known as nodes.
2) Emitors produce tokens. This is the action of initiating a process.
3) Collectors consume tokens but do not produce tokens. This is the action of ending a
process.
4) Transitions consume and produce tokens. They represent the actions that constitute
the process.
5) Places store tokens. They represent the state of the process.
6) XOR splits and joins consume and produce tokens. They represent choices in how the
tokens are routed.
7) Arcs connect nodes.
8) Reset arcs empty all the tokens in the place they are linked to when the transition that
they are linked to is carried out.
9) If there is a token in the place of origin, inhibitor arcs prevent transitions from
happening
10) Subnets consume and produce tokens. They contain processes and are used for
convenience to avoid models getting too big to visually analyse.
The combination of a visual vocabulary with limited number of symbols means that Petri
Nets are cognitively economical (Moody 2009). However, the lack of support for higher
order abstractions results in models that can be cumbersome when compared to the same
processes modelled in other languages. Figure 8 below illustrates this for a multiple instance
task that has to be carried out ten times. Some behaviours, like multiple instance tasks where
the number of instances is unknown can be even more complicated.

Figure 8 The same behaviour modelled in Petri nets and BPMN 2.0

60

2.4.6.1 Guidelines for modelling languages.


Moody (2009) proposed a series of principles for the creation of notations, that is the
languages, not the models that are a useful tool to understand some of the limitations of
Petri Nets for the purpose of this thesis.
Petri nets do not follow the principle of semiotic clarity which states that there should be a 1
to 1 correspondence between semantic constructs and graphical symbols. This could only be
achieved in a very specialized modelling language, since having a graphical symbol for each
possible task does not make sense in a language that should be able to represent any kind of
process.
Petri nets as they appear on Yasper generally follow the principle of perceptual
discriminability which states that different symbols should be clearly distinguishable from
each other. This is visible in the selection of symbols above, with case insensitive places (not
used in this thesis) being the only exception since the colour difference may not be intense
enough for it to make them clearly distinguishable from case sensitive places.
Petri nets as they appear on Yasper do not follow principle of semantic transparency which
dictates that one should use visual representations whose appearance suggests their meaning.
As with the principle of semiotic clarity, this principle only makes sense for a specific
purpose modelling language.
Petri nets as they appear on Yasper follow the principle of complexity management which
states that modelling languages should include explicit mechanisms for dealing with
complexity. Moody suggests modularization and hierarchies, which can both implemented by
using subnets.
Petri nets as they appear on Yasper do not follow the principle of cognitive integration
which suggests that modelling languages should include explicit mechanisms to support
integration of information from different diagrams. Following this principle would be useful
for example when combining process models with architectural diagrams, but that is beyond
the scope of this thesis.
Petri nets as they appear on Yasper do not follow the principle of visual expressiveness,
which suggests using the full range and capacities of visual variables (shapes, colours, visual
textures, etc.). Compared to the approach used in cartography, the discipline that Moody

61

brings up as a good example of a language that follows this principle, Petri nets are very
simple.
Petri nets as they appear on Yasper follow the principle of dual coding, which recommends
using text to complement graphics. Because Petri nets are general purpose, using text is
actually a requirement to understand what happens in any transition.
Petri nets as they appear on Yasper follow the principle of graphic economy which suggests
that the number of different graphical symbols should be cognitively manageable. This
principle has to be considered while striking a balance with the principles of semiotic clarity
and semantic transparency. Petri nets in general follow this principle to a fault, as a result of
being a low abstraction level language that works with primitives. A consequence of this is
that the same process that can be modelled with a single symbol in other languages requires
the combination of multiple symbols in Petri nets.
Petri nets as they appear on Yasper do not follow principle of cognitive fit, which suggests
using different visual dialects for different tasks and audiences. Petri nets are a general
purpose language and as such it has not been adjusted to create model dedicated to processes
belonging to specific fields. From the point of view of the audience, academics may be
familiar with Petri nets, but industry experts are more likely to be familiar with BPMN.
Overall, Petri nets follow the principles that are most relevant for this research.
2.4.6.2 Guidelines to increase the quality of models
According to Bandara et al (2005), there have been few empirical studies on process
modelling and no studies that identify and describe what elements should exist in a process
modelling project or how to evaluate the success of a process modelling project. However,
guidelines for the creation of successful workflow models have been proposed both before
and after Bandaras research.
La Rosa, ter Hofstede, et al. (2011) developed a series of patterns to improve process
models by making changes to its concrete syntax, that is the visual representation of the
model. It is this researchers view that most of these patterns are desirable features in
modelling languages and tools that could be used to improve models and not actual modelling
practices.

62

The patterns developed by La Rosa, ter Hofstede, et al. (2011) are listed below and
accompanied by observations to their applicability for the models in this thesis and how they
relate to Moodys (2009) principles.
Layout Guidance: Suggests that one should make the model orderly by avoiding crossing
arcs; keeping one general direction; placing incoming and outgoing arcs on opposite sides of
tasks and keeping related items close to each other. This pattern was compromised in many
of the models in Chapter 4. Crossing normal arcs could be avoided in most cases, but reset
arcs almost invariably had to cross other arcs. The direction of the flow was changed in order
to allow for the models to fit on a page. Finally, Yasper automatically distributes the
positions of incoming and outgoing arcs, which occasionally resulted in them not being on
opposite sides of the node.
Enclosure Highlight: Suggests that one should be able to visually enclose sets of elements
that are related and add comments to characterize them. Although Yasper does not support
this, it was done via image edition to the model for Kiva in order to identify a hypothetical
part of it.
Graphical Highlight: Suggests that there should be availability of features to change the
visual appearance of model elements. This pattern would be helpful in order to have a
model follow Moodys (2009) principle of semiotic clarity and principle of visual
expressiveness but could potentially go against the principle of graphic economy. Yasper
does not support graphical highlight.
Pictorial Annotation: Suggests that one should be able to modify icons to make them
convey more information. This is helpful in order to have a model follow Moodys (2009)
principles of semiotic clarity, semantic transparency, visual expressiveness, but it can
potentially go against the principle of graphic economy. This pattern is not supported by
Yasper.
Textual Annotation: Suggests that one should be able to visually represent free-form text
in the model. Yasper does not allow this.
Explicit Representation: Suggests that one should be able to capture process modeling
concepts via a dedicated graphical notation. This is helpful in order to have a model follow
Moodys (2009) principles of semiotic clarity, semantic transparency, and visual
expressivenesss, and enables the principle of cognitive fit. Yasper does not allow this.

63

Alternative Representation: Suggests that one should be able to capture process modeling
concepts without the use of their primary graphical notation. Specifically, they suggest
being able to represent complex concepts that require multiple elements with a single
element, in order to simplify models. . This is can potentially go against Moodys principles
of graphic economy. Yasper does not support it.
Naming Guidance: Suggests that there should be naming conventions, or guidelines for the
creation of elements labels. These can be syntactic (e.g. using a verb-object style) or
semantic (e.g. using a domain-specific vocabulary). Mending et al (2010) proposed a use
verb-object labels convention that can be considered an implementation of this pattern.
Yasper does not enforce it, but allows it.
La Rosa, Wohed, et al. (2011) also developed a series of patterns to improve process models
by addressing their abstract syntax, that is the structure created by model elements and the
way they are interrelated. These patterns are defined in a manner more similar to design
patterns. Summarized descriptions of these patterns follow and are accompanied by
explanations of how they were leveraged when developing the models in Chapter 4.
Block-Structuring: Organize elements in structured blocks. Structured blocks are sections
of a model that have matching splits and joins (Mendling et al. 2010). Although this pattern is
very appealing, the need to model loops in order to support the generation of multiple
instances in many of the processes modelled in Chapter 4, prevented most of the processes
from being highly structured.
Duplication: Repeat model elements where doing this helps making the model clearer. This
practice was taken on account, but the researcher was not able to find a model that would
benefit from it in Chapter 4.
Compacting: Delete elements that are redundant. As with duplication, an opportunity to
apply this pattern did not present itself when developing the process models in Chapter 4.
Vertical Modularization: Decompose the model in hierarchical subprocesses. Mendling et
al (2010) recommend decomposing models in general when these contain over fifty elements.
This pattern was applied in the models for Crowdin, Facebook, Pootle, Launchpad, DotSub
and Amara. Even within those models where the potential to further modularise existed, the
inability to model messages sent from outside a module affecting it constrained the use of this
pattern. For example, the translating and voting parts of the models could have been
represented as modules, but often the need to send messages from other parts of the process
64

to reset places in order to close the loops within those parts of the process prevented the
further application of the pattern.
Horizontal Modularization: Decompose the model in modules that are at the same level,
often using the criteria of who is in charge of enacting each module as a guide. This is
another way of implementing the recommendation to decompose big models made by
Mendling et al (2010). This pattern was not applied in the process models of Chapter 4.
Orthogonal Modularization: Decompose the model in modules that are at the same level,
often using the criteria of concern of the module as a guide. This is another way of
implementing the recommendation to decompose big models made by Mendling et al (2010).
Two examples or concerns are security and privacy. This pattern was not applied in the
process models of Chapter 4.
Composition: Consolidate modules into a single unified model. This pattern was not applied
in the process models of Chapter 4.
Merging: Combine similar process models based on their commonalities. This pattern was
not applied in the process models of Chapter 4.
Omission: Remove one or more elements from a process model and reconnect the remaining
ones. This pattern was used on the models for Amara and DotSubs, where the generation and
timing of subtitles is represented by a subnet that omits all the detail of the process.
Collapse: Synthesize multiple model elements into a single one of more abstract nature,
where the distinction among the constituent elements is no longer relevant. This pattern is a
way of implementing Mendling et als (2010) guideline to use as few elements as possible.
This pattern was not applied during the creation of the models in Chapter 4.
Restriction: Use only selected elements from the modelling language in the model. This
pattern was used in the models in Chapter 4. Although Yasper can explicitly model data
storage and data manipulation in processes, these features were not used.
Extension: Extend the syntax and semantics of a process modelling language by adding new
modelling concepts. It could be argued that the highlight box in Kivas model is an
implementation of this pattern, but otherwise, this pattern was not applied during the creation
of the models in Chapter 4.
Mendling et al (2010) also developed a series of guidelines. Mendling, Van der Aalst and
Reijers are co-authors of the papers by La Rosa, Wohed, et al. (2011) and La Rosa, ter
65

Hofstede, et al.(2011) and there is an overlap in the guidelines that they proposed that has
been addressed already. However, three of the guidelines proposed by Mendling et al (2010)
did not appear in the latter papers. The guidelines were:
Use one start and one end event: The guideline of having a single start and end events was
followed in the models in Chapter 4.
Minimize the incoming and outgoing arcs per element: The guideline of minimizing the
incoming and outgoing arcs per element was taken on account, but found to be not applicable
since any change to the number of arcs would imply a change in the behaviour of the model.
Avoid OR splits and joins: Petri nets as implemented by Yasper do not support OR splits
and joins. If that behaviour appeared in any process it was modelled by combining AND and
XOR splits and joins.
2.4.6.3 Immeasurable quality
Although the literature often addresses quality (Becker et al. 2000; Krogstie et al. 2006;
Mendling et al. 2010; La Rosa, ter Hofstede, et al. 2011; La Rosa, Wohed, et al. 2011), it
generally focuses on practices to improve it and not on ways of measuring it. This is
underlined by Krogstie et als (2006) statement that the only quality that can be objectively
measured is the syntactic quality of the model. Instead of proposing ways to assess the quality
of existing models, the literature generally provides guidelines, practices or tools that if used
are expected to help produce better models. By using Yaspers tools, the syntactic quality of
the models in Chapter 4 is guaranteed.
Becker et al (2000) point out that pre-existing quality frameworks for models focus on either
such specific aspects or such high level aspects that deriving useful guidelines to improve
quality is difficult. But this is not the only issue, in studying the guidelines discussed in the
previous section; the researcher noticed that there are opposing forces that have to be
balanced. For example, all the modularization practices proposed by La Rosa, Wohed, et al.
(2011) have to be balanced with the composition practice that they themselves propose in the
same article. Mendling et al (2010) propose a number, fifty elements, as the barrier when
modularization should be used, because in their own previous research (Mendling et al. 2007)
they found the number of errors in models above that size to be bigger. However, it is the
researchers opinion that modularization should also be done considering other factors, as it is
suggested by La Rosa, Wohed, et al. (2011) by suggesting three approaches to
modularization.
66

Becker et al (2000) suggest that as process models are used by more people, the importance
of their understandability grows. They introduce a guideline of clarity stating that models
that are understandable are better. Interestingly, that same paper considers it an optional
guideline. In contrast with the guideline of clarity being optional, the idea that a more
understandable model is better plays a central role in the guidelines proposed by other authors
(Krogstie et al. 2006; Mendling et al. 2010; La Rosa, ter Hofstede, et al. 2011; La Rosa,
Wohed, et al. 2011). As such, every effort was made to make the models in Chapter 4
intelligible, but it is the researchers view that the complexity of some of the processes
resulted in models that are still challenging to understand.
Becker et al (2000) also talk about semantic correctness. Semantic correctness is the level of
correspondence between the structure and the behaviour of the model and the real world
process. No other paper in the literature about the creation of models brought this up
explicitly, but research about automated methods for checking or adapting existing models
for semantic correctness exists (Ly et al. 2006; Dijkman et al. 2008; van der Aalst et al.
2010). Mendling (2008) observes nonetheless that currently the only way to identify
inconsistencies between the model and the process is to talk to the stakeholders. With the
exception of the ones for Asia Online, Facebook and Kiva, all models in Chapter 4 were
created with the researcher performing the role of the stakeholders in order to keep the
highest possible semantic correctness. In the case of the model for Facebooks translation
platform, the model is based on talk by Losse (2008) and the researchers own experience
with the platform. The model for AsiaOnlines process is based on Vashees (2009) own
model. The model for Kivas process is based on the researchers own experience as a
translator and contains a purely hypothetical part that is marked as such.
Finally, Becker et al (2000) talk about guidelines of economic efficiency, relevance,
comparability and systematic design, but none of them had impact on the models in Chapter
4.

2.4.7 Summary of the Literature Review for Workflows


This section of the literature review introduces the concept of workflow and discusses the
function of workflow models as tools for understanding and supporting the practices. It also
reviews some process representations from the industry and presents some representations of
processes for crowdsourced translation. It offers an overview of modelling practices and the
selected notation: Petri nets. This is followed by a review of guidelines for improving the
67

quality of process models and points out how these were taken on account for the models in
Chapter 4.

68

Chapter 3 Taxonomy of crowdsourcing


Crowdsourcing was discussed in depth in Chapter 2. In the context of that discussion, a
definition for crowdsourcing was selected from the ones provided in the literature. This
definition is valid, but also very broad, and the level of resultant generality lacks the detail
necessary for practitioners who are burdened with the responsibility of designing workflows
for crowdsourced translation scenarios. In order to become able to discuss different types of
crowdsourcing, Chapter 2 presented an overview of different classifications for
crowdsourcing. Among these, the taxonomy produced by Geiger, Seedorf et al (2011) was
found to be useful by satisfying the criteria of conciseness, inclusiveness, comprehensiveness
and extendibility proposed by Nickerson et al (2009). Considering the observation that
taxonomies can be moving targets and that there is a need to focus on a useful taxonomy,
instead of an optimal one (ibid.), this chapter presents builds upon by Geiger, Seedorf et als
(2011) work by adding twelve new platforms and a new characteristic to the Aggregation of
Contributions dimension.
The additional data was obtained through three approaches. For the eight platforms appearing
in Chapter 4, the data was generated in the process of creating the process models. For the
remaining five platforms, the data was obtained through an online survey.
The new characteristic for the Aggregation of Contributions dimension, that first appeared
in Table 6 on page 38, is Select and Integrate. The researcher observed after carrying out
the survey and studying different process models that for crowdsourced translation platforms
it was common to select a translation from multiple alternatives that the crowd has suggested.
These selected translations would then be integrated in the final output of the system. Geiger
et als data for Facebooks translation was updated to reflect this change in the taxonomy.
When the Geiger, Seedorf et als (2011) taxonomy was extended, the resulting clusters were
found to no longer be meaningful. This chapter provides an overview of why these clusters
were meaningless. Following this, there is a second attempt to keep Geiger, Seedorf et als
clustering approach by modifying the number of clusters. This attempt likewise resulted in
meaningless clusters. Lastly, two clustering algorithms k-means and hierarchical clustering
using between-groups linkage and squared Euclidean distance were selected to carry out
another clustering attempt. Both approaches produced the same clusters and those clusters
were meaningful and different from Geiger, Seedorf et als (2011) as it will be see in section
3.2.2.1.
69

3.1 Data Collection


The intention of the researcher was to collect data from a limited number of platforms in
order to create a taxonomy that could be generalised to platforms that were not included in
the initial data. Two approaches were used to collec the data:

Data generated during the creation of the process models that appear in Chapter 4 and
is presented in section 3.1.1 in this chapter.

Data collected by means of an online questionnaire that is discussed in section 3.1.2 in


this chapter.

Table 9 Characteristics of the platforms obtained through the modelling process

Name

Aggregation of
Contributions

Accessibility
of
contributions

Remuneration
of
contributions

Preselection of
contributions

Amara
(Universal
Subtitles)

Integrative

Modify

No

No

Asia Online

Select and
Integrate

None

No

Qualificationbased

Crowdin

Select and
Integrate

Assess

No

No

DotSub

Integrative

Modify

No

No

Facebook

Select and
Integrate

Assess

No

No

Kiva

Integrative

None

No

Qualificationbased

Launchpad

Select and
Integrate

Assess

No

No

Pootle
(closed
project)

Select and
Integrate

Assess

No

Qualificationbased

3.1.1 Data from Models


By 2011 the researcher had created models for processes enabled by Crowdin, Asia Online,
Facebook and Pootle in order to compare them to the processes supported by GlobalSight and
Idiom World Server (Morera et al. 2012). Later models for Amara, DotSub, Kiva and
70

Launchpad were created; this time with the intention of better understanding existing
crowdsourced translation practices and identifying the practices that appear in them. With the
knowledge acquired during the creation of the models, the researcher was able to determine
the characteristics of the framework developed by Geiger, Seedorf et al (2011) that each
platform displayed. The data obtained through this approach can be seen on Table 9. A
detailed methodology for the creation of the models is described in Chapter 4.

3.1.2 Online Questionnaire and Survey considerations


The data obtained through the creation of process models was too scarce to develop a
taxonomy. This lead to an online survey being carried out with the intention of increasing the
amount of data collected. According to Oates (2005), there are six different activities that
need to be carried out when using surveys for research
Data requirements: The intention is to collect the data necessary to expand Geiger et als
taxonomy or as it turned out, create a new one that is exclusive for crowdsourced translation.
Bearing this in mind, the survey had to generate data that identified the platforms, the
organisation using the platforms and all the characteristics of the dimensions identified by the
conceptual framework of Geiger et als taxonomy.
Data generation method: The survey was carried out by means of an online questionnaire.
Sampling frame: According to Oates the sampling frame is a collection of all the eligible
people that could be included in the survey. Saris and Gallhofer (2007) call this the
population. In this case, the population for the survey was people who either are involved in
the development of crowdsourced translation platforms or are part of organisations that either
use crowdsourced translation or offer crowdsourced translation facilities. Unfortunately, no
such lists exist and creating one is also not possible for several reasons. First, crowdsourced
translation initiatives disappear when they fail (Mesipuu 2010) making them difficult to
track. Second, as stated in Chapter 2, some organisations that use crowdsourcing hide this
fact (Ray and Kelly 2011). Third, other organisations use the term for marketing purposes
without clearly describing their process which may or may not fit the definition of
crowdsourcing.
Sampling technique: The reasons stated that inhibit a proper sampling frame for
crowdsourced translation platforms are the reasons why crowdsourced translation platforms
fall into the hard to identify and specific group categories that justify using
nonprobability samples (Fink 2003a).
71

This surveys sample is a convenience sample that consists of all those who actually
answered the survey (Oates 2005). There was an attempt at snow balling the sample, by
requesting those who answered to pass the survey along to representative of other platforms.
Two of the six respondents stated that they had passed the survey along, but the only answers
eventually collected came from people who had been directly contacted by the researcher.
Response rate: The response rate was of 35.7%, which is relatively high for an unsolicited
survey, compared to the 10% that Oates (2005) suggests and the 20% that Fink (2003a)
suggests frequently happen. Three of the answers were provided only after follow up emails.
Without follow up emails the response rate would have been below 18%.
Non-responses: Out of eight non respondents, five were companies that charge for
crowdsourced translation as a service as opposed to charging for access to their crowdsourced
translation platform. Representatives of two such companies did answer the survey, which
means that such companies are represented. However, it is the view of the researcher that
more companies like that would be necessary to get a holistic overview of the processes and
create a comprehensive taxonomy. It is possible that these companies were averse to
responding because they prefer not to share details about their process that they think give
them an edge over competitors.
Sample size: Oates recommends a minimum sample size of 30 if the intention is to carry out
statistical analysis. The sample size when using both the platforms collected by Geiger et al
(2011) and the platforms collected by the researcher is 58. Two attempts to clustering using
TwoStep clustering were carried out, but as seen below, the resulting clusters resisted
interpretation. A third and fourth attempts, using k-means and hierarchical clustering using
exclusively crowdsourced translation platforms resulted in clusters that could be interpreted.

3.1.3 Survey Administration


The survey took the form of a web based self-administered questionnaire. Self-administered
questionnaires are questionnaires that are given directly to potential respondents to answer
without supervision (Brace 2004; Fink 2009) and the delivery of questionnaires via web page
is one of the approaches to doing surveys suggested by Oates (2005).
The questionnaire can be seen in Appendix 1 Survey Questionnaire.
Following the recommendation to contact the potential respondents before sending them the
actual survey (Saris and Gallhofer 2007; Fink 2009), emails were sent where the background

72

of the researcher and the purpose of the survey was explained. The emails were customized
for each potential respondent, but the general template can be seen on Appendix 2. Besides
the text of the email itself, a consent and information sheet was attached to the emails. The
information sheet addressed issues such as anonymity, risks and the participation being
voluntary as advised in the literature (Fink 2009). The application for ethical clearance for
this survey can be found on Appendix 3.
The email also contained a link to the survey. Reminders were sent to those who did not
respond within a week and one follow up conversation was held to specify one of the
responses in the case of Mozillas usage of Pootle.
The responses were collected between May 2013 and September 2013. Since platforms
evolve, the same survey carried out during an earlier or later period may have collected
different responses. Bearing this in mind the survey should be considered cross-sectional
because it represents part of the crowdsourced translation platform ecosystem in that time
period (Fink 2009).

3.1.4 Survey Design


The main objective of the survey was to collect the values of the characteristics for each
dimension of the taxonomy. As such the questions in this survey were designed accordingly
(Fink 2003b; Saris and Gallhofer 2007). Operationalization, the translation of concepts into
questions (Saris and Gallhofer 2007), was only partially an issue, because the main
questions four to seven and their possible answers were pre-determined by the existing
conceptual framework developed by Geiger et al (2011). The pre-existing framework guided
the decision upon of subject and dimension and was also taken on account when including
requests for additional comments regarding the answer to each question. Although avoiding
open questions at the beginning of a questionnaire has been recommended (Brace 2004), the
researcher ordered the questions according their perceived level of difficulty, which resulted
in the first question being an open question.
Questions one to three had a different rationale not affected by Geiger et als (2011)
framework that is discussed below.
The only question perceived to be potentially ideologically loaded is question three that asks
about the rationale for using crowdsourced translation. Question three is the first question that
is not of an administrative nature and as a result there is no reason to worry about priming
and consistency effects affecting the response to this specific question (Brace 2004).
73

The quality of survey was piloted, as suggested by Saris and Gallhofer (2007), by asking two
programmers involved in the development of a crowdsourced translation platform to fill it out
and provide feedback on the design. The impact of this pilot is discussed where relevant
below.
3.1.4.1 Survey Questions
The questionnaire contains seven questions that are discussed below. Because of limitations
of the tool used Google forms, questions three to seven were presented as headline
followed by additional information that contained the request for an answer.
Question one asked the name of the organisation or platform with which the person
answering is involved. This was done in the form of the direct instruction (Saris and
Gallhofer 2007) Please, write below the name of the organisation/product with which you
are involved. Since different organisations can use the same platform in different ways
(illustrated by the usage of Pootle by this researcher and Mozilla), it was necessary to know
not only to which platform the answers referred, but also which organisations point of view
the answers provided. The respondent was presented with a free text box, since this was an
open request, given that no collection of answers could be foreseen. Two of the people who
answered were confused by the word platform. One asked if it was related to development
frameworks and the other asked if it was related to CAT tools.
Question two asked the name of the platform and the type of project the answers refer to, if
the platform allowed more than one type of project. This was done through the direct
instruction (Saris and Gallhofer 2007) Platform. Please, write below the name of the
platform that you use. which was accompanied by the additional specification Please, if
you use multiple platforms, answer only about one. If the platform has multiple
configurations (for example, "closed projects" and "open projects"), indicate the
configuration below and, if possible, answer the survey for the other configurations too.
When the person answering works in the development of the platform, asking for the name of
the platform is redundant, but it is still necessary when an organisation uses a third party
platform. As a result of the different configurations, a single platform could be able to
support different types of crowdsourcing.
The respondent was presented with a free text box, since this was an open request, given that
no collection of answers could be foreseen.

74

Question three asked for the rationale behind using crowdsourcing to translate. This was
done through the direct request Why does your organisation or the organisations using your
platform use crowdsourcing?. The rationale, which was not included in the original
taxonomy by Geiger, Seedorf et al (2011) could become a valuable dimension to add.
However, because of controversies around crowdsourced labour, it is possible that answers
regarding the rationale are not useful because the respondents may have tried to avoid
controversy. The question is posed as the direct request with WH word (Saris and Gallhofer
2007) Why does your organisation or the organisations using your platform use
crowdsourcing?. This is a common type of request when the researcher wants to determine
the cause or motive for something (ibid.). The response alternatives were three non-exclusive
multiple choice in the form of tickable boxes containing the two motivations that are brought
up most often: increase user engagement and cost saving. The third option other with a free
text allowed respondents to explain any reasons they or the organisations using their platform
may have to use crowdsourced translation. When the respondents ticked the third box some
of the answers involved the philosophy of their organisations.
Question four asked about remuneration of contributions. This was done through a direct
instruction that is implicit after presenting the additional information This refers to
compensations with market value, such as money, t-shirts, phone call credit. Badges on your
profile, karma points and other forms of recognition without a market value are not
considered remuneration.. Remuneration of Contributions is one of the dimensions in the
original taxonomy by Geiger, Seedorf et al (2011). The answer was in form of exclusive
multiple choice presented as radio buttons covering the characteristics outlined by Geiger,
Seedorf et al. and adding an other field in case their platform did not fit in the
characteristics already considered.
Question five asked about preselection of contributors. This was done through the direct
request Does your platform or organisation select translators before letting them
contribute?. Preselection of Contributors is one of the dimensions in the original taxonomy
by Geiger, Seedorf et al. The answer was in form of exclusive multiple choice presented as
radio buttons, covering the characteristics outlined by Geiger, Seedorf et al. and adding an
other field in case their platform did not fit in the characteristics already considered.
Question six asked about accessibility to peer contributions. This was done through the
indirect request To what extent can contributors access each others translations. This is
one of the dimensions in the original taxonomy by Geiger, Seedorf et al. The initial answer
75

was in the form of a single exclusive multiple choice. This required the respondent to provide
additional information and according to the feedback from the pilot, was deemed to be
difficult to understand. The exclusive multiple choice was replaced with three dichotomous
answers a in choice matrix with radio buttons, which was found easier to understand by the
pilot respondents.
Question seven asked about the Aggregation of Contributions. This was done through the
direct instruction If for each piece of source text a single translation is selected and
published, please write "selective". If your organisation combines several translations for a
single piece of text to produce the translation that will be published (for example at the
sentence level by combining subsegment matches), please write "integrative" and a brief
explanation of the process used to combine the different translations.. This is one of the
dimensions in the original taxonomy by Geiger, Seedorf et al. Although this dimension has
only two characteristics and exclusive multi choice was the natural answer, the researcher
foresaw that the answer would be conditioned by the point of view of the surveyed person.
For example, the researcher considers that Facebooks system is selective at the TU level,
since for any TU in a given locale, only one translation is used at any given time. But, besides
that, Facebooks system is also integrative because the translations of all the TUs are
integrated in order to create the current version of the locale. Bearing this in mind, the
answers to this question is free text and request an explanation of the process used to
integrate the translations. After the survey concluded and considering the processes analysed
in Chapter 4, the researcher considered that it was necessary to expand the categories for this
dimension, since it is common in crowdsourced translation to have a combination of selection
at the TU level and integration at the project level. This also resulted in Geiger, Seedorf et
als (2011) data for Facebook being updated accordingly.
Finally, the survey contained a free text box for the surveyed people to add any comments
they considered necessary and another to ask for their email address in case they were
interested in being contacted in the future.
Nobody used the comments box, however, several respondents provided additional
information in other free text answers that may have belonged there. Please, refer to
Appendix 5 to see answers to the survey. Bear in mind that they have been slightly edited to
preserve the anonymity of the respondents.

76

3.2 Clustering
As recommended in the literature, the development of a taxonomy has three stages. A
conceptual stage where the dimensions and characteristics are determined, an empirical stage
where clustering techniques are used to determine the taxa and a final conceptual stage where
the taxa are interpreted (Bailey 1994; Nickerson et al. 2009). This section discusses the
second stage, after having collected the data and expanded the conceptual framework
accordingly.

3.2. 1 TwoStep Clustering


After developing their conceptual taxonomy, Geiger, Seedorf et al (2011) clustered 46
instances of crowdsourcing processes. In order to have what Bailey (1994) calls a three level
model, they first developed the conceptual model, i.e. a set of dimensions with their
characteristics, that was discussed in Chapter 2. For the second level, they created an
empirical model by submitting the data to a clustering process. Finally, they submitted the
clusters to a conceptual process in order to attach meaning to them. The selected TwoStep
clustering mechanism allows clustering based on quantitative and qualitative variables
simultaneously, while also being able produce a set number of clusters, or automatically
determine the optimal number of clusters. This method first goes through the individual
elements and uses a distance criterion to decide if they belong to a previously existing cluster
or if they are the first element of a new cluster. The distance is measured in Euclidean
distance if the variables are continuous, which is not the case here, and log-likelihood
distance (the logarithm because the likelihood distance is often a very small number that is
inconvenient to manipulate), which is probability based and can be used for both categorical
and continuous data variables. The algorithm then uses a CF (clustering feature) tree to refine
the clusters.
This process resulted in five clusters that were then subjected by Geiger, Seedorf et al to
interpretation to derive the final conceptual model that follows:
Integrative sourcing without remuneration: often appearing in activities such as wikis, and
free user-generated content.
Selective sourcing without crowd assessment: The example being Innocentive.
Selective sourcing with crowd assessment: The example being Threadless.
Integrative sourcing with success-based remuneration: The example being iStockphoto.
Integrative sourcing with fixed remuneration: The example being Mechanical Turk.
77

Table 10 Additional data for the taxonomy

Name

Aggregation of
Contributions

Accessibility
of
contributions

Remuneration
of
contributions

Preselection of
contributions

Motivation for
using
Crowdsourcing

Amara
(Universal
Subtitles)

Integrative

Modify

No

No

Asia Online

Select and
Integrate

None

No

Qualificationbased

Crowdin

Select and
Integrate

Assess

No

No

DotSub

Integrative

Modify

No

No

GetLocalisa
tion

Selective

Assess

No

Contextual

Increase
engagement and
save costs

Kiva

Integrative

None

No

Qualificationbased

Launchpad

Select and
Integrate

Assess

No

No

Pootle
(closed
project)

Select and
Integrate

Assess

No

Qualificationbased

Pootle
(Mozilla
usage)

Select and
Integrate

Modify

Success (based
on contribution
metrics)

No

Increase
engagement and
fitting for the open
source philosophy

Transifex

Integrative

Modify

Always (project
dependent)

No (project
dependant)

Increase
engagement and
save costs

VerbalizeIt

Integrative

Assess

Success

Both (context
and
qualification)

Business
opportunity

Zanata

Select and
Integrate

Modify

No

No

Increase
engagement and
expect quality
from community

Geiger, Seedorf et als experiment was reproduced with their same data and produced
identical results. After this, the researcher extended the conceptual framework of the
taxonomy by adding one characteristic to the aggregation dimension. The new characteristic
is selection and integration because several crowdsourced translation platforms collect
multiple translations for a single TU and then select one that is integrated in the published
translation. The extension of the framework results in it theoretically having 144 possible
78

combinations. Taking this information into account, the researcher modified Facebooks
entry for aggregation to selection and integration. The researcher also updated the preselection of contributions for Facebook to contextual, given that only users that have had an
account for a given amount of time can join the translation effort (Mesipuu 2010).

Besides extending the conceptual framework, the researcher added data for Amara (Universal
Subtitles), Asia Online, Crowdin, Dotsub, GetLocalisation (both open and closed projects),
Kiva, Launchpad, Pootle as used by the researcher to produce the process model, Pootle as
used by Mozilla, Transifex, VerbalizeIt and Zanata. The values for the different dimensions
of these platforms appear in Table 10.
The original taxonomy had 96 potential combinations in its framework and 19 were
represented on its 46 data points. With the additional 12 data points, 29 out of the 144
possible combinations were represented. The results of the addition of data and modification
of the number of clusters in an attempt to obtain clusters that could be interpreted can be seen
on Table 11 on page 82, that also shows the original clusters resulting from Geiger, Seedorf
et als (2011) for comparison purposes.

According to the SPSS documentation (IBM 2013), SPSS determines three possible
valuations for the fitness of a group of clusters, this fitness is called Silhouette measure of
cohesion and separation in SPSS terminology, this measure goes from -1 to +1 and in SPSS
graphics, it is represented as a continuous bar divided in three colours, each them
representing a valuation, where pink represents a valuation of poor fitness, yellow represents
fair fitness and green represents good fitness.

These valuations are based on the work of Kaufman and Rousseeuw (2009) and are as
follows:

Poor: Indicates that there is no significant evidence that the members of a cluster
belong together.

Fair: Indicates that the evidence that the members of a cluster belong together is
weak.

79

Good: Indicates that the evidence that the members of a cluster belong together is
reasonable or strong.

With the algorithm manually configured to output five clusters as in Geiger et al (2011) the
fit was fair but close to being good, as shown in Figure 9. However, keeping five clusters
caused a change in the clusters where the Selective Sourcing with Crowd Assessment cluster
(cluster three, in purple in first column of Table 11) was merged with the Selective Sourcing
without Crowd Assessment cluster (cluster five, in orange in first columns of Table 11). It
would be possible to call this the selective sourcing cluster since this is the feature of the
cluster that immediately stands out. However, the fact that this is the only overall common
feature of the cluster makes it difficult to attach meaning to it.

Besides that, the Integrative Sourcing without Remuneration cluster (cluster one, in pink in
the first column of Table 11) was replaced by two new clusters.

Figure 9 Summary of TwoStep clustering with five clusters

One of the new clusters (cluster one, in red in the second column of Table 11) contains
former elements of the Integrative Sourcing without Remuneration cluster plus
GetLocalization, Crowdin, Launchpad, Pootle (closed project). This cluster mixes:

Integrative, Select and Integrate, and selective aggregation (all the possible)

Assess accessibility (one out of four)

No payment (one out of three)

No and context based Preselection of Contributors (two out of three)

The defining feature of this cluster is the combination of assess level and no payment. This
cluster could be called no payment with crowd assessment. However, the fact that it
80

contains all the possible aggregation characteristics makes it difficult to attach any semantic
utility to it.
Seven of the translation platforms Amara (Universal Subtitles), Transifex (professional),
DotSub (open project), Zanata (open project), Pootle (Mozilla projects), Kiva, Asia
Online) appeared in the other new cluster (cluster 2, pink in the second column). This
cluster mixes platforms with:

Integrative, and Select and Integrate aggregation (two out of three)

View, none, and modify Accesibility (three out of four)

None, success based, and fixed payment (all the possible)

None, qualification based, context based, and both Preselection of


Contributors (all the possible)

This cluster is too diverse to select defining feature, because it contains every platform that
did not fit in any of the other clusters that do have defining features.
The resulting clusters when constrained to five clusters, resisted interpretation. This could be
because of insufficient data point, the algorithm not being able to integrate background
knowledge field knowledge that can be used to modify the behaviour of the algorithm
(Wagstaff et al. 2001) or a combination of both.
If the algorithm is configured to produce six groups, the fit of the clusters is very slightly
improved as depicted in Figure 10, where the purple bar that indicates the quality of the
clusters is slightly closer to the threshold value between fair and good quality. This indicates
that the clusters obtained when constraining the number of clusters to six, contain items that
are slightly more similar among themselves and slightly more dissimilar to the items in the
other clusters.

81

Table 11 Original cluster distribution and distribution after data expansion


Five clusters after data and

Original five Clusters

Six clusters after data and

taxonomy extension
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
3
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5

Delicious
Digg
Facebook Translate
Fashiolista
TripAdvisor
YouTube
Amazon reviews
Camclickr
Google Image Labeler
ReCaptcha
Hollywood Stock Exchange
Wikipedia
OpenStreetMap
Angie's List
eBay reputation system
Emporis Community
Android Market
Apple AppStore
Yahoo! Contributor Network
iStock Photo
YouTube Partners
99designs ready-made logo design
Coolspotters
Iowa Electronic Markets
Atizo (Atizo Community)
Cisco I-prize
Threadless
Atizo (Own Community)
InnoCentive atWork
Dell Ideastorm
e-Rewards
Microtasks
LiveOps
Castingwords
Mechanical Turk
Netflix Prize
InnoCentive Challenge Center
99designs (private contests)
Brainrack
Calling all Innovators
Crowdspring (private contests)
Designlassen.de(private contests)
idea bounty
99designs (public contests)
Crowdspring (public contests)
Designlassen.de(public contests)

1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
2
2
4
4
4
4
4
1
5
5
5
5
5
4
4
4
4
4
4
4
4
4
4
4
2
2
2
2
2
2
2
1
1
1
1
4

Delicious
Digg
Facebook Translate
Fashiolista
TripAdvisor
YouTube
Amazon reviews
Camclickr
Google Image Labeler
ReCaptcha
Hollywood Stock Exchange
Wikipedia
OpenStreetMap
Angie's List
eBay reputation system
Emporis Community
Android Market
Apple AppStore
Yahoo! Contributor Network
iStock Photo
YouTube Partners
99designs ready-made logo design
Coolspotters
Iowa Electronic Markets
Atizo (Atizo Community)
Cisco I-prize
Threadless
Atizo (Own Community)
InnoCentive atWork
Dell Ideastorm
e-Rewards
Microtasks
LiveOps
Castingwords
Mechanical Turk
Netflix Prize
InnoCentive Challenge Center
99designs (private contests)
Brainrack
Calling all Innovators
Crowdspring (private contests)
Designlassen.de(private contests)
idea bounty
99designs (public contests)
Crowdspring (public contests)
Designlassen.de(public contests)
Amara (Universal Subtitles)
Transifex (professional)
DotSub (open project)
Zanata (open project)
Pootle (Mozilla projects)
Kiva
Asia Online
GetLocalization
Crowdin
Launchpad
Pootle (closed project)
VerbalizeIt

82

taxonomy extension
1
1
1
1
1
1
1
2
2
2
2
3
3
2
2
2
4
4
4
4
4
4
3
3
5
5
5
5
5
1
6
6
6
6
6
5
5
5
5
5
5
5
5
5
5
5
3
3
3
3
3
2
2
1
1
1
1
4

Delicious
Digg
Facebook Translate
Fashiolista
TripAdvisor
YouTube
Amazon reviews
Camclickr
Google Image Labeler
ReCaptcha
Hollywood Stock Exchange
Wikipedia
OpenStreetMap
Angie's List
eBay reputation system
Emporis Community
Android Market
Apple AppStore
Yahoo! Contributor Network
iStock Photo
YouTube Partners
99designs ready-made logo design
Coolspotters
Iowa Electronic Markets
Atizo (Atizo Community)
Cisco I-prize
Threadless
Atizo (Own Community)
InnoCentive atWork
Dell Ideastorm
e-Rewards
Microtasks
LiveOps
Castingwords
Mechanical Turk
Netflix Prize
InnoCentive Challenge Center
99designs (private contests)
Brainrack
Calling all Innovators
Crowdspring (private contests)
Designlassen.de(private contests)
idea bounty
99designs (public contests)
Crowdspring (public contests)
Designlassen.de(public contests)
Amara (Universal Subtitles)
Transifex (professional)
DotSub (open project)
Zanata (open project)
Pootle (Mozilla projects)
Kiva
Asia Online
GetLocalization
Crowdin
Launchpad
Pootle (closed project)
VerbalizeIt

A new, meaningful cluster appears (cluster 3, in yellow in the third column of Table 11). This
cluster contains platforms with:

Integrative and Select and Integrate aggregation (two out of three)

Modify accessibility (one out of four)

No, success based and fixed payment (all the possible)

No Preselection of Contributors (one out of four)

The lack of preselection and the ability of contributors to modify the contributions of others
indicate a process that could be similar to that of an open wiki. The fact that all payment
possibilities are present is problematic, since paying contributors for modifying existing work
can invite abuse of the system.
Interpreting the six clusters is however still challenging given the presence of the selective
sourcing and no payment and crowd assessment clusters and a cluster that contains the
platforms that did not fit in the rest.

Figure 10 Summary of TwoStep clustering with six clusters

3.2.2 Other Approaches to Clustering


Given that the TwoStep algorithm did not produce meaningful clusters after extending the
conceptual framework by adding an additional characteristic to the Aggregation of
Contributions and adding the data for crowdsourced translation platforms, other approaches
to clustering were investigated.
Fasulo (1999) observes that there are four criteria to take into account when selecting a
clustering algorithm: Data and cluster model, scalability, noise and result presentation.
83

Data and cluster model: Without a priori knowledge of the shape of the desired
clusters to guide the selection of the algorithm (Fasulo 1999; Mythili and Madhiya
2014), the researcher could not use this criteria in this research.

Scalability: The small number of dimensions, characteristics and data points used
resulted in scalability not being of importance for the selection of the algorithm.

Noise: By limiting the data to the crowdsourced translation platforms the sample
becomes free of noise, so noise handling did not play a role in the selection of the
algorithm.

Result presentation: SPSS offers a selection of options to represent the results of any
of the clustering algorithms it can execute. For this reason result presentation did not
influence the selection of the algorithm.

Given that none of the criteria suggested by Fasulo where helpful, the researcher resorted to
trial and error, which has been suggested as the best option in cases like this (Mythili and
Madhiya 2014).
The researcher decided to use two approaches to clustering by selecting a partitioning method
and a hierarchical method.
Partitioning methods construct a number of clusters referred to as k that is dictated by the
researcher. These methods compare the objects that are being clustered by searching for
candidate partitions where the objects are similar among themselves and also different from
the objects in a different partition (Kaufman and Rousseeuw 2009).
There are two types of hierarchical methods: agglomerative and divisive. Agglomerative
methods start with as many clusters as there are data points and at each pass through the data
it merges two clusters. This is done iteratively until it has a single cluster that contains all
objects. Divisive methods start with a single cluster that contains all objects and at each pass
a cluster is divided in two, until there is a cluster for each single object (ibid.).

3.2.2.1 K-Means clustering


The partitioning approach chosen was k-means clustering. K-means is a well-known and
popular algorithm that was first published in 1955 (Jain 2010).
Data Preparation

84

K-means requires the data entered to be in the form of numbers. This turned out to be
beneficial, because one of the weaknesses of TwoStep as used by Geiger, Seedorf et al
(2011) was that it worked with nominal variables, but this researcher sees all the dimensions
as ordinal variables in that they represent discrete scales (Kaufman and Rousseeuw 2009).
The Aggregation of Contributions dimension goes from least integrative to most integrative.
Bearing this in mind the researcher attached Select to the lowest end of the scale and
Integrate to the highest. Select and integratge was placed in the middle because it combines
both.
Accessibility of Contributions starts at level None, where there is no access and each level
beyond requires more access. The second level is View. The third level is Assess and, in this
context, in order to assess one must be able to view. The fourth level is Modify and in order to
modify, one has to be able to view and assess the existing translation as worth modifying.
The characteristics of Remuneration of Contributions represent the potential frequency with
which contributions receive rewards. At the lowest level there are No material rewards for
contribution, which means zero probability of receiving a reward. At the second level
rewards are given if the contribution achieves Success, which implies a variable probability
of receiving a reward. Finally, at the third level, a contributor Always receives a
compensation which implies a 100% probability of being rewarded.
Finally, the Preselection of Contributors starts with no preselection, moves on to selecting
people according to their context, with the third level requiring them to pass a test and
eventually to the most stringent preselection where they must meet contextual criteria and
pass a test.
As per Kaufman and Rousseeuw (2009) all variables were converted using the following
formula that also standardises them by ensuring that all values fall between zero and one.
1
1

In the formula, M stands for maximum rank, f stands for variable, r stands for rank, and i
stands for index of the object. Mf is the highest rank in a variable; rif is the rank of object i in
variable f and zif is the scaled value that will be provided to the algorithm. For example, the
ranks for aggregation of contribution are 1 for select, 2 for Select and Integrate, 3 for
integrate. The highest rank is thus 3. The value that must be entered for the Select and
Integrate characteristic is found by calculating:
85

21
31

The result of that calculation is 0,5.


Initially all dimensions carried the same weight, but the clusters resulting from this were not
amenable to interpretation, as depicted on Table 12, where platforms that are quite similar
like Amara and DotSub are included in the same cluster as platforms like Pootle (as used by
Mozilla) and Crowdin, that have much more in common with Facebook and Pootle (closed
projects).
Table 12 Cluster membership with the same weight assigned to all dimensions
Case Number

Name

Cluster

Distance

Amara (Universal Subtitles)

,267

DotSub (open project)

,267

Zanata (open project)

,181

Launchpad

,267

Pootle (Mozilla projects)

,446

10

Crowdin

,267

Transifex (professional)

,000

Facebook Translate

,183

GetLocalization

,367

Pootle (closed project)

,183

11

VerbalizeIt

,607

12

Kiva

,319

13

Asia Online

,375

An advantage of converting the scales to numbers is that it allowed the researcher to modify
the weights of different dimensions. After considering the issues with the first attempt to
cluster, the researcher decided to increase the weight of the aggregation of contribution
dimension by multiplying the converted values by three. This effectively equates to using the
aggregation dimension three times (Kaufman and Rousseeuw 2009). The results can be seen
in Table 13.
Clusters:
After applying the conversion above, the data was entered into SPSS 22 and submitted to kmeans clustering. Table 14 displays the results of the clustering.

86

Table 13 Conversion values for the characteristics of each dimension


Aggregation of Contributions

Accessibility of
contributions

Remuneration of
contributions

Preselection of
contributions

Select = 0

None = 0

No = 0

No = 0

Select and Integrate = 1.5

View = 0.33

Success = 0.5

Context = 0.33

Integrative = 3

Assess = 0.66

Always = 1

Test = 0.66

Modify = 0.99

Both = 0.99

Colony Translation
Cluster one contains Asia Online. Asia Onlines Wikipedia project selected and integrated
TUs, without letting contributors see each others translations, using unpaid volunteers who
could only contribute after passing a test (Vashee 2009). According to Vasheee, by collecting
three independent, different translations for each TU and selecting them via automated
matching or with the help of an expert, Asia Online achieved good translation at an extremely
high speed. Although the volunteers received no material rewards, they did gain opportunities
to participate in draws for different awards. The researchers view is that having volunteers
working in an isolated manner even when they are working on the same TU shows that, for
people using this approach, creating a community is not an objective. This impression is
further reinforced by the fact volunteers had to pass a test, which would severely reduce the
size of the community. Furthermore, although no material reward was given to contributors,
participations in draws for prizes were used to motivate them, this indicates that the
organisation might perceive the social aspects of the project to be insufficient to motivate the
crowd.
Table 14 Cluster membership after triplicating the weight of the Aggregation dimension
Case Number

Name

Cluster

Distance

Amara (Universal Subtitles)

,333

Transifex (professional)

,667

DotSub (open project)

,333

Facebook Translate

,224

GetLocalization

,224

Zanata (open project)

,290

Launchpad

,186

Pootle (closed project)

,224

Pootle (Mozilla projects)

,512

87

10

Crowdin

,186

11

VerbalizeIt

,448

12

Kiva

,448

13

Asia Online

,000

Zaidan and Callison-Burch (2011) used a process that was similar in many ways, since they
collected multiple translations for each TU. However, in their case there was a second layer
in the crowdsourced process, which was the assessment of the TUs in the form of a ranking.
The aggregation was thus Select and Integrate. The ranking process, however, was not done
by peers, but by another crowd and an algorithm. In this experiment, people at the translation
stage could not see the translations of others and people at the ranking stage could not see the
rankings of others. Effectively the peer accessibility in this experiment was None. The
platform used to carry out the process was Amazon Mechanical Turk, so it could be argued
that the preselection was Contextual, since only people with a Mechanical Turk account
could participate. However, having a suitable account is minimum requirement affecting
most platforms and this researcher considers it as a No preselection system. Finally, everyone
involved was paid for their contributions, meaning that they were Always rewarded. The
purpose of this experiment was to research the most cost effective way to create training
resources for MT systems.
Another example of a similar process is the one used by TXTEagle (Eagle 2009) that
collected redundant translations provided via SMS by different workers, who in principle do
not know the suggestions of other translators. This meant that their access is none. The
translation that is eventually used is selected using statistical methods and although no
reference to it is made in the published work, it could be argues that different selected
translations could be pooled together to create a text that is longer than an SMS or strings in a
piece of software. This means that the aggregation is Select and Integrate. The users of
TXTEagle are always rewarded with phone credit. Unfortunately, the paper does not address
if the selection process and for this reason, the platform could not be included in the data used
to create the taxonomy.
Bearing all the above in mind, the researcher considers that cost effectiveness and speed are
the main reasons to use this type of crowdsourced translation process and community is not a
priority for organisations using it. The proposed name for this type of cluster is Translation
by Colony Translation, after colonial organisms, which are multiple-cell organisms whose

88

cells, unlike the cells in multicellular organisms, can live independently from the bigger
organism. This underlines the fact that although contributors are working towards the same
objective, they do their part independently.
Wiki Translation
Cluster two contains Amara, DotSub and Transifex. All three platforms have integrative
aggregation and allow contributors to modify previously existing translations. Transifex, as
used by the respondent of the survey Always rewards contributions, while Amara and
DotSub, as used by the researcher provide No material reward for contributions. All of them
require an account and nothing in order else to contribute as configured by the researcher
and the respondent of the survey, indicating No preselection of contributors.
Although both Amara and DotSub have messaging functions, they are not tools that would be
used to create a community. Instead, as implied by the answers of Subject V in Chapter 5
whose organisation uses DotSub, they are platforms that were leveraged to deploy an existing
community.
The fact that the aggregation is integrative and that contributors can modify previous
translations pointed the researcher towards wiki style processes. For this reason the
researcher decided to name this cluster Wiki Translation.
Translation for Engagement
Cluster 3 contains Facebook Translate, GetLocalization, Zanata (open project), Launchpad,
Pootle (closed project), Pootle (Mozilla) and Crowdin. The defining features of this cluster
are Select and Integrate aggregation in almost all the platforms in this cluster, and Assess
accessibility.
Facebook has claimed that they use crowdosurcing because they expect higher quality from
their users (Losse 2008). Respondents from Mozilla, Zanata and GetLocalisation state
increasing user engagement as a reason to use crowdsourced translation. Mozilla added that
crowdsourcing the translation suited the open source ethos of the company. Most of them do
not preselect contributors (beyond the need for an account in order to participate). All of
these platforms work with strings as TU.
Considering that access is open for anyone to contribute and assess translations, and that, as
stated in the previous paragraph, for several people involved with platforms in this cluster,

89

one of the motivations to use crowdsourced translation is that it fosters engagement, the
researcher suggests that this cluster should be called Translation for Engagement.
Crowd TEP
Cluster four contains Kiva and VerbalizeIt. Both platforms in this cluster select the
contributors. Both organisations use integrative aggregation. In the case of Kiva via the
experience of the researcher as a translator for them, the integration happens at the site level,
when all independently translated loan applications appear on the site. There is no
Aggregation of Contribution at the TU level. Kiva does not materially reward its contributors,
but VerbalizeIt does it when their translations are successful.
The lack of selection or integration at the TU level in Kivas process is also a feature of the
process of the NGO for which Subject S in chapter 5 works. This and the existence of a
review stage are typical from a TEP process. Furthermore, although the deeper details of
VerbalizeIt process are not available for this research, their website shows they perform a
TEP process (VerbalizeIt 2014).
The fact that the process is an implementation of the TEP process made the researcher
consider the name Volunteer LSP style. However, the fact that this process can be carried out
using paid translators and that for some organisations, like Kiva there is a focus on
community (Petras 2011), leads to the eventual choice of Crowd TEP as a suggested name for
this cluster.

3.2.2.2 Hierarchical clustering


The hierarchical method selected was SPSSs hierarchical clustering. It is an agglomerative
hierarchical method (Burns and Burns 2008). As with k-means, the software was configured
to output four clusters. The clustering algorithm was between-groups linkage and the
measure to determine if linkage happens was Euclidean distance. The data was submitted to
the same preparation used for K-means and no standardization was applied since the only
deviation in the scale of the dimensions was purposeful.
Table 15 shows the output of the clustering process. Although the numbers given to each
cluster are different, it can be seen that the results are the same as when applying k-means,
which reinforces the impression of the reviewer that these are meaningful clusters.

90

Table 15 Cluster membership as per hierarchical clustering


Case

Clusters

1: Amara (Universal Subtitles)

2: Transifex (professional)

3: DotSub (open project)

4: Facebook Translate

5: GetLocalization

6: Zanata (open project)

7: Launchpad

8: Pootle (closed project)

9: Pootle (Mozilla projects)

10: Crowdin

11: VerbalizeIt

12: Kiva

13: Asia Online

3.3 Conclusions
This chapter opens by presenting the rationale for a taxonomy for crowdsourced translation
processes and then proposes extensions to the taxonomy developed by Geiger, Seedorf et al
(2011). The extension is the result of adding a new Select and intergrate characteristic to the
aggregation dimension and the addition of translation specific platforms. The chapter then
discusses the survey used to collect the necessary data elements used for the extension that
serves as the input to the clustering process. Having found that the TwoStep clustering
algorithm used by Geiger, Seedorf et al (ibid.) can no longer produce meaningful clusters, the
chapter addresses the clustering of just crowdsourced translation data using k-means. Kmeans produced meaningful clusters with the suggested names:

Colony translation: for processes where selected and aggregated translations


are produced by translators working independently.

Wiki style translation: for processes where anyone can modify the existing
translations.

Translation for Engagement: for processes where selected and aggregated


translations are peer assessed at the selection stage.

Crowd TEP: for processes where translators work individually with reviews
done by individuals like in the traditional TEP process.

91

Finally, the same data used in the k-means clustering is submitted to a hierarchical clustering
process and results in the same clusters, reinforcing the impression that the clusters are valid.
This new taxonomy enables the discussion about specific types of crowdsourced translation
and reduces the miscommunication caused by one-size-fits-all concepts that have been
frequently used before.

92

Chapter 4 Workflow Models


This chapter builds upon the discussion of Petri nets in Chapter 2 and discusses the approach
taken for the creation of the models. Workflow models for eight crowdsourced translation
platforms are presented and from these a number of practices are identified.

4.1 Static models vs simulable models


Petri nets have been used extensively in research and successfully applied in process
modelling and analysis (Sarshar and Loos 2005). The literature favours making the models as
understandable as possible (Becker et al. 2000; Mendling et al. 2010), and one of the ways of
doing so is limiting the number of elements in it. Petri nets have an element that explicitly
represents places. This severely increases the complexity of models that use them by
increasing the number of elements. However, the researcher considers that precisely in this
context of crowdsourcing where iterative and multiple instance tasks are frequently occurring
features, having places where the tokens are visible is valuable because it provides additional
detail. This additional detail results in more fine-grained representations of the workflows
that enable a deeper understanding of the processes. Furthermore, the place elements allow
Petri nets to explicitly representing tokens during model simulation. It is the opinion of the
researcher that when the intention is to understand a process, which is a dynamic entity, a
simulation is the optimal representation and Petri nets this ability to represent states by
placing tokens in places during simulation makes them especially suitable for this use case.

4.2 The models


This section presents a collection of workflow models for different crowdsourced translation
platforms. During the creation of the models the researcher had to make a choice between
soundness and semantic correctness. Soundness is achieved in a workflow model:
if and only if (a) from any reachable state it is possible to reach a state with a
token in the sink place (option to complete), (b) any reachable state having a
token in the sink place does not have a token in any of the other places (proper
completion), and (c) for any transition there is a reachable state enabling it
(absence of dead parts) (van der Aalst 2013).

Sound models are executable and can be the base for real processes. Semantic correctness, as
stated in Chapter 2 is the level of correspondence between the behaviour of the model and the
real world process (Becker et al. 2000).
93

The researcher favoured soundness because it is a guarantee of syntactic quality which, as


stated in Chapter 2 is the only criteria for quality of a model that has any hope of being
objectively measured (Krogstie et al. 2006), but still strived to achieve as much semantic
quality as possible.
Yasper can differentiate between tokens belonging to different cases, that is, tokens that were
or are derived from tokens produced by the emitter at different points in time. However, it
cannot discriminate tokens within a single case, that is tokens generated by token splits of
multiple instance tasks. This limitation of Yasper also forces some divergences between the
actual processes and the models.
In both cases, when the model purposely deviates from the actual process, this is noted in the
process description.

4.2.1 Crowdin
Crowdin has been brought up as an example of tool supporting crowdsourced translation
(Garcia 2010) and it has been used in the localisation
In order to simulate the different stakeholders, three different user accounts were created.
One account was the owner of the project and effective project manager and the other two
were used to simulate the crowd of translators. In Crowdin projects can configured as
managed if collaborators have to be accepted before they can contribute or open if
anyone can participate.

Figure 11 Crowdin Process Model at the Locale Level

A project owner can upload a file, select the locales that are taking translations, contribute as
a translator in their own project, select translations, download target files with the target

94

strings for each locale and decide if the locale is finished or not and at any time which results
in the locale being closed. Normal contributors can only suggest translations and vote. Project
creators can at any time promote contributors into group leaders. Besides translating and
voting, group leaders can also select translations.
Figure 11 represents the process at the locale level. This level of the process requires
supporting the Multiple Instances without Synchronization workflow pattern, because each
locale is an instance that runs independently from the others. The model in Figure 11 does not
show multiple locales as different instances because Yaspers inability to discriminate
between tokens from different instances within one case. In this situation using multiple
instances would make the model unsound and the simulation would not run correctly. If
Yasper were able to discriminate tokens by instance, the arc between the task select locales
and the M2 place would show the number of instances.
After creating an instance for each locale, the system divides the source text into TUs, which
results in an implementation of Multiple instances with a priori Run-Time Knowledge, where
there is an instance for each TU. Again, the model does not show the number of instances,
but if it did it would be visible on the arc between Divide Source into TUs and place M3.
The hypothetical instances from the TUs would be synchronized during the Merge TUs in
Target File task that happens when the project owner requests a download.
Figure 12 represents the pre-translation process at the string level. At the time when this
model was being created the system collected matches from its own TM and translations
from both Google and Microsofts MT systems. This meant that the pre-translation stage
required a Parallel Split to obtain them.
Once the pre-translations have been carried out, the flow converges through an Acyclic
Synchronizing Merge, as there may be no matches from the TM or one of the MT systems
may be down. In the model this is simulated by the addition of the Add Nothing task that
passes on an empty token and the Store Pre-Translations task that consumes three tokens
reproducing the behaviour of the system and keeping this stage of the model structured.
Again, it is worth remembering that the model departs from the actual process because
Yasper cannot discriminate the origins of a token at the instance level. If multiple instances
had been enabled earlier on by the generation of multiple tokens during the Divide Source
into TUs, the Store Pre-Translations synchronization task could have been enabled by any
three tokens coming from instances related to different TUs, instead of by the three tokens
95

corresponding to a single TU, which would be the real world behaviour. Once the pretranslations are stored, the strings undergo the translate, edit, and vote subworkflow that is
represented in Figure 13, where one the pre-translations could be selected for publishing.
The first task in Figure 13 enables the system outputting untranslated strings, this is so
because at the time of the creation of this model, the project owner could request
downloading the target file even if there were there are no translations available or no
translation available had been selected. This resulted in Crowdin outputting the source text,
effectively merging source and translated text in its output. This behaviour changed in later
iterations of the software with the introduction of different automated selection methods.

Figure 12 Crowdin MT and TM leveraging subworkflow

Supporting the merging of untranslated and translated strings requires again supporting the
Acyclic Synchronizing Merge control flow pattern. In this model this has been modelled by
presenting the Output Unstranslated String task.

Since it is not possible to know beforehand how many people will suggest a translation and if
any of them will suggest more than one, it is necessary to support Multiple Instances without
a priori Run-Time knowledge. Unless there are no matches and both Bing and Google MT
systems failed, after the pre-translation it is also possible to vote for or against these pretranslated strings. Since we cannot know how many people will vote, it is necessary to
96

support Multiple Instances without a priori Run-Time knowledge. In this model the creation
of instances for both translation and voting tasks is represented by loops. Although the loop
representation allows for multiple instances to be created on demand, it does not allow for
more than instance to exist at any given time. This is not how the actual system works.
Crowdin allows multiple instances to co-exist and the creation of instances is not dependent
on the previous instance being finished. However, the loop approach is the best possible
approximation within the limitations of the implementation of coloured Petri nets of Yasper.

Figure 13 Crowdin translate and vote subworkflow

The select translation task has a number of flushing links that consume the tokens in all the
relevant places in order to close the loops and prevent new downloads of files with
untranslated strings. This is an implementation of the Cancel Region pattern that stops the
translation and voting loops and potential downloads where the string remains untranslated.
The project owner can also cancel the project at any time, but what happens in case of
cancellation is not relevant for this thesis and has not been modelled.

97

4.2.2 Asia Online


According to (Vashee, 2009) Asia Online has translated great part of the content of the
English Wikipedia to Thai, to do this they have used MT and a selected community of users.
Each document is segmented and translated by their MT system before contributor access
them. As visible in Figure 14, each segment is post edited thrice, each time by a different
collaborator in what is an implementation of the Multiple Instances with a priori DesignTime knowledge pattern.

Figure 14 Model for Asia Onlines Wikipedia translation project

Then, their corrections are compared resulting in an implementation of the Synchronization


pattern. If two post editions are the same the translation is automatically sent to their main
translation database, otherwise, the corrections go to an expert for manual selection. To
support this Exclusive Choice pattern must be supported. The administrator selects the
authoritative version and if any of the other translations are good enough, they are stored in a
database of alternative translations. This means that there are three possible outcomes after
the automated matching:
a) One translation selected via automated matching for publishing and the other is
discarded.
b) One translation selected by the administrator for publishing and the others are
discarded.

98

c) One translation selected by the administrator for publishing, one or two translations
are selected for storage in the alternative translation database and one or no
translations are discarded.
Independently of the choice, storing one or two alternative translations requires supporting
the Multiple Instances without a priori Run-Time knowledge pattern. This was modelled by
using an exclusive choice split that can result in no task, which implies no instances, one
Store One alternative translation task that implies one instance, or Store Two Alternative
Translations task that implies two instances. Since it is also possible that no alternative
translations are saved, it is necessary to support the Acyclic Synchronizing Merge, which in
this model is done by directing the empty token generated at How many of the remaining
translations are good? directly to place P15. The empty token is consumed at the Simple
Merge in Merge 1. This is made visible in the model through the number two that appears on
the arc coming out of place P15.

4.2.3 Facebook
This model was created using Losses (2008) description of the process, Mesipuus (2010)
research and the researchers own experience as a user and translator during October 2010.
Facebook uses different models for localisation depending on the language. The community
supported languages do not have professional QA stage. The fully supported languages
involve a final quality check before any translation goes live.
For all languages Facebook has a first terminology stage that includes some thirty terms.
Those terms are translated and its translation fixed before general UI translation happens.
until later on, but the mechanism is the same (Losse 2008; Lenihan 2014).
This model is for what Losse called the community supported language. In this model users
can suggest translations for the strings in the interface or in a string list view. At the locale
level, which is modelled in Figure 15, for each string an instance of a translation
subworkflow is created, which means that there are Multiple Instances without
Synchronization of the translate and vote subworkflow, since each string is translated
independently. The instances have not been modelled explicitly because of the tool
limitations, as it was the case with Crowdin, but if they were, the arrow between the
Generate TUs and the following place would display the number of instances.
99

Figure 15 Facebooks process at the locale level

The model of the subworkflow presented in Figure 16 includes a present source task that is
placeholder needed to deal with Yasper not supporting the input place of a subworkflow
linking to more than one tasks at once. In Facebooks process each string can receive
translation suggestions by an unknown number of users, this is an implementation of the
Multiple Instances without a priori Run-Time knowledge pattern. Each of those translations
can be receive votes from an unknown number of users which again requires implementing
the Multiple Instances without a priori Run-Time knowledge pattern. This has been modelled
by creating a loop that automatically re-enables users being able to look at existing
translations and deciding if they want to add their own translation, vote for one of the existing
ones or overwrite their vote if they have already voted for a translation of that string. The
suggestion with the best rating the rating is calculated according to weighted votes, is
automatically selected and becomes the translation that appears on the UI. The selected
translation can be later on be superseded by a different, more popular translation, as votes
come in. Since this process of translations being replaced by more popular translations can
happen numerous times and the number of times that it will happen is not known, supporting
this behaviour requires supporting the Multiple Instances without a priori Run-Time
knowledge pattern. In this model this is done with the same loop that enables the votes and
translation suggestions that is most visible in the arc between the Freeze? exclusive choice
split and the place P1. It is worth noting that where all the previous apparitions of the
Multiple Instances without a priori Run-Time knowledge resulted in more votes or more
translations being added, here every time the Select Translation task is executed the result is
the previous translation being displaced. It order to prevent the Select Translation with the
most votes task from happening before there are any translations available, inhibitor links
have been placed going from P13 and P14 to the Select Translation with the most votes
task.

100

Figure 16 Facebooks process at the string level.

Losse (2008) addressed translations being frozen by professional translators in what she
called fully supported languages but she did not address the process of freezing a string in
community supported languages and neither Mesipuu (2010) nor Lenihan et al (2011)
address the process used.

4.2.4 Pootle
Pootle has been used to localize a number of open source products like Firefox (Razavian and
Vogel 2009), LibreOffice, the documentation of the Python programming language and the
Scratch programming language (Arjona Reina 2012). The project simulations ran in order to
create this model were carried out in November 2010 and the version of Pootle used was 2.1.
That version did not have TM functionality yet, but it could get suggestions from MT
systems. However, the MT integration was not enabled in the instance that was used for this
research and hence does not appear in the model. Three accounts were used to create the
model: one administrator for the project and two translators to simulate the crowd.
In Pootle a user creates a project by uploading one or more compatible files. The user can
then select the locales that will be available for translations. These locales work
independently of each other, a behaviour that requires supporting the Multiple Instances
without Synchronization pattern. If there are terminology files available, they can be uploaded
to Pootle in order to leverage them during the translation project. The terminology can be
global, that is, it applies to all the projects on the server, or project specific (Translate 2014),
but from the point of view of the translation process this does not make any difference. Pootle
accepts a variety of file formats. For this research Java properties files were used and the
101

same project was carried out with and without adding terminology to see the differences.
When a file is uploaded, the translatable strings in it become TUs.

Figure 17 Model of a Pootle Process at the locale level

The strings are presented sequentially in the UI, but can be translated individually without
interfering with each other. This is another case of a Multiple Instances without
Synchronization pattern. Each string undergoes a Translate, vote, submit subworkflow
before being merged into the target language file at download time. Since the merge can
happen before there are submitted translations for some strings, the Translate, vote, select
subworkflow includes an Output untranslated string task to enable it. Since files with the
most recently submitted translations can be downloaded multiple times until the locale is
closed, it is necessary to support the Multiple Instances without a priori Run-Time knowledge
pattern in order to enact this behaviour. As in previous models, the Multiple Instances without
a priori Run-Time knowledge is modelled by a loop. In this case the loop goes from the
Download File with Translations task, to place P8, through the no branch of the Is the
locale finished XOR gate that goes back to place P4. This all happens at the locale level,
which is modelled in Figure 17.
When users start to translate they see a list of strings, if there is terminology this has been
leveraged before the user sees the strings and the translated term appears next to the string. If
there are suggestions already, users can both vote for those suggestions and add their own
suggestions. If there are no suggestions, at least one suggestion has to be added before voting
is possible. Both suggestions and votes can happen as many times as users decide to vote or
add suggestions, which means that support for Multiple Instances without a priori Run-Time
knowledge for each of this actions is required. In this model this is represented by the loops
that include the re-enabling tasks visible in Figure 18. In the case of voting the loop goes
102

from Vote for Suggestions to place M8 and the Re-enable Voting task. In the case of the
addition of new translations the loop goes from the Is voting enabled? Yes arc to place
M16, to the Re-enable Adding Suggestions task.

Figure 18 Model of a Pootle process at the string level

Since it is not necessary for any vote to be casted or even a translation to be suggested before
a translation is submitted, it is necessary to support the Acyclic Synchronizing Merge. In
Figure 18, this is implemented by a combination of the Submit Translation task being
immediately enabled after the Enable Submit Translation, check for Suggestions and
Voting tasks that places a token in place M6, and the flushing links coming from the Close
String task to places M3, M5, M8 and M9.
It is worth noticing that the Close String task is not representative of the way Pootle
actually works, since closing happens at the locale level as visible in Figure 17, but to
represent that behaviour on Yasper would require integrating the vote, suggest, submit
subworkflow into the main workflow which would result in a much too complex model. The
Close String task has thus been added in order to have a closest equivalent behaviour while
maintaining correct syntax.
If the user has permission to submit, they can directly submit a translation without making it
a suggestion of waiting for votes. A submitted translation can also be an already existing
translation that is selected by this user with submission permission or a translation that has
103

been added by such a user. After the translation has been submitted, a set of automatic quality
checks (47 as determined by the configuration) are carried out. In the model these are
represented as a single automated task for simplicity reasons, but modelling as 47 individual
parallel tasks would be more suitable since the outcomes of the test are independent. If a test
fails, the outcome is visible both on the review page and on the translate page. Users with the
right permissions can resolve zero or more of the issues by editing the translation. It is worth
considering that some of the issues pointed out by the tests may not be actual problems.
At any time users with the right permission can download a file that contains the submitted
strings. If a string has no submitted translation the source is inserted in the target file. That
behaviour is visible on the upper part of the model where the Enable Outputting
Untranslated String and Output Untranslated String are visible.
Over time users with the right permissions can submit new translations that they themselves
have written or that they have selected among the ones suggested by other users. This once
again requires supporting Multiple Instances without a priori Run-Time knowledge, since it is
not possible to know how many times submission will happen before a locale is closed. This
is modelled with a loop that goes from the No branch of the Locale finished? XOR gate,
to place M17, to task Re-enable Submissions to place M6.
In Pootle the submission of a string does not prevent the collection of further suggestions or
votes for that string. This is a noteworthy difference with Crowdin, where selection locks the
addition of votes and suggestions.

4.2.5 Launchpads Translations


Canonicals Launchpad is a web application for the support of open source development. It
has features such a versioning system for code, bug tracking facilities and a translation
module that was called Rosetta in November 2010 when the work for this model was carried
out. Users could create projects in Launchpad and upload POT (Portable Object Template)
files for their translation in Rosetta. POT files are a bilingual files that contain the original
strings, very limited metadata and an empty string in the place of the translated string (Kagdi
and Maletic 2006). Launchpad allows for different ways of managing contributors
permissions.

Open: Projects where anyone can contribute to any language.

104

Structured: Trusted contributors exist for some languages, only they can
submit translations but anyone can suggest translations. If a language does not
have trusted contributors, it works like an open project.

Restricted: Same as structured but locales without trusted translators do not


accept suggestions.

Closed: Only vetted translators can suggest translations.

For this research three accounts were created: one as project owner and two to simulate the
crowd. The project was restricted and only had one locale besides English, which was
Spanish. There was no way to close a locale once opened in 2010 and that was still so as of
May 2012 (Question #196237: Questions: Launchpad itself 2014). Since more than one
locale can be selected at runtime, it would be necessary to support Multiple Instances without
Synchronization. As visible on Figure 19, this behaviour has not been modelled because of
the same limitations that were explained when describing the Crowdin process.

Figure 19 Model of Launchpad's translation platform at the locale level

Strings are sequentially presented as independent TUs in Rosettas UI with each of them
undergoing the translation subworkflow individually, which is another occurrence of the
Multiple Instances without Synchronization pattern being used in crowdsourced translations.
Again, there is a merge stage that happens when a user downloads the translated strings,
where source strings appear in the place of those strings without a translation. By collecting
untranslated strings, the merge translated string task is effectively acting as an Acyclic
Synchronizing Merge, where the untranslated strings from the Translate, vote, submit
subworkflow take the place of the empty token that is frequently used to model this
behaviour.

105

Figure 20 A model of the Launchpad Translation process at the string level

The translate, vote and submit subworkflow is represented in Figure 20. Here, if there is an
exact match in Launchpads internal TM, the match will be automatically suggested. In
addition to that, each string can be translated by a number of contributors. If the contributor is
not a trusted translator they can suggest translations and vote for existing translations. Both
suggesting and voting can happen as many times as contributors decide to perform those
tasks, therefore supporting the Multiple Instances without a priori Run-Time Knowledge
pattern is necessary to support the systems behaviour. The model, as it was the case for the
model for Pootle, contains loops for both tasks in order to support this behaviour. If the
contributor is a trusted translator, they can submit a translation that they have written
themselves or submit a translation from among the ones written by other contributors. The
currently submitted translation is the one that is sent to the file with the translated strings. In
restricted projects, the source string is sent to the translated file if there is no submitted
translation, even if there are multiple suggested translations. Since this process of outputting
untranslated strings can happen as many times as a user with the right permissions requests
the merge, it is necessary to support the Multiple Instances without a priori Run-Time
Knowledge pattern to support the systems behaviour. In this model this is represented by the
106

loop between the download file with translations task and the Translate, vote, submit
subworkflow in Figure 20.
As stated above, Launchpad lacks a mechanism to close locales and strings, as a result, the
model that represents its process does not execute correctly, because of the tokens that
become trapped in the infinite loops.

Figure 21 A suggested model for Launchpad Translation process at the string level

Figure 21 shows a suggested model that includes an exclusive choice to close the string,
effectively freezing it, that allows it to close the loops and enables the process to finish
correctly. By effectively closing locales when all the strings for a given locale have been
frozen, an implementation of the model in Figure 21 would prevent resources being directed
to strings and locales that no longer need them.

4.2.6 DotSub
DotSub is a crowdsourced subtitling platform that has been used to translate TED talks
(Notley et al. 2013) and the videos made available by Adobe through their Adobe TV
program (Adobe 2014), and is cited as an example of crowdsourced translation platform in
the literature (Dsilets 2007; OHagan 2012; Orrego-Carmona 2012).
107

Figure 22 A model of DotSub's process at the video level

Three accounts were used to carry out the project used for the creation of this model during
March 2012. One account was the project owner and the other two were used to simulate the
crowd. The first step in a DotSub project is to create the source subtitles and time them.
DotSub projects can be configured so that either only trusted users or any user can contribute
both in the creation of the source subtitles and in the translation independently. With the right
permissions, contributors can also adjust the source subtitles and timing if they so desire, this
would happen within the Create Source and Time subworkflow that has not been detailed in
the model. Besides that, trusted users can be given full access permissions which allow
them to perform all the functions of the project owner. Any user with permissions to translate
can add new languages. This results in the system having to be able to support an unknown
number of languages that can be added at any time which is another case of the Multiple
Instances without a priori Run-Time Knowledge pattern appearing in a crowdsourced
translation workflow. In this model this is represented by the loop that allows indefinite
additions of locales, that starts at P2, then to the Add Locale task and P3 from where it goes
to the Add More Locales XOR split that can either result in disabling the addition of
locales or go to P2 to reinitiate the loop. This is a compromise between the way the system
actually works and the expressive power of Yaspers implementation of Petri nets. In the
actual system the project owner can always add new locales and can disable the addition of
new locales for other users at any time, not just after new locales have been added.

108

The source is already segmented because it is made of subtitles. Although the subtitles are
generally processed independently, when a user decides to leverage MT this happens at the
video level and affects all the subtitles in that video.
Subtitles are presented sequentially in the UI and can be translated individually and in any
order. If a subtitle has not been translated, nothing is shown in the output. As each subtitle is
independent and empty outputs are possible if a translation is published before it is complete,
this is a case of Multiple Instances without Synchronization where each subtitle is an
instance. In Figure 22 this is not represented, but if it were the output arc of Generate Token
for each Subtitle would display the number of tokens.
Since contributors can iteratively edit existing translation until the translation is frozen, the
Translate or Edit task is another task that is part of a Multiple Instances without a priori
Run-Time Knowledgeworkflow pattern. In this model the loop that enables the Multiple
Instances without a priori Run-Time Knowledgestarts at P9, but it contains too many tasks to
cover it completely all at once. The loop eventually reaches the Freeze? XOR gate and
either closes by going to P15 and the Freeze tasks that has multiple reset arcs to ensure the
model is sound, or to P16 and then to Enable New Translation which goes back to P9 and
reinitiates the loop.

After a translation is carried out or edited, it is automatically published. Depending on if the


person who edited the translation was the owner or not, the owner is notified. After being
notified, the owner can decide whether to roll the translation back to the previous version or
not. Rollbacks happen at the locale level, meaning that good changes are lost together with
pernicious changes when the Version Rollback is executed. As it was the case with the
addition of new locales, the project owner can freeze translations any time, but the model
represents it at the end of the process because it makes for a less complicated model and
because the decision to freeze translations would in most circumstances be triggered by
considering the translation finished or malicious editions.
Contributors can, besides working online, download the source file, work offline and upload
their translated file. If they do this, they are effectively adding one iteration to the translate
or edit loop. Unlike all the previous platforms, where each instance generates a suggested
translation that coexists with other suggestions, in DotSub, each instance overwrites the
previously published translation. For this reason, it is also not possible for two people to edit
109

the same subtitle at once. Subtitles can only be open for edition for one person at time. If a
contributor tries to open a subtitle that has already been opened by someone else, they will
receive a message letting them know that the subtitle is not currently available for edition.
Once translations for a locale are frozen, the owner can also unfreeze them, but modelling
that would result in the model having an infinite loop that cannot be closed that would make
it unsound.

4.2.7 Amara
Amara is a crowdsourced subtitle translation platform that as of the writing of this thesis is
used for the translation of TED talks (TED Conferences 2013) and figures among the
platforms discussed in the literature (Orrego-Carmona 2012; Notley et al. 2013).
During the time of the creation of this model, late March 2012 to mid April 2012, the tool
changed names from Universal Subtitles to Amara. Three accounts were used to work on the
project upon which this model is based. When a person adds a video to Amara for subtitling
the person does not become the project owner, but a follower of the video. Every follower
has the same competences. Amara has paid services that offer more control, but these were
not explored in this thesis.

Figure 23 A model of the Amara process at the video level

The process represented in Figure 23 starts when a video is added to Amara for subtitling. Its
first step is the creation and timing of the subtitles. In the model, this process is represented
as a subworkflow, but its details are not part of the model since it is not part of the translation
process itself. Once the captions are ready, anyone can add new languages and each language
works independently from each other, showing another case of the Multiple Instances without
a priori Run-Time Knowledge pattern in a crowdsourced translation scenario. In the
workflow this is represented by a loop that enables the ad-hoc addition of languages. The
110

loop starts in M2, goes to Add Language then to M3 and to Re-enable Adding Languages
before going back to M2 again.
When a new language is added, contributors can work on translating the subtitles for that
language. The subtitles are presented sequentially, but are independent translation units. This
results in another case of Multiple instances with a priori Run-Time Knowledge appearing in
these processes. As it was the case in previous models, this has not been represented in the
model, but it would be by having the output arc of Generate a Token for each Subtitle
generate as many tokens as subtitles there are.
At the time when this model was created, contributors could request a Bing MT translation
for the subtitle they are working on or start the translation from scratch if there was not a
translation already. This is visible in Figure 24 that represents the process at subtitle level.
A single translation can be repeatedly edited by different contributors over time. This is
another case of the Multiple Instances without a priori Run-Time Knowledge workflow
pattern appearing. This behaviour is represented in the model by a loop that goes from the
Save and Publish to place M4.
After editing one or more subtitles any contributor can save their work, which results in the
editions being automatically published. This happens at the video level, not at the subtitle
level as visible on Figure 23. When a new version of the subtitles is published, the videos
followers are notified and any of them can request a rollback. This can also happen an
unknown number of times, but it is not necessary to create a loop for it, because the tokens
generated by the Safe and Publish task also enable the Version Rollback task.

Figure 24 A model of Amara's process at the subtitle level

Using the features available for free there no option to freeze captions, timing or translations,
or any way to close a video. Because of that, it is always possible for anyone to add new
111

languages and modify existing translations. This leaves the freely accessible version of
Amara open to vandalism, i.e. malicious editions, and by representing this, the model in
Figure 23 contains two infinite loops: from Add Language to M2 and from Save and
Publish to M4. Although syntactically correct, this behaviour is very impractical because it
enables unlimited resources to be dedicated to executing the loops, unless some
administrative backend process is used to stop them. Furthermore, the automatic simulation
of the model fails because the tokens become trapped in the infinite loops.

Figure 25 A suggested model for Amara at the video level

These issues could be solved by adding mechanisms to close the infinite loops. Figure 25
displays a hypothetical model that includes a task that closes a video and flushes all the
relevant places, and a choice after M6 in order to close the infinite loops. This suggested
model diverges from the way the system actually works, but it would offer some protection
from malicious users by preventing changes after the subtitles have been closed, it can be run
automatically and does not contain infinite loops.

4.2.8 Kiva
Kiva is an organisation that connects financers with people who need loans (Baer and
Moreno 2009) that is frequently brought up as an example of organisation using
crowdsourced translation (Dsilets 2010; Dsilets and van der Meer 2011; Dolmaya 2011;
Kelly et al. 2011b). In their process Kiva involve volunteers and professional translators
(Baer and Moreno 2009). Kiva was selected in order to have an organisation from the Crowd
TEP group. In order to create the model, the researcher carried out translations for them
between December 2011 and March 2012.
The model has two parts. The first part covers the process solely from the point of view of a
volunteer translator. The second part has a grey overlay in Figure 26 and is hypothetical since
Kiva does not provide information about their review process. There is also an issue with the
112

model not covering what happens when a translation does not meet Kivas quality
requirements since the researcher had ethical concerns regarding purposely providing the
organisation with poor translations in order to find out how the organisation deals with them.
Kivas contributors must undergo training and pass a test before being given access to the
actual translation material. This was described by Baer and Moreno (2009) as doing the
quality assurance in advance. They also brought up a peer review program where professional
translators mentor new translators. This mentoring process matches the experience of the
researcher.

Figure 26 A model of Kiva's process from the point of view of a volunteer

Kivas effective work unit is the loan application. Loan applications are about 200 words
each and self-contained in meaning. In Kivas process, translators claim a number of loan
applications at once. At this point three tasks become enabled. One of them is the
cancellation of the translation at any stage if the deadline has passed, the other is the request
for an extension of the deadline and the third one is the review of request for issues. If the
deadline passes before the translation is submitted, the translation is sent back to the
translation pool and any work being done is cancelled, even if it has already started. This is
113

represented in the model by flushing links coming from the Deadline has Passed node to all
the places preceding the Submission of Translation task.

As long as the deadline has not passed, volunteers review each of the requests for issues.
Issues could be things such as the full name of the person applying for the loan appearing in
the loan request text, which should not happen for privacy reasons; or the person that appears
on the picture that accompanies the request not matching the expected appearance. For
example, the loan is for a young person and the person in the picture appears middle aged. If
there is an issue the volunteer reports it and the loan disappears from their queue. The next
step for the volunteer is to indicate if the language of the application is the expected one. If it
is not, the volunteer can mark it so that the application is placed in the right pool of
applications. If there are no issues and the language is right, the volunteer carries out the
translation. Once they have finished the translation, volunteers carry out a self review tasks
where they identify possible issues with the translation. If the translation is good enough, they
Submit the translation, if not, they can edit it again and will have to do the self review
again. This is modelled as a loop that goes back to P13 if the translation fails the self review.
Kiva does not offer integrated language resources of any kind, but they have a terminology
wiki that translators can use. Translators also support each other via forums, but none of this
activity was integrated in the process at the time of the creation of the model.

4.3 The practices


This subsection presents 14 practices identified through the creation and analysis of the
models.
1 Leveraging MT
The models for Crowdin, Asia Onlines Wikipedia translation project, Amara and Dotsub
show tasks where MT is leveraged. For this reason Leveraging MT has been added to the list
of practices.
2 Leveraging TM
Crowdin and Launchpads model include tasks where TM is leveraged. Furthermore, Pootle
integrated TM leverage too at a later stage. For this reason Leveraging TM has been added to
the list of practices.
114

3 Leveraging Terminology
The models for Facebook and Pootle included tasks where terminology was leveraged. For
this reason Leveraging terminology has been added to the list of practices.
4 Open Alternative Translations
The models for Facebook, Pootle, Crowdin and Launchpad include loops that allow
contributors to add unlimited translations for each source string. Initially this practice was
named Unlimited redundant alternatives, but in the course of the interviews carried out for
chapter 5, it emerged that the core feature of this practice is not so much that contributors can
add unlimited translations, but that they can see the translations that other contributors have
suggested, hence the change of name for the practice. Redundant was replaced by
alternative to avoid the connotation that most of those translations are not necessary.
5 Hidden Alternative Translations
This practice only appears in the model for Asia Onlines Wikipedia translation project, but it
is also described by Zaidan and Callison-Burch (2011) and Eagle (2009). For these reasons,
Hidden Alternative Translations is included in the list of practices. Initially the practice was
called Limited Redundant Alternatives but it emerged during the interviews carried out for
Chapter 5 that the core feature of this practice is not that the number of translations collected
is limited beforehand, but that contributors cannot see the suggestions made by other
contributors.
6 Super Iterative Translation
This practice appears in the models for DotSub, Amara and Facebook. All three models
include loops that allow an existing translation to be replaced by a different translation an
indeterminate number of times until frozen, if freezing the translation is possible. Two
implementations are visible in the models. For DotSub and Amara the iterations happen when
new translations are automatically published when saved. For Facebooks community
supported languages in 2008 the iterations happened when a translation replaced an already
existing one when the new translation collected more votes than the currently published
translation. Initially this practice was called Redundantly Refining translations, but the fact
that some iterations could worsen the quality of the translation and the connotation of
unnecessary in redundant made the researcher consider other names. The next name was
Autopublished Iterative Translation, but this did not match situations where the iterations
happened after certain conditions have happened, such as a translation gaining more votes
115

than the current one in the case of Facebook. The next name considered was simply Iterative
Translation, but mainstream processes where there is a translation and a review of that
translation are also iterative. The name Super Iterative Translaiton was selected in order to
express the idea that the process is highly iterative.
7 Translation without Redundancy
This practice where a source text unit is translated only once and then reviewed appears only
in the model for Kiva. However, this is the most common approach to translation in
mainstream processes and it is also used by The Rosetta Foundation (Anastasiou and Schler
2010; OBrien and Schler 2010b), for whom the researcher carried out a translation to
become acquaintance with their process. For these reasons the practice has been included in
the list.
8 Freeze
There are two implementations of this practice in the models. In the case of DotSub, freeze
happens when a contributor with the right permissions prevents other contributors from
further modifying the existing translations. This is represented in the model by a series of
flushing links that empty the relevant places. In the case of Crowdin, freeze happens when a
translation is selected and as a result, contributors can no longer suggest translations for that
string. The freeze is represented in the Crowdin model by flushing links too. Although
Facebooks process for fully supported languages is not among the models in this thesis. In
the process for Facebooks fully supported languages, a professional translator can freeze
strings and prevent the addition of further suggestions and votes (Losse 2008). There are also
instances of soft freezes by some projects using Launchpad, such as Ubuntu, where
translations are frozen towards release dates. In this last case, contributors can still add
suggestions, but they will not be taken on account until the next release. For these reasons,
freeze was added to the list of practices.
9 Rollback
The practice of rolling back appears in the models for Amara and DotSub. This is why the
practice has been included in the list.
10 Deadlines
The practice of using deadlines appears in the model for Kivas process. Deadlines also
appear in the processes of Skype (Mesipuu 2010), TED talk (Wijayanti 2013), Adobe TV

116

(Adobe 2014) among others. For these reasons, Deadlines have been included to the list of
practices.
11 Open Assessment
The practice appears on the models for Crowdin, Facebook, Pootle and Launchpad. The
original name for the practice was unlimited redundant assessment. As with the open
alternative translations, during the interviews it became apparent that the core feature of this
practice was not the number of assessments collected, but the fact that other contributors
could see the assessments.
12 Hidden Assessment
The practice does not actually appear in any model. However, when first considering the
processes, the researcher thought of instances of a given translation within the Asia Online
process as votes. A technique similar to the technique used by Asia Online was also used by
txtEage (Eagle 2009). There is single instance of actual Hidden Assessment in the literature in
the form of ranking used by Zaidan and Callison-Burch (2011). Although a single instance in
the literature may not be enough to justify adding Hidden Assessment to the list of practices,
the researcher considering the parallelism with open alternative translations and Hidden
Alternative Translations added it Hidden Assessment to the list nonetheless.
13 Expert Selection and Edition
The practice appears in the models for Asia Online, Pootle and Launchpad. Furthermore, the
practice is also implemented in the process for fully supported languages in Facebook (Losse
2008), in Twitter (Arend 2012). The practice was added to the list for those reasons.
14 Metadata Based Selection
The practice appears in the models for Asia Online and Facebook. These are two different
implementations. Asia Online uses frequency of appearance as the selection criteria, of which
a weighted variation is also used by txtEagle (Eagle 2009). Facebook uses Open Assessment,
which is weighted too, although this is not explicit in the model (Losse 2008). A third
implementation appears in the literature with Zaidan and Callison-Burch (2011) proposing
selecting the translation using Hidden Assessment.
Those practices are the practices that emerged from the analysis of the models. However,
given that the initial purpose of this thesis was to propose a pattern language following the
suggestions of Buschmann et al (2007), it was required to add practices that could perform
117

the function of entry point for the pattern language. Two patterns identified by Dsilets and
van der Meer (2011) were found suitable for this function. Those patterns are content
selection that in Dsilets and van der Meers work is called Identify Compatible Content
and Unit Granularity Selection that they called Contributor-Appropriate Chunk size.
Both additional patters are discussed in depth in the following chapter.

4.4 Summary
This chapter has discussed issues with the selection of petri nets for the workflow models and
other aspects of the approach to the creation of the models. It also presented models for eight
crowdsourced translation platforms and discussed the practices that were identified in the
models and the additional practices that were required for the creation of a collection of
practices that covers the process from beginning to end.

118

Chapter 5 Refinement of the Practices


5.1 Introduction
This chapter presents the method selected for data collection used to refine the practices
which is the semi structured interview. The questions are presented, along with the
Framework method used to analyse the interviews and the outcome of the analysis, including
considerations that affect the coverage of each practice.

5.2 The Choice of Semi Structured Interview


Having identified candidate practices from the workflow models in Chapter 4, the function of
next data set was to enable the researcher to refine them by clearly identifying their forces,
that is the requirements, consequences and constraints that shape the practices (Buschmann et
al. 2007).
The semi structured interview was selected as according to Oates (2005) this is a valid
approach to obtain detailed information about matters that are not completely defined. A
survey asking for ratification of the forces identified by the author and requesting other forces
was also considered, but discarded because interviews have a higher response rate (Keats
1999, p.4), and produce much richer data that is less limited by the prior knowledge of the
researcher.
The choice of semi structured instead of structured interview was motivated by the need to
maintain flexibility to interview people involved in different aspects of the crowdsourced
translation project, which can result in some interviewees being able to provide more
information to the questions that are more relevant to their experience, to skip questions that
become irrelevant during the interview and the need to be open to the interviewees, for
example, by providing new information that had not been considered prior to the interview.
Probes were used with the intention of obtaining more detail when interviewees provided
detail that seemed insufficient to the researcher (Oates 2005).

5.3 The Selection of Interviewees


The interviewees were selected using purposive sampling, that is to say that the interviewees
were chosen due to their qualities (Tongco 2007). Using purposive sampling aligns with
119

Keats suggestion that when using interviews for research, the respondents must be carefully
selected according to criteria that meets the need of the research plan (Keats 1999, p.20).
Fourteen experts were identified through the researchers professional network. Two of them
were nominated to be participants in the pilot interviews that were used to refine the
questions and gauge the duration of the interviews in order to provide that information in the
consent form for the rest of interviewee candidates. Five of the remaining twelve agreed to
being interviewed and are part of the data used in this research. They are the following:

Subject P: Former director of a small organisation that offers translation tools. Subject
P was selected because of his extensive experience working in long tail languages
using tools that support Translation for Engagement type of processes.

Subject A: Marketing Executive for a medium sized company that uses crowdsourced
translation. Subject A was selected because of his experience with a big crowdsourced
translation project that used a Colony Translation type of process.

Subject AC: Program manager for a multinational that uses crowdsourced translation
for software localisation. Subject AC was selected because he is the senior manager in
charge of a sizeable community translation initiative that uses a Translation for
Engagement type of process.

Subject V: Program manager for a multinational that uses crowdsourced translation


for video. Subject V was selected because she manages a project that uses a tool that
enables a Wiki Translation type of process. However, it came up in the interview that
the run the project using a Crowd TEP approach, where volunteers translate videos
individually.

Subject S: Senior developer for small organisation that develops a crowdsourced


translation platform. Subject S was selected because his involvement with an
organization that uses a Crowd TEP approach for the translation of documents
connected to humanitarian causes.

A further eighteen experts were identified by looking at the personnel of organisations using
or offering facilities for crowdsourced translation. Three experts agreed to be interviewed.
They are the following:

Subject C: Platform CEO. Subject C was selected because his company offers a tool
that enables a Translation for Engagement type of process. This tool is being used
both by communities and commercial entities.

120

Subject R: Lead of an open source translation group. Subject R was selected because
he manages a community of translators that work for a famous piece of OSS. His
community has a Translation for Engagement approach to the translation process.

Subject M: Researcher and consultant for a crowdsourced translation platform.


Subject M was selected because he is involved with two projects. Both of them use
Crowd TEP approach, but while one of them deals mostly with humanitarian and
political documents that are translated completely by independent volunteers; the
other deals with commercial content that is translated by a crowd of post editors that
are paid.

The University of Limericks Research Ethics Committee makes it mandatory to protect the
identity of the interviewees. As a consequence of this no other personal details of the
interviewees have been included given that the community around crowdsourced translation
is still relatively small thus increasing the likelihood that the publication of any further details
would compromise their anonymity.

5.4 The interviews


The interviewees were located in different countries, resulting in two interviews only being
carried out face to face in the researchers office. The face to face interviews were recorded
with a Zoom H1 recorder and a mobile phone as back up track in case there was any issue
with the recorder. The telephonic interviews were recoded with a Zoom H1 connected via the
headphone jack for the voice of the interviewee, through the laptops internal microphone via
Audacity for the voice of the interviewer and a mobile phone as back up track in case there
was any issue with the others.

Table 16 Date and duration of interviews for each subject

Interviewee
Subject P
Subject C
Subject R
Subject M
Subject AC
Subject S
Subject A
Subject V

Date
05/04/2013
11/04/2013
17/04/2013
18/04/2013
02/05/2013
17/05/2013
20/05/2013
21/05/2013

121

Duration
49:20
1:11:02
41:41
57:15
1:11:47
47:31
1:20:59
57:31

The interviews were carried out between the 5th of April 2013 and the 21st of May 2013. The
subjects were asked to be available for one hour based on the duration of the pilot interviews
which lasted 27 and 47 minutes respectively. Three of the interviewees exceeded the
estimated time. Transcription was carried out in NVIVO and the time it took was highly
variable depending on the speaker. Because the researcher is not a native speaker and the
interviewees had a diversity of accents, the transcription took between 5 and 13 hours per
hour recorded. The transcriptions are verbatim, except for vocalized pauses that were left off
transcript, and traits of verbal communication such as unfinished sentences and grammar
mistakes have been kept. Given that most of the interviewees were not native English
speakers, it was decided not to mark the errors since it would hinder the reading flow.
As observed by Keats (1999, p.13), carrying out interviews on the phone can result in a loss
of the non-verbal cues, however, with this research not being focused on attitudes or
perception of the practices, the loss of this information is not critical.
Although the default term in the questions was crowdsourcing, several interviewees
expressed discomfort with the term and opted for community translation. In those cases the
questions were adapted accordingly in order not to antagonize them.

5.5 The questions


The questions for the pilot interviews were general open ended questions of the type What
do you think of practice X?. This was done in the expectation that the opinions of the
experts would include considerations on the benefits and prerequisites of the practices.
Although the questions did explicitly specify such considerations, they were generally
implicit and few of the expected observations appeared.
After the pilot interviews the interview refined by basing it upon specific open ended
questions that according to Keats (1999) would limit the range of expected responses. The
reviewed questions took the form of What are the advantages and disadvantages of practice
X?. By asking for advantages and disadvantages it was expected that the interviewees would
come up with the constraints and benefits of applying the practices, which are the forces that
shape the practices. The data produced by this type of questions contained many more
observations that better satisfied the needs of the research.
The explanation of the practices was done without adding any additional information of why
things were done that way. This occasionally caused confusion, for example, Subject R who
is involved in translation for open source project was confused by people hiding translations
122

from the community in the Hidden Alternative Translations practice, but after clarification,
his answer showed that he had understood.
Occasionally the interviewees would focus only on the advantages or the way a practice was
implemented in a context known to them. If that was the case, probing was used to bring the
interviewee back to the focus on advantages and disadvantages.

5.5.1 Question Sequence


Since, at this stage, the purpose of the research was to create a collection of practices akin to
a pattern language, and taking cognisance of the observation that the connections between the
practices are in some cases apparent, selecting the right order for the questions was critical in
order not to bias the answers. For example, Freeze and Version Rollback are two practices
that are used in combination with Super Iterative Translation in Wiki Style processes to
prevent or solve damages caused by malicious or unskilled collaborators. In order not to bias
questions about the Super Iterative Translation practice, the questions about Version
Rollback and Freeze were placed after the question about the Super Iterative Translation
practice. Likewise, the questions about Metadata Based Selection and Expert Selection and
Edition were asked after the question about how to select a translation when you have used
one of the practices that produce alternative translations.

5.5.2 Question list


This section presents the questions posed to the participant subjects, with a brief discussion
about their relevance. Potential answers are presented and serve as an explanation of the
underlying motivation, i.e. how the question was expected to contribute to the generation of
data.
1) Why does your organisation/organisations using your platform use
crowdsourcing for translation?
The rationale for using crowdsourcing is expected to be linked to the ways the interviewees
think about the practices. For example, an organisation that uses crowdsourcing partly to
create a community as a marketing strategy to create brand loyalty, and in addition, gets all
translations for free, will have a different view on Open Alternative Translations and Open
Assessment when compared to an organisation whose motivation to use crowdsourcing is not
primarily based on increasing brand loyalty.
123

2) Can you explain the translation process that your organisation uses? / Can you
explain the translation process that your platform enables?
If the process enacted by the organisation to which the interviewee belongs includes any of
the practices, they will be able to provide more information about those practices and their
views on those practices will carry more weight at the analysis stage.
The relevance of the following three questions is presented in the subsequent paragraph.
3) What kinds of content do you think are suitable for crowdsourcing?
4) What kind of content does your organisation or organisations using your
platform translate using crowdsourcing?
5) Are there specific kinds of content that you think are not suitable for
crowdsourcing?
Questions three to five aim to obtain information about the content selection practice. As
stated before, this practice did not emerge from the analysis of the workflows carried out in
Chapter 4, but was instead identified by Dsilets and van der Meer (2011). When considering
an entry point for a pattern language as recommended by Buschmann et al. (2007, p.340), all
of the practices that emerged from the different platforms were only applicable after the
content had been selected. Therefore, content selection was added to the collection of
practices with the intention of supporting an entry point.

6) What are the advantages and disadvantages using linguistic resources like
TM, MT or TDBs in the context of crowdsourcing?
7) If your organisation/platform uses them, which ones does it use?
8) Why those, why not the others?
As in the case of some TEP processes, some crowdsourced translation processes take
advantage of linguistic resources. The responses to these questions will enable the researcher
to characterise prerequisites such as the need to have such resources available and also elicit
negative effects arising from their deployment, such as the spread of errors through TM
(Bowker 2005).

124

9) What is the size of the minimum unit (sentence, paragraph, self-contained


document) that you present to collaborators?
10) Why that size, why not the others?
11) What are the advantages and disadvantages of using each unit size?
It was observed when analysing the platforms that the work units that were presented had
varying sizes, from one-word strings to self-contained documents. The size of the work unit
affects how the process works in later stages. For example, using Open Alternative
Translations with long TUs would be controversial because of the amount of wasted labour
arising from the duplication of effort. The intention with these questions was to characterise
the criteria for TU Granularity Selection practice and the links between this and other
practices.

12) What are the advantages and disadvantages of preselecting collaborators by having
them pass some kind of test, or meet some requirement (like being a native speaker, or
working for a given organisation) before letting them contribute?
This the question is directly linked to the pattern Entry Exam suggested by Dsilets and
van der Meer (2011), and is not a member of the collection of practices in this research.
However, KIVA uses it and they are also the only organisation using the Translation without
Redundancy practice covered in this thesis. With both patterns appearing in the same process,
it is possible that the practices are inter-related and this question brings up the forces that
govern Translation without Redundancy. Furthermore, the question is also linked to the
preselection criteria used in the taxonomy and was expected to provide relevant information
about why it is so rarely used.
In hindsight, the interview would have benefited from the inclusion of the following question:
What are the advantages and disadvantages of using the TEP model in crowdsourcing
scenarios?
13) If you are/were using a process that produces only one translation per
content unit with deadlines and someone does/did not finish the translation in
time, is/would their work (be) collected and shown to the following person for reuse?

125

This question is linked to the Translation without Redundancy, Super Iterative Translation
and Deadlines practices. The intention is to find out which forces are preventing
organisations that use Translation without Redundancy from using a Super Iterative approach
and also how deadlines are implemented in the context of Super Iterative or Translation
without Redundancy processes.

14) What are the advantages and disadvantages of using deadlines or release
dates in the context of crowdsourced translation?
This question is linked to the Deadlines practice. Answers such as communities dont like
deadlines because they make the process feel business-like or communities dont like
deadlines because they give them the impression that they are working for free were
expected as examples of disadvantages. Answers such as sometimes deadlines are necessary
for the information to be relevant or soft deadlines can motivate the community to work
harder were expected as examples of advantages.

15) What are the advantages or disadvantages of letting contributors modify the
contributions of others (the way that Wikipedia allows you to modify articles)?
16) How would you decide that a translation should no longer be modified?
17) What do/would do you do to prevent flame wars and malicious edits?
18) What do/would you do to minimize the impact of flame wars and malicious
edits when they have already happened?
These questions are all linked to the Super Iterative Translation practice. This approach
allows you to make content available very quickly, You would need to decide when to
freeze the translation or This approach may result in malicious translations being
published are the type of answers expected for question 15.
When the translation has not been edited for a set amount of time or views is an example of
hypothetical answer to question 16.
Freezing the translation and Blacklisting the problematic users are examples of
hypothetical answers to question 17.
Rolling back the version is an example of hypothetical answer to question 18.
126

19) What are the advantages and disadvantages of content rollbacks in the
context of minimizing the impact of flame wars and malicious contributors?
This question pertains to the Version Rollback practice. Answers such as It can cause the
loss of good changes, It requires a versioning system and It allows you to go back to a
point where things were acceptable are foreseeable.

20) What do you think about translation freezes in the same context?
This question pertains to the Freeze practice. Some hypothetical answers are It can prevent
improvements, It can be used to keep an acceptable translation stable and It can help to
direct efforts to the unfrozen TUs.

21) What are the advantages and disadvantages of letting contributors suggest
more than one translation for a content unit?
This question pertains to the Hidden Alternative Translation and Open Alternative
Translation practices. The expected answers were There is a higher chance that a good
translation will be suggested, You have to device a protocol to select the one that will be
published and You are wasting resources that could be better used elsewhere.

22) If you are/were using multiple translations for a content unit and you wanted
to limit the number of alternatives, how do/would you decide the number that
you want to limit it to?
This question pertains to the Hidden Alternative Translations practice. The expected answers
were Based on human capacity to decide between alternatives, which is X, Based on
statistics of how many translations were entered before the one that was published appeared
or Dynamically, based on metadata about the people who suggest them, with few
suggestions if the contributors have a good track record and many otherwise or if no
historical data about the contributors exists.

127

23) If you are/were using a method that produces several translations per content
unit, how do/would you select the translation that will be published?
This question intends to determine if there are selection practices other than the Expert
Selection and Edition and the variations of Metadata Based Selection. Answers expected
were Depending on votes, Depending on reputation of the translator or I would let an
expert do the selection.

24) What are the advantages and disadvantages of letting contributors see/vote
on the translations suggested by others?
This question pertains to the Open and Hidden Assessment practices. Some foreseeable
answers were Voting provides an opportunity to contribute for those who cannot translate
and Voting is a way of finding out what option the crowd prefers.

25) What are the advantages and disadvantages to selecting a translation for
publication according to the number of votes it gets (or some other kind of crowd
assessment)?
This question pertains to the Open Assessment, Hidden Assessment and Metadata Based
Selection practices. The expected answers were The selection will please the majority of the
crowd, The crowd will feel that it has power over decisions, The crowd may not select
the best translation according to academic criteria and Malicious users could exploit that
mechanism.

26) If you decided to publish the first translation to meet a certain condition,
what would be the condition?
This question attempts to investigate how the Metadata Based Selection practice could be
realised. Some predictable answers could be I would publish the first to be up-voted by a
respected (high performing) member of the community, I would publish the first to be
suggested by a high performing member of the community, I would publish the first that
gathers X up-votes.

128

27) What are the advantages and disadvantages of letting an expert select the
translation that will be published?
This question pertains to the Expert Selection and Edition practice. The answers expected
were It prevents malicious translations getting through, It centralizes decision-making for
translations, Allows the person choosing the translation to keep terminology in check, If
theres many translations it slows down the process and If the choice goes against the
community votes, it may irritate the crowd.

28) What are the advantages and disadvantages of letting a computer select the
translation that will be published?
This question pertains to the Metadata Based Selection practice. One of the outcomes of the
interviews was changing the name of the practice from Automatic Selection to Metadata
Based Selection. The expected answers were Automated selection is faster and
Automated selection can be tricked and let bad translations go through.

5.6 Approach to Data Analysis


As previously stated, interviews were chosen as a data collection method. There are different
options for the analysis of the data generated by interviews. The two more established
approaches are analytic induction and grounded theory (Bryman 2012, p.539).
Grounded theory was not fit for the purpose of the research because the intent of the research
is not the discovery of theory from data (Glaser and Strauss 2009, p.2) and its bottom up
approach (Urquhart 2001, p.105) did not suit the purpose of collecting expert judgements in
order to identify the forces affecting the practices, an aim that requirea a more top-down
approach.
Analytic induction, which has been defined as the process of deriving laws from a deep
analysis of experimentally isolated instances, has a more top down approach whereby the
researcher presents a hypothesis and iteratively refines it through the study of cases. This
would have been more appropriate, but its focus on finding universal laws, besides leaning
towards positivism (Miller and Brewer 2003, p.155), does not fit with the idea of practices
that are not universal and, furthermore, evolve with their development.

129

Thematic analysis, in spite of the criticism against it for its lack of tradition (Bryman 2012,
p.554), emerged as an appropriate analysis method. Thematic analysis enables the use of topdown, deductive themes that emerged from prior research and themes generated inductively
from the data (Boyatzis 1998, p.4). By combining views from both the interpretivist and
positivist paradigms (Guest et al. 2011, p.15), thematic analysis matched the pragmatic
approach that the researcher wanted to follow.
Ritchie and Lewiss Framework (2003) method, one of the implementations of thematic
analysis, allows for the top down creation of an index where practices can be mapped to
themes and the different forces mapped to subthemes as they emerge during the analysis.
Instead of using the spreadsheet approach suggested by Ritchie and Lewis (ibid.em, p. 220),
NVIVO was used, since the methods theme-subtheme organisation maps well to the nodesubnode approach to coding that NVIVO supports.
5.6.1 Coding
Coding, labelling or tagging, depending on the author, is the process of identifying to which
theme a part of the data refers (Bryman 2012). The first coding iterations were done
following the structural coding approach, i.e. identifying which parts of the interviews
referred to which main themes (Guest et al. 2011, p.55). This resulted in nodes that match the
structure of the interview with the occasional jump ahead or back as interviewees made
observations relevant to practices that had already been discussed or were yet to be discussed.
During this process, three additional nodes were also defined, reasons to crowdsource,
requirements to crowdsource and other practices', but ultimately, were not used for the
analysis. The contents of the requirements to crowdsource node overlapped with the
contents of the Content Selection node. The contents of reasons to crowdsource included
much information about attitudes that could not be suitably analysed using Thematic
Analysis. The other practices node did not contain enough data about any single suggested
practice to justify the addition of new practices that had not been found in processes before
the interviews.
Commencing the analysis with structural coding contradicts Ritchie and Lewiss (2003,
p.221) recommendation to choose a fraction of the data collected and analyse it to determine
the main themes and subthemes. However, as stated above, the analysis of the use cases and
resulting workflow models in chapter 4 has resulted in the synthesis of the practices that
constitute the initial themes for this part of the research.

130

It was observed that the resultant themes were unwieldy after selected data had been attached
to the main themes. This is visible in Figure 27 that shows most practices having over 20
utterances attached to them. These utterances addressed different aspects of the practice and
had to be organized using a finer granularity. In order to reduce the complexity of each node,
a second iteration of coding was done where the references were coded as advantages,
disadvantages and neutral features for each practice. These nodes are not represented in this
chapter, but were submitted to another iteration of coding that was carried out in parallel with
the writing. Each of the nodes resulting from this last coding iteration became a subsection
header in this chapter, with the node structure effectively becoming the guide for the structure
of the chapter. It is at this stage that the subthemes that were not part of the initial framework
emerged.
With deeper understanding of the practices that was gained through this process, some of
them were renamed and issues with the questions were identified. These changes in the
names and issues with the questions are addressed later in this chapter in the discussion
section of each practice.
The total word count of the interviews is 64.456 words, counted using Microsoft Words
word count feature. The first two iteration of coding took approximately 81 hours. Further
coding was done in parallel with the writing of this chapter over the course of two months,
starting in June 2013, and was refined for another two months. In total, 165 themes were
identified. Eighty eight themes emerged by using the repetition criterion (Guest et al. 2011,
p.66), i.e., interviewees brought a matter up more than once during the interview. The
remaining 77 had a single reference at their lowest granularity. These were unique pieces of
information that the researcher found of interest and tagged following the advice of Guest et
al (ibid., p.68). The researcher expects that these single reference themes would form proper
themes if the amount of data collected came from more sources that had direct experience
with more of practices.
The code has was not validated by an analyst as suggested by La Pelle (2004) in order to
obtain intercoder agreement, since an analyst would have to be familiar with the practices
before being able to assess the coding and no such analyst was available. It was also not
validated with the interviewees as suggested by Saldaa (2012). The only validation of the
coding came from the supervisors who have a conflict of interest, because the failure or
success of this research will also reflect on them. However, the researcher integrated the
coded quotes into the text of this research, which according to Guest et al (2011, p.95) is a
131

way to increase the validity of qualitative research. By increasing the transparency through
the use of quotes, the researcher expects to reach face validity, i.e. consensus with other
researchers, as suggested by Guest et al (ibid., pp.8081). This face validity will be further
enhanced by submitting the research to external review (ibid., pp.92) in pursuit of a doctoral
degree.
5.6.2 Coverage
There was at least one relevant utterance for each of the practices from each of the sources
except for:

Translation without Redundancy: There are no references to this practice by Subjects


R and P because the question, as discussed before, was misaligned with the practice,
since it asked about submitting contributors to a test instead of asking about taking
only one translation per TU.

Open Alternative Translations: There are no utterances from Subject V because she
does not implement the practice and did not feel comfortable addressing it without
more knowledge.

Rollback: There are no references from Subjects AC, M and R. Subject R commented
that because of the way that their process works, they would never be in a situation
where rollbacks would be necessary and he had no experience of the practice. Subject
M also had no experience with the technique and did not have an opinion on it.
Subject ACs project does not use rollback either and did not initially understand the
practice. When the practice was more clearly explained to him he understood it and
considered it to be a potentially good practice, but did not go deeper.

Deadlines: There are no references from Subject M for this practice. Although
Subject M answered the question, he focused on deadlines being related to the type of
project, and no further information relevant to a crowdsourcing context emerged.

Metadata Based Selection: There are no references from Subject V because we ran
out of time before reaching the relevant question.

The chart in Figure 27 shows the references per practice. The only practice that has a very
low number of references is Version Rollback. As seen above, the practice had not been
considered by three of the interviewees and this resulted in the lower number of references.
However, as presented in the practice section, the interviews and their analysis still addressed

132

all the points that the researcher had considered beforehand as shown in the corresponding
section of this chapter.
70
60
50
40
30
20
10
0

Figure 27 References for each practice

5.6.3 Sorting
The sorting stage, i.e. putting all the data together according to the theme where it fits, was
automated by NVIVO. The purpose of this stage is to reduce the noise at the time of
analysing the data.

5.7 Analysis Outcome


Having coded and sorted all the relevant data, in this section, the researcher presents the
conclusions of the analysis supported by the relevant quotes. Finally, each practice has a key
findings section that summarizes all the preceding information.

Practice 1: Content Selection


This practice was did not emerge from the models in Chapter 4, but a similar pattern was
suggested by Dsilets and van der Meer (2011) and was necessary to support the concept of
an entry point for the collection of practices as recommended by Buschmann et al (2007).
Any content is suitable if the process is right
The first consideration that emerged for the Content Selection practice concerns the existence
of limits to the content that can be translated using crowdsourcing. Subjects P, R, AC and S
133

considered that any content could be crowdsourced. Subject R saw no issue in using a
crowdsourced translation platform for any kind of content, expressed this view by saying
[]I can't imagine what kind of content would not be interested to be translated in [[the
platform]]. [] to which he later added [...] Of course, people want to translate only open
source and software, really, [] but I suppose there's no problem to translate any problem
[meant content] in [[the platform]].[]. Subject AC added nuance by bringing up the need
to match the process to the content in crowdsourcing by saying [] if today we say
something is not suitable, then we probably haven't found the right solution for that [],
and this matches well with Subject Ss thoughts [] I think most content is actually suitable
for crowdsourcing. There's some caveats on who should be able to claim it, but I think it can
be crowdsourced. [], and Subject Ps opinion I think any content is suitable [] that is
clarified in the following subsection.
Content with special requirements
Subjects C, P, V, M and S bring up examples of content types that require further
considerations. Subject C notices that [] it is not possible to translate via crowdsourcing a
legal document. As well as the as some kind of medical documentation []. The issues
with legal and medical translations are also brought up by Subject P that, after saying that any
content is suitable, states its the legal ramifications of the translated content [rather] than
the process, so I would list things like medical translations that require full legislative or
reasons or whatever, even contractual stuff that require... things that technically require
some kind of certification [[from]] the translator. Subject M also brings up legal texts being
an issue and points out why it is so by stating It's related to the division of responsibilities
and the social approach doesn't cover these responsibilities in the socially acceptable way, so
the legal documents, well, at least for legal documents, even if you just assume social
translation as a draft, somebody who have the name has to take responsibility (emphasis
mine. Somebody who have the name refers to a singular person that can be named, as
opposed to a crowd). Subject S also brings up legal and medical translations as
problematic.
Besides legal and medical translations, Subject V brings up marketing material and Subjects
C and S bring up private and confidential documents too.
Subject P when talking about legal documents argues that even in those cases you could
probably have the same kind of community processes to do like professional translators
doing the work, using the same tools and you could use the same, even if it were like... three
134

translators operating in a community and looking at those kind of models to, to distribute the
work. And Subject S proposes a similar idea for legal texts by saying that you would want
to restrict who can claim it to people who have shown that they have that ability or that
expertise. And in the case of confidentiality issues he adds you'd have to restrict it to
people who've agreed to maintain the confidentiality or people who have been verified as
trustworthy to view sensitive information.
Having considered all this, it can be said that all contents are amenable for crowdsourcing as
long as there are measures in the process to deal with their specific issues. In the case of legal
and medical translations that are bound by regulations, reducing the crowd to translators who
are qualified would solve the issue. In the case of confidential information, reducing the
crowd to translators who are legally bound to respect that confidentiality would solve the
issue. When Subject A brought up marketing texts it was by expressing the need for specially
vetted translators to do that kind of job, which again, does not exclude those vetted translators
from using crowdsourcing style tools to carry out their task.
Crowd-Content Relationships
The second consideration that emerged is the appeal of the content. Legal, privacy, or
contractual limitations can be solved within the process, but it is possible to have content that
no crowd is willing to translate for free. Subjects that considered crowdsourcing as an activity
where the crowd does not get paid brought up the importance of the relationship between the
crowd and the content.
Subject P underlined the importance of the connection of the people to the content when he
said that In terms of looking at the content, actually, the problem becomes half, you know,
what is the connection of the people to the content that they are translating. Subject A
points out that It [crowdsourcing] works where you can find a community that cares and
further stressed by stating that that's how people decide, it's stuff that they have personal
interest.
Content that indicates competence
Subject A brought up the example of tutorials and how they can serve as proof of expertise
with You will get people to assist you on, say something like how to install, let's say an
Exchange server, because there's value in the guy, in the contribution and in the guy who
makes the contribution, being recognized as an expert. The idea of suitability of tutorials is
supported by Subject V when she says that I think for crowdsourcing for video [[project]]
135

90% of the content is actually perfect because it's mostly tutorials, featurettes or demos on
how to use our product. In this case, it can be said that the attraction for the crowd is that
recognition of expertise and the potential benefits that this recognition can generate,
something that Subject A brought up: the guys that do the Microsoft stuff, they get expertise
recognition and they get consulting business out of that or Microsoft flies them down to meet
with the product team [] And that kind of access is worth gold to these guys..
Content that has an embedded community
Several subjects brought up examples of content that works because the crowd has a preexisting relationship with it. For example, Subject R singled out the connection of some
crowd translators with FOSS by saying Of course, people want to translate only open source
and software, really, I don't know to explain, really connected with [[the open source
product]] in my case. Subject A when talking about Twitter and Facebooks crowdsourced
translation points out another example of pre-existing relationships being very helpful by
saying that where you have large communities that are online already, it's just a matter of
showing them a few strings and saying 'Can you translate this?'. That's going to work.

Content that is familiar to large groups of people


Subject M communicates that familiarity and smaller work units make the translation easier
and So, people get sort of easy interest in deal with small amount of text and work on that
without concentrating on translation, that was the content of the translation, because they are
so kind of used to these expressions in their daily lives, rather than requiring the focused
really. The idea of familiarity with the content being linked to success as introduced by
Subject M is also supported by Subject A when he discusses why their project was successful
when dealing with popular topics, but not very successful when translating scientific material
we wanted to do physics, chemistry and all the other stuff too, but you know, we never really
got that part done very well because for that you need relationships with schools and
colleges, you know, some way to engage students.

Content that is attached to cultural values


Another factor related to content that emerged as an indicator of success is the values linked
to the content. Subject A makes this connection between content and values when he says
136

that there is content from which you get benefits that are not monetary or professional, but
they are satisfying at a personal level.
Even though many supporters of FOSS present it as politically agnostic, it has been argued
that FOSS is politically charged (Coleman 2004), and FOSS is the main example of success
brought up by Subject R. Subject A brings up another example linked to political values
when saying that Yeeyan volunteers are people who want Chinese people to know what's
going on in the world and get a real outside perspective and not just the government
perspective". Although Subject M says that the more sort of socially kind of heavy document
like war, women's agenda issues, peace issues, law, economy require more dedication and
to translate [them] they [the contributors] need to at least cover much longer length of the
document, one of the platforms that Subject M is connected to is used by NGOs precisely to
translate documents connected to war issues, womens rights, law, etc. Subject S says that
not for profits, charitable documents are definitely more suitable for crowdsourcing, which
again hints at political values playing a role in characterising the suitability of content for
crowdsourced translation.
Subject A also comments that you can get a really engaged crowd around religious stuff
and that it is easy to form communities around health care, around spiritual care, around
child care. To underline this he brings up the examples of the Mormon church which
according to him has 130 languages, and nobody knows about it, but they are doing stuff on
a scale that no corporation is. Another example provided is that the Osho Foundation has
talks translated into up to 60 languages on oshotalks.com. According to him, such
organizations have volunteers lined up because they just wanna be associated with this
work. Like 'I did, I was involved with the Osho foundation, I was involved with the Bible, I
was involved with the Christian worthy stuff.

Content that requires material rewards


On the opposite side of the spectrum to all the above mentioned types of content is the
content for which there does not seem to be a community. Subject A talks about how HP
has a hard time getting people to come and translate their manuals about printers and
laptops. Subject S comments that vouchers would not be as suitable to crowdsource... they
could be, but a lot of people will tend to... if they are doing it for free, they will tend to prefer
to work on non-commercial projects. Subject A states that for such kind of content where
137

you get no support you have to either do it by paying someone or you have to create special
incentives.

Content Format
Another consideration that emerged in the interview was the way in which content is
presented. The following categories of content are analysed:

Flowing text documents

UI strings

Videos

Commercial websites

Flowing text documents


Subject C talks about long text in the form of HTML being crowdsourced on his platform,
examples being user guides, manuals and all of that materials. This is echoed by Subject P
who says that a tool he is involved with is used for the translation of help that involves
larger text and flowing text but still software centric. Subject P points out as an advantage
of longer form documents in that with bigger crowd or community [] you can get many
people contributing.
However, the subjects point out that long form text has special requirements. For example,
Subject C comments that crowdsourcing of documents also works, but it requires []
editing and this editing should be made by probably a professional or semi-professional,
professional translators that you make sure that the consistency of the terminology is ok and
the formatting and several more, several more things are checked and there are no issues
with them.
Subject P observes that the platform he is involved with may not be ideal for some types of
content that you would want to translate because you cannot add a paragraph or merge
two paragraphs. forcing the translators to follow the exact kind of layout of the source
text.
Subject AC says that his company uses crowdsourcing for document translation, video
translation and UI translation, but they are three different projects that use different systems.
He also thinks that everything that is a longer continuous text or medium, like videos or like
core documents is more difficult to be translated by the community.

138

From the above, the researcher concludes that flowing text documents are suitable for
crowdsourcing, but one should use tools specifically designed for the translation of flowing
text and/or add an extra review step to the process to improve the texts cohesion.

UI strings
User interfaces seem to be a good candidate for crowdsourced translation. Subjects P and AC
bring them up explicitly and Subjects C and R talk about translation of software which
includes the UI.
Subject AC, who works specifically on crowdsourced translation of UIs, comments that UI
strings are very good candidate for community translation, for collaborative translation
because according to him more people will translate shorter strings because it's faster,
because you get faster results and you get a faster reward in terms of seeing your translations
published and you do not have to be so much of a more linguistic based language expert.
Subject C agrees when he states that the success of crowdsourcing translations for
localisation is linked to the fact that the strings usually are short, are shorter and the context
actually depends on the position of the string, on the appearance of the string in that
application, but not the appearance of the string in the sentence of paragraphs or the page in
document projects. From the second part of the statement, where he addresses visual context
being more important than linguistic context for UIs, it can be interpreted that he thinks that
translating UIs requires less linguistic expertise than translating flowing text. Subject P also
commented on people using the platform he is involved with: Mostly what people are using
it for is software user interfaces. That would be the primary use of it. Also, it has been
mentioned previously that some stakeholders use Subject Ps platform for flowing text, but
that it is less suitable because it does not allow for splitting and merging strings.
Three specific types of software that were brought up were mobile apps, social networking
sites and open source software in general. Subject C considers mobile apps a perfect scene
that appeared recently. If he were to sort content types according to how suitable they are
for crowdsourced translation mobile apps localisation should be in the first place. Subject
R commented that people want to translate only open source and Subject A brought up the
success of Facebook and Twitter.

Videos
139

Although Subject AC brought up videos as type of challenging content because they


constitute a type of longer, continuous text, Subject V works in a project that is dedicated to
the translation of videos and Subject A brought up the success examples of TED and the
Osho International Foundation. The fact that there are succeeding efforts dedicated to the
crowdsourced translation of videos leads the researcher to consider videos to be suitable for
crowdsourced translation.

Commercial sites
Subject M, who is involved with a platform where users are paid for post edition, commented
that commercial site, like, vending site and Amazon and things like that, the fashion site...
they are probably the more suitable one. For several reasons, partly, the description of the
goods. Description of the goods are highly sort of repetitive, which means that the tuning MT
engines can be effective, more effective than the normal sentences, which means that MT+P
[MT plus post edition] model have much higher opportunity to succeed. Among commercial
sites he singled out cosmetics and fashion sites. It is also noteworthy that he is involved with
a platform that is used by NGOs to obtain volunteer translations of their material and he goes
as far as to say that from the point of view of the familiarity and from the point of view of the
unit of translation, the commercial sites and most of the, yeah, commercial sites and things
like that have, I think, more success, chance of success for crowdsourced translation than the
journalistic documents or NGO documents and things like that.
In contrast, Subject S, who works for a platform that is used exclusively by NGOs makes the
point about money: if they are doing it for free, they will tend to prefer to work on noncommercial projects.

Terminology
Subjects A and P commented that they use crowdsourcing for terminology development.
Subject P explains how they do it by saying we have a community that could add value, but
then they are not actually very good at developing terminology [] they dont have any
linguistic or terminology creation experience and we also want to try formulise and
standardise terminology. So we want an expert at the end to vet everything. So the advantage
then of the suggestion model, so the community is participating in the suggestion model, is
that its always easier to coins terms when you are coining from something, instead of from
140

nothing. []. Subject A just says they would ask for assistance in building terminology
without explaining the process further.

Content prioritisation

Two Subjects, P and R brought up content prioritization. Subject P said that theres this
false target that is 100% translated, which creates an illusion that everything is as important,
which we try to avoid and try to get people to avoid and trying to get people to identify what
is that that, you know, that kind of like, Pareto principle, 80-20 rule, like, what is the 20% of
most important stuff. Cause ideally people should be as much translating as rechecking that
20% making sure thats really good translations. Subject R added that every release have
to be translated at least the most important package or user interfaces thus further
supporting the concept of content prioritisation. Subject Rs group addresses this by
presenting the packages to be translated ordered by the most important to the less
important.
Key findings
The key findings with respect to Content Selection are listed as follows:
1. All types of content are amenable to crowdsourcing if the process has been adjusted
accordingly.
2. Crowd content relationships are very important.
a. FOSS, NGO and content with attached prestige or political value are good
candidates because there are crowds with strong connections to the ideas that they
promote.
b. Spiritual and religious content also count with crowds that are heavily involved.
c. Technology brands with big embedded communities have crowds available for
crowdsourcing the translation of UIs.
d. Crowds are not willing to translate for free documentation and sales related
material for commercial companies.
e. If you are paying for the crowds work and the crowd is familiar with the things
that you sell, translation for a commercial site (product descriptions) is a good
candidate for crowdsourcing.

141

f. Tutorials are a good candidate because volunteers either prove or acquire


expertise by translating them.
3. Content formats also affect their suitability for crowdsourcing.
a. Formats have an impact on the willingness of the crowd to get involved.
b. User interfaces are amongst the most popular and successful types of content that
is translated via crowdsourcing. Experts consider that the fact that UI text has to
work only in the UI context and not in the context of flowing text makes UI text
easier to translate for non-linguists.
c. Videos are suitable for crowd translation. The organizations that successfully
leverage crowdsourcing for their video translations use platforms that have been
developed specifically for crowdsourced video translation. Crowds that translate
videos are motivated either by cultural values attached to the project or because
their involvement indicates that they have valuable skills related to the content of
the video.
d. Crowds can be a good as a first step in terminology creation and translation, but
because terminologies must be self-consistent, a person with an overview of the
knowledge space needs to make the final decision.
e. It is also important to match the tool that is used to the content format. Tools
designed to support the translation of flowing text may not be suitable for strings
or videos and tools designed to support the translation of strings may not be
suitable for the translation of subtitles or flowing text, etc.
4. Since not all content is equally important, it is a good idea to not only select the content,
but also to prioritize it.

Practice 2: TU Granularity Selection


This practice was did not emerge from the models in Chapter 3, but a similar pattern was
suggested by Dsilets and van der Meer (2011) and was necessary in order link the entry
point practice, Content Selection, to the rest of the collection of practices.
Factors that condition the unit granularity
One of the considerations that emerged about this topic was that you are not always fully in
control of the size of the TU that will be used.

142

The format of the content is a factor that limits your choice of TU. For example, if you are
working on crowdsourced translation for localisation, the size of the work unit may be
partially determined by your programmers, as Subject R observed for programmers it's easy
to cut the translation in not so big phrases. So, I think it's a really the source is not translator,
the source is programmer. That is to say that if you are working in localisation, the size of
your work units will also be partially determined by the work done by the developers.

The combination of languages may also affect the size of the TU as notice by Subject A when
he brought up that when they were going from [[language 2]] to [[language 1]] it was
possible to do it at a sentence level but the nature of the language pairs could a limiting
issue since if you are going from [[language 1]] to [[language 2]] then you maybe have to
work with paragraphs. because there is not a way of matching those languages at a more
fine grained level.
Subject A also brought up another consideration to take on account when selecting the work
unit granularity by saying that you have to kind of think 'what will my contributor base
bear?' will they be willing to accept a full article, do I need to give them a paragraph or do I
need to stay at the sentence level. I don't think there's one clear rule. You need to match it to
what is possible.
If shorter work units are preferred in a project, they can be achieved from longer work units
via segmentation. This was illustrated by Subject C commenting that when people upload
flowing text to his platform they try to segment the long portion of text to the smaller. In our
case it's called, as I said segments, and the segments are usually, is a sentence. Subject P,
whose platform can also receive flowing text inside PO files, and the platform will present
that content at a paragraph level, but when they are working on our own stuff we will
segment paragraphs down to the sentence level.
Longer work units
When asked about the size of the unit that they present to their collaborators, Subjects S and
V said they presented a whole document or a whole video. Something to take no account
when thinking about what is a long work unit is the time it takes to translate it. There is no
hard limits, but considering that it has been claimed that the most popular Youtube videos
average less than minutes (Jarboe 2012), the researcher thinks that any work unit that will
take more than 3 minutes to translate if the contributor knows all the words in it, should be
143

considered long. As an example of this, Kiva presents descriptions of the applications for
loans that are between 200 and 250 words. Although this may not seem as such a long unit,
translating 250 words can easily take an hour which in the context of crowdsourcing is a long
time.

Pros of longer work units

Internal consistency
Subject A said that he thought that it's always better to have someone do a bigger chunk
than a smaller chunk and Subject C commented that quality is important and that requires
long text. They did not go in detail regarding why they thought so, but Subject V made a
point that is worth bringing up which is consistency in style. In Subject Vs project they use a
single translator per video because they wanted to make sure that for one video we had the
same tone throughout the translation following the speaker. This matches Subject Ss
observation that if one person translates an entire document, you can be fairly sure that the
document will be self-consistent.
Simple credit assignment
Subject V also brought up credit assignment being easier with big work units and she felt that
credit assignment was important by saying that if somebody was kind enough and interested
enough to translate the video, they should get the credit for translating the video.
Cons of longer work units
Increased probability of tasks being abandoned by volunteers
Subject C claimed that a problem of using long work units is that translators will just not
finish their work. They will just translate a part and just, you know, left the project. This is
supported by Subject V commenting that sometimes some of my translators may get tired of
a video. I think some of them once in a while I have to go and close a... not close, we reclaim
a video because it's been assigned too long to somebody who may have not time to finish.
This indicates that, at least in volunteer crowdsourcing, longer work units are less likely to be
finished.

144

Increased probability of tasks not being claimed by volunteers


Subject S commented that something that they will have to do in the future is that if we see
that an organisation is constantly uploading 10000 word chunks and they are not being
claimed, one of the suggestions about is to say to them that 'perhaps, subdivide this and see if
it's picked up that way and if they are smaller chunks we think they'll be picked up. Subject
A also noticed that you don't want to make it [the work unit] such a big chunk that they don't
want to do it. This indicates that, at least in volunteer crowdsourcing, longer work units are
less likely to be picked up by volunteers.
Increased probability of tasks not being finished by casual workers
Subject M, who is involved with one crowdsourced translation platform where translators are
expected to work casually and are paid also points out that a long work unit does not work
well in that context by saying that if the unit that they are supposed to deal with becomes too
long, then it's a burden for them.
Not suitable for the collection of alternative translations
Subject S noticed that working with long work units means that collecting alternative
translations with unpaid contributors is less feasible since it's a big disincentive to the
community if you have 3000 words translation that's been claimed by three people and only
one of the three translations is used. Subject C added that if ten people translated the same
document, they did it in several manners, they did it in several ways and there should be an
authority which decides which translation is better and Subject P when considering the
advantages of limiting the number of alternative translations commented that they are
limiting them to a number that humans can cope with, which is probably nothing more than
seven, but even seven seven terms maybe, but probably three sentences and even that would
be difficult. which points towards the possibility that the more options and the more complex
those options, the harder it becomes to choose one.

Limits the opportunities for parallelisation


Subject S noticed that when you are doing document level translation it's very hard to allow
multiple users to claim a document at the same time or to have multiple volunteers working
on the same document.

145

Short work units


If long work units are those that would take more than three minutes to translate even when
the contributor already knows all the words in it, short work units are those that would take
less than three minutes.
Pros of short work units
Increased chance of being translated
Subject C noticed that from translator's perspective, the segment should be as short as
possible. This idea is reinforced by Subject ACs comment that Content with short unit to
translate is probably more successful to be accepted by the community and Subject Ms
statement that people get sort of easy interest in deal with small amount of text.

The reasons for that are diverse. Subjects C and AC talked about shorter strings being easier
to translate. Specifically Subject C stated that for translators, specially, not professional
translators, it's really hard to deal with long version of text something that is supported by
Subject ACs opinion that in order to translate shorter units you do not have to be so much of
a more linguistic based language expert, I would say, compared to somebody who translates
documents for example

Another reason is the willingness of the crowd members to tackle those units. In Subject
ACs experience strings that are, I don't know, less than long paragraphs or maybe
sometimes not even the length of a full complete sentence as in the UI, I believe that more
people will jump onto that opportunity. Affecting this willingness is the time factor that was
brought up by Subjects A and M. Subject M observed that with short work units they can do
that whenever they have time. Subject A commented that having short work units means
that people can translate maybe even on a phone and they just do three sentences here and
three sentences there when they are waiting at a bus stop or when they're idling for a few
minutes. The argument about working on a phone was also presented by Subject S who said
that if you are translating directly on a mobile phone, it's easy enough to translate a single
string cos it's only a small piece of text.

146

Subject AC said that people will also tackle shorter work units because you get faster results
and you get a faster rewards in terms of seeing your translations published.

This increased willingness to tackle the task and the possibility of doing it at leisure may be
why in Subject Cs experience when we split the text into more portions, it's easier to
crowdsource, actually. It's easy to operate with small parts than operate with bigger ones
and why Subject A said that when choosing a work unit size you want it to be small enough
that they may make three now and then come back later and do 10 later.
Cons of short work units
Can require context to make sense
In Subject Ms project initially users were presented with isolated sentences that were used to
train their MT system, but according to him that was unpopular and post editors cannot
have motivation.
Increased chance language inconsistencies across units
Subject S noticed that working with a smaller chunk is that there's a greater chance that
inconsistencies will arise in documents, not just with terminology issues but also keeping a
style or tone because of how different translators have translated different sentences in the
same paragraph and even grammatical aspects since verb tense may not be maintained
and these inconsistencies would require a great approve or post editing effort.

Strings as work units


Subjects C, P, R and AC are involved with platforms that display strings to the user.
Strings have variable sizes
Strings as observed by Subject AC can be a word, a single word. This can be an 'OK' or
'cancel'. This can be the content of an info dialogue that pops up, or an error dialogue. So the
length varies greatly, the length of those strings. It can be anything from a word, to a
sentence, to a paragraph.
The variable size of strings can be an advantage
Subject AC when asked about why they did not segment the longer strings pointed out that,
although they do not think that doesn't mean that it [segmenting longer strings] shouldn't be
147

done, there is an advantage of mixing sizes that they have noticed in the project and that is
that users start with very short strings because they are easier to translate and that helps
them get a feel for the tool, for the internet application that runs in the browser, and slowly,
when there are no untranslated short strings they move on to the somewhat longer strings
and at the same time, they become a little bit more experts, maybe for that language for that
product, or for the tool. So it almost feels like an organic process to offer to them strings of
different lengths, starting with a single word and up to a sentence.
Strings management is simple
Subject P when asked why people present strings to their users answered No reason other
than the ease of creating the data., which brings out another advantage of using strings: the
simplicity of creating the data.

Segmented longer work units


Subjects A and M used combined granularities by presenting work units that were longer than
a sentence, but were segmented.
For example Subject A said that they presented a paragraph's sentences all aligned in a
source-target sentence by sentence basis. In Subject Ms project they provide the minimum
of the sort of meaning unit of these sentences, consisting of several sentences which was
sort of like a paragraph [..] and in the case of the description of the commercial products,
the description complete description for one product.
The reasons for doing this were that according to Subject A presenting the paragraphs in
sentences allows for continuity. Subject M has experience working without continuity and
comments that they provided sentences without context, one by one, but it was not really
popular.
Key findings
There are factors like the different ways in which languages structure text or the ways that
programmers store strings that will affect the choices available.
If you have long units, you can divide them into shorter work units by segmenting them.
Regarding long work units, on the positive side, they help achieve consistently written pieces
of text and the credit for the work is more easily distributed. On the negative side, longer
work units are more likely to not be picked up or be left unfinished. This affects even
148

translators that are paid but are not under a contractual obligation to finish the task.
Furthermore, working with long units makes using alternative redundant translations less
useful because of the difficulty of choosing the right translation and the effect in the morale
for those members whose translation is not chosen. Finally, longer units mean not being able
to have people work in parallel and results hence in reduced throughput.
Regarding shorter work units, on the positive side, crowds are more willing to work on them.
This may be because short work units are:
-Perceived as easier.
-Take less time and as a consequence:
-Can be translated even using phones.
-The reward of having done a translation is more immediate.
On the negative side, working with shorter work units risks higher inconsistency across units
and if consistency is a requirement, a bigger effort to achieve it.
Strings with their variable lengths can work as a natural learning environment for crowd
members and from a data management point of view they are easier to manage than
segmented strings.
It is possible to present segmented, longer work units that result in longer, consistentlywritten chunks of text while keeping some of the advantages of shorter work units, however,
this must always be done in a manner that the work unit presented has global meaning and
not just as a random collection of segments.

Practice 3: Leveraging Translation Memory


As presented in Chapter 4, several crowdsourced translation platforms leverage TM.
Perceived advantages

Increased consistency
When asked about the benefits of leveraging TM in a crowdsourcing scenario Subjects C, P,
AC and S brought up increased consistency. Subject C said that TM helped to make the
content translated by different people have the same terminology. Subject P said that in
149

terms of the community, it helps foster consistency. Subject AC said that translation
consistency is our main reason to use it and Subject S said that if TM is in an integrated
manner in the site, you can easily share the resources so that if you get greater consistency
between different translators.
Can help internationalization by replacing pseudolocalisation
Subject C also pointed out that TM can be a replacement for pseudolocalisation, which can
help detect issues with internationalization.
Can increase throughput
Subject AC also observed that TM is extremely helpful for people to see how something has
been translated before which can help when doing other translations, whereby resulting in
less time spent translating.
Subject C brought up that when they have TM suggestions for certain string, it's, it's
usually, it's suitable enough which implies requiring less time to finish since no further
editions are required.
Perceived disadvantages
Adds complexity to the UI
Subject A commented this kind of stuff is great if it's relatively simple to interact with and
Subject C noticed that in their platform translators are not professionals, they don't know
what is actually TM and how it works. Subject P commented that in their platform they had
hidden away some of the technicalities from users because people are often obsessed with
the percentage match and to most community translators we see they do not understand what
that means so by removing it we thought we did not remove any kind of value. This indicates
that TM potentially makes the UI more complex and Subject A noticed that non-professionals
are not going to use stuff that is really hard to interact with.
Can result in misuse of existing translations
Subjects C and M noticed that even a good match can be a bad translation. Subject C brought
this up by stating that the context of different strings may be different and for the same
translation to work the context of those string has to be identical, which is related to
Subjects M observation that the "same string which appears in one domain should be
translated in other domain differently. As Subject C noticed, this can result in contributors
choosing the suggestion direct from TM and submit to the wrong string.
150

Can contain poor translations


Another issue is the quality of the translations in the TM. Subject R noticed that some of this
memory is not good memory, because not all the memory is really approved by real
translators. So you can found some phrases and some translation not, a little bit badly,
which is echoed by Subject Ps observation that TM did not help languages without a
translation tradition because with new languages we found that theres a lot that they need to
learn in how to translate in their language.
Needs maintenance and management
Subject M commented that people who are working at a company, who are dealing with
high quality translation, managing TM, if they use TM, is one of the expertise they have.,
which brings up another issue that for TM to be useful in the long term, it has to be
maintained and managed.
Measures to control the disadvantages
Flagging automatically leveraged strings
Regarding bad matches, be this because of low quality TM or changes in the context, Subject
C said that in their system project managers can get a list of the strings that have been
translated without people checking and that extra check helps prevent some issues
potentially introduced by automatic translation via TM or MT.
Letting contributors choose UI modes
Regarding the UI complexity issue, Subject A says that you have to make your UI
configurable. If you want people to help, give them some info that says 'We can provide
dictionaries, we can provide glossaries, we can provide TM, we can provide MT. Which ones
do you want? and let them look at it for a while and then turn off what they don't want. He
brought up an interface from an off-the-shelf tool that he thought worked well. The interface
says Here's the source, here's TM and here's MTand then there's a third block of empty
space and you pick what you want.
Key findings
In the context of crowdsourcing where many contributors can work on a piece of text TM can
help reduce inconsistencies. By shortening the time it takes to write a new translation or
meeting a minimum quality requirement, TM helps to increase the throughput. For apps TM
can be used to pseudolocalize the app and identify internationalization issues.
151

Adding TM can add complexity to the UI which can be off-putting for some crowd members,
so integration must be done carefully.
It is possible that the translations in the TM are not of optimal quality, especially in the case
of languages without a translation tradition, so TM management to help the TM evolve will
be required.
Crowd members may use TM matches that are not good enough, so special care should be
taken with TM matches that have been suggested without any edition.

Practice 4: Leveraging MT
Machine translation was used in several of the processes analysed in Chapter 4.
Considerations

It may be unnecessary
If the crowd is performing very well, you may want to skip MT completely. Subject A
commented that Adobe has their some success with their TV program and some stuff in
China, you know, I think there were able to bypass trying to do stuff with MT, because the
crowd contributions were so strong that they didn't need to.
Domain specific MT in crowdsourcing
Subject M said that in one of the platforms he is involved with clients can use it [MT],
because of domain is limited. The idea that domain specific MT systems perform better
(Hutchins 1998) seems to keep its validity in a crowdsourcing context.
Confidence scores for MT in crowdsourcing
Subject M also observed that if we can incorporate, self-judgment or self-evaluation to MT
engine, like the the MT engine after translating is the MT engine can show 'look this part is
reliable', 'this part may not be'. Then it can become useful. His suggestion is that adding a
confidence score to the MT output would be helpful.

Perceived benefits of MT
Increased throughput

152

Subject S said that MT can be used to speed up translation and reduce the work load and
increase the throughput that a single translator can achieve. Subject V conveyed a similar
thought when she said that if people can start the translation like that and then spend some
time... a lot less time just fixing it or making sure it's good, then yeah, I think that's great.
Subject R also hinted at MT as a way of saving time when he said can be useful to begin to
construct the phrase and after that you can depurate and you can tune the phrase and put it
in your translation, which correlates with Subject Ps statement that someone that
understands what MT is and its limitations can use it quite effectively.
MT can be a vector for terminology
Subject M noted that in one of the projects he is involved with the terminology was sort of
run by MT running and tuning.
Guarantees no untranslated text
Subject M noticed that MT systems, unlike humans, never accidentally omit some important
information.

Perceived disadvantages of MT

Low quality.
Subjects C, R, AC, A and M pointed out the possibility of quality issues. Subject C stated that
the quality they get from MT is often much worse than what we get from TM. Subject R
said I can read phrases really, really bad translated because you know that's it's translated
directly from Google or from Bing indicating that he can identify MT output by its low
quality. Subject AC notices that if they quality is too low it may not help the community
translators improve the translation quality and that in some cases they even hear the
community laughing about the output. Subject M noticed that using a general MT system for
certain language pairs doesn't work, it doesn't work at all which echoes what Subject A
said when he stated that he had heard people say that if you are using Google for Slovenian
or for Thai or Hindi, it's so bad that you may as well start from scratch, that you are better
off with a blank sheet.

153

Misuse
In Subject Rs context, where people get rewarded by the system for suggesting translations
he has seen people that use [MT] without any correction.
Negative influence on novice translators
Subject P noticed that new translators, [] are just gonna follow it blindly and will follow
it stylistically and solving this issue requires quite a bit of education.
Cost
Subject P brought up that most of the free services have become not free, which means that
there may be a cost associated with using MT.
Negative reception from the crowd
Subject AC commented that sometimes they hear the community laughing about the output
of their MT system, which sounds unproblematic, but Subject S noticed that a lot of
proofreaders are not happy to be proofreading machine translated translations while they
were happy to take human translations and proofread those.
Key findings
If the crowd is performing well enough, skipping leveraging MT should be considered.
As in the traditional paradigm, MT can accelerate the translation process.
Although the quality is generally perceived to be low, MT guarantees that there is a
translation available because it does not skip any text.
If the MT system is trained with domain specific TM or uses a domain specific glossary, the
right terminology will be embedded in the MT suggestions, which means that MT can
enhance consistency.
If you are rewarding users for making suggestions, there is a high chance that MT will be
used to exploit the reward system.
Novice translators may end up imitating the style of the MT system and that may have a long
term negative impact on quality.
Some members of the crowd may react negatively to MT.

154

Practice 5: Leveraging Terminology


Several of the platforms discussed in Chapter 4 leveraged terminology in their process.
Perceived benefits of using terminology
Terminology increases consistency
Terminology helping with consistency was implicitly brought up by Subject S when he said
that One disadvantage of working with a smaller chunk is that there's a greater chance that
inconsistencies will arise in documents, especially if you don't have terminology assistance
or glossary assistance, so a particular term could be mistranslated or the capitalization of a
term could be ignored in one sentence but not on the other.(emphasis mine). Subject P was
more explicit when he said the most useful tool we find is terminology, just because that
allows consistent use of terms. Subject AC brought up terminology as a tool that helps to
achieve consistency when discussing TM by saying that On our side, it's [TM] of course
preferred because we achieve consistency of the translation as they are coming in. The same
is true also for terminology databases.

Terminology simplifies and therefore speeds up the translation process


Subject P noticed that besides helping with consistency, terminology makes the task easier
since their collaborators can string a sentence together but its the kind of domain specific
words or more complex words that they need some help with which is supported by Subject
As observation that providing terminology was very helpful to people because they didn't
have to look up those words. Subject A expanded his view of why terminology was helpful
by saying that terminology accelerates [] and also, [makes it] easier for the contributor.

Terminology mitigates terminology related arguments


Subjects R and V do not have terminology integrated in their systems. Subject R commented
that I didn't use terminology databases because, I think, in our case, in the [[open source
translation team]] we have some wikis for this problematic. From the researchers
experience following the groups distributions list it can be stated that for this team the wikis
are a way to solve arguments about terminology, not a way of increasing productivity or
quality. Terminology and TM as ways of solving issues in the community was also brought
up by Subject S when he said regarding continuous editions caused by disagreements on
155

terminology that TM system or a glossary of terms provided by an organisation would help


to mitigate that as well, because you have an authoritative version of the term.

Integrated terminology helps keeping the terminology in the translations up to date


Subject V said that without integrated terminology, they send our translators to the... from
our site from [[project]] there's a link to [[site]] UI database. I don't know that it's the best for
me to have my committee translations on [[project]] in that is definitely one aspect that I
would like to improve, to give them a more direct access to our UI terms glossary since she
thinks that with the current approach they may be pushing live translations that may not
have the correct translation as far as UI terms go.

Terminology can also be implicit in MT


Subject M commented that one of their platforms does not use terminology in an explicit
manner because It's sort of run by MT running and tuning

Displaying Terminology Definitions


Subject C outlined how they present terminology in a simple manner it appears as a hint,
it's just as the title to the source string, it just says Hey, you see this word, it should be
translated like that or even they are not translation they just, it just helps with the
description.

Perceived disadvantages of leveraging terminology


None of the subjects expressed any negative opinions on using terminology in a
crowdsourcing context. Subjects C, P and A were especially positive. Subject C said that
using terminology only have positives influence on the project. Subject P said that the
most useful tool we find is terminology. Subject A said that they provided lots of
terminology and that was very helpful.
Key findings
Having terminology tools integrated in the platform fosters consistency.

156

Having terminology tools integrated in the platform fosters the usage of the most up to date
terminology.
Having terminology tools integrated in the platform makes translation easier and faster.
Having terminology can help mitigate terminology related arguments.
Even if a translation has not been decided upon yet, including a definition in the terminology
is helpful.
Terminology can be integrated in the MT suggestions if the MT system has been trained with
the right material.

Practice 6: Translation without Redundancy


Translation without Redundancy is used in the Kiva process analysed in Chapter 4 and also
used by the project that Subject V manages.
Three approaches
During the interviews it emerged that this practice is implemented in three different ways.
1) Translations written by translator tested a priori.
2) Translations are tested a posteriori by the addition of a review stage.
3) Translations written by translators that have proven their skills in the current or
previous projects.
Translations from tested translator
A translator can become a trusted translator by passing a test, as it happens with Kiva
translators. Subject M is involved with a platform that uses Translation without Redundancy
and pays for contributions. In this platform in order to get service registered they need to go
through go through the first exam to... and then get their competence checked by expert
translators. In this context Subject M says that the test from the point of view of managing
the system, it's a minimum quality control because from the point of view of the translators, is
sort of the motivation and pride.
Perceived disadvantage of the tested translator approach
Reduced crowd

The issue with this approach is that in the case of unpaid collaborators, it can turn people off.
This was illustrated by Subject Cs statement that it's really big problem of the approach of
having some kind of preselection, some kind of moderation of participants is that, project
157

cannot be really big, which is supported by Subject A saying We used to actually have a
test for translators to see that they were good enough to do the translation and that turned so
many people off that we stopped doing the test.
Reduced scalability

Subject C noticed that if you are working with a smaller crowd this is not real
crowdsourcing and it has all of the problems I mentioned: scalability.
Reduced value from a marketing perspective

Subject AC thought that community translation always becomes a marketing effort of a


company, which is smaller than the intended translation target, but it's important, so if you
start testing the people, or have them write applications, then you get a much, much smaller
list of participants, which means that the project loses value as a marketing effort.

Adding a review stage


In Subject Vs project where no redundancy is used, an extra validation stage is performed by
a trusted person, in this case a paid professional. According to Subject A, TED talks also
works in this way and they have a subject matter expert that just cleans up a little bit and
then the feedback is given back to the original guy saying that 'We made these corrections not
because you made a bad translation, but because this guy has expertise in this area and he
thought that this was maybe not quite correct and you need to say it like this'.
This approach was also suggested by Subject C as an alternative to testing translators
beforehand when he said that it's easier, and this is the right approach, to have additional
stage of validation instead of having, instead of dropping a lot of community being involved
in the project.
Data mined trusted translator
Translators can become trusted translators after having collaborated for a while and delivered
consistently good work as in Subject As project where they combined Hidden Alternative
Translations and Translation without Redundancy. They used the data about the performance
of the translators to identify trusted translators whose contributions just go through, even if
they are only one.

158

Perceived advantages of Translation without Redundancy


One-task-one-person distribution ratio
This approach makes the most of the available crowd by avoiding scenarios where people
translate the same stuff over and over again which Subject A identified as a waste. The
main advantage of this is the optimization of your resources, since each translation unit goes
to one translator that produces a single translation.
Simplified credit assignment
Another advantage of this approach is the credit being clearly assigned to an individual.
Subject A brought up the example of someone who is doing an article on how to install an
Exchance server... he wants to do the whole thing. He doesn't want to do one sentence,
because he wants the association with... I did this and I know about how to do exchange
servers. Subject V also saw value in someone being able to claim a whole translation as
their work and commented that if somebody was kind enough and interested enough to
translate the video, they should get the credit for translating the video on their own.
Perceived disadvantages of Translation without Redundancy
If combined with longer TUs, reduced throughput through limited parallelization
The systems with which Subject V and Subject S are involved use no redundancy and long
TUs. In these systems when a user claims a task the task stops being available to anyone else,
since not locking the task would mean risking another user tackling it and creating
redundancy. A single translator working on a task will take more time than several translators
working on the same partitioned task as illustrated by Subject C commenting that in case of
having a sentence to be sent to each translator, they can work on a small part of this, of the
whole project, yet this gets results faster and get them committed to the project.
If using longer TUs higher chance of tasks not being finished
When combined with longer translation units, a disadvantage of this approach is that you
depend on one person for a bigger amount of text and that person may not complete the task.
Subject V brought up an example when she said that a translator if they grab the translation
and don't finish it, a lot of time it's because they just want to try it out a little bit, or they are
not really a translator, they just wanna have fun with it.
Potentially wasted effort if translations cannot be retrieved when reassigning tasks

159

In the project managed by Subject V when a task is not finished they have to start from
scratch because all work is lost when they reassign it. Thus, if a system does not allow
reassigning tasks while keeping the data or retrieving it easily, unfinished work goes to
waste. The issue of work being wasted may be minor since in the experience of Subject V
when translators do not finish a task there won't be a lot of it before they give up on it
and she does not think that she has ever deleted a translation that was 90% complete.Key
findings
Translation without Redundancy works pretty much like the TEP approach. It happens when
there is trust in the translations.
There are three ways in which the translations become trusted:
1) The translation coming from a translator that has been tested.
A consequence of testing translators is that the crowd will be much smaller.
A small crowd weakens the value of crowdsourcing translation as a
marketing strategy.
A small crowd does not allow the scalability of a bigger crowd.
2) The translation is submitted to an additional review.
3) The translation was written by a translator that has already proven their skill inproject.
Translation without Redundancy is the least wasteful approach, which makes it suitable to
increase the impact of smaller crowds.
Credit assignment is very simple and can be used for motivation.
If used with longer TUs the risk that a task will not be finished is higher.
If used with longer TUs low parallelization further decreases the throughput.
When a task is reassigned, it can be wasteful if there is not a mechanism to recover the work
that has already been carried out.

Practice 7: Open Alternative Translations


This is the practice of openly collecting multiple translations for a single source TU. It
appears in Facebook, Crowdin, Launchpad and Pootles process.
Crowd suggestions as starting point for experts
Subject P does not think collecting multiple alternatives and letting the crowd choose one is a
good idea, but notices that there is a value in using the crowd suggestions as a starting point
160

when he says that after looking at suggestions As a linguist at last you tend to see ah,
theres a pattern here. They actually mean this type of execute and theyve given these three
suggestions and I dont like any of them but I know now what they mean and this is a better
term that we have in our language. Subject R expressed an opinion that aligned with
Subject Ps opinion when he said that he thought that it is very useful to have different
contributions for the same chain, because you could elect this way or maybe debate in the
mailing list and say OK, there are two possibilities. Guys, what do you think about... and so
on and so on.
Multiple alternatives as MT training material
Subjects A and M brought up the value of the redundancy when training MT systems. Subject
M stated that From the point of view of training MT or kind of developing the writing
system, automatically writing system, we need to... It's good to have different versions and
Subject A explained that there is value in seeing slight variations because that teaches
computers also how the same sentence could be translated in 7 different ways.
Enables crowd assessment in the form of voting
Subject AC said that an advantage of redundant alternative translations would be the voting
mechanism [] that's of course an advantage that will lead to higher quality.
Fosters more engagement from the crowd
Subject P commented that one of the advantages is that the community is participating, but it
was Subject AC who made the most observations about this by saying that the advantage of
letting contributors make more than one suggestion is that there's this community dynamic
because as soon as you have more than one translation for any given string, people can start
discussing 'why is this the better translation, why that... this links to the observation done
by Subject R that the multiple options were discussed in their mailing list. Subject AC also as
a result of being able to see the different options people get more involved, it's more fun for
them to discuss and make sure that what they think the best quality is can be achieved. The
fun factor was also brought up by Subject M who was not keen on this approach but said I
think when the aim is to enjoy, the content, or to enjoy the text itself, it's fine.
Selecting the right translations is a challenge
Subject M brought up the need to select as a disadvantage we need to have the sort of
evaluation judgment for it. Look, this translation is better than this. Subject S went into
161

more detail again the one thing is how do you select to take as your final translations. You
need to have some mechanism for that, either automated or you need to actually have
someone physically sit down and then review the document and put that in and if you are
selecting from multiple sources again you have to make sure that the tone in maintained, that
the grammar is maintained as well in the document so that it's self-consistent in the final
output. So there is the issue of selecting, but also, the issue that that selection has to be
consistent which can be a challenge when collecting translations from different volunteers as
it was discussed previously when addressing the disadvantages of short TUs.
Creates redundant data
Subject AC said that if you have ten strings, as a matter of fact people will only look at one,
two or three strings, at the top anyway and the rest just becomes a dead body in the
database.
Creates noise
Subjects R and AC brought up that the number of translations can be an issue. Subject R
expressed mild concern about information noise when he said that there can be a lot of quite
similar translation and, ok, maybe, you could say it is a lot of noise in the translation.
Not suitable for longer translation units
Subject S noticed that using this kind of redundancy would generally prevent you from using
larger translation unit, like a document because it's a big disincentive to the community if
you have 3000 word translation that's been claimed by three people and only one of the three
translations is used.
One-task-multiple-person distribution ratio
By using this approach people translate the same stuff over and over again which Subject
A identified as a waste.
Key findings
Crowd suggestions can be used as a starting point for the work of professionals or trusted
members of the community, enabling hence the Expert Selection and Edition Practice.
The creation of multiple translations for a single output is useful for training statistical MT
systems.

162

Enables crowd assessment, both open and hidden, being used for the selection of the
published translation.
Fosters crowd engagement.
Selecting the right translation for publishing is challenging.
The existence of multiple alternatives generates noise that affects the decision making
process and redundant data that weights the databases supporting them.
It is better suited for working with smaller TUs.
It wastes effort in the creation of translations that will not be used.

Practice 8: Hidden Alternative Translations


Initially this practice was called limited redundant alternative translations but during the
interviews it became clear that the core feature of this practice was not so much that a limited
number of alternative translations were collected, but that those translations were not visible
to the members of the crowd. On the industry side, txtEagle and CrowdFlower collect
multiple redundant translations, but do not display them to the users (Eagle 2009; Bentivogli
et al. 2011), in academia this approach has also been successfully used through Amazon
Mechanical Turk (AMT) (Zaidan and Callison-Burch 2011). In all three cases, translators
obtain material rewards for their contributions. This practice was also enacted in Subject As
project where new, untrusted volunteers did not see the suggestions of other volunteers.
Many of the features of unlimited redundant alternatives apply, and the differences will be
covered in the discussion part for this practice. Some of the features below have to do with
the practice used to collect a limited number of alternatives, which is the case in all the real
life implementations up to the time of this writing.
Features resulting from the collection of alternatives being limited in number
Time saving on the selection process
If this technique were used in combination with Expert Selection and Edition, Subject R
noticed an advantage in that no-one can make a decision when there are ten suggestions but
they can if theyve got three, so there may be marginal time improvements in the selection
process. Asia Onlines Wikipedia translation project did exactly that by collecting three
hidden alternatives from the crowd and letting a professional choose one when none of them
matched each other (Vashee 2009).
163

Cost saving on the selection and review process


Subject C observed that if you pay for someone to do the selection or the proofreading, all
of the additional suggestions will be charged by the agency, by the freelancer, by anyone who
will read those translations.
Reduced redundancy results in more optimized task-to-resource ratio
Subject As statement that It's kind of a waste if you have people translate the same stuff
over and over again again can be used to claim that this practice by virtue of reducing the
amount of redundancy reduces the amount of effort wasted.
Optimal translation may not appear
Subject R noticed that by limiting the number of translations you can lose some good
translation because the number is not the five or the six or maybe the position of the
translation meaning that a better translation may have emerged later on.
Features resulting from the alternatives being hidden from other translators
Enables frequency of translation based selection
This approach also enables a Metadata Based Selection method based on the frequency of a
translation being suggested that was used in Subject As project. He brought it up by saying
that when we ended up getting multiple versions of the same thing and the ones that had the
more people do it exactly the same way automatically got accepted and explained it further
by saying that if there's say 10 translations of a segment, we will take and say 4 of them are
identical and the others are different, we'll choose the one that has four identical and say
'This is the one we use'.
Increased throughput by automatically directing the collaborators to strings without
translations that meet your automated selection quality criteria
By combining hidden redundancy with frequency or provenance Metadata Based Selection,
as in Subject As project, you have some intelligence in your process that says 'OK, this
sentence has been translated, let's make sure that no-one else tries to translate it' which
makes your process more efficient, you focus on new work.
Volunteers are not aware of existing good translations
The other side of the coin of this practice is that the first translation may be good already. As
Subject A noticed it is probably good to show people that's been done already and they can
164

vote on it, so that they... so, it's more efficient too. [] You want them to move on to new
content.
Potentially more variations for MT training
Subject A also suggested that by hiding the suggestions You get more variations, for MT this
is good because you get more variations.
May hinder community related motivation
When asked about this practice Subject ACs answer included the following If it's a one way
street, where users only submit their stuff and do not see what others have submitted, only the
organizer sees this, I'm wondering what happens to my motivation.
A communication strategy to address the lack of transparency must be taken
When asked about hiding the suggestions of others, Subject V brought up the issue of
valuing your translators by saying that she would hate to introduce any kind of potentially
negative, or potentially people not being happy, because they feel like somebody else may be
taking on their translation on, I would feel bad from a human point of view and I am trying to
put myself in the person's skin, on the other side of the computer, translating and spending
time on, you know, let's say video, but it could be something else and then, feeling that maybe
they are being not 100% respected. In Subject As project, where this practice is used, their
volunteers understood that they would make a contribution and then the contribution would
be accepted or rejected. They addressed the motivation issues emerging from this by using
material incentives as seen below.
Material motivation strategies must be used to replace the community motivation
Subject A also commented that for the users their contribution earned them some points,
earned them lottery tickets, whatever it was and that's all they cared about, which indicates
a need to replace the motivation generated by the community with some kind of material
reward or potential material reward for this practice to work.
Enables using redundancy with longer translation units
Subject S noticed that by using this practice you could use redundancy even if your
translation unit size is larger documents, [because] you don't really get the same negative
impact that you have in the community that their work may not be used and, because that's
hidden from them.

165

Key findings
By having limited alternatives, it saves time in the selection process.
The limited alternatives also save cost if the person doing the selection is being paid.
By limiting the redundancy the resource-to-task ratio is better than in Open Alternative
Translations.
The optimal translation may not be suggested, so the quality can be lower than when using
other approaches.
Enables frequency based automated selection of translation.
By combining it with Metadata Based Selection, it increases the throughput by directing the
effort towards those TU that are yet to be translated.
By hiding the existing translations that may be good you may be creating redundancy that
does not add value because a better translation already exists.
Can produce more variants that are useful for MT training.
By hiding the translations community activities such as open voting and commenting on them
disappear and this weakens the motivation related to those activities.
By hiding the process from the translator you run the risk that they feel like they are not being
appreciated, this must be addressed in your communications strategy.
Material incentives are required in order to replace the community incentives.
If your community is not aware of the approach, it is possible to combine it with longer TUs,
but this contradicts having a communication strategy that addresses the method and is a
suboptimal approach because of the issues of using long TUs.

Practice 9: Super Iterative Translation


This practice appears in the processes enabled by DotSub, Amara and Facebook in Chapter 4.
In the case of DotSub and Amara, it is very similar to the way that Wikipedia works, by
letting users modify the existing articles. In the case of Facebook, where the current
translations are replaced by a new one when the new one has more votes, the process is more
alike multiple branch open source development, where branches are merged in when they add
value.

166

Enables iterative quality improvements


Subject S thought that the advantage of it is that over time the translation should improve
and then over time certain sections of it would stabilize and you could freeze out. This idea
of improvement was also brought up by Subject V when she said that People could either fix
it or improve on previous translations. Also Subject C, whose platform also enables this
practice, suggested that it can help with quality when he said that letting contributors work
on the same text and the same translations and modify them a bit, improve, is actually what
crowdsourcing do.
The first iterations may be of low quality
Subject P explained the importance of iterating on translations in general by saying: When
we are doing community translations or any translations actually, I think the first translation
is always gonna be rubbish. Even when people translate Microsoft and they get the best
people and they talk about how amazing they were. [] But the first translation of Microsoft
in [[Subjects native language]] was rubbish and it needed an iterative process to become
better.

When published automatically, enables the quick correction of translations after seeing
them in context
Subject P also illustrates how this process works in the context of localisation often when
people are translating user interface stuff they are not seeing stuff in context, so when they
see it in context they are able to see er well, this isnt working. Let me see and find that
string and fix it and that kind of iterative thing works well in new languages where people
are discovering words that could be better or didnt work.
Leverages existing translations without using MT or TM
Subject S noted that with this approach it is easier to leverage old translations than with the
traditional TM approach by saying that If we allow the same document to live for the lifetime
of the software you will not have to start over each time, in each language. Granted that if
you've got TM matches and an MT system you could actually reuse some of the previous
translations, but, you could do this without having to put all that infrastructure in place as
well.
A poor or incomplete translation can kick start the process

167

Subject A indicated that a poor or incomplete translation can serve to kick start more
contributions by saying that If having something helps draw more people in, so for example,
you may have a guy who starts, something really out of the way like Zulu and he does the first
10 minutes of a video and then he gets busy and he can't do it anymore and then maybe some
other one sees the first ten minutes of the video and says 'Damn, I wish I could see the rest of
it, so I'll find other people to finish it'. Theres some value to that.
Presenting poor or incomplete translations can have a negative impact
However, both Subjects A and V brought up that poor or incomplete translation can have a
negative effect too. Subject A noticed that you don't want an article that really leaves people
hanging. I don't know, tech support related, you wait 'til it's done before you put it up. There's
some things that you want to make sure there's management before they go up and Subject
V said that If the end product is high visibility and important at that, you should never have
such a wrong, let's use the word wrong although it's not a pretty word, wrong translation.
Then you should have moderation in place in some form.

Identifying Term Candidates


The fact that people can continuously change a translation can be used to identify
controversial terms as Subject S noticed when he said that if you found that one particular
sentence or a group of sentences is changing very, very often and nothing else was changing,
you could use that as an indicator that that may be happening, that it could be two people
fighting over a particular term, but again, using a TM system or a glossary of terms provided
by an organisation would help to mitigate that as well, because you have an authoritative
version of the term. This can actually be used to help identify items that need to be made into
terms and add it to the glossary as well.
Keeping the translation alive along with language
Subject P addressed the value of iterating indefinitely by saying: I actually dont think
translations should ever not be modified. [] my feeling is that, well, languages are in such
flux that you need to improve and change [my emphasis]. When asked about choosing a
time for freezing the translation, Subject R supported the idea of constant flux when he
answered Every time you could modify the translation, because the group could change his
mind about how to translate some kind of words, some kind of using and there's no finalizing
translations at any moment [my emphasis]. I can, I can rewrite every template, every
168

software in every moment. Of course, following the rules of the group, but they are no closed
translation.
Sensitive translators can become an issue
Subject R noticed that translators working within this iterative paradigm need to be a little
bit, I know to explain, to open mind to be corrected by other people and not to be afraid
about because there are people who want to debate about why do you correct my phrases.
It's really good translated and so on.
Storage approach considerations
From the point of view of the software architecture, a decision has to be made regarding how
to store the translation. Subject AC brought this up when he said the question that comes to
mind right away: if you modify a string what happens to that string? Do you save it in the
modified form or do you save the new variant of the string as a separate string? That's
something that is relevant for the database and Subject S went deeper by addressing the two
approaches he considered you have to figure out exactly how you gonna store this, so you
could save the deltas, the change between each document. The issue with that is that
overtime it's going to take longer and longer to compose the actual current version of the
document. So, if you have just the original document and a series of deltas it will take time to
render it and you are going to see, even when you are changing it, it's going to take longer
and longer for it to save, but you will save space, whereas if you constantly do snapshots
approach, where you have this is the current state, that's the next state, and then the next
state... The issue of changes means that you will actually have to save a lot of physical space
to store this, so that can be one of the disadvantages.

Metadata rich version control


Although when asked about the iterative autopublished translation practice, he did not bring it
up, Subject A, when asked about methods for automatically selecting publishing translations
when using alternative translations he talked about how they use metadata to approve some
translations, but also said that they have version control for it given that The data gathering,
the data management exercise needs to be multidimensional. It needs to be version, segment,
contributor, admin... because one admin may say it's good and one may say it's bad. or
maybe your trusted translator was having an off day and did really bad translations on this
one day.
169

Iterations as MT training material


Storing different versions of the same translation unit as they evolve over time can also
provide material for MT training as noticed by Subject M when he said that although they do
not collect multiple redundant alternatives and actually use a TEP approach, they are
actually collecting the translation log, but from draft to modified, improved versions and
things like that. That we find very advantageous for developing the MT system or developing
the kind of system that automatically suggests dubious parts of the training translators'
translators [translations].

If not frozen, some translations may lose quality


Subject P identified one of the issues that happened in projects where he has been involved
because they never froze the translation when he said that sometimes you change a
translation to work within an interface concept, an interface structure that is working in
English, but is not working in [[native language of the subject]] so changing the structure, so
it is not actually a direct translation. The problem weve seen with that is that it always gets
reverted to the correct translation, which is an incorrect in its actual usage..
Some translations need to be stable
This practice results in translations that can change at any stage and there are reasons why
you may not want translations to change. These are discussed in the following practice:
Freeze.
Key findings
Over time quality should improve.
The first iterations are most likely going to be of a low quality.
Corrections are visible as soon as they happen, or a soon as a new release comes out.
Leverages existing translation more easily than MT and TM by addressing only the changes
in the source.
A poor or incomplete translation can trigger more community involvement.
A poor or incomplete translation can trigger backlash.
Helps to identify term candidates that are controversial.
Enables the translation to evolve along with the language.
170

Measures must be put in place to manage the reaction of translators who are not comfortable
with their work being edited by others.
Two approaches to storage exist: deltas and snapshots. If you decide to use deltas if many
changes are made, the system will take a long time to compose the final version. If you
decide to use snapshots, you will need high storage capacity.
The version control systems should be rich in metadata.
The iterations can be used as MT training material.
Translations may occasionally lose quality.
Some translations need to be stable.

Practice 10: Freeze


Although it is not a translation platform, Wikipedia is an iconic crowdsourcing platform and
entry freezes are one of its defence mechanisms (Warren et al. 2008). In crowdsourced
translation Facebook uses freezes as a way to avoid vandalism (Losse 2008). As seen in
Chapter 4, DotSub allows the owner of a project to freeze translations too and Crowdin
freezes translations when one of them is selected.
Although in the interviews freezing was considered in the context of Super Iterative
Translations, Subjects C and AC also considered freezing as the point when you stop
collecting alternatives in the context of open alternative translations, which is the
implementation that appears in Crowdins process.
As Subject S noticed, there are challenges to using freezes and one is selecting when to
freeze and that in itself can be a tricky problem. As said above, vandalism and quality
degradation are reasons for freezing but during the interviews other reasons to freeze
translations and points when it could be done were brought up by the interviewees. These are
are listed below.
Prevention of quality degradation
One of the reasons for freezing is the fact that text that cannot be changed, cannot be replaced
by malicious or poorer translations. Subject C brought this up by saying that freezing is also
related to the vandalism. Subject P did not talk about vandalism, but thought that in certain
situations theres a risk that when the community is so big that people can do more damage
than good and that some of the thinking behind freezing translations is influenced by
171

preventing damage. When asked about ways to minimize the impact of vandalism Subject S
answered The only measure I can actually think of to minimize, it would be to freeze the
actual document so that no modifications could be made.
Some translations need to be stable for usability reasons
Another reason for freezing was identified by Subject P when he said the following I think
its more based on the theory that users dont like changes in their translations.
Some translations need to be stable for legal reasons
Subject M noticed that some documents have to be stable, when he said Take for instance, if
you want to collect a petition, right? And signature from people for the petition? If the next
day the translation changes, they lose sort of trust. You know, people start, you know, getting
suspicious, right? So they need to fix it, right? Yeah. So, for this kind of document, the sort of
document that NGOs are dealing with, at certain point they need to fix it and they cannot just
keep revising.
Finishing
Another reason to freeze is making clear progress towards finishing the project as Subject C
indicated when he said that translation freeze is absolutely needed stage. At some point,
project should be finished, one project should be finished. In some contexts having a target
such as finishing can help to motivate contributors.
Top-down freeze in order to finish
Subject C puts the responsibility to choose the freezing point on the people who manage the
project when he says that participants have to have the ability to modify pieces of translated
text but at some point where the translation, when the authority said ok, this has been
translated. This possibility disappears. He further elaborates by saying that the moment to
freeze comes when a project manager or translator granted or promoted to proofreader
validates one of the previously added translations variants. Subject S expresses a similar
opinion suggested having a trusted user or an admin from the organisation or the person
who actually published it in the first place, once you had the work done, can say 'OK, this
document has met our quality level' and then manually freeze it. This approach is also used
in Subject ACs project where Once that string is approved, as translation, the translation is
frozen.
Freeze towards a release
172

Subject P brought up freezes linked to releases with the observation that there are natural
freezes that happen in terms of software when something is released. Although he talks
about natural freezes, it seems more suitable to consider those scheduled freezes and
think of them as points when the product is finished for the time being, that is, until the
product is updated.
Freeze to reinforce a natural freeze
Subject S commented that over time certain sections of it would stabilize and you could
freeze out, this can be considered a natural freeze that moves the project towards being
finished. His suggestion was that freezing could be applied if a section hasn't changed in 5
or 6 situations, so it's probably good enough he notices though that that could be a naive
implementation.
Freezes can increase speed
Subject C said that if you have the feel [need] to finish it faster you should do it in his
platform once a translation has been approved the system can be configured so that that part
of the text that will not be available for the translators and this way the project gets
smaller and it will be handled the rest of the strings easier, faster. This works by directing
the effort to parts that need it instead of allowing infinite refinements to parts that already
meet the desired quality level.
Unfreezing mechanism becomes a requirement
Subject C observed that it may be possible that authority or maybe community passed [pass]
vandalism to this frozen text or, as Subject S said, that after freezing you find that at later
date that something that has been frozen actually has allowed a mistake to be frozen and
that is why Subject S states that you have to have a mechanism for unfreezing as well as
freezing, which adds complexity.
Potential demotivator
Subject AC noticed that it may be somewhat demotivating for contributors if they see
something is translated already. Along the same lines, Subject A noted that you have to
manage the reaction of the community to the frozen text when he said that if someone cares
enough to tell you that this translation is not good, that contribution should be respected and
you either say that 'OK, it will be taking into consideration and at a later point we will revise
it', so, they need to be told what's gonna be done with their feedback.
173

Pseudo freezing
Subject Cs platform can allow users to add suggestions after freezing but their work does
not influence the resulting text, the resulting translated text, without opening issues.
Something similar happens with Subjects AC project, where translators can keep making
suggestions indefinitely and their approvers will be notified if a better, well, if another
translation comes in so that they can check again. This avoids the demotivating factor of not
being allowed to suggest better translations, but requires someone to take the responsibility of
responding to requests for an update, and prevents the contributions from being directed to
parts that need more work and have not been frozen yet
While frozen, translations do not evolve
Subject P was against freezing translations and stated that he feels like it should be evolving
as the language evolves and as it evolves stylistically. Subject Rs view was that any time
the group could change his mind about how to translate some kind of words, some kind of
using and there's no finalizing translations at any moment, which echoes a similar feeling to
that of Subject P.
Key findings
The reason to freeze must be considered:

Prevent the degradation of quality, be this well or ill intentioned.

For usability reasons.

For legal reasons.

In order to close project.

The points at which to freeze:

When the desired quality has been achieved as decided by a project authority.

Decided according to a release plan.

When a translation has become stable on its own.

Freezing can increase speed by forcing contributors to direct their efforts towards unfrozen
text.
Enabling freezing creates the requirement of being able to unfreeze in case of issues.
By preventing collaborators from contributing on parts of the text that appeal to them,
freezing may negatively impact their motivation.
174

Pseudo freezing: By letting contributors add suggestions to a frozen translation you


prevent the lockout demotivation effect, but some of the speed up effect is lost
and it is necessary to have an expert that manages the updates to the
translations.
Frozen translations do not evolve and over time translations may sound dated.

Practice 11: Version Rollback


Rollback, reverting to previously existing version of the data, is another mechanism used in
Wikipedia to support damage control (Wagner 2004), which is also enabled by Amara and
DotSub and shown in Chapter 4.
Versioning system should include user metadata
Subject A commented that they had versioning in all the translations that we had done []
it was possible to Version Rollback if you needed to, if it made sense. In their project the
version control used to roll back translations were also associated with the contributor. In
subject Cs platform it is also possible to roll back changes and again the versioning system
attaches the agent of the change to the changes themselves allowing you to roll back some
contributions done by several people if they get negative votes from other participants or
negative confirmation from the authorities. Subject Ps system does not have a roll back
mechanism, but he knows about systems that do and comments that the systems know who
did what and they are able to revert everything that theyve done. He also suggested allow
you to look for strings that someone has reverted or changed, specifically your own, so if you
overwrote one of mine I could go and look at what you did and that gives you some
protection in that kind of churning situation.
Potential loss of progress
Subject S, whose project works at a file level granularity, noticed that there could be some
general improvement [[unintelligible]] to the document, or some additional translations that
have been done before the Version Rollback had been done and, which could be lost.
Rollbacks should be selective to avoid loss
Subject S suggested doing selective Version Rollback of that particular section, then I think
that would be a very good approach.
Rollbacks are synergistic with Super Iterative Translation and freezing

175

Subject S suggested combining iterative translation, Freeze and rollbacks as a full process
when he said you could every time a change was done, you could have that as a Version
Rollback point and say 'this is the current state' and Version Rollback to it, so, you combine
the two, so you could have certain parts of the document that are frozen and still allow
iterative changes and rollbacks with the freezing.
Contributor reactions must be managed
Subject V noticed that when a Version Rollback is done to correct a well-intentioned change
it's unfortunate that effort go into it, and you know, time and people may get frustrated.
Key findings
Besides the versions, the versioning system should attach user information to the changes.
This enables:
Contributor focused rollbacks.
Contributor awareness of the activity of other contributors.
A Version Rollback could result in the loss of positive changes in order to solve newly
created issues.
Fine granularity rollbacks can minimize the loss of good changes when rolling back.
Rollbacks are synergic with freezes in iterative processes.
Contributors reactions to rollbacks have to be considered.

Practice 12: Deadlines


Through the statements of the Subjects it emerged that the approach to deadlines should vary
depending on the project crowd being a paid community, an unpaid community or people
working individually.
Paid crowd accepts deadlines and the tools used have features to deal with them
Subject C makes a connection between deadlines and business when he says that time is
also critical for businesses. He goes on to comment that his platform is used in translation
agencies and there the deadlines are really important and states that for those clients the
tool deals with the deadlines, with all of the things related to the time management and
deadlines management. This echoes Subject Vs statement that now that we have all that
176

process online with translation workflow with [platform name] implemented, we actually also
use it for our professional translators when we want them... so when we have it, it's with
deadline for a video.
Volunteers may decide to use all the time until the deadline
Subject S observed that in his project that deals with individuals the main problem is
selecting when the deadline is, not whether you use them or not, because I think people will
try and meet the deadline if you set one, but I also think by setting a deadline, if you give let's
say a ten day deadline for a task, people say, well, internally say this will take me two days to
do and may actually decide to wait until two or three days before the deadline instead of
starting immediately. So, if there was no deadline they may have uploaded in two days, but
because they have ten days to do it, they may have waited a few days after they claimed it and
then upload it just before the deadline.
Distant Deadlines Can Result in the Volunteers not picking up the task
Subject S noticed that with tasks with long running deadlines, people may not claim them
either.
The crowd may neglect the deadline
Subject P when asked about deadlines and making the community hit a target said We cant
on one level, because you are dealing with volunteers and that sending out a deadline
announcement to 30 translators will get very low result. Subject C believes deadlines wont
work in a context where the contributors motivation is not money, is not having some kind
of reward, reward for their work.
Deadlines can motivate
Subject P noticed that deadlines can provide motivation in the form of an objective with
deadlines conveying that theres some target that you are heading for.
Lack of deadlines or deadlines without meaning can demotivate
Subject P noticed that that without those dates, it is as if you are carrying on and carrying
on and not seeing any results. And in one project where there deadlines come every six
weeks, so six weeks come quite quickly, even though it is not a lot of work. But theres no
sense like wow, we did this and after six months we can enjoy it. It keeps coming back.
Direct requests to individuals within a crowd can work
177

Subject P said that with a community manager you can send a message requesting
individual contributions from the same translator, a translator previously, almost certainly
gets, gets us close to 100%. People feel an obligation, erm, that they are responsible and if
you ask them personally. So as soon as we ask people personally, we get much better
response from the community to complete.
If working with individuals sanctions can be a motivator
Subject A noticed that in cases where prestige is attached to the organisation, punishment, or
the threat of it can help when trying to get people to meet deadlines. This was illustrated by
his statement about the TED program that they tell you If you want to be part of this
program you do it by this date. Otherwise we won't ask you again.
Rewards can help implementing them
Subject A noticed that for deadlines to work you have to offer something if you want people
to do it within a deadline, which meant some kind of financial reward or recognition that
meant something to them.
The community can react negatively to deadlines
Subject AC commented that he would really stay away from giving deadlines to community
translators because with deadlines you are getting into the area of business and if you
ask a community, any community, to translate software into a new language that you are then
selling to a customer later, then things become all of a sudden very dicey and we have seen
examples in the history of community translation where communities get very upset as soon
as they started to feel exploited by a money making capitalistic company.
Deadlines and freezes as displays of appreciation for the translators
Subject P noticed that if there are release dates and string freezes, that communicate to
translators a sense that their work is important. So, I think programmers dont really
understand the consequences of breaking strings [explanation of why breaking strings in an
issue] So I think that has a real advantage of communicating to translators that they have an
important part to play in the process when programmers are told to step down and revert
changes because strings being broken, to communicate to communities that they are
important. In fewer words, it could be said that by presenting deadlines and freezes that
protect the work of the translators, an organisation shows its appreciation for it, something
that may help with the motivation of the community.
178

Soft deadlines
In Subject Ss project contributors are given a deadline to submit online and they are
notified if they have exceeded the deadline, but they are still allowed to upload, even if they
have missed the deadline. Subject R, whose project does periodical releases with their
deadlines you can translate in [[the platform]] every day independent of the release but on
the other hand he noticed that meeting certain targets by the release date is important,
because you want to know that every release have to be translated at least the most important
package or user interfaces and so on.
Unmet deadlines
In the previous statement Subject S commented that volunteers that missed deadlines were
notified. Also the organisation that submitted the task is notified if someone claimed the task
but did not finish it or if the task has not been claimed by the time it reaches the deadline,
we automatically unpublish it in case that they have no use for the translation beyond the
deadline. Subject V observed that because it is totally volunteer work, you know, if a
translation hasn't been wrapped for a year it should be flagged to me by how people would
check that and then, maybe, reassign, remake it available for translation.
Key findings
If unpaid and not under contract, individuals react to deadlines and requests to meet them
differently than crowds.
An organisation that is paying the crowd members labour should use a tool that supports
deadlines.
When deciding deadlines, bear in mind that some volunteers will wait until deadline date to
deliver something they could have done earlier and/or will ignore tasks with distant
deadlines.
Deadlines can be motivating if they come with a sense of achievement that may not be
achieved if working without deadlines or with too frequent deadlines.
Without a way to enforce them, crowds may neglect the deadline.
Addressing individuals personally has a better success ratio than broad requests.
Rewards can help push people towards meeting deadlines.
179

If your organisation has sufficient attached prestige and works with individuals, some
kind of sanction to people who do not meet the deadline can work.
An unpaid community can react negatively to deadlines proposed by for profit companies.
In the context of localisation, deadlines that prevent programmers from breaking strings can
be used to show translators that their contributions are valuable.
With unpaid contributors, soft deadlines that express targets can work.
Unmet deadlines have to be managed via messages to the involved parties and a reassignment
or discarding mechanism, depending on if the task is still relevant after the deadline.

Practice 13: Open Assessment


This practice appeared in the processes for Facebook, Crowdin and Launchpad. This practice
is usually enacted in the form of up or down-votes.
Crowd assessment makes the process more self-managed by enabling vote-based
translation selection
Subject A said I think voting is very cool, useful and teaches the process and makes it selfadministering however he does not put full trust on the crowd and observes that some
monitoring of quality needs to be done. Subject S said something similar when he said that
The advantage of it is that you don't have to have a dedicated person within your
organisation to choose the correct translation.
Crowd Assessment can be used to help handling unlimited alternative translations
Subject AC noticed that in his project they do not mind if there are 15 proposals for a
translation for a single individual string because that's a good opportunity how you can
leverage voting, or commenting, or even ranking of different translations by the users
Crowd assessed pre-selection to simplify the Expert Selection and Edition
Subject R suggested that there could be advantages to have some scoring in translation in
suggestion because you can order the suggestion depending on the scoring. Maybe it's an
advantage and you can choice, you are not duty to elect the first one, but you can watch the
suggestion in order to the voting. [meaning according to the number of votes] Subject S
expressed a similar idea when he said that you may want to potentially moderate it [],
review the top two votes or the top three candidates afterwards.
180

Human pre-selection to avoid bad translations being published


Alternatively to having an expert select one among the translations with the most votes,
Subject S proposed having someone, preselect maybe 5 or 6 translations and having the
community only vote on that subset of them, which would still prevent poor translations
being up-voted while giving the crowd the last word on which translation is used.
Crowd assessment is open to vandalism
That two interviewees suggested combining voting with experts to avoid quality issues is an
indication of one of the issues of crowd assessment, i.e.: vandalism. When asked about the
disadvantages of letting the crowd assess the translations, Subject C singled out vandalism as
the only negative aspect when he said that if we exclude the vandalism from that, I don't see
any disadvantage. Subject S went into some more detail regarding how vandalism can work
when he said that if you've got a small community, or if you have a large community of
people who have malicious intent they can, they can overwhelm general people and up-vote
inappropriate translations.
Assessments must carry provenance metadata
Also related to controlling vandalism is Subject As suggestion that There should be some
authentication of the voter. So that people don't make frivolous votes. In the same way that
you don't want in a political election you want the voters' name and address and you want to
make sure that they only vote on stuff that they should vote on. In the same way, you want
accountable, responsible votes, so some care needs to be given to that.
Weighting assessments according to the contributor
Subject A went further and proposed a having a system with some intelligence in the
software that says 'OK, this votes is voting consistently with many other voters. This one is
like completely off-scale in terms of consistency. This one should be removed from having
any weight in the vote'.
Open Assessment works with larger communities
Subject AC talked about how community generated and selected translations met the quality
level of professionals, but he observed that you will come to that if the community is large
enough, which no doubt it is in the case of Facebook. Subject C hinted the need for a large
community to get certain quality when he said that they recommend our customers to have a

181

team of proofreaders and, if it's not possible, to have a bigger community involved in the
voting.
Open Assessment increases community engagement
As Subject P observed, voting can create consensus and thats a way in which, as observed by
Subject AC, people get more involved, it's more fun for them to discuss and make sure that
what they think the best quality is can be achieved to what he added you can create a much
more lively community interaction which I think is what you want because it is motivating
and rewarding to people if they can talk to others about what it is that they are doing.
Potential controversies must be managed
The same way that engagement through assessment can create consensus, it also provides
room for dissent, about this Subject AC said that you can have flamewars, discussions,
things may get out of control and it becomes, it can become very labour intense to get things
right again however he also said that if you design the process the right way and you make
sure that contributors treat each other with respect and driven by a common goal, which is
find the best translation, do not harm each other, then I think that, the advantages outweigh
the potential of opening the process up almost completely to the users with votes and
comments and alternative translations and multiple options.
Open Assessment can generate a Yule process
Subject AC noticed that if you have ten strings, as a matter of fact people will only look at
one, two or three strings, at the top anyway and the rest just becomes a dead body in the
database, basically and that's it. This leads to a Yule process, also known as preferential
attachment and the rich-get-richer principle (Cha et al. 2009) whereby translations with the
most votes by virtue of being more visible keep receiving more votes and good translations
that appear later do not bubble up. Subject P also talked about this when he said that he has
seen situations in Facebook where weve had brilliant translators who work with us, who
really have literally translated absolutely everything thats available in their languages []
and their voice is never heard because theyll never they didnt arrive early enough in the
process, the words are set on stone by a community who essentially have voted on something
but they have no experience: A [as in letter a] of translating software in their language or
B of translation. However, AC also noted that in a spot check done by Facebook to check
the performance of the community and of paid translators There was not very much, if

182

anything at all, that the translations of professional translators appeared better than
community translators.
A specific UI for assessment is viable
Subject C commented that Open Assessment is actually used in [[the platform]], it even
having a separate window, a separate workbench where translators just vote. The interface
is as follows they don't do translation, they actually just see a list of source, translation
source and translation and just vote for the added variants.
Unfiltered-crowd assessment less suitable for specialized documents
Subject M noticed that for instance legal documents, you cannot rely on fan voting.
Subject S elaborated further by talking about selecting the translation according to the votes,
but his observation that voting may not be suitable for all documents, but if it's a nontechnical document then allowing the community to select the most appropriate translation
from the ones that are there would be an option indicates that he thinks voting is not a good
idea for the localisation of technical documents.
Translators who grow up in an Open Assessment paradigm mightnot develop their own
style
Subject M thinks that a good translator, to be a good translator they need to have a style. If
they care too much about the voting and the reaction by the users, I mean, you know, you
need to take into account the actions by the users, but it's not in the form of voting, it's more
in the form of informed input from them, informed feedback. Right? So, in the end, translators
who grew up in the context of voting, I don't think that they probably, they have trouble
getting their own style, fixing their own style.
Open Assessment better suited for languages with a translation tradition
Subject P noticed that voting for people who are used to using software in their mother
tongue its a useful approach, but for people who have never used software in their mother
tongue and understood its interaction or any of those decisions, I have seen stuff voted up
thats just bad translation so those the voting model doesnt really work that well when
people dont understand their language very well.
Open Assessment for longer TUs

183

Subject V said that Open Assessment was fitting for short strings, but she did not consider it
suitable for the videos because people spend more time on it, implying that she would not
feel comfortable submitting such an amount of work to crowd assessment.
Letting contributors assess can accelerate the process
Subject A noticed that by letting contributors see the translations already in place, their
efforts could be directed to untranslated content by saying that they would say 'OK, this is
done. I'll go and do something else' and illustrated it with the following metaphor If you see
two people carrying a guy bleeding over, you don't go and 'I'll also help' when there's three
other lying there. You are going to pick up the guys still bleeding and trying to help those.
Key findings
Open crowd assessment enables vote based selection, which results in a more crowdmanaged process.
By making the process more self-managed, working with unlimited alternatives becomes less
challenging.
If the project manager is not happy with letting the crowd choose the final translation, crowd
assessment can be used to pre-select translations before an expert chooses the published one;
or an expert could preselect translations before letting the crowd choose the published one.
Open Assessment is open to vandalism.
To reduce the impact of vandalism, votes should have provenance metadata and be weighted
according to the voter.
Open Assessment produces better results with larger crowds.
Open Assessment opens another venue for crowd engagement, which is valuable for
marketing purposes.
There must be a plan in place to handle the controversies that may arise with Open
Assessment.
If the criterion for quality is the acceptability of a translation, crowd assessment can work
well; however:
If no measures are taken to prevent it, voting can cause a Yule process that hides
better translations than the ones currently used, but that does not mean the ones
selected based on crowd assessment are not good enough.
184

The Yule effect can be exacerbated by the social influence bias effect that creates
bubbles of positive feedback.
Having a specific UI for assessment is desirable.
Open Assessment for specialized documents is only suitable if the crowd is made of
specialists too.
Open Assessment is not suitable if the intention is to have translators develop a personal style.
Open Assessment is not suitable for languages that do not have a translation tradition.
Open Assessment is not suitable for longer TUs.
Open Assessment adds a level of transparency that can accelerate the process by allowing
translation efforts to be directed to untranslated text.

Practice 14: Hidden Assessment


This was used in academia by Zaidan and Callison-Burch (2011) in the form of ranking
limited redundant translations. Many of the features of the Open Assessment practice apply to
this practice too. The differences will be covered in the discussion section of this practice in
section 5.8.14.
Hidden Assessment enables vote threshold metadata-based approval
When asked about the practice AC said that maybe they rely on a certain number of upvotes, of positive votes for a translation, so, if a translation gets 5+ votes it gets approved
automatically.
Hidden Assessment enables population ratio threshold based metadata based approval
Subject S suggested doing metadata based auto-publication when a certain percentage of the
community would have to agree. Subject AC also considered that auto-publication is a safe
bet if you have a certain number of positive votes, out of the community, if the community is
big enough, if you are fine with the sample size.
Hidden Assessment is more easily exploited by malicious users
Subject Cs platform implements unlimited redundant assessment, but when asked about the
practice limited redundant assessment he said that they did not implement it in their platform
because not nice guys will find a way to vote enough for the wrong translation, so just
having the threshold for them will not work..

185

Key findings
Many of the key findings regarding Hidden Assessment are features that are shared with Open
Assessment, when that is the case, the features appear marked with an asterisk in this
subsection.
Hidden Assessment enables Metadata Based Selection methods that make the process more
self-managed.*
By making the process more self-managed, working with unlimited alternatives becomes less
challenging.* Hidden Assessment enables vote-number threshold metadata-based selection.
Hidden Assessment enables population-ratio threshold metadata-based selection.
When a limited number of Hidden Assessment s is collected, the system can more easily
exploited by malicious users that are aware of this. However, the fact that the process is
hidden provides some limited protection against malicious users.
If the project manager is not happy with letting the crowd choose the final translation, Hidden
Assessment s can be used to pre-select translations before an expert chooses the published
one; or an expert could preselect translations before letting the crowd choose the published
one.
To reduce the impact of vandalism, votes should have provenance metadata and be weighted
according to the voter.*
Hidden Assessment produces better results with larger crowds.* If the criterion for quality is
the acceptability of a translation, Hidden Assessment can work well.*
Having a specific UI for assessment is desirable.*
Crowd assessment for specialized documents is only suitable if the crowd is made of
specialists too.*
Hidden Assessment is not suitable for languages that do not have a translation tradition.*

Practice 15: Expert Selection and Edition


Expert Selection and Edition appears in the processes of Pootle, Launchpad, Asia Online and
Crowdin as modelled in Chapter 4. Besides that, it is used in the fully supported languages in
Facebook (Losse 2008).
Expert Selection and Edition centres responsibility
186

Subject S said that having an expert select the translation that will be published centralizes
the actual responsibility for the final publication to one person, later on he added that if
you find that they do select inappropriate content, you have one person who is responsible for
it, you know who it was.
In Subject ACs project they use Expert Selection and Edition because their legal
organisation requires us to use in order to avoid malicious content appearing on our
software, because that would be kind of the biggest perceived catastrophe if you let that
happen. Later he stated it again when he said that the main reason for using Expert Selection
and Edition Legal reasons in our case. It is noteworthy that although they use Expert
Selection and Edition for legal reasons they do it with the help of trusted users or with the
help of company employees or with the help of localisation vendors.
Expert Selection and Edition can fill in expertise gaps in the crowds
Subject A pointed out that an expert with specialized knowledge can fill knowledge gaps
when he said that For a lot of business content or water filtering to prevent diarrhoea, you
need some kind of expert approval process because maybe the translator is not a medical
professional and so they may say something that may be wrong. If you are trying to prevent
disease, you want that information to be relatively accurate, so you want an expert there.
Expert Selection and Edition can be used to complement automated selection processes
Subject As project used frequency based and trusted user based automatic publishing
methods but when those methods did not reach their desired confidence levels the ones that
were unclear is an expert made decision. they had to go through a QA test, a person who
said 'Yeah, this is a good translation and that went through, but leave that one out there
because it's not good enough'. As seen before Subject S suggested using variable limited
assessments for publications but if there were something that really had to be an exact
translation, you may have to set it higher or even flag it that it had to be manually reviewed
before that was done.
Publication of unstranslated content if no translations has been selected
In Subject Cs project there is also the possibility for project managers to disable export of
not approved, or, to choose options in which cases sometimes strings will preferred to stay
untranslated, if they were to be translated by not validated translation, not validated string,
when he says validated he means validated by a person.

187

Expert Selection and Edition risks crowd rejection and decrease in engagement
Subject A noticed that in some forums where you are expecting real democratic engagement
in conversation and you get someone who always overrules, then people will stop engaging.
In situations where gaining popularity is the main aim the crowd should prime over any
expert
Subject A commented that he had run into situations where experts should not have been
involved, because it was really supposed to be a community collaboration and the community
should have been... the knowledge of the crowd, the wisdom of the crowd should have
dominated over any expert decision and suggested the following scenario as an example:
Like Facebook, say people say 'Post on wall' and 1000 people vote this way and then the
community guy says 'No, I'm gonna use this, because I like that better'. The 1000 guys will
say 'What the hell did we vote for then?' There are scenarios where experts make sense.
When you want popularity or community selected [] then you should let the community
stuff rule, unless it's really rude or bad.
Have a communication plan for situations where you may be superseding a crowd that
is not malicious
Subject A said when your expert supersedes the community you need some reason that you
can explain to the community 'This is the one that came at the top, but we decided not to go
with it because it offended these people and we cannot afford the legal risk, or whatever, the
liability and all that'.
Choosing someone for the selector role can be challenging
Subejct AC said that if you don't do automatic approval, you have to rely on the expertise of
an approver and that's probably the biggest problem. To figure out, who you let be an
approver he went on to explain that they use specially engaged users that suggest
consistently high quality translations as selectors, but don't ask me for how to find them. I
can only say you know one when you see one. Subject P said The disadvantage is, of
course, who is the expert? and went on to point out how this is especially challenging with
long tail languages by saying that the problem is around where, where a language sits in the
long tail. The further its moving down that tail, the less availability of expertise they have
and I think that becomes more of an issue around select, around the expert role, because your
expert is gonna become less and less qualified the further down.

188

Expert Selection and Edition can be used to freeze a translation


Subject AC said that in their project freezing a translation is the approval of the translation
which is carried out by an expert. Subject C pointed at a similar implementation when he said
that the freezing point comes when a project manager or translator granted or promoted to
proofreader validates one of the previously added translations variants. Finally Subject S
also suggested this when he said that You could also have a case where a trusted user or an
admin from the organisation or the person who actually published it in the first place, once
you had the work done, can say 'OK, this document has met our quality level' and then
manually freeze it that way.
Expert Selection and Edition can help consistency
Subject AC notices that In some cases consistency may be impacted in a beneficial way,
especially if we cannot display TM or TDB. So, I believe that there's always something that a
human can see, when that human has the big picture and maybe this big picture cannot be
visible to the community members. Subject P pointed in the same direction by saying that
one advantage would be, should hopefully have stylistically the same, there should be a
stylistic flow, cos there should be adapting or suiting stuff that is mostly consistent.
In Subject Rs project a group of experts selects the translations and as a result they not only
increase the

quality [], increase the

normalization,

the normalization...

the

standardization, sorry! The standardization of the translation because it is possible to


translate phrases in different ways.
Subject Cs platform allows for what he called managed projects where when the community
is finished with the translations, there's an authority that makes it, erm, that takes and effort
to confirm, confirm actually that the translations are ok. He suggested that using this is
especially recommendable in long form documents where there is an editing of the
communitys work that should be made by probably a professional or semi professional,
professional translators that you make sure that the consistency of the terminology is ok and
the formatting and several more, several more things are checked and there are no issues
with them. With this he points out at the need not only of linguistic consistency, but also of
formatting consistency.
Expert Selection and Edition introduces a bottleneck

189

Subject AC noticed that in a more open system [as opposed to the one used in his project]
there may be a disadvantage because you slow down the process. Subject S also said that if
the experts have a lot of tasks, it may take them some time to do it.
Experts can iterate over and improve the crowd output
In Subject Vs project experts refine the crowd ouput If they are some basic grammatical
errors, the [[professionals]] fix them themselves, if there are some questions on the UI term
they may fix them themselves
Paying experts may be necessary and it adds cost
Subject C noticed that using an expert will cost money. If you have an expert, his time costs
money. And Subject S suggested the same when he said that it may cause additional
expense if you have to pay them.
Experts can be supported by metadata
Subject P when asked about how he would select translations he said that his preferred
method [of selection] is to provide metadata for the people who are making the final
decision. He specifically suggested translator metadata when he said that you can know
how many suggestions this person has made before, how many have been accepted, how
many youve accepted. Those kinds of numbers would be helpful and that would just really
prioritize a person that in your experience has been better. When asked about crowd
assessment Subject R suggested there are advantages to have some scoring in translation in
suggestion because you can order the suggestion depending on the scoring. Maybe it's an
advantage and you can choice, you are not duty to elect the first one, but you can watch the
suggestion in order to the voting.
Experts can be gatekeepers against malicious translations
Subject R said that in the project that he is involved in malicious translations being published
is not possible because people that has permission to put the correct translation are the
official translators. Subject S also said that having an expert select the published translation
means even an extra layer of security. Subject Vs project uses paid professionals who
make sure that the video was entirely translated, that there's no lines left in English, then
they check for if anybody put a bad word or anything disparaging against the company or
our competitors or anybody within the industry for that matter.

190

Diverse expert committee selection is an option when aiming for the highest quality
Subject R raised the issue that experts could have certain biases in the way they translate
when he said: If only, if they only people from [[open source project]], or people from [[open
source project]] or people from programmers, maybe the group will have some kind of
deviation about the standard. To avoid this he suggested that the group must be, must have
different types of people. Maybe professional translators, maybe because they know
[[language A]] and [[language B]], [[language A]] and [[language C]] and so on, engineers,
developers, maybe users, why not? And this kind of mixed groups I think is the best way to
have a really, really good translation.
Experts can inform the community
Subject A said they have a subject matter expert that just cleans up a little bit and then the
feedback is given back to the original guy saying that 'We made these corrections not
because you made a bad translation, but because this guy has expertise in this area and he
thought that this was maybe not quite correct and you need to say it like this'.
Experts reduce the risk of having no pre-selection of contributors
Subject V commented that they decided not to bother too much with putting a test, just put a
simple questionnaire because we knew that we were going to have a moderation process at
the end of the translation.
Key findings
Having an expert select the translations that are published centres responsibility. This is
useful for legal reasons and in order to take measures if low quality translations are
published.
For crowdsourcing of a specialized text, an expert can fill knowledge gaps in the crowd.
Experts can be used to complement automated selection processes when those do not reach
the minimum trust requirements.
Expert Selection and Edition superseding crowd votes can result in a loss of crowd
engagement.
If the purpose is not linguistic quality, but acceptability by the crowd Expert Selection and
Edition should only be used to prevent malicious translations.

191

If an expert supported by crowd assessment ignores the crowd generated data, there must be a
way for the expert to justify the decision to the crowd.
Finding an expert can be challenging, especially in long tail languages.
Expert Selection and Edition can be used to freeze translations.
Experts have an overview of the project that can help with consistency.
In fast moving project, experts can become a bottleneck.
If your experts have to be paid, their remuneration is added to your costs.
Experts can use assessment or provenance metadata to increase their efficiency.
By superseding the crowd, experts can prevent malicious translations being published.
A diverse committee of experts can be used for even higher quality.
If a mechanism for it is put in place, experts can help the community get better.
By placing an expert at the end of the process, pre-selecting the members of the crowd
becomes less necessary.

Practice 16: Metadata Based Selection


Metadata Based Selection appeared in the processes for Facebook and Asia Online on
Chapter 4. Other implementations of the practice appear in the process of TxtEagle (Eagle
2009) and the research of Zaidan and Callison-Burch (2011).
Trusted translator
Subject As project collected data about the contributions of the translator and when a
contribution was made by someone who has established, that has made a thousand
contributions and they've been viewed as someone if a high status, their contributions just go
through, even if they are only one. Subject ACs project does not do automatic approval, but
also suggested that this could be done Maybe based on the fact that the users is known or
has constantly submitted good translations.
Trusted translators may occasionally perform poorly
Subject A noticed that sometimes they found that your trusted translator was having an off
day and did really bad translations on this one day and you remove those.
Frequency of occurrence with Hidden Alternative Translations

192

Subject As project used a combination of practices, among them hidden alternatives which
they combined with frequency of occurrence for automatic approval as he explained when he
said maybe it's two or three I don't know exactly what the algorithms were, but you know,
when you see it repeated exactly the same way. Frequency of occurrence is a check mark that
'Yes, go ahead. No human approval needed'.
Selection of highest ranked translation
If the criterion for quality is user acceptability, voting helps according to Subject C because
it's really clear which [translation] is better by considering the number of votes. Similarly,
Subject AC said that voting and vote based selection would eventually provide you with the
best possible translation as it is viewed by the people who do the translation who in this
case are the users. Subject As project did not use votes, but he suggested publishing
according to voting, so when you see lots of votes on something. Finally, Subject A also
stated that voting seems to work better when there's... when you have a user string for GUI
and stuff like that, where you really want to know what people prefer. That's why Facebook
and twitter really are voting schemes. However, you may not consider that user
acceptability is the right criterion and agree with Subject Ps observation that voting doesnt
actually create quality, it just creates consensus around how badly you understand your own
language.

Fixed threshold selection with Hidden Assessment


Subject ACs project uses expert approval with Open Assessment in the form of votes and
also suggested that in the context of having used Hidden Assessment you could maybe rely
on a certain number of up-votes, of positive votes for a translation, so, if a translation gets
5+ votes it gets approved automatically.
Population ratio threshold selection with Hidden Assessment
In the context of having used Hidden Assessment, Subject S suggested enabling automatic
publication when a certain percentage of the community would have to agree. Subject AC
also considered that auto-publication is a safe bet if you have a certain number of positive
votes, out of the community, if the community is big enough, if you are fine with the sample
size.

193

Both population ratio and vote number thresholds can be adjusted according to the
importance of the content
Subject S suggested an implementation where if the document was a low priority document
you could set the threshold to be lower; if there were something that really had to be an exact
translation, you may have to set it higher or even flag it that it had to be manually reviewed
before that was done.
Timestamp
In subject Cs project votes are used in some project and votes and Expert Selection and
Edition in others, but occasionally there may be multiple alternatives with no votes and if that
is the case the last added one [translation] will be used.
Extremely high volume projects can only be handled by automatic methods
Subject A pointed out that if you have 100 page document that you are translating through
crowd, maybe computer selection is not the best way to do it. If you've got a million page
document, maybe the computer selection is the only way that you are gonna do it.
Automated selection is fast
Subject AC said that an advantage can be speed, expediteness of the process, or the
translation and although he expressed strong opposition to the idea of letting a computer
select a translation, Subject R admitted that the work will be done, fast. Subject S agreed
that automatic selection will be done in a timely manner. It'll have great throughput
Automated selection is consistent through being objective
Subject P noticed that an automatic system would be consistent, cause it is applying rules
consistently, has no emotional attachment. Subject S also said that an automated selection
system will provide consistent result.
Automated selection may let malicious translations slip through.
Subject AC noticed that with automatic selection you run the risk things that you do not
want to see in your translation may be in there. He suggests word black lists to reduce this
risk.
Do not use it for controversial or sensitive content
Having discussed before that in an iterative process problematic content is continuously being
edited, Subject S noticed that if you are monitoring how it was translated, whether people
194

are constantly trying to retranslate a particular section of the file, you may want to not
autopublish that document. He went on to say that If it's sensitive documents, it's
something you want to have a human being though.
Use it only if you trust the output will be fit for purpose.
Subject M said that any kind of automatic selection is related to the fit for use. So, as long
as people who decide to use, who decide to let computer select, right? Limit the range of
application to whatever it's useful and workable for computers to decide, it's ok.
Key findings
Automated Metadata Based Selection can be used with:
Trusted translators if you are logging the performance of your translators. However, it
is possible for trusted translators to occasionally perform poorly and a system to deal
with those situations must be in place.
Frequency of translations if hidden alternatives have been used.
Highest ranked using either open or Hidden Assessment.
First to reach a fixed number of votes threshold if hidden alternatives have been used.
First to reach a population threshold if hidden alternatives have been used.
If thresholds are used, these should be adjusted according to the sensitivity of
the content.
Time stamp, under the assumption that the latest translation will be the best available.
Automated selection can be the only alternative in extremely big projects.
Automated selection is fast.
Automated selection has consistent criteria.
Automated selection can be tricked, but some mechanisms to minimize this risk can be put in
place.
Automated selection should not be used for controversial or sensitive content.
Automated selection should only be used where there is trust in its fitness for the purpose.

195

5.8 Discussion of Practices


The this section elaborates upon the coverage of each practice, and it addresses any features
of the practices that the experts did not discuss but the researcher has observed or seen in
other research. The unsubstantiated claims from the discussion section appear here in italics,
to differentiate them from the claims based on statements by the subjects.
5.8.1 Content Selection Discussion
The coverage for this practice was excellent and went well beyond the aspects anticipated by
the researcher. This may be partly because the answers stem from one of the first questions
that was posed to the interviewees, but also because all interviewees had very clear ideas in
regards to it.
5.8.2 Discussion of Unit Granularity Selection
The coverage for this practice went beyond the aspects foreseen by the researcher. This is
probably because all the interviewees had had to consider the size of their work unit ahead of
starting their projects.
5.8.3 Leveraging Translation Memory Discussion
Again the coverage for the practice was good, however, the researcher expected someone to
bring up the possibility of TM not just spreading errors as shown by Bowker (2005), but also
the risk that contributors would learn the errors from the TM and incorporate them in their
new translations. Subject P worried about this in the context of languages without a
translation tradition, but the researchers opinion is that this is a risk with new translators in
general. A related issue is actually brought up by Subject P when talking about new
translators copying the style of MT. Bearing this in mind, the researcher would add to the key
findings that there is a risk of low quality TM being imitated by inexperienced translators and
becoming a vector for the spread of linguistic issues.
5.8.4 Leveraging Machine Translation Discussion
Again the coverage for the practice is thorough with all the points that the researcher
expected being covered.
The researcher also agrees with Subject M that confidence scores are desirable when
leveraging MT, but would like to point out that this addition could lead to a UI that is too
convoluted and may not help many users, as it was the case for fuzzy matches as observed by

196

Subject P. This leads again to Subject As observation that the UI for the translation should
be configurable. This observation came up when talking about TM, but is equally valid here.
Although the researcher has not found any papers claiming this, it is the researchers view
that MT output biases translators towards a specific translation. This bias may reduce the
diversity of translations for a single TU, a feature that can be beneficial when collecting
Hidden Alternative Translations and using frequency as the criteria for Metadata Based
Selection. Bearing this in mind, the researcher would suggest adding to the key findings that
leveraging MT reduces the number of alternatives suggested in scenarios where hidden
alternatives are collected and as a result makes frequency based Metadata Based Selection
more effective.

5.8.5 Leveraging Terminology Discussion


This practice too had thorough coverage from the interviewees and all the points the
researcher expected were discussed. A point that stood out as particularly relevant in the
context of crowdsourcing is the function of terminology as a way to reduce arguments in the
community. This is an advantage that is not relevant in other scenarios. An example of
terminology being used to stop community arguments is was presented by Lenihan et al
(2011, p.58) and similar examples have been observed by the researcher in his involvement
with the Spanish Ubuntu localisation community.
5.8.6 Translation without Redundancy discussion
The coverage for the practice is reasonably thorough considering the mismatch between the
question that was asked and the question that should have been asked. Many of the
observations attached to this practice emerged indirectly or in answers to other questions.
Although this practice fits in Howes definition (2006) and the definition discussed in
Chapter 2 of this thesis, the researcher was expecting more interviewees to argue against the
practice belonging to crowdsourcing, since it is close to the TEP approach and does not
require large numbers of contributors to be implemented. Only subject C expressed this view
when he said that this practice was not real crowdsourcing.
An advantage that was not brought up by any of the subject was that in scenarios where the
content being translated is sensitive, the one-to-one resource-to-task ratio is beneficial
because the number of people with access to the content is limited and this makes finding
leaks easier.
197

Subjects C and A brought stated that tests could result in a reduced crowd when using the
tested-translator approach, but that is not the only way in which Translation without
Redundancy reduces the crowd. By having a one-task-one-person distribution ratio, the
number of people that can directly contribute to the community at any time is directly linked
to and limited by the number of tasks. If an organisation has a wealth of translators in a given
language pair and uses Translation without Redundancy, it is possible that many of those
translator are unable to find tasks to carry out.
There is another disadvantage that was not brought up by any of the subjects, but can be
derived from the fact that translations are locked when someone claims a task. The time since
a task is claimed until the task is reassigned is effectively a delay that would not have
happened if there were several translations happening in parallel.
Given the previous observations, the researcher would suggest adding the following
considerations to the key findings:

Translation without Redundancy limits the number of people with access to sensitive
content which is an advantage by itself and also results in easier identification of
leakers.

Locking TUs to avoid redundancy can prevent some volunteers from finding
unclaimed tasks.

Locking TUs to avoid redundancy can cause further delays.

5.8.7 Open Alternative Translations discussion


The coverage for this practice was good. Again, the interviewees had clear ideas about this
practice and covered all the expected points. A possible factor contributing to this is that this
practice is implemented by Facebook, one of the best known examples of crowdsourced
translation.
There were two features of this practice that the researcher expected someone to bring up but
were not discussed by any of the interviewees.
First is the effect of using Open Alternative Translations if contributors are being paid for
their translations. In this scenario contributors have a material motivation to suggest pointless
translations that will add to the cost.
The second feature of this practice has to do with the effect that seeing existing translations
has on contributors. The researcher thinks that existing translations can work to some degree
198

as TM and MT, given that contributors can copy and paste existing translations and modify
them according to their criteria. This would accelerate the process for the creation of
individual new suggestions.
This analogy to leveraging TM and MT extends to their potentially negative impacts. In the
same way that Subject P worries about novice translators imitating the style of an MT system,
the researcher thinks that existing translations will impact later translations. For example, if
all existing translations use a given term even without an authoritative translation for it, it is
unlikely that a new alternative will decide to use a different term and erroneous terms would
become common place through this mechanism. Although in a different field, this happened
in an experiment where people were asked to transcribe poorly rendered text, when subjects
were able to see previous transcriptions that contained mistakes, they reproduced those
mistakes in their own transcriptions (Little et al. 2010).
This reproduction of mistakes is an example of the convergence processes discussed by
Lorenz et al (2011). The convergence processes are:

The social influence effect, which says that social influence diminishes diversity in
groups without improving its accuracy.

The range reduction effect which says that the position of the truth, in this case that
would be the optimal translation, over time moves to peripheral regions, that is,
becomes less similar to the cluster of translations that are the most popular.

The confidence effect through which opinion convergence boosts individuals


confidence in their estimates despite a lack of collective improvements in accuracy.

Because of the observations above, the researcher would suggest adding the following
considerations to the key findings.

It can accelerate the process by allowing contributors to copy and modify existing
translations.

It can be a vector for the spread of poor translations.

5.8.8 Hidden Alternative Translations discussion


Most points considered by the researcher were discussed by the interviewees. However, there
were some features that none of the interviewees brought up.

199

First, by collecting a limited number of translations the generation of translations itself would
take less time since, for example, it takes less time to collect five translations than to collect
thirty.
Second, this practice by setting limits to the cost enables you to use redundancy even if you
are offering material rewards for the translations. This is illustrated by the processes used by
TxtEagle (Eagle 2009), Crowdflower (Bentivogli et al. 2011) and Zaidan and Callison-Burch
(2011).
Third, by hiding existing translations the contributors are freed from their influence hence
avoiding the social influence effects (Lorenz et al. 2011). Subject As suggestion that this
method produces more variants could be seen as a result of avoiding the range reduction
effect.
Also, as discussed in the previous practice, the researcher thinks that novice translators may
imitate existing translations in a project using the open alternative translation, but this effect
would be avoided using hidden alternatives.
Using hidden redundant translations with the limited redundancy means that fewer people
will have a chance to contribute, hence losing more appeal as a marketing strategy.
Lastly, although the researcher agrees with Subject S that using Hidden Alternative
Translations with longer TUs is feasible, the researcher thinks that using redundancy with
longer work units is a suboptimal approach because of the issues pointed out in the section
about the Translation Unit Granularity Selection practice.
Considering the preceding observations, the researcher would suggest adding the following
considerations to the key findings:

If the number of alternatives is limited, the time required to collect them should be
shorter.

If the number of alternatives is limited, it is possible to budget for translations and use
material rewards to motivate the crowd.

If used with a limited number of instances, fewer people will be able to contribute.
This lowers the value of crowdsourcing translation as a marketing effort.

5.8.9 Super Iterative Translations discussion


The coverage for the practice contained most of the features foreseen by the researcher even
though none of the interviewees has worked in an environment where it is used.
200

At the time of the interviews the researcher referred to the practice as autopublished iterative
translation. In the meantime the researcher noticed that there are two implementations of this
practice:

Autopublished Super Iterative Translation, as enabled by DotSub and Amara.

Trigger published Super Iterative Translation, as enabled by Facebook.

Only Subjects P and V talked about quality issues. Subject P did so by saying that the first
iteration would always be bad and Subject V by talking about not wanting to have wrong
translations published, which is more related to vandalism than to poor performing
translators, but still a quality issue. The researcher thinks that these quality issues are the
reasons why TED translations and Subject Vs project do not use this practice even though
both of them use platforms that enable it. It would have been interesting to talk to
representatives of platforms that enable the practice to see what kinds of project actually
implement it.
The researcher expected more interviewees to bring up the high risk of malicious translations
getting through since they are auto-published, but only Subject V addressed this indirectly
when talking about wrong translations. Since the implementation of the practice means
being open to that risk, the researcher thinks that when implementing it, it is necessary to
have to have a flagging mechanism so that malicious translations are not only corrected
quickly, but that measures can be taken to at least hinder the perpetrators activity in the
future. Wikipedia has two strategies that are practices included in this thesis Freeze and
version Rollback and three that fall out of scope since they are related to resource patterns
user banning, user or IP blocking and page protection (Category:Wikipedia enforcement
policies 2013). Banning and blocking are ways of preventing users from modifying
Wikipedia pages and page protection is a freeze that does not affect users that meet certain
criteria. The metadata rich versioning system that Subject A talked about would be useful to
enforce these practices and several of the strategies that such a system enables are actually
suggested by the interviewees when talking about how open redundant assessment is open to
vandalism.
The researcher also expected that the interviews would bring up the possibility of having a
feature that enables debate and planning attached to each document as it is the case in the
Wikipedia Talk pages (Viegas et al. 2007). A feature like this, besides allowing the

201

contributors to solve issues and coordinate their efforts within the platform, can result in
increased engagement.
With regards to the trigger published Super Iterative Translation, it is an implementation that
appears only when combined with methods that collect alternative translations and use a
metadata based selection process as the trigger. As such, the trigger published super iterative
loses the advantage of corrections being published immediately, but offers more protection
from poor translations and malicious users as those translations have to undergo the selection
process before being published.
Considering all the previous observations, the researcher would suggest adding the following
to the key findings:
If the translations are published without control from a human gate keeper, malicious
translations can get through.
A flagging mechanism should be made available in order to enable faster reaction to poor
translations.
A discussion page attached to each document in a manner similar to the Talk Pages used in
Wikipedia, can be helpful in moderating changes and guiding the evolution of the translation.
5.8.10 Freeze discussion
Even though the coverage went beyond what the researcher expected, there was one foreseen
observation that was not made by interviewees. As discussed in the previous practice,
Wikipedia uses a special kind of Freeze that they call Page Protection whereby only users
that meet certain criteria can modify a page. A similar strategy could be used for translations.
Bearing the previous paragraph in mind, the researcher would add the following observation
to the key findings:
Freezing can also be implemented in a way that affects only specific users.
5.8.11 Version Rollback discussion
The coverage for the practice was good with the interviewees covering all the aspects
expected by the researcher even though only one of them implemented the practice. This is
the practice with the lowest number of references coded in the data and a good candidate for
further research.
5.8.12 Deadlines discussion
202

The coverage for this practice went beyond the researchers expectations. This is probably
due to six out of the eight interviewees actually working with deadlines in their projects.
5.8.13 Open Assessment discussion
The coverage for this practice went beyond the aspects foreseen by the researcher; this may
be due to the practice being implemented by Facebook and several of the interviewees.
However, most of the interviewees did not consider forms of assessment other than voting.
As discussed before, Zaidan and Callison-Burch (2011) successfully used ranking of
translations in their experiment. Ranking requires a higher cognitive effort than voting, so it
may not be suitable for many crowdsourcing contexts, however, if the crowd is being paid, as
it was the case for Zaidan and Callison-Burch (ibid.) ranking is another type of open or
Hidden Assessment that could be implemented.
Regarding the Yule process risk caused by visibility brought up by Subject AC, the
experiment performed by Muchnik et al (2013) showed that positive votes feed positive votes
creating bubbles while negative votes tend to be neutralized over time. This effect would
further accentuate the Yule process, so it would be recommendable to take some measures to
reduce it if the criteria for quality were not solely the popularity of a translation.
Bearing the previous observations in mind, the researcher would suggest adding the
following consideration to the key findings:
Open Assessment can also be in the form of rankings.
2.8.14 Hidden Assessment discussion
The coverage met most of the expectations of the researcher thanks to the overlap in features
of hidden and Open Assessment.
There was an issue with the question that was asked to find out about Hidden Assessment. At
the time of the interviews the practice was still called limited redundant assessment and the
question put the stress on the number of assessments collected being limited. During the
coding process it became clear that the core feature of the practice is that the assessments
were hidden from the contributors and the stress should have been on that. For this reason
none of the interviewees considered issues and advantages of the assessments being hidden.
The main advantage in the view of the researcher is that social influence has no effect on
Hidden Assessment. The lack of social influence in the assessment could result in better
translations receiving the highest assessments by virtue of avoiding conditions that enable a

203

Yule process (Cha et al. 2009), the positive assessment bubble (Muchnik et al. 2013) and
social convergence issues (Lorenz et al. 2011).
The main disadvantage of the Hidden Assessment practice is that the assessments are carried
out in isolation, without an awareness of the opinion of the rest of the crowd. This eliminates
the motivation factors that are connected to social interaction during the assessment process
and it may be necessary provide some material motivation tools such as money or
participations in a draw to compensate the absence of social motivation factors.
As it is the case with Open Assessment, Hidden Assessment enables crowd decision making
and makes the process more self-managed and it also makes working with unlimited
redundant alternatives becomes less challenging. However, the researchers opinion is that
Hidden Assessment will be used in combination with Hidden Alternative Translations. An
organisation that uses an Open Alternative Translation process would do it because it values
the openness aspect of the process, the increased room for engagement or both. In contrast
with that, an organisation that uses Hidden Assessment is not likely to be especially invested
in having an open process or be thinking of crowd assessment as a way to increase
engagement, and in that context they would use Hidden Alternative Translations with a
limited number of alternatives.
Regarding Subject ACs observation that fixed number threshold could be used to automate
the selection, the researcher has two observations to add:
a) With a small community the predetermined number of votes may be not achieved
by any translation or take a long time resulting in the process being slowed down.
b) With a big community the number of votes will be achieved very quickly and the
process would be fast.
The possibilities of letting the crowd assess translations that have been pre-selected by an
expert or of an expert using the crowds assessment to pre-select the translations are also
available when implementing this practice.
Although the researcher agrees with Subject Cs observation that implementing this practice
would result in a system that is more easily exploited by malicious users, the researcher is
also of the mind that by hiding the assessments from malicious users, something that he failed
to point out in his question, this practice would gain some level of protection from such
exploits, since the malicious users would have to figure out the mechanism before they can
game it.
204

Having provenance metadata for the assessments would also help against vandalism and
could also be used to weight votes when implementing this practice.
The researcher thinks that the size of the crowd becomes less important when using Hidden
Assessment, since an organisation using it, is probably not using crowd acceptability as the
criterion for quality.
Although Hidden Assessment provides room for high crowd engagement, the fact that
members of the crowd are participating as isolated individuals and not as a collective means
that its value for marketing purposes is reduced.
Because the contributors are isolated, it seems unlikely that any controversies would arise.
Also when doing Hidden Assessment having a specific UI for it can be useful.
Just like in Open Assessment, the assessment of specialized documents should only be carried
out by a crowd made of specialists.
Unless a mechanism is put in place for translators to gather the feedback that has been given
by the other contributors, using Hidden Assessment should not hinder their developing a
personal style.
Hidden Assessment, like Open Assessment, is also less suitable for languages that do not have
a translation tradition.
Using Hidden Assessment for longer TUs is viable because the translators are not aware of
the judgements and their work potentially not being used. However, using Hidden Assessment
would imply using redundant translations and using redundant translations with longer TUs is
a very inefficient way of spreading resources.
Considering the observations made above, the researcher would suggest adding the following
to the key findings for this practice:

Hidden Assessment loses the transparency that can accelerate the process in Open
Assessment; however, if combined with Open Alternative Translations and Metadata
Based Selection, this loss of speed can be compensated.

Hidden Assessment aligns itself better with limited Hidden Alternative Translations.

Hidden Assessment produces better results with larger crowds.* However, the size of
the crowd is less relevant since the number of assessments is usually limited.

Because contributors are isolated in controversies cannot arise.

205

Hidden Assessment prevents the exacerbation of the Yule effect by the social influence
bias effect that creates bubbles of positive feedback.

Hidden Assessment, unless a feedback mechanism is created, would not condition the
development of a personal style.

Hidden Assessment could be used with longer TUs, but Hidden Assessment fits in a
process that uses alternative translations and there is too many disadvantages to
using longer TUs and alternative translations for Hidden Assessment to be beneficial.

Hidden Assessment combined with Metadata Based Selection can accelerate the
process by allowing translation efforts to be directed to untranslated text.

Hidden Assessment can also be in the form of rankings.

2.8.15 Expert Selection and Edition discussion


The coverage for this practice went beyond what the researcher expected. This is probably
due to six of the eight interviewees using Expert Selection and Edition in their processes.
Discussing resource related practices is out of scope for this thesis, but since Subject P
observed as a disadvantage that finding an expert is challenging, it is worth bringing up that
Subject C suggested that Participants that would like to commit, have to pass some kind of
test, some kind of examination. This examination could be a way of finding experts. This
idea was also implied in Subject Ss statement that It also, kind of contradicts the idea of
crowdsourcing the translation, but, not necessarily in a negative way, because you could
potentially have people form, an exam, or, you know, people who've worked in the system for
a certain amount of time, given this privilege and then it's at least crowdsourced to a smaller,
trusted community.
2.8.16 Metadata Based Selection discussion
At the time of the interviews the practice was called automated selection and the question
asked was What are the advantages and disadvantages of letting a computer select the
translation that will be published?. This lead to many interviewees thinking about computers
choosing the translation using some Natural Language Processing related method.
Fortunately, a lot of the metadata aspects of the selection had emerged in the answers to
questions related to open and Hidden Assessment and the coverage went beyond what the
researcher expected.

206

The researcher failed to ask Subject C about the reasoning behind publishing the newest
translation in situations where multiple alternatives have been proposed but there is a draw or
no positive votes for any of them. The researchers assumption is that the newest translation
is expected to be better than the rest because if a suitable translation is already in place
contributors are more likely to vote for one of them a low effort task than to write their
own alternative a higher effort task.

5.9 Summary
This chapter discusses the choice of data collection strategy, the selection of interviewees, the
interviews, including the questions and their order and the method used for the analysis of the
data.
The analysis outcome section contains the observations of the interviewees made directly
about the practices or in relation to them. The discussion section contains observations about
the strength or weakness of the data and observations by the researcher that were not
supported by the data.

207

Chapter 6 Practices and scenarios


Considering the advantages and disadvantages of the different practices, differing
combinations of practices will be more appropriate than others in varying scenarios. This
chapter presents a collection of scenarios based on the four taxa

Colony translation: for processes where selected and aggregated translations


are produced by translators working independently.

Wiki style translation: for processes where anyone can modify the existing
translations.

Translation for Engagement: for processes where selected and aggregated


translations are peer assessed at the selection stage.

Crowd TEP: for processes where translators work individually with reviews
done by individuals like in the traditional TEP process

Although there are four taxa, there are five scenarios that were identified after considering the
platforms analysed in Chapter 4 and the processes discussed by the interviewees in Chapter 5.
Bearing in mind the forces affecting the practices that were refined in Chapter 5, a collection
of suggested practices is presented for each scenario.

6.1 Scenario 1 Translation for Engagement


These organizations have an invested community; their content is software or software
related and much of the value of using crowdsourced translation is in the way that the
community's emotional attachment to the product increases when they translate it and in how
the translated product increases the reach of the company by expanding the number of
markets they can access. Two real examples of this scenario are Facebook and the translation
of Ubuntu. Many open source software products and other social networking sites like
Twitter would also belong in this category, but their data was not available during the writing
of this thesis.
Content Selection
This approach works best for high visibility content and content that the contributors do not
perceive as difficult. Low visibility content and difficult content, such as legal documents,
can be made available for contributors to translate, but the community may not engage with it

208

and there should be a fall back strategy for it. In the case of content with legal ramification,
you must always use Expert Selection and Edition to centralize the responsibility.
According to the information collected during this thesis, the types of content that should be
translated with this approach are terminology, GUI and, documentation in the case of open
source software.

6.1.1 Practices to Support Translators


In a scenario where emotional attachment is one of the core objectives of a crowdsourced
translation initiative, the community must be allowed to take ownership of the translation.
Bearing this in mind, in this scenario it is particularly important to offer a positive experience
for translators. According to the data collected, the practices below reinforce a positive
experience for the translators.
Unit Granularity Selection: shorter TUs.
The organizations using this approach mostly translate software and software related content,
and therefore use the software strings as TUs. If possible, segmenting longer strings, as
Subject Ps organization does, is recommended because shorter strings are perceived as being
easier to translate with a quicker turn around time, thereby affording a timely sense of
satisfaction to the contributors.
Leverage Terminology
From the point of view of giving translators a better experience, terminology pre-emptively
clarifies the meaning of some of the words with which they are most likely to have
difficulties and also prevents terminology related arguments that may hurt the community.
Furthermore, from a project perspective, presenting a well-developed terminology will help
consistency, which is one of the possible potential disadvantages of using shorter TUs.
Leverage TM
From the point of view of supporting the translators, leveraging TM results in a process that
takes a shorter amount of time and be perceived as easier, since many difficult words will
be included in the TM even if they are not terms.
From a project perspective, if the TM is consistent, it will help with consistency and
minimize the impact of having used short TUs. However, as some of the interviewees
209

commented, in crowdsourcing scenarios it is likely that some contributors will approve a


fuzzy match without editing it, so some manner of check should be implemented to prevent
unedited fuzzy matches from being approved.
Leverage MT
From the point of view of supporting a good experience for the translators, good MT output
also makes the process shorter in duration and be perceived as easier, since many difficult
words will be included in the output even if they are not terms. However, poor MT output can
degrade the experience of the translators.
From the perspective of the project, if the MT system has been trained with texts that contain
the right terminology, MT can also help alleviate consistency issues caused by the usage of
short TUs.
As it was the case with TM, translators may approve MT output without making necessary
corrections. An automated check to prevent this from happening should be implemented.
Caveats when Leveraging Terminology, TM and TM
The inclusion of all these tools can cause confusion in translators without previous
experience with their use. In all cases, these tools should be implemented in a manner that
does not get in the way. In order to do that, Subject A proposed either a configurable UI or
the presentation of the TM matches and MT output in text boxes together with an empty text
box so that contributors can choose their preferred starting point.

6.1.2 Practices to Enable Higher Engagement


Because translation in this scenario is part of a marketing effort, it is desirable to have as
many people as possible involved and for those people to have as many possibilities of
interaction as possible. The practices below facilitate both objectives.
Open Alternative Translations
The collection of open, unlimited alternatives allows anyone who so desires to suggest their
own translation. If combined with short TUs and linguistic resources as recommended, the
effort to carry out a translation is minimized, which further facilitates participation by
potential contributors.
Open Assessment

210

Again, the collection of a potentially unlimited number of assessments is desirable because it


allows everyone to express their preference. If in addition to voting, which is the default form
of assessment, contributors are allowed to make comments, the room for engagement further
increases. However, measures to prevent community in-fights arising from controversies
must be put in place.
Super Iterative Translation
Instead of selecting a translation from the alternatives proposed, it is desirable to have a super
iterative process where the most popular translation over time is published, similar to the
approach used in Facebooks process. This way, there are motives for the community to
continue offering translations and assessing them even after one translation has been
published.

6.1.3. Practices that give the Crowd Ownership


Because of the importance of translation as a marketing tool to increase engagement in this
scenario, practices that convey that the crowd is in control are desirable.
Metadata Based Selection: Maximum Votes
If the crowd is large enough, crowd assessment will work well, but determining what is a
large enough crowd is a research problem in itself. However, with a smaller crowd it will
be relatively easy for groups of malicious users to exploit the automated selection
mechanism. Blacklisting words can prevent some malicious translations from ever being
selected, but malicious users will eventually be able to get around word black lists.

6.1.4 Other Practices for Translation for Engagement


The practices discussed above focus on the engagement aspect, but there are other issues that
should be taken into account and the practices necessary to address them are discussed below.
Expert Selection and Edition
If the group of people contributing to the project is small, automatic selection based on the
number of up-votes can be easily exploited by malicious users. Besides word black list, using
an expert or group of experts to select and edit crowd suggestions as a gate keeper can solve
this issue. Two ways of doing this have been suggested:
211

1) Let the expert select the best option from those that the crowd has assessed to be
the best. This risk is that some crowd members may develop feelings
disempowered if the expert selects a translation that is not amongst the most
popular.
2) Have two rounds of assessment. First is the open assessment, then a prune is
conducted by the expert, and finally, selection based on the crowd votes for the
pruned list of alternatives. This creates more work for the expert and a redundant
round of assessment, but the approach allows the crowd to have the final say on
the selection of the translation.
In both cases, one needs a process to identify willing experts, and one then has to get them
involved. The system should allow the experts to justify their choices.

Freeze
Although using a super iterative process without freezes allows the translations to evolve
over time and means that there is always room for new contributors to get involved, there are
also quality concerns and usability considerations that can be addressed by using freezes.
Pseudo freezing, that is keeping the published translation stable but allowing contributors to
make suggestions, can strike a balance between allowing the language to evolve while
keeping the translations stable until there is a reason to change. In order to have pseudo
freeze translations, it is necessary to have an expert with the right permissions to unfreeze and
select the translation that will replace the currently published one. Pseudo freezing, by using
an expert to manage the process, can also be perceived as a measure that takes power from
the community.
Deadlines
Hard deadlines in this scenario are not recommended since the community may find them
exploitative. However, soft deadlines combined with rewards, be this material or symbolic such as badges, can help motivation as long as the deadlines have some significance such as
a new release.

6.1.5 Discussion of the Translation for Engagement Scenario


Although this scenario, up to the writing of this thesis, has mainly been used by organizations
that manage open source software or social networking platforms, it is the view of the
212

researcher that this scenario could also be used for the translation of tutorials, spiritual,
political, and philosophical essays if the community around them has a sufficient contributor
count.

6.2 Scenario 2 Crowd TEP


There are actually two processes used by organizations in this scenario. The volunteer
translator scenario and the crowd post edition scenario.

6.2.1 Volunteer translator


The organizations discussed in this thesis that use this approach where working with NGOs,
had spiritual or political objectives or, in one case, was a publisher of proprietary software. In
all cases there were invested communities around their content. Except for the software
publisher, the organizations using this approach are those with a higher mission. They
would like to have their information available to as many people as possible, and they aim to
optimize their impact per contributor.
Content selection
News, tutorials, spiritual, political, and philosophical essays were discussed as content types
that are being translated using this approach, as stated in the interviews reported in chapter 5.
Unit Granularity Selection: Long TUs
The types of content appearing in this scenario were flowing texts or video. Independent of
the TU size chosen, there is a requirement for consistency and cohesion in a good translation
of flowing text, and therefore, the whole text has to be made available to the contributors.
The organizations discussed in this thesis send files or assign complete videos. The files are
long TUs. Likewise, the videos are also long TUs - although the subtitles are short TUs,
prevent the collaboration of several people in parallel.
6.2.1.1 Practices that Increase Impact per Contributor

Leverage Terminology
From the point of view of optimizing the work of translators, terminology pre-emptively
clarifies the meaning of some of the words with which they are most likely to have

213

difficulties. Although in this scenario the use of longer TUs makes it less likely to have
inconsistent translations, having a unified terminology will support consistency across
different documents in an organization too.
Leverage TM
From the point of view of optimizing the work of translators, leveraging TM makes the
process shorter in duration. Once more, if there is a unified TM that is consistent, it will help
with consistency within and across documents.
As it was the case in the Translation for Engagement scenario, it is possible that some
contributors approve fuzzy matches without editing them, so some manner of check should
be implemented to prevent unedited fuzzy matches from being approved if they are not
suitable.
The benefits of TM in this scenario may be limited, because the types of texts translated in
these scenarios are not as repetitive as the GUIs and documentations appearing in Translation
for engagement scenarios.
Leverage MT
From the point of view of optimizing the work of translators, good MT output also makes the
process take a shorter amount of time. However, poor MT output may not help and irritate the
translators.
Again, as in the Translation for Engagement scenario, an MT system that has been trained
with texts that contain the right terminology can also help achieve consistency within and
across documents.
The issue of translators potentially approving unedited MT output without making necessary
corrections is also present in this scenario.
Caveats when Leveraging Terminology, TM and TM
As it was the case in Translation for Engagement, the inclusion of all these tools can cause
confusion for translators who lack the prior experience necessary to effectively and
efficiently leverage these tools whose implementation and deployment should be done in a
way that takes cognisance of these constraints.
Translation without Redundancy

214

With Translation without Redundancy, the organization optimizes the work of each
contributor by not allowing several of them to translate the same content in different ways.
As discussed in Chapter 5 some organizations using this approach test translators before
letting them contribute. Those same organizations also add a review stage when using this
approach. Since the review stage becomes almost compulsory, the researchers view is that
the benefits of testing translators beforehand are not enough to counter the negative effects of
the practice and it should be avoided. Subjects A and C were also against the idea of using
preselection in crowdsourcing scenarios.
Besides the optimization of the resource-to-task ratio, the main advantage of this approach is
it the simplified credit assignment. The fact that the credit is not shared may have a positive
impact on the effort made by the translator, and prevent issues with TM and MT being
published without having been submitted to necessary post-edition. Furthermore, if the
content is sensitive, Translation without Redundancy may be the only suitable option in order
to reduce the risk of leaks.
Another advantage of using Translation without Redundancy is that when combined with
long TUs, the management overhead is reduced since few bigger TUs are easier to manage
than many small TUs and the credit assignment is also simplified.
Deadlines
The discussion in Chapter 5 pointed out that deadlines work better with individuals and that
they can help to motivate them. In case of NGOs and political organizations deadlines can
play an important role since for them some projects must be done by a given day in order to
be effective. Given that there is no contractual obligation to meet the deadline, the deadline
should be early, since volunteers may choose to wait until the deadline is near to turn the
translation in. The deadline should also be a soft deadline that allows volunteers to turn
translations in with delay, so that their work is not wasted if they do not meet the deadline.
There should be a fall back plan for unmet deadlines too.
Discussion of the Volunteer translator scenario
The usage of Translation without Redundancy in this scenario makes sense when dealing
with small communities and sensitive content, but it is the view of the researcher that
organizations with access to bigger communities would be better served by using the
Translation for engagement approach if they have the technology to implement it.

215

6.2.2 Crowd Post-Edition


Subject Ms organization offers this service to commercial companies and there are
companies like Unbabel (Unbabel 2014) that offer it. For organization using this approach
community creation or the spread of their values are of a lower relevance than cost saving.
Content selection
The main motivation factor for the contributors in this scenario is money. Bearing this in
mind, this approach is suitable for any kind of content that is not better served by other
approaches.
Unit Granularity Selection: Segmented longer work units
Segmented longer work units have most of the advantages of short work units, while
allowing to keep enough information in them to prevent the issues caused by lack of context.
If the original texts are long, using several TUs allows the parallelisation of the translation.
The issue of inconsistency across different work units remains, but can be palliated by the
implementation of terminology, consistent TM, MT that has been trained with consistent
material and a final review carried out by an expert.
Leverage terminology
In this scenario terminology pre-emptively clarifies meanings and fosters consistency within
and across different TUs.
Leverage TM
As repeatedly stated in this thesis, leveraging TM makes the process take a shorter amount of
time. If the TM is consistent, it also fosters consistency within and across documents.
In this scenario where the motivation is economical, the risk of contributors trying to game
the system by approving fuzzy matches without editing them, becomes very high so a system
to prevent it must be in place.
Leverage MT
MT is the main language technology leveraged by Subject Ms organisation and the
technology that Unbabel puts the most stress on in their marketing.
As in previous scenarios, if the MT system has been trained with texts consistent, MT can
also help achieve consistency within and across documents.

216

Preventing translators potentially approving unedited MT output without making necessary


corrections is specially pressing in this scenario where they receive payment for their
editions.
Translation without Redundancy with review
In a scenario where the contributors are being paid, paying for multiple alternatives can
become costly. By using Translation without Redundancy with a centralized review stage
consistency issues can be solved and no cost is generated by paying for translations that will
not be published. However, the centralized review stage becomes a potential bottleneck in the
process.
Discussion of the crowd post edition scenario
Although this scenario is based on the process of one of the organisations with which Subject
M is involved and is also representative of the process of Unbabel. Considering that both, the
crowd post edition and the colony translation approach perform a similar function i.e.
produce translation at a reduced cost without making community development a priority, the
opinion of the researcher is that organisations using crowd post edition may be better served
by colony translation if dealing with high word counts. More information about this is
available in Chapter 7.

6.3 Scenario 3 Colony translation


This approach was used by Asia Onlines Wikipedia translation project (Vashee 2009) and by
Zaidan and Callison-Burch (2011). It is useful when community creation is secondary or not
important. Although this approach is suitable when paying for contributions as in the case of
Zaidan and Callison-Burch (2011), potential material rewards like participations in draws to
motivate contributors can generated enough motivation for certain types of content as it was
the case in Asia Onlines Wikipedia translation project.
Content selection
Contributors in this scenario are mainly motivated by material rewards. Bearing this in mind,
this approach is suitable for any kind of content that is not better served by other approaches.
However, Subject As organisation found that specially challenging content was not tackled
by their crowd, so special strategies for such content may be necessary.
Unit Granularity Selection: Segmented longer work units

217

The reasons to use Segmented longer work units in this scenario are the same as in the Crowd
TEP crowd postedition scenario.
Leverage terminology, TM and MT
The reasons to leverage these technologies are the same as in the Crowd TEP crowd
postedition scenario.
Hidden Alternative Translations and Translation without Redundancy with data mined
trusted translator.
By using Hidden Alternative Translations, multiple translations can be collected in a cost
effective manner. Furthermore, the usage of frequency based selection is enabled.
If translator performance data has been collected, translations suggested by trusted translators
can be approved automatically. This increases efficiency and reduces costs if combined with
a freeze of the collection of alternatives.
Freeze
If a data-mined trusted translator makes a suggestion, the collection of alternatives can be
frozen, thus preventing the potential extra cost and delay of collecting more alternatives.
Metadata Based Selection: Frequency
Using Metadata Based Selection it is likely that a high quality translation will be
automatically selected at a negligible cost. If no matching translations appear this approach
does not work.
Hidden Assessment
For those TUs for which no matching translations are suggested, Hidden Assessment can be
used to enable Metadata Based Selection using highest ranked translation as the criteria.
Hidden Assessment, because of its not being exposed to social bias effects might produce a
better ranking than Open Assessment.
Metadata Based Selection: highest ranked translation
The usage of Metadata Based Selection instead of Expert Selection and Edition in this
scenario avoids potential bottlenecks in the process.
Deadlines

218

Soft deadlines can be used to push contributors during specific periods and be reinforced by
increasing the rewards during that period.
Expert Selection and Edition
Exceptionally, if the extra quality is needed, having an expert do a last pass over the
automatically selected translations will allow the correction of mistakes made by the
automated selection systems and the edition of translations in order to achieve higher
consistency.
Discussion of the colony crowd scenario
This approach is by far the most technologically challenging of all the approaches suggested
in this thesis. Its main advantage over the crowd post edition approach is that through the
increased automation realised by the implementation of Metadata Based Selection practices
there are no bottlenecks in this process.

6.4 Scenario 4 Wiki Style Translation


Although the wiki style process is enabled by platforms such as Transifex, Amara and
DotSubs, none of the organisations that have processes that the researcher knows use this
approach. For example, TED talks use Amara which enables a wiki style process, but the
organisation runs a Crowd TEP volunteer translator process instead (Wijayanti 2013). The
choice of a Crowd TEP volunteer translation over a wiki approach may be because
organisational aversion to publishing low quality iterations and unwillingness to handle
vandalism.
Given that the researcher is not aware of any actual project using this approach, the
suggestions that follow are based exclusively on the view of the researcher that anyone using
this approach does it because they do not have a clear idea of who will contribute to their
effort.
Content Selection
Any kind of content is amenable to being translated using this approach as long as there are
measures in place to prevent a negative impact from low quality iterations being published or
the impact of low quality or malicious editions is not an issue. Content that needs to be stable,
such as legally binding documents, can also be translated using this approach, but it needs to
be frozen before publishing it.
Unit Granularity Selection: Long segmented TU
219

This is the approach followed by Amara and DotSub. A complete segmented document made
available for contributors, who work at the segment level with awareness of the context. This
allows multiple contributors to work on different segments at the same time, resulting in
higher throughput via parallelisation.
Leverage terminology, TM and MT
MT and TM would be helpful in accelerating the first iteration of the translation and as with
previous scenarios, if these linguistic resources are consistent, they also help to keep the
translation consistent. Since the translation recommended in this scenario is the super
iterative, the value of MT and TM beyond the first iteration is severely reduced because of
the current iteration performing the same function.
Terminology in this scenario, as it was the case in the translation for engagement scenario,
can be used to prevent terminological debates.
Super Iterative Translation
By using this practice, the barrier to contributions being published is at its lowest level. This
has two immediate consequences: More people can contribute and seeing their contributions
published can motivate them to continue contributing; and more poor translations and
malicious translations are likely to be published. Freezes, rollbacks and other practices can be
used to minimize the impact of poor and malicious translations.
Besides the impact on potential participation and the quality of the participation, Super
Iterative Translation can be used to enable the translation to evolve over time. This evolution
can be guided via a talk page dedicated to the document like it happens on Wikipedia
pages. Besides helping to guide the translation, the talk page would open a space for
contributors to establish social bonds between themselves which could in the long term result
in the development of a community.
Freeze
When malicious users decide to vandalize specific content, or because of controversial issues
the content is continuously being reinterpreted, freezing it prevents further changes and can
help to direct the translation effort to content that has not yet been translated.
Freezes can also be used to finish a translation by preventing further changes even when
these could be beneficial.
Rollback
220

When a translation as result of an edition has loss quality, rolling back to a previously
existing version is a low effort manner of recovering that quality. Since the TU has been
segmented, only the affected segments would be rolled back, preventing the loss of beneficial
changes on other segments.
Discussion
Although working Wiki Style loses the simple credit attribution of using and potentially the
motivating effects of a single user taking ownership of a whole document like in the Crowd
TEP Volunteer translator approach; it is the view of the researcher that motivation through
credit assignment is still possible, if more complex credit attribution systems are used, and
the potential gains in speed, quality and engagement through inter-contributor interaction
trump simple credit attribution in the long run.

6.5 Long tail scenario variations


When working on long tail languages with no translation tradition there are some extra
considerations to bear in mind.
TM and MT may not be available and if available the quality may not be good. It is very
likely that new translators will imitate the style of the TM and MT, this could have pernicious
effects on the quality of the translation and even impact the development of the language in
the long term. Subject P who frequently works with long tail languages said that in his
organisation they purposely made using MT difficult to prevent people copying the style
from MT systems. This is the only strategy specific to long tail languages that emerged
during this research, but it is possible that other experts in long tail languages have more
suggestions. It is also possible that the issue of contributors imitating the style of the MT
system, as observed by Subject P, does not appear in other projects or among different
language pairs.
Subject P also observed that finding experts for long tail languages is challenging, but added
that for long tail languages it is especially critical to have use experts select and edit crowd
suggestions since the selection solely based on open crowd assessment is very likely to have
low quality. The researchers view is that Hidden Assessment and frequency based selection
are better options than Open Assessment, but even those approaches can result in poor
translations. For example, the researcher would not be surprised by false friends pairs of
221

words in different languages that resemble each other but have different meanings appearing
in translations selected using frequency or Hidden Assessment.

6.6 Summary
This chapter has presented five scenarios that are associated to the four taxa in the taxonomy
from Chapter 3. For each scenario a selection of practices are suggested according to the
information about them found out in Chapter 5 Refinement of the Practices. Finally, the
chapter discusses some issues regarding long tail languages and the impact of linguistic
resources for crowdsourcing in those languages regardless of the scenario.

222

7 Conclusions
This chapter presents the conclusions of the research by addressing the research questions,
discussing the impact of the research, its limitations and open and unexplored venues that
have been touched on in the process of writing this thesis.

7.1. Summary of results


Q1. What are the existing kinds of crowdsourced translation processes?
As seen in Chapter 2, the previously existing classifications for crowdsourced translation
processes (Bey et al. 2006; Ray and Kelly 2011) did not meet the criteria of usefulness as
described by Geiger et al (2011). In Chapter 3, an extension to a previously existing
taxonomy was carried out using crowdsourced translation processes data, but the outcome
was not satisfactory. As a result of this, another taxonomy was developed using data
exclusively for crowdsourced translation platforms. This taxonomy was created using the
three level approach proposed by Bailey (1994) that had already been successfully used by
Geiger et al. (2011) and Nickerson et al. (2009).
The resulting taxonomy validated hypothesis C that stated that different defined types of
crowdsourced translation processes exist and disproves hypotheses A that all crowdsourced
translation processes fit in a single group because they were very similar and B that no
meaningful groups of crowdsourced processes existed because they are all too different from
each other. Four groups were identified and their proposed names are Colony Translation,
Wiki style translation, Translation for Engagement and Crowd TEP.
Q2. What practices appear in the different crowdsourced translation processes?
In order to answer this question, eight process models corresponding to different approaches
to crowdsourced translation were created as follows:.

Translation for Engagement: Facebook, Pootle, Launchpad and Crowdin.

Wiki style Translation: Amara and DotSub.

Crowd TEP: Kiva.

Colony Translation: AsiaOnline.

Similarities such as the leverage of TM and MT appeared independently of taxon. All the
Translation for Engagement processes had loops that enabled the collection of multiple

223

alternative translations for a single TU. The two Wiki Style Translation processes that were
modelled had loops that allowed the overwriting of existing translations.
Although there was only one example of Crowd TEP, features such as the need for a
translator to claim a task and the existence of deadlines that result in the task being
reallocated to another translator were exclusive to this taxon. In the case of Crowd TEP the
processes described by the marketing department in the case of VerbalizeIt (VerbalizeIt
2014) also helped the identification of practices typical of that taxon. These practices were
also later discovered to be part of the processes of the organisations of two of the subjects
interviewed for Chapter 5.
There was also only one model of a colony translation process, but this also had features that
did not appear in the other taxa, like the collection of a fixed number of redundant
alternatives. This process feature appeared in the literature too (Zaidan and Callison-Burch
2011) enabling the identification of the practice as characteristic of that taxon.
Fourteen practices were identified and listed in Chapter 4.

1 Leveraging MT.

2 Leveraging TM.

3 Leveraging Terminology.

4 Open Alternative Translations.

5 Hidden Alternative Translations.

6 Super Iterative Translation.

7 Translation without Redundancy.

8 Freeze.

9 Rollback.

10 Deadlines.

11 Open Assessment.

12 Hidden Assessment.

13 Expert Selection and Edition.

14 Metadata Based Selection.

Given that the initial purpose of the thesis was the creation of a pattern language, two patterns
found in the literature (Dsilets and van der Meer 2011) select work unit and select content
were added to provide an entry point for the language as suggested by Buschmann et al
(2007). In the process of identifying the forces that affect the practices, the select work unit
224

practice was divided into three variables Long Work Units, Segmented Long Work Unit and
Short Work Unit.
These results indicatesthat hypothesis 2C is correct. The practices within a single taxon were
similar among themselves and the processes across taxa were different among themselves.

Q3. What are the forces that shape the candidate practices?
In order to assess the forces that shape the practices, eight experts were interviewed. Through
the course of the interviews the perception of the core characteristics that constitute the
practices changed and the names of the practices were changed to reflect this.
Some of the features, such as the effect that social influence might have on open assessment
processes, that the researcher expected to emerge from the interviews were not raised by the
subject, but were addressed and where possible supported with evidence from the literature in
the corresponding discussion sections.
Because of the exploratory nature of the question and the high amount of potential
discoveries, no hypotheses were formulated.
Q4. How can the different practices be combined?
An attempt to create a formal pattern language was made in order to answer this question, but
it failed. A mind map was created to this purpose, as visible in Appendix 6, but the number of
practices and forces involved were too complex to manage without input from other
perspectives such as motivation or crowd management. In order to prepare a formal pattern
language, many of the questions that appear in the limitations and further research section
would have to be answered first.
As an alternative, five scenarios, one for each taxon except Crowd TEP which had two , and
recommended practices for each of them are presented in chapter six of this thesis.

7.3. Impact of the Research Contributions.


There are two main contributions and a minor one in this thesis. The main contributions are
identification of four approaches to crowdsourcing translation via a formal taxonomy, and the
identification of 16 practices that are relevant to these scenarios. The minor contribution is
the suggestion of practice models based on each scenario.

225

If the taxonomy is embraced by practitioners and academia, it has the potential to solve
communication problems. This can happen through the taxonomy enabling individuals to
make suggestions and observations and carry out research in a much more focused manner.
The increased accuracy of the discourse can prevent contradictions and misunderstandings
that have resulted from the people conceptualising different types of crowdsourced
translation as if they were the same thing. Furthermore, it is hoped that this research will also
enable the identification of types of crowdsourced translation processes that are not covered
here and the practices specific to them. The researcher is already aware of an unexplored
approach to crowdsourced translation Collaboration between dual monoglots with MT
mediation that will be covered in the limitations and further research section.
The practices provide organisations that plan to use crowdsourced translation or already use it
with options that they may not have considered prior to the existence of this collection.
Furthermore, the process models can help in the software design if they intend to develop a
platform in-house.
The recommendations provide guidance for organisations with regards to the practices they
should use depending on their scenario. However, it is possible that an organisations
scenario does not fit cleanly with any of the existing categories. This will be discussed below.

7.4. Limitations and Future Research


Although every effort has been made to make this research valid and useful, there are
limitations that are commented upon in this section.
7.4.1 Data scarcity
The new taxonomy is based on only twelve processes. This is below the thirty minimum data
points as recommended for statistical analysis (Oates 2005), but the main issue is actually the
lack of processes for some of the categories. The Crowd TEP model only has two
representatives in the taxonomy, which are Kiva and VerbalizeIt. However, the processes
described by Subjects S and AV fall into this category and fully fit the taxon. Furthermore,
the TED translations project as described by volunteers (Wijayanti 2013) and Subject A falls
into this category too. This hints towards the taxons being valid.
In the case of colony translation only the data for AsiaOnlines Wikipedia Translation Project
was fully available. The only other documented instances of this type of process come from
the literature: one from using AMT to create training material for MT systems (Zaidan and
226

Callison-Burch 2011), and the other being TXTEagle that used SMSs for translation (Eagle
2009). It is possible that some companies that offer crowdsourced translation as a service use
this type of process, but although several were contacted for the survey, they did not respond.
The Translation for Engagement taxon contained five platforms Crowdin, Zanata,
Facebook, Launchpad and Pootle but more platforms would have reinforced the validity of
the cluster. Although Arends talk did not contain sufficient detail to add the Twitter
Translation platform to the taxonomy, the process described by Arend (2012) would fit in this
taxonomy.
The lack of access to the platforms or experts that could facilitate their documentation also
prevented the inclusion of processes that may have resulted in new taxa. For example,
Lingotek was brought up by one of the interviewees but the researcher was unable to obtain
data that would have allowed its inclusion in the taxonomy.
Duolingo (Savage 2012) is a platform that may fall into the Colony Translation group, since
it was developed by the person who developed reCaptcha (Von Ahn et al. 2008), which
works in a manner similar to Colony Translation processes. However, the researcher was not
able to uncover enough information about its process to include it in the taxonomy.
Collaboration between dual monoglots with MT mediation approaches such as that used by
MonoTrans (Resnik et al. 2010; Hu et al. 2011) has the potential to become its own taxon,
but so far this approach has only been used successfully in research contexts and was not
included in the taxonomy.
There are also communities dedicated to the translation of media such as comics (OHagan
2009), videogames (Snchez 2009), films and television series (Hatcher 2005) that have their
own processes. These communities may have advanced tools as illustrated by the tool used
by the Spanish group Tales Translations (Ramrez 2010) and their processes are missing from
this thesis. Although there were attempts were undertaken to collect data from these
communities, the researcher received no response. This could be related to the work of these
groups being often in a legally grey area (Leonard 2004). One of the fan subtitling processes
described in the literature (Carmona and Pym 2011) does however point at a crowd TEP
approach.
Another well-known platform that could be using the crowd TEP approach is Yeeyan (Stray
2010), but no trustworthy information was available with respect to its process, and attempts
to collect data via the researcher using the platform with the support of MT failed because the
227

quality of the MT did not allow the researcher to use it proficiently enough. Efforts to contact
representatives of the platform failed too.
7.4.2 Slanted Data
As observed in chapter three the number of companies that charge for crowdsourced
translation as a service and kindly answered the survey was insufficient, and it is possible that
existing taxa are not represented because of this issue.
7.4.2 Cross-sectionality
As stated in chapter three, crowdsourced translation initiatives appear and disappear. The data
in this thesis corresponds to initiatives that existed in the years 2008 to 2013. It would be
interesting to find out about initiatives that have disappeared, and to carry out the same
survey in a periodical manner for that longitudinal span thus depicting the evolution of
crowdsourced translation over time. Such modelling work might reveal convergences and
diveregences.
7.4.2 Incomplete Process Perspective
This thesis deals only with aspects of the process that generate translation data or metadata.
In order to have a successful crowdsourced translation project, other aspects must also be
taken on account.
Motivation is a very important factor (Brabham 2008a; Yang et al. 2009; Borst 2010; Cedeno
2010; Dsilets and van der Meer 2011) and it is only tangentially touched upon in this thesis
due to scope. The collection of practices for motivation available in the wiki for collaborative
translation patterns (Dsilets and van der Meer 2011) is a good starting point and researching
how they relate to the different types of processes would add value to them. For example, all
the cases of Colony Translation included either material rewards or, in the case of Asia
Online, a potential material reward. Some of the translation for engagement systems had
point systems that allowed the ranking of contributors. These relationships between
approaches to motivation and processes are worth exploring in more depth.
Crowd organisation is another aspect that is outside the scope of this thesis. Several of the
Translation for Engagement platforms have Expert Selection and Edition, and in the case of
the Crowd TEP they have reviewers. In some cases the people performing these tasks are
paid professionals, but in others they are trusted members of the community. The exploration
of the processes used to decide who can perform these roles and what other differentiated
228

roles can constitute the crowd is another avenue of research that should be pursued in order to
have a holistic collection of practices. Although the wiki for crowdsourced translation
practices (Dsilets and van der Meer 2011) does not have suggestions regarding selection
processes, it does have some suggestions regarding potential roles.
Practices regarding tracking the performance of contributors which happens in several
platforms (Losse 2008; Eagle 2009; Vashee 2009; Arend 2012) and the use of this
information to manage contributions would also be valuable for a holistic view of the
process.

7.4.3 Social Influence in Translation


The existence of Yule processes (Cha et al. 2009), the positive assessment bubble (Muchnik
et al. 2013) and social convergence issues (Lorenz et al. 2011) have been researched in other
fields and their applicability to translation processes is discussed in chapter five under the
open alternative and hidden redundant translations and the open and hidden redundant
assessment practices. Subjects AC and P discussed issues related to Yule processes, but
there is no formal data proving that Yule processes, positive assessment bubbles and social
convergence issues also apply in the context of crowdsourced translation. The exploration of
these issues is also an avenue for future research.

7.4.4 Processes with a Source Language other than English.


AsiaOnlines process for their Wikipedia project included the training of an MT system to
translate between Thai and English. This required special stages for the segmentation of Thai
text (Vashee 2009). There are other platforms where English is not the only source language,
such as Meedan (Hu et al. 2010) where the translations go from English into Arabic and vice
versa. It is possible that there are practices that apply in these contexts that have not emerged
in this study.

7.4.5 Performance Differences between Interchangeable Combination of Practices


To the best of the knowledge of the researcher, no organisation is using the Wiki Style
approach to carry out their translations. Organisations like Adobe TV using DotSub (Adobe
2014) and TED using Amara (Wijayanti 2013), could use the Wiki Style approach, but
229

instead use a Crowd TEP approach. It would be interesting to conduct a pilot study to
evaluate the efficacy and effectiveness of both approaches by running sample translation
projects in parallel using Crowd TEP and Wiki Style models..
Hidden alternatives with Metadata Based Selection, Open Alternative Translations with
Metadata Based Selection and automatically published Super Iterative Translation all have
potential to speed up the process in different ways. However, currently there is not enough
information to inform which one is actually faster and/or if the increases of speed depends on
a combination of other factors in conjunction with the approach to translation.
In chapter six the researcher proposed using a combination of frequency based and selection
based on highest ranked for Hidden Assessment as an alternative to expert review when
selecting translations to publish. Except for the beneficial effect of having a single expert
with an overview of the whole translation, these practices are generally interchangeable, but
questions regarding the impact of one or the other in cost needs to be researched in order for
this recommendation to be derived from empirical data.
7.4.6 Modelling Variation
The process models created for chapter four were validated by one of the supervisors of this
thesis. Although the supervisor is an expert in process modelling, there are issues of conflict
of interest that weaken the validity of those models. Furthermore, while the choice of Petri
nets as the language of abstraction is highly appropriate in an academic context, in order to
increase the impact of the research, it might be recommended that one also produces BPMN
models that are more popular within the industry.
7.4.7 Survey using Interviews
A posteriori the researcher noticed that the Aggregation of Contribution s should have been
discussed in a context where the level is included at which aggregation happens. This would
have resulted in the Select and Integrate characteristic emerging more quickly.
Similarly, the concept of peer should have been explained better, since its vagueness can
lead to people thinking that the person reviewing a translation in crowd TEP processes is a
peer of the translator and as a result, the answer to the accessibility may become assess or
modify, where it should have been none.

230

It is possible that by carrying out another survey using interviews instead of a selfadministered questionnaire, researchers could shine a light on these and other aspects of the
process that did not emerge when using the questionnaire technique.

7.4.8 Practice Collection Completeness and Validity.


The collection of practices is not by any means complete. This collection is only
comprehensive with regards the contents in the relevant literature and the tools that were
analysed. Only eight experts were interviewed for this thesis. The experts were selected with
cognisance of the types of crowdsourcing emerging from the new taxonomy. However, none
of the experts were involved with wiki style processes and only one was involved in colony
translation. Practices like Version Rollback and the hidden redundant translations and
assessment had lower coverage than those from the crowd TEP and Translation for
engagement groups that were better represented. It would be desirable to interview more
experts and especially experts that are involved in colony and wiki translation processes.
The gaps in the collection currently emerge from the different aspects of crowdsourced
translation processes that have not been addressed in this thesis. Besides that, there is a
chronological factor to be taken into account, given that these practices are representative of
current real life scenarios that are evolving over time. Some of the practices in this thesis may
at some point be superseded due to technological or cultural developments that make them
less relevant, others may just evolve in response to those changes and new ones may emerge.
The temptation exists to extrapolate practices from other venues in order to fill the collection.
The discussion pages from Wikipedia are an example of such a practice. Such pages would
perform a function similar to the notes and comments from reviewers in the TEP process and,
as suggested before, the researcher is of the opinion that they could help develop a feeling of
community in wiki style processes. Furthermore, the content of such pages could be highly
relevant in a research context. However, not a single crowdsourced translation process of
which the researcher is aware used anything similar, which would make it very difficult to
consider it a valid practice.
Regarding the validity of the collections of practices, it has been said that patterns, which is
what these practices attempt to emulate, need to appear frequently (La Rosa, Wohed, et al.
2011), but frequently has not been defined in the literature. All of the practices appearing in
this thesis do so in less than ten platforms known to the researcher. This could be seen as
231

insufficient in some contexts. Conducting research to identify more instances of these


practices being used in real scenarios is a necessary step to in progressing towards a pattern
catalogue..

7.4.9 Size of the crowd


Several of the experts claimed that certain practices work better with a bigger crowd. Besides
the positive impact of bigger crowds, the researcher thinks that it would be useful to model
and determine the relations between the size of a language and the size of the crowds
involved in translation for those languages. Also, what is the minimum number of involved
people required for a project to be tractable, and how involved should these contributors be?

7.4.10 Impact in Cost and Quality of Increasing the Number of Translations Collected when
using Hidden Assessment
Asia Online collected three alternatives for each TU in their Wikipedia translation project
(Vashee 2009), and TXTEagle collected a variable number of alternatives (Eagle 2009).
Research is required to investigate the balance between quality and cost in the form of more
alternatives collected, that can be achieved by combining factors beyond the previous
performance of the contributors.

7.5 Summary
This chapter addressed the methods in which the research questions were answered, and then
presented a discussion on the impact of the research, its limitations, and a series of questions
that emerged during the writing of this thesis that the researcher considers to be worth
pursuing in future research.

232

Bibliography
Van der Aalst, W., Weijters, T., Maruster, L. (2004) Workflow mining: Discovering process
models from event logs, Knowledge and Data Engineering, IEEE Transactions on,
16(9), 11281142.
Van der Aalst, W.M. (2013) Business process management: A comprehensive survey, ISRN
Software Engineering, 2013.
Van der Aalst, W.M., Dumas, M., Gottschalk, F., ter Hofstede, A.H., La Rosa, M., Mendling,
J. (2010) Preserving correctness during business process model configuration,
Formal Aspects of Computing, 22(3-4), 459482.
Van der Aalst, W.M., Reijers, H.A., Weijters, A.J., van Dongen, B.F., Alves de Medeiros, A.,
Song, M., Verbeek, H. (2007) Business process mining: An industrial application,
Information Systems, 32(5), 713732.
Van der Aalst, W.M.P. (2004) Business process management demystified: A tutorial on
models, systems and standards for workflow management, Lectures on Concurrency
and Petri Nets, 2158.
Adobe (2014) Adobe TV [online], available: http://tv.adobe.com/translations/translatorfaq/
[accessed 10 Mar 2014].
Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M. (2008) recaptcha: Humanbased character recognition via web security measures, Science, 321(5895), 1465
1468.
Alexander, C., Ishikawa, S., Silverstein, M. (1977) A Pattern Language: Towns, Buildings,
Construction, Oxford University Press, USA.
Allen, J.P. (2010) Knowledge-Sharing Successes in Web 2.0 Communities, Technology and
Society Magazine, IEEE, 29(1), 5864.
Alonso, G., Agrawal, D., El Abbadi, A., Mohan, C. (1997) Functionality and limitations of
current workflow management systems, IEEE Expert, 12(5), 6874.
Alonso, O., Rose, D.E., Stewart, B. (2008) Crowdsourcing for relevance evaluation,
Presented at the ACM SIGIR Forum, ACM, 915.
Anastasiou, D., Gupta, R. (2011) Comparison of crowdsourcing translation with Machine
Translation, Journal of Information Science, 37(6), 637659.
Anastasiou, D., Schler, R. (2010) Translating Vital Information: Localisation,
Internationalisation, and Globalisation, Syn-thses journal.
Anthony, D., Smith, S.W., Williamson, T. (2005) Explaining quality in Internet collective
goods: zealots and good samaritans in the case of wikipedia, Hanover: Dartmouth
College, available: http://web.mit.edu/iandeseminar/Papers/Fall2005/anthony.pdf.
Arend, T. (2012) Social Localisation at Twitter- Translating the World in 140 Characters
[online], available: https://www.youtube.com/watch?v=eGb5-MLcLr0 [accessed 3
Jun 2014].
Arjona Reina, L. (2012) Translations in Libre Software.
Baer, N., Moreno, M. (2009) New Trends in Crowdsourcing: The Kiva/Idem Case Study
[online], available: http://vimeo.com/8549171 [accessed 3 Apr 2013].
Bailey, K.D. (1994) Typologies and Taxonomies: An Introduction to Classification
Techniques, Sage Publications, Incorporated.
Baym, N. (2011) Social Networks 2.0, The handbook of internet studies, 11, 384.
Becker, J., Rosemann, M., von Uthmann, C. (2000) Guidelines of business process
modeling, in Business Process Management, Springer, 3049.

233

Bentivogli, L., Federico, M., Moretti, G., Paul, M. (2011) Getting expert quality from the
crowd for machine translation evaluation, Proceedings of the MT Summmit, 13, 521
528.
Bey, Y., Boitet, C., Kageura, K. (2006) The TRANSBey prototype: an online collaborative
wiki-based cat environment for volunteer translators, Presented at the LREC-2006:
Fifth International Conference on Language Resources and Evaluation. Third
International Workshop on Language Resources for Translation Work, Research &
Training (LR4Trans-III), 4954.
Bloodgood, M., Callison-Burch, C. (2010) Using Mechanical Turk to build machine
translation evaluation sets, Association for Computational Linguistics, 208211.
Borst, W.A.M. (2010) Understanding Crowdsourcing: Effects of Motivation and Rewards on
Participation and Performance in Voluntary Online Activities.
Bowker, L. (2002) Computer-aided Translation Technology: a Practical Introduction, Univ
of Ottawa Pr.
Bowker, L. (2005) Productivity vs Quality: A pilot study on the impact of translation
memory systems., Localisation Focus, 1(4).
Boyatzis, R.E. (1998) Transforming Qualitative Information: Thematic Analysis and Code
Development, Sage.
Brabham, D.C. (2008a) Moving the crowd at iStockphoto: The composition of the crowd
and motivations for participation in a crowdsourcing application, First Monday,
13(6-2).
Brabham, D.C. (2008b) Crowdsourcing as a Model for Problem Solving, CONVERGENCE,
14(1), 7590.
Brace, I. (2004) Questionnaire Design: How to Plan, Structure, and Write Survey Material
for Effective Market Research, Kogan Page Ltd.
Brki, M., Seljan, S., Mikuli, B.B. (2009) Using Translation Memory to Speed up
Translation Process, Presented at the International Conference The Future of
Information Sciences (2; 2009).
Brooks, D. (2000) What price globalization? Managing costs at Microsoft, Translating into
Success: Cutting-edge Strategies for Going Multilingual in a Global Age, Amsterdam:
John Benjamins, 4358.
Bruns, A., Bahnisch, M. (2009) Social media: tools for user-generated content: social drivers
behind growing consumer participation in user-led content generation [Volume 1:
state of the art].
Bryman, A. (2012) Social Research Methods, OUP Oxford.
Burns, R.P., Burns, R. (2008) Business Research Methods and Statistics Using SPSS, Sage.
Buschmann, F., Henney, K., Schmidt, D.C. (2007) Pattern Oriented Software Architecture:
On Patterns and Pattern Languages, Wiley.
Callison-Burch, C. (2009) Fast, cheap, and creative: evaluating translation quality using
Amazons Mechanical Turk, Presented at the Proceedings of the 2009 Conference on
Empirical Methods in Natural Language Processing: Volume 1-Volume 1,
Association for Computational Linguistics, 286295.
Carey, J.M. (1998) Creating global software: a conspectus and review, Interacting with
Computers, 9(4), 449465.
Carmona, J.D.O., Pym, A. (2011) The Empirical Study of Non-professional Subtitling: a
Descriptive Approach.
Category:Wikipedia enforcement policies (2013) Wikipedia, the free encyclopedia,
available:
http://en.wikipedia.org/w/index.php?title=Category:Wikipedia_enforcement_policies
&oldid=546845494 [accessed 9 Oct 2013].
234

Cedeno, J.J.D. (2010) Motivating programmers through karma systems.


Cha, M., Kwak, H., Rodriguez, P., Ahn, Y.-Y., Moon, S. (2009) Analyzing the video
popularity characteristics of large-scale user generated content systems, IEEE/ACM
Transactions on Networking (TON), 17(5), 13571370.
Chinosi, M., Trombetta, A. (2012) BPMN: An introduction to the standard, Computer
Standards & Interfaces, 34(1), 124134.
Clark, J., Aufderheide, P. (2009) Public media 2.0: Dynamic, engaged publics, Center for
Social
Media.
Retrieved
from
http://www.
centerforsocialmedia.
org/resources/publications/public_media_2_ 0_dynamic_engaged_publics.
Coleman, G. (2004) The political agnosticism of free and open source software and the
inadvertent politics of contrast, Anthropological Quarterly, 77(3), 507519.
Collins, A. (2002) Are the communications leviathans devouring cultures and languages?,
Localisation Focus, 1(1), 1415.
Cronin, M. (2010) The Translation Crowd, revista tradumtica, (8).
Curran, S., Feeney, K., Schaler, R., Lewis, D. (2009) The management of crowdsourcing in
business processes, Presented at the Integrated Network Management-Workshops,
2009. IM09. IFIP/IEEE International Symposium on, IEEE, 7778.
Curtis, B., Kellner, M.I., Over, J. (1992a) Process modeling, Commun. ACM, 35(9), 7590.
Curtis, B., Kellner, M.I., Over, J. (1992b) Process modeling, Communications of the ACM,
35(9), 7590.
DZurilla, T.J., Goldfried, M.R. (1971) Problem solving and behavior modification.,
Journal of abnormal psychology, 78(1), 107.
Dalvit, L., Terzoli, A., Wolff, F. (2008) Opensource software and localisation in indigenous
South African languages with Pootle, SATNAC 2008, available:
http://www.satnac.org.za/proceedings/2008/papers/software/Dalvit%20No%2095.pdf.
DePalma, D.A. (2006) Quantifying the return on localization investment, Perspectives on
Localization, 1536.
DePalma, D.A., Kelly, N. (2011) Project management for crowdsourced translation,
Translation and Localization Project Management: The Art of the Possible, 379.
Van Der Aalst, W.M.P., Ter Hofstede, A.H.M. (2005) YAWL: yet another workflow
language, Information Systems, 30(4), 245275.
Van Der Aalst, W.M.P., Ter Hofstede, A.H.M., Kiepuszewski, B., Barros, A.P. (2003)
Workflow patterns, Distributed and parallel databases, 14(1), 551.
Dsilets, A. (2007) Translation Wikified: How will Massive Online Collaboration Impact
the World of Translation?, Presented at the ASLIB, Translating and the Computer
29, London.
Dsilets, A. (2010) Collaborative Translation: technology, crowdsourcing, and the translator
perspective, Presented at the AMTA 2010 Workshop--Collaborative Translation:
technology, crowdsourcing, and the translator perspective.
Dsilets, A. (2011a) Identify Compatible Content [online], available: http://collaborativetranslation-patterns.wiki4us.com/tiki-index.php?page=Identify+Compatible+Content
[accessed 5 Nov 2013].
Dsilets, A. (2011b) Publish Then Revise [online], available: http://collaborative-translationpatterns.wiki4us.com/tiki-index.php?page=Publish+then+Revise [accessed 5 Nov
2013].
Dsilets, A., van der Meer, R. (2011) Co-creating a repository of best-practices for
collaborative translators, Linguistica Antverpiensia, 10.
DiFranco, C. (2006) Localization Cost, Perspectives on localization, 13, 47.
Dijkman, R.M., Dumas, M., Ouyang, C. (2008) Semantics and analysis of business process
models in BPMN, Information and Software Technology, 50(12), 12811294.
235

Doan, A., Ramakrishnan, R., Halevy, A.Y. (2011) Crowdsourcing systems on the worldwide web, Communications of the ACM, 54(4), 8696.
Dolmaya, J.M. (2011) The ethics of crowdsourcing, Linguistica Antverpiensia, New Series
Themes in Translation Studies, (10).
Van Dongen, B.F., de Medeiros, A.K.A., Verbeek, H., Weijters, A., Van Der Aalst, W.M.
(2005) The ProM framework: A new era in process mining tool support, in
Applications and Theory of Petri Nets 2005, Springer, 444454.
Eagle, N. (2009) txteagle: Mobile crowdsourcing, Internationalization, Design and Global
Development, 447456.
Ellis, D. (2009) A Case Study in Community-Driven Translation of a Fast-Changing
Website, Internationalization, Design and Global Development, 236244.
Esselink, B. (2000) A Practical Guide to Localization, John Benjamins Publishing Company.
Estells-Arolas, E., Gonzlez-Ladrn-de-Guevara, F. (2012) Towards an integrated
crowdsourcing definition, Journal of Information Science, 20(10).
Eurostat (2011) Information Society Statistics, Computers and the Internet in households and
enterprises, Internet - Level of access, use and activities, Internet activities Individuals,
available:
http://epp.eurostat.ec.europa.eu/portal/page/portal/information_society/data/database.
Exton, C., Wasala, A., Buckley, J., Schler, R. (2009) Micro Crowdsourcing: A new Model
for Software Localisation, Localisation Focus, 81.
Fasulo, D. (1999) An analysis of recent work on clustering algorithms, Department of
Computer Science & Engineering, University of Washington.
Fernndez Costales, A. (2010) The role of Computer-Assisted Translation in the field of
software localization, Evaluation of Translation Technology, 179.
Filip, D. (2012) Localization for the long tail: Part 1, Multilingual Computing, 23(7), 51
55.
Filip, D., Conchir, E. (2011) An Argument for Business Process Management in
Localisation., Localisation Focus, 10(1), 417.
Fink, A. (2003a) How to Sample in Surveys, Sage.
Fink, A. (2003b) The Survey Handbook, Sage.
Fink, A. (2009) How to Conduct Surveys: A Step-by-step Guide, Sage.
Fort, K., Adda, G., Cohen, K.B. (2011) Amazon mechanical turk: Gold mine or coal mine?,
Computational Linguistics, 37(2), 413420.
Freij, N. (2010) Enabling Globalization, GlobalVision International.
Frimannsson, A. (2011) Community-driven Translation of Software and E-content, in
Education Without Borders 2011, Abu Dhabi, United Arab Emirates.
Garcia, I. (2010) The proper place of professionals (and non-professionals and machines) in
web translation, Tradumtica: traducci i tecnologies de la informaci i la
comunicaci, (8), 17.
Geiger, D., Rosemann, M., Fielt, E. (2011) Crowdsourcing information systems: a systems
theory perspective, Presented at the Proceedings of the 22nd Australasian Conference
on Information Systems (ACIS 2011).
Geiger, D., Seedorf, S., Schulze, T., Nickerson, R., Schader, M. (2011) Managing the
Crowd: Towards a Taxonomy of Crowdsourcing Processes, Presented at the
Proceedings of the Seventeenth Americas Conference on Information Systems.
Van Genabith, J. (2009) Next Generation Localisation, Localisation Focus, 4.
Georgakopoulos, D., Hornick, M., Sheth, A. (1995) An overview of workflow management:
From process modeling to workflow automation infrastructure, Distributed and
parallel Databases, 3(2), 119153.

236

Georgakopoulos, D., Hornick, M.F., Manola, F., Brodie, M.L., Heiler, S., Nayeri, F.,
Hurwitz, B. (1993) An extended transaction environment for workflows in
distributed object computing, Data Engineering Bulletin, 16(2), 2427.
Giaglis, G.M. (2001) A taxonomy of business process modeling and information systems
modeling techniques, International Journal of Flexible Manufacturing Systems,
13(2), 209228.
Glaser, B.G., Strauss, A.L. (2009) The Discovery of Grounded Theory: Strategies for
Qualitative Research, Transaction Books.
Guest, G., MacQueen, K.M., Namey, E.E. (2011) Applied Thematic Analysis, Sage.
Hall, P., Schler, R. (2005) Development Localization, Multilingual Computing, 8(16).
Hatcher, J.S. (2005) Of otakus and fansubs: A critical look at anime online in light of current
issues in copyright law, Script-ed, 2(4), 51442.
Hecht, B.J., Gergle, D. (2010) On the localness of user-generated content, in 2010 ACM
Conference on Computer Supported on Cooperative Work, ACM, 229232.
Van Hee, K., Oanea, O., Post, R., Somers, L., van der Werf, J.M. (2006) Yasper: a tool for
workflow modeling and analysis, Presented at the Application of Concurrency to
System Design, 2006. ACSD 2006. Sixth International Conference on, IEEE, 279
282.
Heer, J., Bostock, M. (2010) Crowdsourcing graphical perception: using mechanical turk to
assess visualization design, Presented at the Proceedings of the 28th international
conference on Human factors in computing systems, ACM, 203212.
Hiroshi, U., Meiying, Z. (1993) Interlingua for multilingual machine translation,
Proceedings of MT Summit IV, Kobe, Japan, 157169.
Horton, J.J., Chilton, L.B. (2010) The labor economics of paid crowdsourcing, in
Proceedings of the 11th ACM Conference on Electronic Commerce, EC 10, ACM:
New
York,
NY,
USA,
209218,
available:
http://doi.acm.org/10.1145/1807342.1807376.
Howe, J. (2006a) Crowdsourcing: A definition, Crowdsourcing: Tracking the rise of the
amateur.
Howe, J. (2006b) The rise of crowdsourcing, Wired magazine, 14(6), 14.
Howe, J. (2008) Crowdsourcing: Why the Power of the Crowd Is Driving the Future of
Business, Century.
Hu, C., Bederson, B.B., Resnik, P. (2010) Translation by iterative collaboration between
monolingual users, Presented at the Proceedings of Graphics Interface 2010,
Canadian Information Processing Society, 3946.
Hu, C., Bederson, B.B., Resnik, P., Kronrod, Y. (2011) Monotrans2: A new human
computation system to support monolingual translation, Presented at the Proceedings
of the SIGCHI Conference on Human Factors in Computing Systems, ACM, 1133
1136.
Huberman, B.A., Romero, D.M., Wu, F. (2009) Crowdsourcing, attention and productivity,
Journal of Information Science, 35(6), 758765.
Hurrell, A., Woods, N. (1995) Globalisation and inequality, Millennium-Journal of
International Studies, 24(3), 447.
Hutchins, J. (1998) Translation technology and the translator, Machine Translation Review,
7(1), 714.
Hutchins, J. (2005) Example-based machine translation: A review and commentary,
Machine translation, 19(3), 197211.
IBM (2013) IBM Knowledge Center - Model Summary View [online], available:
http://www-

237

01.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/c
lusterviewer_modelsummary_panel.htm [accessed 17 Oct 2014].
Ipeirotis, P.G., Horton, J.J. (2011) The Need for Standardization in Crowdsourcing, in CHI
2011 Workshop on Crowdsourcing and Human Computation, Vancouver.
Jain, A.K. (2010) Data clustering: 50 years beyond K-means, Pattern Recognition Letters,
31(8), 651666.
Janev, V., Vranes, S. (2009) Semantic Web Technologies: Ready for Adoption?, IT
Professional, 11(5), 816.
Jarboe, G. (2012) The Best Length for a YouTube Marketing Video [online], ReelSEO,
available: http://www.reelseo.com/length-youtube-video/ [accessed 17 Oct 2013].
Jimenez-Crespo, M.A. (2013) Crowdsourcing, corpus use, and the search for translation
naturalness: A comparable corpus study of Facebook and non-translated social
networking sites, Translation and Interpreting Studies, 8(1), 2349.
Jimnez-Crespo, M.A. (2011) From many, one: Novel approaches to translation quality in a
social network era, Linguistica Antverpiensia, 10.
Jordan, P.W. (2002) Designing Pleasurable Products: An Introduction to the New Human
Factors, CRC Press.
Kagdi, H., Maletic, J.I. (2006) Mining for co-changes in the context of web localization,
Presented at the Web Site Evolution, 2006. WSE06. Eighth IEEE International
Symposium on, IEEE, 5057.
Kageura, K., Abekawa, T., Masao, U., Miori, S., Sumita, E. (2011) Has translation gone
online and collaborative?: An experience from Minna no Honyaku, Linguistica
Antverpiensia, 10.
Kaufman, L., Rousseeuw, P.J. (2009) Finding Groups in Data: An Introduction to Cluster
Analysis, John Wiley & Sons.
Kazman, R., Chen, H.-M. (2009) The metropolis model a new logic for development of
crowdsourced systems, Communications of the ACM, 52(7), 7684.
Keats, D. (1999) Interviewing: A Practical Guide for Students and Professionals, NewSouth
Publishing.
Kelly, N. (2009) Freelance Translators Clash with LinkedIn over Crowdsourced Translation
[online],
available:
http://www.commonsenseadvisory.com/Default.aspx?Contenttype=ArticleDetAD&ta
bID=63&Aid=591&moduleId=391.
Kelly, N., Ray, R., DePalma, D. (2011a) From crawling to sprinting: Community
Translation goes mainstream, Linguistica Antverpiensia, 10.
Kelly, N., Ray, R., DePalma, D.A. (2011b) From crawling to sprinting: Community
translation goes mainstream, Linguistica Antverpiensia, New SeriesThemes in
Translation Studies, (10).
Kittur, A., Chi, E.H., Suh, B. (2008) Crowdsourcing user studies with Mechanical Turk, in
Proceedings of the Twenty-sixth Annual SIGCHI Conference on Human Factors in
Computing Systems, ACM: Florence, Italy, 453456.
Kleemann, F., Vo, G.G., Rieder, K. (2008) Un (der) paid Innovators: The Commercial
Utiliza-tion of Consumer Work through Crowdsourcing, Science, Technology &
Innovation Studies, 4(1), PP5.
Krogstie, J., Sindre, G., Jrgensen, H. (2006) Process models representing knowledge for
action: a revised quality framework, European Journal of Information Systems,
15(1), 91102.
Kwan, E. (2009) The Award Goes to...Translators [online], available:
http://blog.facebook.com/blog.php?post=204787062130.

238

Latva-Koivisto, A.M. (2001) Finding a complexity measure for business process models,
Helsinki University of Technology, Systems Analysis Laboratory.
Lenihan, A. (2014) Investigating language policy in social media: translation practices on
Facebook, The Language of Social Media: Identity and Community on the Internet,
208.
Lenihan, A., Thurlow, C., Mroczek, K. (2011) Join our community of translators: language
ideologies and Facebook, Digital discourse: Language in the new media, ed. Crispin
Crispin Thurlow and Kristine Mroczek, 48464.
Leonard, S. (2004) Celebrating two decades of unlawful progress: Fan distribution,
proselytization commons, and the explosive growth of Japanese animation, UCLA
Ent. L. Rev., 12, 189.
Levitina, N. (2011) Localization project management: Scope and requirements, Translation
and Localization Project Management: The Art of the Possible, 95118.
Lvy, P. (1997) Collective Intelligence: Mankinds Emerging World in Cyberspace. trans,
Robert Bononno. Cambridge, Mass. Perseus.
Lewis, D., Curran, S., Doherty, G., Feeney, K., Karamanis, N., Luz, S., McAuley, J. (2009)
Supporting Flexibility and Awareness in Localisation Workflows, Localisation
Focus, 8(1), 29.
Liang, T.P., Wang, J.M., Wang, C.E. (1993) Workflow Modeling and Implementation,
PACIS 1993 Proceedings, 45.
Little, G., Chilton, L.B., Goldman, M., Miller, R.C. (2010) Exploring iterative and parallel
human computation processes, Presented at the Proceedings of the ACM SIGKDD
workshop on human computation, ACM, 6876.
Lommel, A. (Ed. . (2003) The Localization Industry Primer [online], available:
http://www.cit.griffith.edu.au/~davidt/cit3611/LISAprimer.pdf.
Lommel, A., Ray, R. (2009) Crowdsourcing: The Crowd Wants to Help You Reach New
Markets, Localization Industry Standards Association.
Lorenz, J., Rauhut, H., Schweitzer, F., Helbing, D. (2011) How social influence can
undermine the wisdom of crowd effect, Proceedings of the National Academy of
Sciences, 108(22), 90209025.
Losse, K. (2008) Facebook - Achieving Quality in a Crowd-sourced Translation Environment
[online],
available:
http://www.localisation.ie/resources/presentations/videos/video2.htm.
Louridas, P. (2008) Orchestrating web services with bpel, IEEE Software, 25(2), 8587.
Ly, L.T., Rinderle, S., Dadam, P. (2006) Semantic correctness in adaptive process
management systems, in Business Process Management, Springer, 193208.
Mackenzie, A. (2006) Internationalization: software, universality and otherness, available:
http://www.lancs.ac.uk/people/mackenza/papers/Mackenzie_internationalization_oct0
6_web.pdf.
Mangiron, C., OHagan, M. (2006) Game Localisation: unleashing imagination with
restrictedtranslation, The Journal of Specialised Translation, 6, 1021.
Mans, R., Schonenberg, M., Song, M., van der Aalst, W.M., Bakker, P.J. (2009) Application
of process mining in healthcarea case study in a dutch hospital, in Biomedical
Engineering Systems and Technologies, Springer, 425438.
Maria, A. (1997) Introduction to modeling and simulation, Presented at the Proceedings of
the 29th conference on Winter simulation, IEEE Computer Society, 713.
Markoff, J. (2012) How many computers to identify a cat? 16,000, New York Times.
Meer, J., Rigbi, O. (2011) Transactions Costs and Social Distance in Philanthropy: Evidence
from a Field Experiment.

239

Meeyoung, C., Haewoon, K., Pablo, R., Yong-Yeol, A., Sue, M. (2007) I tube, you tube,
everybody tubes: analyzing the worlds largest user generated content video system,
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement.
Melby, A.K. (2008) TBX-Basic Translation-oriented Terminology Made Simple,
Tradumtica: traducci i tecnologies de la informaci i la comunicaci, (6).
Mendling, J. (2008) Metrics for Process Models: Empirical Foundations of Verification,
Error Prediction, and Guidelines for Correctness, Springer.
Mendling, J., Neumann, G., Van Der Aalst, W. (2007) Understanding the occurrence of
errors in process models based on metrics, in On the Move to Meaningful Internet
Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS, Springer, 113130.
Mendling, J., Reijers, H.A., van der Aalst, W.M. (2010) Seven process modeling guidelines
(7PMG), Information and Software Technology, 52(2), 127136.
Mesipuu, M. (2010) Translation Crowdsourcingan Insight into Hows and Whys (at the
Example of Facebook and Skype).
Miller, R.L., Brewer, J.D. (2003) The AZ of Social Research: a Dictionary of Key Social
Science Research Concepts, Sage.
Moody, D.L. (2009) The physics of notations: toward a scientific basis for constructing
visual notations in software engineering, Software Engineering, IEEE Transactions
on, 35(6), 756779.
Moorkens, J. (2011) Translation Memories guarantee consistency: Truth or fiction?,
Presented at the Translating and the Computer Conference, London.
Morera, A., Aouad, L., Collins, J. (2012) Assessing Support for Community Workflows in
Localisation, Presented at the Business Process Management Workshops, Springer,
195206.
Muchnik, L., Aral, S., Taylor, S. (2013) Social Influence Bias: A Randomized Experiment,
Science, 341(6146), 647651.
Muegge, U. (2007) Why manage terminology? Ten quick answers, Globalization Insider, 7.
Munro, R. (2010) Crowdsourced translation for emergency response in Haiti: the global
collaboration of local knowledge, Presented at the AMTA Workshop on
Collaborative Crowdsourcing for Translation.
Munro, R., Bethard, S., Kuperman, V., Lai, V.T., Melnick, R., Potts, C., Schnoebelen, T.,
Tily, H. (2010) Crowdsourcing and language studies: the new generation of linguistic
data, Presented at the Proceedings of the NAACL HLT 2010 Workshop on Creating
Speech and Language Data with Amazons Mechanical Turk, Association for
Computational Linguistics, 122130.
Murata, T. (1989) Petri nets: Properties, analysis and applications, Proceedings of the IEEE,
77(4), 541580.
Mythili, S., Madhiya, E. (2014) An Analysis on Clustering Algorithms in Data Mining,
International Journal of Computer Science and Mobile Computing, 3(1), 334340.
Nickerson, R., Muntermann, J., Varshney, U., Isaac, H. (2009) Taxonomy development in
information systems: developing a taxonomy of mobile applications.
Notley, T., Salazar, J.F., Crosby, A. (2013) Online video translation and subtitling:
examining emerging practices and their implications for media activism in South East
Asia., Global Media Journal: Australian Edition, 7(1).
OBrien, S. (2011) Collaborative translation, in Handbook of Translation Studies,
Benjamins, 17.
OBrien, S., Schler, R. (2010a) Next generation translation and localization: Users are
taking charge, Presented at the ASLIB: Translating and the Computer Conference,
London.

240

OBrien, S., Schler, R. (2010b) Next generation translation and localization: Users are
taking charge.
OHagan, M. (2009) Evolution of User-generated Translation: Fansubs, Translation Hacking
and Crowdsourcing, The Journal of Internationalisation and Localisation Volume I,
94.
OHagan, M. (2011) Community Translation: Translation as a social activity and its possible
consequences in the advent of Web 2.0 and beyond, Linguistica Antverpiensia, 10,
1123.
OHagan, M. (2012) From Fan Translation to Crowdsourcing: Consequences of Web 2.0
User Empowerment in Audiovisual Translation., Approaches to Translation Studies,
36.
Oates, B.J. (2005) Researching Information Systems and Computing, Sage Publications
Limited.
Oprea, T.I., Bologa, C.G., Boyer, S., Curpan, R.F., Glen, R.C., Hopkins, A.L., Lipinski, C.A.,
Marshall, G.R., Martin, Y.C., Ostopovici-Halip, L. (2009) A crowdsourcing
evaluation of the NIH chemical probes, Nature chemical biology, 5(7), 441447.
Orrego-Carmona, D. (2012) Internal Structures and Workflows in Collaborative Subtitling,
Presented at the Paper delivered to the First International Conference on
Nonprofessional Interpreting and Translation. Universit di Bologna, May, 1719.
La Pelle, N. (2004) Simplifying qualitative data analysis using general purpose software
tools, Field Methods, 16(1), 85108.
Petras, R. (2011) Localizing with community translation, Multilingual Computing, 23(7).
Plitt, M., Masselot, F. (2010) A Productivity Test of Statistical Machine Translation PostEditing in a Typical Localisation Context, The Prague Bulletin of Mathematical
Linguistics, 93(-1), 716.
Poncin, W., Serebrenik, A., van den Brand, M. (2011) Process mining software repositories,
Presented at the Software Maintenance and Reengineering (CSMR), 2011 15th
European Conference on, IEEE, 514.
Pym, A. (2004) The Moving Text: Localization, Translation, and Distribution, John
Benjamins Publishing Company.
Pym, A. (2005) Localization: On its nature, virtues and dangers, Synaps, 17, 1725.
Pym, A. (2011a) Democratizing translation technologiesthe role of humanistic research,
Presented at the Atti del convegno Language and Translation Automation Conference
Roma.
Pym, A. (2011b) Translation research terms: a tentative glossary for moments of perplexity
and dispute, Translation Research Projects, 3, 75110.
Question #196237: Questions: Launchpad Itself [online] (2014) available:
https://answers.launchpad.net/launchpad/+question/196237 [accessed 6 Mar 2014].
Ramrez, A. (2010) Localizacin de Videojuegos: alternativas Al Excel? [online].
Ray, R., Kelly, N. (2011) Crowdsourced Translation Best Practices for Implementation,
Common Sense Advisory.
Razavian, N.S., Vogel, S. (2009) The web as a platform to build machine translation
resources, Presented at the Proceedings of the 2009 international workshop on
Intercultural collaboration, ACM, 4150.
Resnik, P., Buzek, O., Hu, C., Kronrod, Y., Quinn, A., Bederson, B.B. (2010) Improving
translation via targeted paraphrasing, Presented at the Proceedings of the 2010
Conference on Empirical Methods in Natural Language Processing, Association for
Computational Linguistics, 127137.
Reynolds, P. (2009) TRANSLATION TECHNOLOGY OF TODAY IN THEORY AND
PRACTICE, The Sustainability of the Translation Field, 480.
241

Rickard, J. (2009) Translation in the Community, in LRC XIV Localisation in The Cloud.
Rinsche, A., Portera-Zanotti, N. (2009) Study on the Size of the Language Industry in the EU,
available: http://www.pasmee.gr/download/EU_lang_industry_study_final_report.pdf.
Ritchie, J., Lewis, J. (2003) Qualitative Research Practice: A Guide for Social Science
Students and Researchers, SAGE Publications Limited.
La Rosa, M., ter Hofstede, A.H., Wohed, P., Reijers, H.A., Mendling, J., van der Aalst, W.M.
(2011) Managing process model complexity via concrete syntax modifications,
Industrial Informatics, IEEE Transactions on, 7(2), 255265.
La Rosa, M., Wohed, P., Mendling, J., Ter Hofstede, A.H., Reijers, H.A., van der Aalst,
W.M. (2011) Managing process model complexity via abstract syntax
modifications, Industrial Informatics, IEEE Transactions on, 7(4), 614629.
Russell, N., van der Aalst, W., ter Hofstede, A. (2006) Workflow exception patterns,
Springer, 288302.
Russell, N., van der Aalst, W.M.P., ter Hofstede, A.H.M., Edmond, D. (2005) Workflow
resource patterns: Identification, representation and tool support, Springer, 216232.
Russell, N., ter Hofstede, A., Edmond, D., van der Aalst, W. (2005) Workflow data patterns:
Identification, representation and tool support, Conceptual ModelingER 2005, 353
368.
Russell, N., Hofstede, A.H.M., Aalst, W.M.P., Mulyar, N. (2006) Workflow Control-Flow
Patterns: A Revised View, BPM Center Report, BPM-06-22, BPMcenter. org.
Russell, N., ter Hofstede, A.H.M., Edmond, D., van der Aalst, W.M.P. (2007) newYAWL:
achieving comprehensive patterns support in workflow for the control-flow, data and
resource perspectives, BPM Center Report BPM-07-05, BPMcenter. org.
Ryan, L., Anastasiou, D., Cleary, Y. (2009) Using Content Development Guidelines to
Reduce the Cost of Localising Digital Content, Localisation Focus, 8(1), 1128.
Sakamoto, Y., Tanaka, Y., Yu, L., Nickerson, J. (2011) The crowdsourcing design space,
Foundations of Augmented Cognition. Directing the Future of Adaptive Systems, 346
355.
Saldaa, J. (2012) The Coding Manual for Qualitative Researchers, Sage.
Snchez, P.M. (2009) Video Game Localisation for Fans by Fans: The Case of
Romhacking, The Journal of Internationalisation and Localisation Volume I, 168.
Sargent, B., DePalma, D. (2008) Translation Management Systems: Assessment of
Commercial and LSP Specific TMS Offerings., Common Sense Advisory.
Saris, W.E., Gallhofer, I.N. (2007) Design, Evaluation, and Analysis of Questionnaires for
Survey Research, John Wiley & Sons.
Sarshar, K., Loos, P. (2005) Comparing the control-flow of epc and petri net from the enduser perspective, in Business Process Management, Springer, 434439.
Savage, N. (2012) Gaining wisdom from crowds, Communications of the ACM, 55(3), 13
15.
Scannel, K. (2012) Translating Facebook into Endangered Languages, in Language
Endangerment in the 21st Century: Globalisation, Technology and New Media.,
Presented at the 16th Foundation for Endangered Languages Conference, Auckland,
106110.
Schler, R. (2008a) Localization, Routledge Encyclopedia of Translation Studies.
Schler, R. (2008b) Communication as a Key to Global Business, in Hayhoe, G.F. and
Grady, H.M., eds., Connecting People with Technology: Issues in Professional
Communication, Baywood Publishing Company.
Schenk, E., Guittard, C. (2009) Crowdsourcing: What can be Outsourced to the Crowd, and
Why?, Presented at the Workshop on Open Source Innovation, Strasbourg, France.
Schmitt, D.A. (1999) International Programming for Microsoft Windows, Microsoft Press.
242

Somers, H. (2003) Translation memory systems, BENJAMINS TRANSLATION LIBRARY,


35, 3148.
Sprung, R.C., Jaroniec, S. (2000) Translating into Success: Cutting-edge Strategies for Going
Multilingual in a Global Age, John Benjamins Publishing Company.
Straub, D., Schmitz, K.-D. (2010) tekom study: Cost and effectiveness of terminology
work,
tcworld,
available:
http://www.tcworld.info/tcworld/contentstrategies/article/tekom-study-cost-and-effectiveness-of-terminology-work/.
Stray, J. (2010) The Wikipedia of news translation: Yeeyan.orgs volunteer community,
Nieman Journalism Lab, available: http://www.niemanlab.org/2010/06/the-wikipediaof-news-translation-yeeyan-orgs-volunteer-community/ [accessed 5 Jun 2014].
Symantec (2011) Norton Together > Frequently Asked Questions [online], available:
http://together.norton.com/info/faq.
TED
Conferences
(2013)
Quick
Start
on
Amara
[online],
available:
http://www.ted.com/pages/translation_quick_start [accessed 13 Mar 2014].
Thayer, A., Kolko, B.E. (2004) Localization of digital games: The process of blending for
the global games market, Technical Communication, 51(4), 477488.
Tongco, M.D.C. (2007) Purposive sampling as a tool for informant selection, Ethnobotany
Research & Applications, (5), 147158.
Translate (2014) Terminology Pootle 2.5.1-rc1 Documentation [online], available:
http://pootle.readthedocs.org/en/latest/features/terminology.html [accessed 3 Mar
2014].
Tsang,
F.
(n.d.)
Crowdsourcing
at
Adobe,
available:
http://www.translationautomation.com/articles/crowdsourcing-at-adobe [accessed 20
Mar 2013].
Twitter (2011) It Takes a Community to Translate Twitter [online], available:
http://blog.twitter.com/2011/08/it-takes-community-to-translate-twitter.html.
Unbabel (2014) Unbabels Mission Is to Eliminate Language Barriers. [online], Unbabel.
Translation as a service., available: https://www.unbabel.com/ [accessed 25 Jul
2014].
Urquhart, C. (2001) An encounter with grounded theory: tackling the practical and
philosophical issues, Qualitative research in IS: Issues and trends, 104140.
Vashee, K. (2009) MT Technology in the Cloud An evolving model, in LRC XIV,
Localisation
in
The
Cloud,
available:
http://www.localisation.ie/resources/conferences/2009/presentations/LRC09-KV.pdf.
VerbalizeIt (2014) How VerbalizeIts Translation Platform Works, VerbalizeIt, available:
https://www.verbalizeit.com/how-our-translation-platform-works/ [accessed 2 Jun
2014].
Viegas, F.B., Wattenberg, M., Kriss, J., Van Ham, F. (2007) Talk before you type:
Coordination in Wikipedia, Presented at the System Sciences, 2007. HICSS 2007.
40th Annual Hawaii International Conference on, IEEE, 7878.
Wagner, C. (2004) Wiki: A technology for conversational knowledge management and
group collaboration, Communications of the Association for Information Systems,
13(13), 265289.
Wagstaff, K., Cardie, C., Rogers, S., Schrdl, S. (2001) Constrained k-means clustering with
background knowledge, Presented at the ICML, 577584.
Warren, R., Airoldi, E., Banks, D. (2008) Network Analysis of Wikipedia, Statistical
Methods in eCommerce Research, ed. by G. Shmueli and W. Jank, Wiley, NY, 81102.
Webb, L.E. (2000) Advantages and disadvantages of translation memory: a cost/benefit
analysis.

243

WfMC (1999) Workflow Management Coalition: Terminology & Glossary [online],


available: http://www.wfmc.org/standards/docs/TC-1011_term_glossary_v3.pdf.
Wijayanti, D. (2013) TED Open Translation Project, A Bunch of Nonsense, available:
http://abunchofnonsense.wordpress.com/2013/10/06/ted-open-translation-project/
[accessed 24 Mar 2014].
Yahaya, F. (2008a) Managing Complex Translation Projects Through Virtual Spaces: a Case
Study, ASLIB Translating and the Computer, 30, 2728.
Yahaya, F. (2008b) Managing complex translation projects through virtual spaces: a case
study, in ASLIB Translating and the Computer, Presented at the ASLIB Translating
and the Computer, London, UK, 2728.
Yamada, M. (2011) The effect of translation memory databases on productivity,
Translation research projects, 3, 6373.
Yan, Z., Reijers, H.A., Dijkman, R.M. (2011) An evaluation of BPMN modeling tools, in
Business Process Modeling Notation, Springer, 121128.
Yang, H.H., Kuo, L.H., Yang, H.J., Yu, J.C., Chen, L.M. (2009) On-line PBL system flows
and users motivation, WSEAS Transactions on Communications, 8(4), 394404.
Yang, J., Adamic, L.A., Ackerman, M.S. (2008) Crowdsourcing and knowledge sharing:
strategic user behavior on taskcn, Presented at the Proceedings of the 9th ACM
conference on Electronic commerce, ACM, 246255.
Yasper User Guide (2005) available: http://www.yasper.org [accessed 14 Apr 2014].
Zahariadis, T., Papadimitriou, D., Tschofenig, H., Haller, S., Daras, P., Stamoulis, G.,
Hauswirth, M., Domingue, J., Galis, A., Gavras, A., Lambert, D., Cleary, F., Krco, S.,
Mller, H., Li, M.-S., Schaffers, H., Lotz, V., Alvarez, F., Stiller, B., Karnouskos,
S., Avessta, S., Nilsson, M. (2011) Towards a Future Internet Architecture, The
Future Internet, in Lecture Notes in Computer Science, Springer Berlin / Heidelberg,
718, available: http://dx.doi.org/10.1007/978-3-642-20898-0_1.
Zaidan, O.F., Callison-Burch, C. (2011) Crowdsourcing translation: professional quality
from non-professionals, Proceedings of ACL 2011.
Zhao, Y., Zhu, Q. (2012) Evaluation on crowdsourcing research: Current status and future
direction, Information Systems Frontiers, 118.
Zhou, P. (2011) Managing the challenges of game localization, in Translation and
Localization Project Management, John Benjamins Publishing Company.
Zouncourides-Lull, A. (2011) Managing project using PMI methodology, in Translation
and Localization Project Management, American Translators Association Scholarly
Monograph Series.

244

Appendix 1 Survey Questionnaire

245

246

Appendix 2 Email Template


Dear [NAME],

I am a doctoral researcher at the Localisation Research Centre belonging to the Department


of Computer Science and Information Systems at the University of Limerick. I am currently
carrying out research about crowdsourced translation processes. It is my understanding that
you are involved with [PLATFORM NAME] and will probably be able to answer a short
survey about it or know of someone who can. The survey has a total of 7 questions and
should take you less than ten minutes to answer.
Please have a look at the attached Participant Information Sheet that is also available at
http://goo.gl/3I9CU. If you have any doubts, please contact me at aram.morera-mesa@ul.ie.
The survey can be found at https://sites.google.com/site/crowdst10ntaxonsurvey/
Looking forwards to hearing from you.
Best regards,
Aram

247

Appendix 3 Ethical Clearance Application Form for


Survey
Faculty of Science and Engineering Ethics Committee
Expedited Form for
research involving human participants
1: Applicants Details
Principal Investigator name (ie supervisor): J.J.Collins, Reinhard Schaler, David
Filip
Principal Investigator email: J.J.Collins@ul.ie, reinhard.schaler@ul.ie,
davidf@davidf.org.
Student name: Aram Morera-Mesa
ID number:0802298
Email address:aram.morera-mesa@ul.ie
Programme of study:Doctor in Philosophy
FYP, MSc or PhD Dissertation: PhD
Working title of study: A pattern language for crowdsourced translation
Period for which approval is sought: Start Date: 01-02-13 End date: 30-05-13
2. Human Participants
Does the research proposal involve:
Working with participants over 65 years of age?
No
Any person under the age of 18?
No
Adult patients?
No
Adults with psychological impairments?
No
Adults with learning difficulties?
No
Adults under the protection/
control/influence of others (e.g. in care/prison)?
No
Relatives of ill people (e.g. parents of sick children)
No
People who may only have a basic knowledge of English? No
Hospital or GP patients (or HSE members of staff)
No
recruited in medical facility
3. Subject Matter
Does the research proposal involve:
Sensitive personal issues? (e.g. suicide, bereavement, gender
identity, sexuality, fertility, abortion, gambling)?
No
Illegal activities, illicit drug taking, substance abuse or the
self reporting of criminal behaviour?
No
Any act that might diminish self-respect or cause shame,
embarrassment or regret?
No
Research into politically and/or racially/ethnically and/or
commercially sensitive areas?
No
4. Procedures
Does the research proposal involve:
Use of personal records without consent?
No
Deception of participants?
No
The offer of large inducements to participate?
No
Audio or visual recording without consent?
No
Invasive physical interventions or treatments?
No
248

Research that might put researchers or participants at risk?


Storage of results data for less than 7 years?

No
No

If you have answered Yes to any of these questions in sections 2 to 4 above, you will need to
fill in the ULREC application form and submit to the Faculty Ethics Committee for review.
However, if the research is to be conducted during teaching practice, and within the
Department of Education subject syllabus outline, and provided the student has the
permission of the class teacher and the school principal and that parent/guardians consent to
participation, this expedited form can also be used. Please note that if the Faculty Ethics
Committee deems it necessary you may be asked to fill in the full application form
Please note that only 1 hard copy of the FREC form is required for the Faculty Ethics
Committee. You can get more information and download the forms needed at this address:
www.ul.ie/researchethics/ NB: If you answered Yes to the last bullet point in section 2 then
you will need to apply to the local HSE ethics committee not the FREC.
If you have answered No to all of these questions, please answer the following questions in
sections 5.

5 Research Project Information


5a Give a brief description of the research.
The survey will be used to collect data necessary to create a taxonomy of
crowdsourced translation platforms.

5b How many participants will be involved?


As of the moment 16 platforms/organisations have been identified, however, the
survey will ask if about other platform and could be extended if other platforms
emerged from the survey.

5c How do you plan to gain access to /contact/approach potential participants?


Email.

5d What are the criteria for including/excluding individuals from the study?
The individuals have been selected according to their involvement with the
platforms/organisations that we need to classify (purposive sampling).

5e Have arrangements been made to accommodate individuals who do not wish


to participate in the research? (NB This mainly relates to research taking place
in a classroom setting)
Yes
If Yes
Please state what these arrangements are.
The individuals are free not to answer the survey and even if they answer they have
the right to request that their answers are not used.

249

5f Can you identify any particular vulnerability of your participants other than
those mentioned in section 2?
None.

5g Where will the study take place?


Survey via Google docs.

5h What arrangements have you made for anonymity and confidentiality?


Names of the respondents will not be published.

5i What are the safety issues (if any) arising from this study, and how will you
deal with them?
No safety issues are expected.

5j How do you propose to store the information? Will the file/computer be


password protected?
The answers will be stored in a password protected rar file in a thumb drive.
Where will the information be stored (room number):
CS1036

5k Having referred to the University of Limericks insurance policy (insurance


policy), are you (Principal Investigator/Supervisor) reasonably confident that the
study conforms to the policy? Yes

5l Please attach the relevant information documents and complete the following
checklist to indicate which documents are included with application
Participant Information Sheet
Participant Informed Consent Form
Parent/Career Information Sheet
Parent/Career Informed Consent Form
School Principal Information Sheet
School Principal Informed Consent Form
Teacher Information Sheet
Teacher Consent Form
Child Protection Form
Questionnaire & Explanatory Cover Letter
250

Yes
Yes
No
No
No
No
No
No
No
No

Interview/Survey Questions
Recruitment letters/Advertisements/Emails, etc.

Yes
Yes

6. Declaration
The information in this form is accurate to the best of my knowledge and belief and I take full
responsibility for it.
I undertake to abide by the guidelines outlined in the UL Research Ethics Committee
guidelines http://www.ul.ie/researchethics/
I undertake to inform S&EEC of any changes to the study from those detailed in this
application.
Student:

Name:Aram Morera Mesa

Date:

Principal Investigator*:

Signature:
Name:J.J. Collins
Signature:

Date:
19-02-2013

* In the case where the principal investigator is not a permanent employee of the University,
the relevant head of department must sign this declaration in their place.
_____________________________________________________________________
You should return this form with signatures to the S&E Ethics Committee c/o Faculty Office, Faculty of Science
& Engineering, University of Limerick. In addition, a single pdf file containing the completed form and
additional information (e.g. participant information sheet) should be emailed to SciEngEthics@ul.ie This form
must be submitted and approval granted before the study begins.

251

Appendix 4 Ethical Clearance Application Form for


Interviews
Faculty of Science and Engineering Ethics Committee
Expedited Form for
research involving human participants
1: Applicants Details
Principal Investigator name (ie supervisor): J.J.Collins, Reinhard Schaler, David
Filip
Principal Investigator email: J.J.Collins@ul.ie, reinhard.schaler@ul.ie,
davidf@davidf.org.
Student name: Aram Morera-Mesa
ID number:0802298
Email address:aram.morera-mesa@ul.ie
Programme of study:Doctor in Philosophy
FYP, MSc or PhD Dissertation: PhD
Working title of study: A pattern language for crowdsourced translation
Period for which approval is sought: Start Date: 01-02-13 End date: 30-05-13

2. Human Participants
Does the research proposal involve:
Working with participants over 65 years of age?
No
Any person under the age of 18?
No
Adult patients?
No
Adults with psychological impairments?
No
Adults with learning difficulties?
No
Adults under the protection/
control/influence of others (e.g. in care/prison)?
No
Relatives of ill people (e.g. parents of sick children)
No
People who may only have a basic knowledge of English? No
Hospital or GP patients (or HSE members of staff)
No
recruited in medical facility
3. Subject Matter
Does the research proposal involve:
Sensitive personal issues? (e.g. suicide, bereavement, gender
identity, sexuality, fertility, abortion, gambling)?
No
Illegal activities, illicit drug taking, substance abuse or the
self reporting of criminal behaviour?
No
Any act that might diminish self-respect or cause shame,
embarrassment or regret?
No
Research into politically and/or racially/ethnically and/or
commercially sensitive areas?
No
4. Procedures
Does the research proposal involve:
Use of personal records without consent?
No
Deception of participants?
No
The offer of large inducements to participate?
No
Audio or visual recording without consent?
No
252

Invasive physical interventions or treatments?


Research that might put researchers or participants at risk?
Storage of results data for less than 7 years?

No
No
No

If you have answered Yes to any of these questions in sections 2 to 4 above, you will need to
fill in the ULREC application form and submit to the Faculty Ethics Committee for review.
However, if the research is to be conducted during teaching practice, and within the
Department of Education subject syllabus outline, and provided the student has the
permission of the class teacher and the school principal and that parent/guardians consent to
participation, this expedited form can also be used. Please note that if the Faculty Ethics
Committee deems it necessary you may be asked to fill in the full application form
Please note that only 1 hard copy of the FREC form is required for the Faculty Ethics
Committee. You can get more information and download the forms needed at this address:
www.ul.ie/researchethics/ NB: If you answered Yes to the last bullet point in section 2 then
you will need to apply to the local HSE ethics committee not the FREC.
If you have answered No to all of these questions, please answer the following questions in
sections 5.

5 Research Project Information


5a Give a brief description of the research.
A series of interviews in order to find out expert opinions on certain techniques that
are used in crowdsourced translation processes.

5b How many participants will be involved?


As of the moment the list of interviewees has 14 people.

5c How do you plan to gain access to /contact/approach potential participants?


Email.

5d What are the criteria for including/excluding individuals from the study?
The individuals have been selected according to their involvement with projects
related to the subject of the research: crowdsourced translation.

5e Have arrangements been made to accommodate individuals who do not wish


to participate in the research? (NB This mainly relates to research taking place
in a classroom setting)
Yes
If Yes
Please state what these arrangements are.
The individuals are free not to participate in the interviews and even if they participate
they have the right to request that their interviews are not used.

253

5f Can you identify any particular vulnerability of your participants other than
those mentioned in section 2?
None.

5g Where will the study take place?


The interviews will be carried at the convenience of the interviewees, which includes
interviews via phone and other means of telecommunication.

5h What arrangements have you made for anonymity and confidentiality?


Names of the interviewees and their organisations will not be published.

5i What are the safety issues (if any) arising from this study, and how will you
deal with them?
No safety issues are expected.

5j How do you propose to store the information? Will the file/computer be


password protected?
The recordings will be stored in password protected rar files in a thumb drive.
Where will the information be stored (room number):
CS1036

5k Having referred to the University of Limericks insurance policy (insurance


policy), are you (Principal Investigator/Supervisor) reasonably confident that the
study conforms to the policy? Yes

5l Please attach the relevant information documents and complete the following
checklist to indicate which documents are included with application
Participant Information Sheet
Participant Informed Consent Form
Parent/Career Information Sheet
Parent/Career Informed Consent Form
School Principal Information Sheet
School Principal Informed Consent Form
Teacher Information Sheet
Teacher Consent Form
Child Protection Form
Questionnaire & Explanatory Cover Letter
Interview/Survey Questions
254

Yes
Yes
No
No
No
No
No
No
No
No
Yes

Recruitment letters/Advertisements/Emails, etc.

Yes

6. Declaration
The information in this form is accurate to the best of my knowledge and belief and I take full
responsibility for it.
I undertake to abide by the guidelines outlined in the UL Research Ethics Committee
guidelines http://www.ul.ie/researchethics/
I undertake to inform S&EEC of any changes to the study from those detailed in this
application.
Student:

Name:Aram Morera Mesa

Principal Investigator*:

Signature:
Name:J.J. Collins
Signature:

Date:
19-02-13
Date:
19-02-13

* In the case where the principal investigator is not a permanent employee of the University,
the relevant head of department must sign this declaration in their place.
_____________________________________________________________________
You should return this form with signatures to the S&E Ethics Committee c/o Faculty Office, Faculty of Science
& Engineering, University of Limerick. In addition, a single pdf file containing the completed form and
additional information (e.g. participant information sheet) should be emailed to SciEngEthics@ul.ie This form
must be submitted and approval granted before the study begins.

255

Appendix 5 Survey Responses


Organisation

Red Hat,
Zanata

Platform

Rationale

Remuneration

Contributor preselection

All projects in an instance of Zanata are

To increase user engagement.,

Some developers on Zanata are paid,

Red Hat has numerous teams of full-

open for all to view. Each moderator

We believe that higher quality

and some assist for free (or may be

time translators who are employed based

(project owner) has to approve translator

work is created through

paid by another company to

on the experience and the quality of

requests to be apart of the project and

communities

participate). All translators who use

their translation (discovered through a

translate it. Any company can take the

the system are not paid by Zanata, but

short test). Zanata as a tool used by

source code of Zanata and setup their

can be paid by company's chosing to

organisations such as Red Hat, does not

own internal instance to ensure their

use Zanata as their translation tool

limit translator contribution at all. It is

projects are not open to the public.

Accessibility:

Accessibility:

Accessibility:

View

Assess

Modify

Yes

Yes

Yes

Yes

Yes

Yes

up to the owner of a project on Zanata to


decide if they wish to allow someone to

As Zanata is open-source, development

translate their work.

is always in motion and how


permissions are handled can change
based on community input and how they
wish for things to work.
Mozilla.

Our platform/system/tool philosophy is

To increase user engagement.,

Contributors in my organisation

My organisation/platform does not

Firefox

based on the idea that our contritbutors

As an open source project,

considered to be active participants

preselect contributors.

desktop and

have the retain the freedom to localize

collaboration and engagement

receive some sort of remuneration.

Firefox for

with their platform of preference, not a

is fundamental to everything

We determine activity by looking for

Android.

prescribed platform that they're

we do. You could almost say

quantifiable evidence of frequent

obligated to learn. That being said, we

that the open source

participation within a period of 6

use primarily open systems and our

community was one of the first

months.

localizers use tools ranging from

adopters of crowdsourcing.

Omega, Pootle, Translate Toolkit,

You could say that our primary

Virtaal, and MozillaTranslator to text

reason was because it most

editors and command-line tools. We

appropriately aligned with our

also have our dashboard features hosted

development philosophy and

on l10n.mozilla.org/teams. The answers

methodology.

in this survey refer to how we use

256

Pootle.
Organisation

Platform

Rationale

Remuneration

Contributor preselection

Transifex

Transifex (https://www.transifex.com).

To increase user engagement.,

Contributors in my

The creation of translation teams is

(https://www.

Yes, transifex is translated via transifex

To save costs.

organisation/platform always receive

handled by each project separately.

transifex.com

=)

Accessibility:

Accessibility:

Accessibility:

View

Assess

Modify

Yes

Yes

Yes

Yes

Yes

No

Yes

Yes

No

some sort of remuneration.

)
Get

Get Localization (closed and open

To increase user engagement.,

Contributors in my

My organisation/platform preselects

Localization,

projects)

To save costs.

organisation/platform do not receive

contributors according to some

any kind of remuneration.

contextual information (for example,

GoogaSync

only native speakers can translate into a


certain language, or only people who
have been involved with the
organisation for a least a month can
contribute).
VerbalizeIt

VerbalizeIt

To help companies go

Contributors in my

My organisation preselects contributors

international and to provide

organisation/platform receive

by both submitting them to a test and

meaningful income to

remuneration if their contribution is

contextual information.

multilingual individuals.

successful (for example, their


translation is the published one).

257

Appendix 6 Failed Mind Map for Pattern Language Development

258

Anda mungkin juga menyukai