Sabrina Grimsrud MSC GIMA Thesis

Improving geo-information through guided participation
By Sabrina Grimsrud August 2012
A thesis Submitted in partial fulfillment of the degree Master of Science in Geographical Information Management and Application (GIMA)
Professor: Prof. Dr. Menno-Jan Kraak Supervisor: Dr. Ir. Rob Lemmens Reviewer: Drs. Marian de Vries
DISCLAIMER
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public.
Abstract
In situations where as much geographical information as possible is required for a widespread area, such as during disaster management or land use management, professionals and non-professionals alike will use satellite imagery to classify features and provide information that is needed. Information provided by non-professionals is called volunteered or crowdsourced geographic information, and very often these volunteers are locals in the area of concern or remote mappers. Volunteered geographic information (VGI) is often not considered a reliable source of information due to errors in submitting the information. In order for the information to be helpful, it must be accurate. However, since most of the people participating in crowdsourcing initiatives are not professional geographers, they are unaware of what elements constitute accurate geographic information. The desire to employ VGI in time-critical, humanitarian, or participatory land management situations balanced against concerns over VGI data quality and credibility issues leads to the investigation of whether VGI can be operationalized. In order for volunteer participants to give more valuable information, they must be guided to do so. Current crowdsourcing initiatives do not provide guidance that is helpful enough. Using the right guidance, crowdsourcing deployers can guide participants to give the right information. This has been tested in three different experiments and using a prototype example and the outcome of this research is a recommended methodology and set of guided participation techniques that crowdsourcing deployers can use to improve the quality of information they will receive. The volunteered geographical information, in turn, will be used by professionals as a viable source of information.
Keywords: Crowdsourcing; VGI, Guided participation, data quality

3
Acknowledgements
Although the official research period lasted six months, this thesis is the result of years of research. It all started in 2011 with the MSc GIMA module 6 in-depth study topics, when Rob Lemmens caught my attention with the benefits of crowdsourcing in order to help solve many of the data collecting challenges in large-scale projects. I have had the privilege of working with a number of people whose contribution to the research deserves special mention. It is my pleasure to convey my gratitude to these individuals in my humble acknowledgement. Firstly I would like to express my gratitude to Professor Rob Lemmens not only for inspiring a greater interest in this topic, but especially for his supervision, advice and guidance throughout the research period. I would also like to thank Professor Dr. Menno-Jan Kraak for his advice, supervision, and crucial contributions. I am grateful for having two supervisors who are experts in their respective and related fields who could guide me through this process. Many thanks go to my colleagues at Geodata for offering their expertise as experienced GIS users and experts during the design of the prototype and for their participation in the experiments and valuable feedback. I would like to convey special acknowledgement to Anne, who provided lessons on how to solve quality control issues with VGI data. Words cannot express how thankful I am to my family and friends for your patience, support, understanding and encouragement during the research period. I would like to especially thank my husband Torleiv, who provided invaluable advice and moral support throughout the entire research period. Thank you for constantly motivating me. Last but not least, I would like to thank all of the other GIMA staff who have helped me along the way. Your help has been truly appreciated and I really appreciate having had the pleasure of working with such kind colleagues. To my fellow GIMA classmates: thank you so much for contributing to the project and for your moral support during presentations. Good luck to you as well. Sabrina Oslo, July 2012
List of figures
Figure 1.1: Figure 1.2: Figure 1.3: Figure 1.4: Figure 1.5: Figure 2.1: Figure 2.2: Figure 2.3: Figure 2.4: Figure 2.5: Figure 2.6: Figure 2.7: Figure 2.8: Figure 2.9: Figure 2.10: Figure 2.11: Figure 2.12: Figure 2.13: Figure 2.14: Figure 2.15: Figure 2.16: Figure 2.17: Figure 2.18: Figure 2.19: Figure 2.20: Figure 2.21: Figure 2.22: Figure 2.23: Figure 3.1: Figure 4.1: Figure 4.2: Figure 4.3: Figure 4.4: Figure 4.5: Figure 4.6: Figure 4.7: Figure 4.8: Figure 4.9: Figure 4.10: Figure 4.11: Figure 4.12: Figure 4.13: Figure 4.14: Figure 4.15: Figure 4.16: Ground vs. satellite imagery classifications for damage buildings Geotagged photos for Oslo, Geo-wiki project Levels of VGI participation Approaches to differentiating between VGI contributors Schematic overview of research phases OpenStreetMap Wiki menu listing help options Ushahidi Haiti report The Ushahidi program for the Haiti disaster Geo-Wiki classification Simple validation method Geo-Wiki validation submission form Help window showing difference between image resolutions The Geo-Wiki validation process, showing area of disagreement Top Geo-Wiki user classification scores List of all Geo-Wiki users and their classification scores Christchurch damage assessment map Christchurch earthquake example, before imagery Christchurch earthquake example, after imagery Pre-disaster satellite image Post-disaster aerial photo EMS-98 building classification diagram EMS-98 Damage classifications based on imagery Goodchild and Hunters buffer comparison method Haklays comparison of OSM vs. OS data Haklays study area for comparison of data Positional accuracy across five areas of London An example of good positional accuracy An example of poor positional accuracy Study area, downtown Oslo Oslo experiment wiki layout Oslo VGI attribute table Christchurch VGI attribute table Wiki layout for final experiment Mapping party participants Mapping party Test 2 attribute table JRC building damage assessment atlas Order of scanning for damage Test 3 attribute table Importing .txt data into Excel Delimited file type Selecting commas as delimiters Oslo experiment VGI Incomplete Oslo experiment attributes Christchurch VGI data Location and description fields
5
16 16 18 19 21 26 27 30 32 32 32 33 34 35 35 37 37 37 39 39 39 40 47 47 48 48 49 49 57 68 72 74 79 79 80 81 81 83 84 84 85 86 86 87 88
Figure 4.17: Figure 4.18: Figure 4.19: Figure 4.20: Figure 4.21: Figure 4.22: Figure 4.23: Figure 4.24: Figure 4.25:
Selection by location One point intersecting 40m Ushahidi Haiti data Experiment data Comparison of descriptions Select all OSM buildings intersecting experiment data Intersecting points Attributes of the OSM VGI Attributes of UNOCHA data
89 89 90 90 91 92 92 93 97
List of tables
Table 1.1: Table 3.1: Table 3.2: Table 3.3: Table 3.4: Table 3.5: Table 4.1: Table 4.2: Table 4.3 Table 4.4: Table 4.5: Table 4.6: Table 5.1: Crowdsourcing tools for disaster management and land use How to guide van Oorts accuracy factors Levels of participant computer literacy Levels of participant GIS/Mapping experience Levels of GIS expertise Corresponding levels of guidance given Differences between the four experiments Experiment 1 participants Experiment 2 participants Experiment 3 participants Experiment 3 participants Comparison of OSM 2010 vs. experiment VGI Closing survey results Guidance given 13 59 60 60 61 62 66 71 74 75 82 93 98
Table of Contents
IMPROVING GEO-INFORMATION THROUGH GUIDED PARTICIPATION ........... 1 CHAPTER 1: INTRODUCTION ..................................................................................................... 11
1.1 The problem and its context ..................................................................................................................... 11 1.2 Background .............................................................................................................................................. 12 1.2.1 VGI in disaster management.................................................................................................................... 13 1.2.2 VGI in land use ......................................................................................................................................... 14 1.3 Research issues .................................................................................................................................. 15 1.3.1 Geo-crowdsourcing is done in two distinct modes ............................................................................. 15 1.3.2 Crowdsourced information can be unreliable or of insufficient quality .................................................. 17 1.4 Research objective ................................................................................................................................... 19 1.5 Research question .................................................................................................................................... 19 1.6 Sub-research questions ............................................................................................................................ 20 1.7 Research methodology ............................................................................................................................. 21 1.7.1 Phase 1: Literature review ................................................................................................................... 21 1.7.2 Phase 2: Testing & Investigation of existing tools and techniques ...................................................... 21 1.7.3 Phase 3: Design and development of an improved guidance methodology ....................................... 22 1.7.4 Phase 4: Experiment & Analysis .......................................................................................................... 22 1.7.5 Phase 5: Conclusion ............................................................................................................................. 23
CHAPTER 2: INVESTIGATION OF EXISTING CROWDSOURCING TOOLS AND GUIDED PARTICIPATION TECHNIQUES .................................................................................................. 24
2.1 Introduction ............................................................................................................................................. 24 2.2 VGI concepts and applications .................................................................................................................. 24 2.3 VGI tools ................................................................................................................................................... 24 2.3.1 OpenStreetMap (OSM) ............................................................................................................................ 25 2.3.2 Ushahidi ................................................................................................................................................... 27 2.3.3 The Geo-wiki project ................................................................................................................................ 31 2.4 The crowdsourcing tools in action: case studies ....................................................................................... 36 2.4.1 Use case study 1: Crowdsourcing for building use classifications in Oslo ............................................... 36 2.4.2 Use case study 2: Crowdsourcing Christchurch earthquake damage ...................................................... 37 2.4.3 Use case study 3: Crowdsourcing for disaster mapping in Haiti .............................................................. 38 2.5 Quality assessment of crowdsourced data ............................................................................................... 42 2.6 Conclusion ................................................................................................................................................ 50 2.6.1 Collaboration ................................................................................................................................ 51
2.6.2 2.6.3
Site functionality ......................................................................................................................... 52 Guidance........................................................................................................................................ 52
CHAPTER 3: DESIGN AND DEVELOPMENT OF A METHODOLOGY FOR IMPROVED GUIDED PARTICIPATION............................................................................................................. 56

3.1 Strategies for implementing improvements ............................................................................................. 56 3.1.1 An improved collaboration ...................................................................................................................... 56 3.1.2 An improved site functionality ................................................................................................................ 58 3.1.3 An improved guidance of participation ................................................................................................... 58 3.2 A proposed prototype solution................................................................................................................. 62 3.2.1 Web design .............................................................................................................................................. 62 3.2.2 Functionality and usability ....................................................................................................................... 63 3.2.3 Monitoring data edits and inputs ............................................................................................................ 64 3.2.4 Assessing data quality .............................................................................................................................. 65
CHAPTER 4: EXPERIMENTATION AND ANALYSIS ............................................................... 66

4.1 Experimentation ....................................................................................................................................... 66 4.2 Experiment 1: Mapping building usage types in Oslo................................................................................ 67 4.2.1 Design ...................................................................................................................................................... 67 4.2.2 Participants .............................................................................................................................................. 68 4.2.3 Methodology: Participant instructions and guidance............................................................................. 70 4.2.4 Results...................................................................................................................................................... 71 4.3 Experiment 2: Re-mapping earthquake-damaged buildings in Christchurch, New Zealand ....................... 73 4.3.1 Methodology ........................................................................................................................................... 73 4.3.2 Results...................................................................................................................................................... 74 4.5 Experiment 3.a mapping party: Re-enactment of Haiti damage mapping ................................................. 75 4.5.1 Methodology ........................................................................................................................................... 75 4.5.2 Test 1: no guidance .................................................................................................................................. 76 4.5.3 Test 2: Guidance using a wiki ................................................................................................................... 77 4.6 Experiment 3.b Re-enactment of Haiti damage mapping using an improved methodology for guidance . 81 4.7 Analysis of experiment results.................................................................................................................. 83 4.7.1 Data preparation How to prepare CSV files for use in ArcGIS .............................................................. 84 4.7.2: Data quality............................................................................................................................................. 86
CHAPTER 5: DISCUSSION AND CONCLUSIONS...................................................................... 97

5.1 Introduction discussion of experiment results ....................................................................................... 97 5.2 Scope and limitations ............................................................................................................................... 99 5.3 Quality of VGI assessed data .................................................................................................................... 99
5.4 Answers to sub-research questions ........................................................................................................ 100 5.5 Conclusions ............................................................................................................................................ 102 6. References ............................................................................................................................................... 103 7. Appendices ............................................................................................................................................... 110
10
Chapter 1: Introduction
1.1 The problem and its context
Volunteered Geographical Information (VGI) refers to geospatial data that are provided voluntarily by individuals through crowdsourcing initiatives. Although the most of the individuals are untrained in the discipline of geography, the information they provide is an incredibly valuable resource. Not only does crowdsourcing lower costs by outsourcing a task that is typically performed by a professional to the general public, but it also offers insight from locals or people on the ground in situations where much information is needed but not easily accessed. With the abundance of handheld technology such as mobile phones and cameras incorporating GPS, and internet-connected sensors in homes and vehicles, most people are able to give location and descriptive data. VGI is heavily relied upon in the domains of disaster management and land use. Most crowdsourcing sites integrate the local VGI with satellite-based imagery classifications, and this is instrumental in providing a complete and comprehensive overview of large areas. For instance, this type of information gathering was instrumental in helping disaster relief workers after disasters such as the 2010 earthquake in Haiti and the 2011 earthquake in Christchurch to make damage maps where local people could provide real-time information that could be combined with remotely sensed imagery. Likewise, VGI has been used for land cover validation using free, online tools such as Google Maps, OpenStreetMap, WikiMapia and the Geo-Wiki project. The intention of these tools is to garner the attention and participation of as many people as possible, in order to create data and information for every part of the World. Validated land cover data is used further in research for global warming, land use planning and decision-making. In disaster management volunteers are under time pressure and hostile environments and this can result in incomplete data and missing or inaccurate descriptions. For instance, during the Haiti disaster, aid workers were prevented from providing food and medical aid due to inaccurate crowdsourced reports of violence. Whereas in land use, because there are so many high resolution images available, many more people are capable of classifying and validating features. However, the images are not always clear enough and this can cause the user to validate with uncertainty. Likewise, there are not always clear enough descriptions of what the meaning of certain land classifications are for instance, does the average person know what mosaic grass is, or the difference between broadleaf or deciduous tree types? In both the disaster management and land use domains, in order for VGI to be useful for decision-makers, the information must be accurate. However, this is the main challenge with VGI, since people can contribute without awareness of the geographic significance of their contributions. Although this is hard to criticize, since one of the defining characteristics of VGI is that it is empowering anyone to make a contribution. It is important to remember that VGI is for the benefit of those who will use the maps, but is also mutually advantageous for all involved: the VGI participant who may be developing their skills while attempting to contribute, as well as the experts who could benefit by widespread cartographic engagement.
11
While crowdsourcing sites vary in requirements for its participants (some are open to all without the need to formally register or have previous training, skills or qualifications, while others require these to certain extents and user trustworthiness is rated based upon these factors); what is clear is that the number and variety of people participating is not the problem: many times participants do not give good or accurate enough information because there has been little help or guidance in how to capture and/or submit it properly. As a consequence, VGI has been seen by professionals as asserted information that carries a risk. Research by Mooney, Corcoran and Winstanley (2010) emphasizes the problem that without some quantitative measures of assessing the quality of data the GIS community has been slow to consider [crowdsourcing sites] as serious sources of data. But in order for the quality of crowdsourced data to improve, participants must be guided with explanatory and exemplary help information. This would allow more reliable geospatial information to be submitted and facilitate increased professional use of the information.
1.2 Background
With the abundance of participatory web (Web 2.0) technologies and a trend for most information to be available online for everyone, it has become popular for everyday citizens to contribute to the information online. With spatial literacy growing outside of the professional cartography domain, there are more map producers and users than ever before, offering the potential for more collaborative mapping. But the quality of the map needs more focus in order for it to be more useful to society. Mapping services such as those created by Google (Google Maps, Google Earth, and Google MapMaker) have capitalized on the idea of asking regular citizens to add their local knowledge to the maps that are available for everyone. Table 1.1 provides a quick overview of the tools that are most relevant to the research. It does not include the Geo-Wiki project, or Ushahidi, which are discussed in greater detail as use cases for this research.
12
Tool Google Maps
Purpose Web Mapping Service
User features Customize and save maps Rate places
Google MapMaker
Crowdsource data for areas not already covered by Google maps
Draw features onto a map Add other new features Edit existing features Trace e features from existing satellite imagery
Guidance Visual guide to Google Maps Interactive tutorials Help forum with getting started section, learn more, fix an issue, additional resources. Getting started guide Useful help links You Tube videos How-to instructions
Moderation None required, as all changes are saved within the users own account Public ratings of places will be reviewed by Google
OpenStreetMap User-generated street maps Upload, edit data or existing data Introductory video Wiki OSM blog Podcasts Help section with answers to questions, information about tags, users, etc. Documentation User guide Beginners guide -
Wikimapia
Lets describe the whole World!
Users can add information in the form of a note to the map
None
Moderation of new users by more experienced users Credibility increases with contribution and length of membership Inputs less moderated in time Report a Problem link, reported errors will be updated by Google Google contact request form Restrictions on use: Users must be registered Moderation rights are gradually assigned to site users based on the accumulated trust points.
Ability to edit the map is controlled based on the level of trust a user has accumulated after a period of time
Table 1.1 Crowdsourcing tools for disaster management and land use The VGI submitted through these sites contributes greatly to disaster management and land use planning. 1.2.1 VGI in disaster management After the immediate shock of a disaster event, the focus is to understand exactly what has happened. A major challenge is to deliver useful information to the right actors at the right place and time. Useful in this context refers to the scale, accuracy, and detail of the information. There are typically three types of actors with spatial information needs: 1. 2. 3. Public sector authorities - such as emergency managers and government agencies Researchers, and Private citizens
These actors will be using GIS to consolidate and visualize the information in order to convey information to many people in a short period of time. The needs of each actor
13
will vary but will overlap many times. For instance, both public officials and private citizens need information about the spatial extent of the disaster and evacuation routes, which areas were worst affected, etc. Rapid and accurate assessment of the damage is essential. This is done from above using remote sensing and from the ground. For events that have caused extensive damage, ground-based mapping is too slow. Mappers on the ground cannot always access entire areas of the damaged sites due to danger or obstruction, and the mappers are not always professionals, but mostly volunteers who make many mistakes. In addition to ground mappers comes VGI from local people living in the area affected via SMS or online reports, which is insightful but may not contain the vital geographical information that is needed. VGI is also produced by (non-local) volunteers who edit and add data online. Based on a classification from Heipke (2010), producers of disaster VGI can be classified into the following categories: (1) Crisis mappers, people who have purposely joined the disaster mapping community and support local users (both experts as well as casual reporters). (2) Casual reporters. Only start acting if a disaster is occurring in their direct neighborhood. They may have eye-witness reports or redistribute information from other sources. (3) Experts. Professionals who are in the field and collect (and post) data from their professional discipline. Amongst these groups are humanitarian aid workers as well as news agencies. (4) Open mappers. Group of people who are mainly interested in creating better GI in general and are not specifically interested in the disaster at hand. They are most likely to contribute to base data or to redistribute existing information 1.2.2 VGI in land use Land cover data derived from classified satellite images are increasingly used in land use planning and environmental management. As a consequence, concern about the accuracy of these data has grown. Although in the domain of land use needs are similar and the situation less urgent, the purpose of the VGI is for community decisionmaking and the usefulness of the information still lies in its accuracy and certainty. Here the stakeholders or actors with spatial information needs are: 1. 2. 3. Public sector authorities such as planning authorities Interest organizations housing, environmental Private citizens/public residents, business owners
Land planning is usually based on policy requirements: does VGI fulfill these? Planners may be reluctant to involve citizens in the planning and decision-making process and doubt their ability to give high quality information. For instance, a leader of the leading producer of forest and land use data in Norway stated in an interview that they have no experience using VGI for land use purposes and would not consider it either, as VGI
14
participants do not have enough expertise in the field to make any information they could give valuable. Involving the public in decision-making is also seen as part of the democratic process and empowering to the local citizens. In this case data is produced by volunteers both on the ground (Participatory GIS) and via the internet by anyone wanting to participate or with local knowledge of the area (VGI, which focuses more on use of social media tools for inputs). When VGI is paired with remotely sensed imagery, it is possible to gain a clearer picture of the full extent of the situation being captured. However there are frequently temporal and environmental limitations in the images taken. Clearly there are differences between information obtained from remotely sensed imagery and VGI obtained from crowdsourcing or field observations. But both are important in assessing and responding to disasters and land use issues. How can VGI be guided in a better way so that it is more useful?
1.3
Research issues
1.3.1 Geo-crowdsourcing is done in two distinct modes Integrating local knowledge with satellite-based imagery classifications is instrumental to guided participation in VGI: without one of them, our understanding of the entire situation would be incomplete. However, this is not always easy. For instance in disaster management, remotely sensed imagery has often underestimated the extent and severity of the disaster devastation due to limited image coverage or cloud contamination. Figure 1.1 illustrates why field mappers may not always include all damaged buildings and the classifications they use differ from those analyzing the remotely-sensed imagery, due to their differing perspectives. While there are good examples of satellite imagery classifications of damaged building, this does not take into account lateral damage and especially damage to the internal structures of buildings.
15
Figure 1.1 Ground vs. satellite imagery classifications for damage buildings Geo-crowdsourcing is also based on interpretations of volunteers on the ground, who use remotely-sensed imagery to make classifications and identify features on a map. However, it is only Google which makes use of street view, the perspective from the street level on the map. In the Geo-Wiki project, participants can turn on layers of geotagged photos which have been submitted by locals. In this case too, it is difficult to get the local perspective, as the photos are static images and a street view is not possible, as seen in Figure 1.2. If VGI participants are mapping building usage attributes for a land use experiment and they are uncertain about a particular building, referring to the satellite imagery or geotagged photos may not particularly helpful.
Figure 1.2 Geotagged photos for Oslo, Geo-Wiki project
16
1.3.2 Crowdsourced information can be unreliable or of insufficient quality The quality of the data collected from crowdsourcing initiatives is always of primary concern. Data quality is influenced by factors such as varying participant expertise, classification differences, fitness for use, rigor of screening and the participants extent of liability. Varying participant expertise The majority of data collected via crowdsourcing efforts is produced by non-experts or amateurs. Naturally, there are concerns surrounding the quality of this data and experts feel that in order to be less skeptical towards this kind of data, there needs to be a measure of the quality of the data to reduce uncertainty. However, not only are there many VGI providers in these arenas, but also many editors, which allow errors and conflicts to be resolved. Until there is a better system for guiding participants which will help to eliminate errors and increase data quality, mapping professionals using crowdsourced data should understand that data quality varies and include this possibility into their analysis. Classification differences While most crowdsourcing efforts such as OSM and Ushahidi provide examples of how to classify features, there are still occasional errors and differences between the classifications that different users contributing to the same effort produce. Fitness for use The data producer should describe the purpose of the data. For example, using tags provides a way for data users to evaluate whether the data is fit for the use they have planned. Rigor of quality assurance process Most crowdsourcing efforts include some kind of process for screening or assuring the quality of the data before it is used. For example, using peer reviews to build up the reputational quality of a data producer. However, if this is not present, the information can have flaws. Participant extent of liability Different actors have different levels of responsibility for the data. Usually amateur crowdsourcing participants do not have any rules, regulations or reasons for assuring that the data is of quality and have no responsibility for any effect it may have subsequently. This is part of the individual freedom that is inherent in giving VGI. VGI is by nature noncommittal: the crowd is used as labor without any compensation, and neither the VGI participants nor those collecting the information have an obligation to each other. People who give information about an area close to them (in feeling or proximity) are more likely to be careful about the quality of the information they give. It is for this reason that Geo-Wiki allows participants the option to validate points around their home area, or why users in Google Map Maker can assign areas as in their neighborhood in order to build trust about the certainty of their classification.
17
Certainty about the information to be submitted is built up by providing instructions that are easy to follow about how to submit information that is useful and reliable. By providing more reassuring information, participants may feel more liable for the information they submit. In most cases of using crowdsourced information, there are enough people working together that errors made by certain individuals can easily be corrected and generally, the benefits of having so much valuable information outweigh the risks. However, even the most user-friendly crowdsourcing sites fail to guide participants well enough both in disaster management and land use initiatives. The disaster management sites use similar approaches and platforms, such as a wiki and an Ushahidi platform and/or OpenStreetMap site and there are a lot of resources in place but not every site provides a central place to find out about how to contribute.
Other factors Various factors cause the crowd to contribute differently based on the type of situation they are in, these are also known as User mapping behavior patterns. According to Sui, Elwood and Goodchild (2012) there are four levels of public involvement in VGI, based on the extent of involvement and interest of the participants. Engagement begins with crowdsourcing, then distributed intelligence, then participatory science, then eventually becomes extreme citizen science (Figure 1.3).
Figure 1.3. Levels of VGI participation Citizen science is a form of VGI that involves prior training on the subjects to ensure quality of the output (Goodchild 2008).Citizens (the users) of the crowdsourcing initiative/issue are stakeholders and are motivated to contribute GI voluntarily. Coleman et al (2009) have identified factors such as altruism, professional or personal interest, intellectual stimulation, enhancement of a personal investment, social reward,
18
enhanced personal reputation, an outlet for self-expression, pride of place or social, economic or political agenda as what specifically motivates people to contribute to crowdsourcing initiatives. Craig (2005) also identified "Idealism" "Enlightened SelfInterest" and "Involvement in a professional culture" as reasons for crowd contributions. The contributions can be constructive (providing valuable new content, constructive corrections, validation or correction of existing entries, minor edits) or damaging (large deletions of article/wiki content, irrelevant information, misinformation, false rumors), with the latter occurring intentionally but mostly unintentionally with most participants truly believing that they have made a valuable contribution. VGI contributions can easily be distinguished by their apparent competence and accountability (Figure 1.4). Studies also show that contributors would like to receive recognition for their contribution and would like to see their contribution used quickly.
Figure 1.4 Approaches to differentiating between VGI contributors The major motivation behind voluntary contribution to VGI data-collection efforts in European countries is that geospatial data layers such as land-use or street data are not provided free of charge.
1.4 Research objective

The objective of this research study is to develop a methodology for guided participation of crowdsourced VGI and evaluate its effectiveness at different levels of guidance. These results will provide a model for improved crowdsourcing methods, eventually contributing to fields such as disaster relief operations and land use planning.
1.5 Research question

The main research question that this study will attempt to solve is: What generic methods for guided-participation will improve crowdsourced VGI?
19
1.6 Sub-research questions

The following sub research questions are proposed in order to answer the main research questions, grouped by category: Data quality 1. What are the best examples of satellite imagery snippets and field photos to represent specific map feature categories, such as damaged building versus severely damaged building? 2. How do crowdsourcing participants contribute to maps? What image data do they have access to, what do they combine it with? 3. What is a good method to measure data quality and how can we measure the improvement of VGI after guided participation? 4. How do existing crowdsourcing sites and software validate and filter information? Were factors such as positional and attribute accuracy, currency, completeness, consistency, lineage, and precision checked? Guided participation 5. Which tools and techniques existed for guided participation in the case studies and what makes them effective or ineffective? How are they related or different? 6. What guided participation is used for amateur contributors vs. others with expertise or trusted contributors? What criteria must contributors meet to be trusted? 7. What are the factors that cause the crowd to contribute differently based on the type of situation they are in? (User mapping behavior patterns) 8. How are different forms of guided participation used for different forms of participant input? (report, wiki edit, SMS) 9. How and when should experts be involved in guiding participation? 10. What is the effect of different levels of guided participation on the resulting quality of VGI?
20
1.7 Research methodology

Figure 1.5 illustrates a schematic overview of the phases of research. In this figure it is also indicated where answers to the research questions can be found.
Figure 1.5 Schematic overview of research phases The research project consists of five phases, each phase consisting of one or more core elements (the boxes) on which the next sections briefly elaborate as an introduction to the chapters to be contained in the thesis research. 1.7.1 Phase 1: Literature review Academic and scientific literature and experts were consulted with regards to the best examples of satellite imagery representing map features and how volunteers use these to create maps and findings presented in Chapters 1 and 2. In addition, a data corpus of VGI datasets or ground truth reference data from each of the use cases was compiled for use as comparison to test data in a later phase. 1.7.2 Phase 2: Testing & Investigation of existing tools and techniques An investigation of existing methods of guidance and for evaluating the quality of VGI was made and used in the development of an original methodology. Existing tools and techniques for guided participation that were used in the Haiti, Australia and Geo-Wiki case studies were investigated, tested and assessed for effectiveness. These findings were used to develop an original methodology for experiments in Oslo, Christchurch and Haiti. Interviews with experts within the field of or directly involved in previous case studies were conducted; resulting in an assessment of the needs of the actors involved in the use case situations. Based on these interviews it appears that these needs are not completely being met with the current, existing tools and techniques. The results of the testing and investigation of existing methods and tools, as well as the expert insights are presented in Chapter 2 of the thesis.
21
1.7.3 Phase 3: Design and development of an improved guidance methodology Based on the theoretical and factual background knowledge gained through literature review, investigations with experts and experimentation with user groups, an original methodology for guided participation of VGI in crowdsourcing deployments has been designed. This methodology is suggested as a guideline for VGI application developers and deployers and presented in Chapter 3. Various actions and products were designed: the expert interviews, a proposed method for guided participation, a prototype with different levels of guidance and the experiments for Oslo, Christchurch and Haiti. In order for the experiments to be carried out in an organized manner, instructions for participants; different levels of group participant guidance; expert involvement; how to test guided participation methods; and how to check for and assess factors such as positional and attribute accuracy, currency, completeness, consistency, lineage, and precision were designed. A prototype solution has also been designed to demonstrate how this methodology can work and is presented in Chapter 4. The prototype is the interactive part of the proposed methodology and consists of a WIKI with an OpenStreetMap-type interface that contains links or pop-ups with all of the instructions necessary to be informed properly about how to contribute VGI. The Ushahidi platform is used for submitting reports using Bing map layers. The crowdsourcing and guidance components are handled separately in order to provide the best guidance, but through links found within the same portal. Introductory pages instruct users on the background of the study and on how to use the wiki site, while screenshots are used to highlight the functionality of the Crowdmap tool and guide participants through the process of submitting reports. While in the end, a survey about the usability/ease of use of the experiment and guidance given is conducted after each experiment via the site. The survey includes questions about participant user levels and which types of guidance they used and found most helpful. 1.7.4 Phase 4: Experiment & Analysis Three experiments were designed to test the effectiveness of the proposed improved methodology for guided participation developed during the research and are presented in Chapter 4. In the first experiment a local environment is used, where remote participants contribute VGI about features of building usage/types in Oslo. Participants then use the same guidance to map earthquake-damaged buildings in Christchurch, New Zealand. In the third experiment participants at a mapping party will use improved guidance to reenact the mapping of earthquake-damaged buildings in Haiti. In the final part of this experiment participants at another mapping party use the most improved methodology developed from findings in the first two experiments to again re-enact the mapping of earthquake-damaged buildings in Haiti, but this time in a specific area. Qualitative and quantitative assessments of VGI participants and the data they submit are performed during experimental tests. The proposed improved methodology was implemented in a prototype solution to illustrate how these guidelines can be provided in the right way (simple, comprehensive
22
and at the right time). Participants used the prototype to submit VGI during the experiments, which was collected and then analyzed using version 10 of ArcGIS. The effectiveness of the system and improvement of VGI was evaluated based on factors found in the literature review and investigations, but also based on user satisfaction surveys. A link is provided between guided participation strategies and how they influence each type of data quality, as listed in Extent of liability for data. In this way it can be indirectly assessed what guidance could have been helpful in the use case studies. Factors measured during the experiments include: Timing: How much time it takes the participants to accomplish the mapping, time laps between mapping edits or validation and visibility on the map will be assessed. How much and the quality of mapping that can be done under pressure (for example, with a time limit) was tested. Guidance: Users were provided with various levels of guidance and differences observed; factors such as the involvement of experts or extent of guidance given (photos, examples of classifications, etc.) were assessed. Familiarity: Participant input quality was assessed based on the participants familiarity or unfamiliarity with the area. Education: A comparison of the effect of educating users on the elements of quality of the data they will submit (positional accuracy, attribute accuracy, currency, completeness, logical consistency, lineage, accuracy, resolution, precision) vs. unaware. User behavior: Patterns in user behavior will be analyzed
After testing different approaches, a choice was made as to what was least and most effective and should be included or excluded from the proposed methodology and prototype and both were changed accordingly. For instance, findings from the methodology shown to be effective in the Oslo and Christchurch use case experiments were applied to the Haiti use case, to reenact the mapping of the area. A comparison of the results of the reenactment (user-generated data) vs. the original 2010 reference data demonstrate whether there has been an improvement in VGI due to improved guided participation. 1.7.5 Phase 5: Conclusion During the concluding phase, the main research question (What generic methods for guided-participation will improve crowdsourced VGI?) is answered in Chapter 5 using a discussion of all of the results of the preceding phases and answers to sub-research questions and results extrapolated to conclusions for the case study areas. The end product of the thesis is a proposed methodology that can serve as a guideline for VGI application developers and deployers. The methodology outlines ways in which guided participation can be provided in the right way (simple, comprehensive and at the right time) and is visualized and made available for testing via a small, final prototype solution
23
Chapter 2: Investigation of existing crowdsourcing tools and guided participation techniques

2.1 Introduction
Guided participation is the action of helping crowdsourcing participants to participate and provide the correct information for the project. Most crowdsourcing sites commonly do this by including a help page for their site, or a wiki where participants can read specific instructions, or a forum where they can discuss issues with other participants. While the existing efforts are admirable, improvement is still needed in order to obtain better crowdsourced VGI. The three popular crowdsourcing tools OpenStreetMap, Ushahidi and Geo-wiki are investigated and disaster cases in New Zealand and Haiti used as examples.
2.2 VGI concepts and applications

Satellite-based imagery can identify broad areas for identification. But although these images are necessary for a broad view of an area, for a complete view it is also necessary for organizations handling the imagery to collaborate with mappers on the ground, or through crowdsourcing initiatives.
2.3 VGI tools

Crowdsourcing tools allow users to generate or contribute geographical content to a map, with the content contributed becoming data which is stored in a database. VGI in most applications has taken the form of georeferenced point- and line-based data, but also areabased features to a lesser extent. The amount of attribute data accompanying the contributions is usually relatively small. The map consists of two layers: a base layer that contains images and maps, and a second layer that contains the user-contributed content. The content that they contribute is from GPS points they have recorded, from geocoded images they have taken, from new features they create based on local knowledge, or from edits they make to existing data. In order to understand why the quality of VGI is often insufficient for professional needs, it is necessary to understand the information needs of the actors involved, investigate the existing crowdsourcing tools, their methods for guiding participants, and how they evaluate the quality of the submitted information. Three popular crowdsourcing initiatives have been researched and compared: OpenStreetMap, Ushahidi and the Geo-Wiki project. What all of these sites have in common is that they are popular crowdsourcing tools, but they are also missing detailed help or guidance for participants, which would greatly improve the quality of the information that is given.
24
2.3.1 OpenStreetMap (OSM) OpenStreetMap is a mapping effort which relies on the contributions of thousands of volunteers, both to contribute entries and to edit the entries of others. The OSM map is a combination of different sources: uploaded GPS tracks, out-of-copyright maps and Yahoo! aerial imagery, not to mention crowdsourcing initiatives such as mapping parties where volunteers label and annotate features on the map together. Users have access to the editable street map where they can upload their GPS traces, as well as use remotelysensed images in order to trace and record the outlines of features such as buildings, streets and other points of interest. The editing process within OSM is very userfriendly. By selecting the Potlatch 2 editor, users are guided through an introduction and step-by-step guidance system. In the OSM guide for collecting data, there are extensive sections about the two main ways of collecting data: using GPS and by tracing aerial imagery. OSM also specifies what data to add, specifying guidelines for commonly mapped features and recommended tags in a section called Map features. Participants that have collected GPS data are provided with some guidelines for how to do so, and they can upload their GPS data onto OSM by first saving files in GPX format, uploading to the site, then using an editor (such as JOSM or Potlatch2 - also used by GeoWiki) to create OSM map data from the GPS traces. By navigating to www.openstreetmap.org and clicking on the Edit tab along the top of the screen. After logging in, the user comes to the editing page, where an editable map is displayed, along with Point of interest icons which can be dragged and dropped into the map. Information about the points is displayed in the Basic and Misc tabs name, address, source of data (i.e. local knowledge), etc. It takes a few minutes for new features to be added to the map, depending on load to the server. Users can also trace roads and areas by tracing Bing aerial imagery. Once they have finished adding or tracing a feature, they can add a tag in order to see the feature rendered on the map, and upload the changes to OSM. OSM data are free to use and edit for users that have created an account. Users can download OSM xml files corresponding to specific regions as specified by a bounding rectangle. When a user makes an edit to the map, the changes get applied to the main database and are immediately visible from the API and the full editing history is stored for each user, as well as the expertise profile of the contributor. However, OSM encourages a process of peer review, so other mappers may change errors that have been made and/or are able to report edits they see as vandalism or abusive to the site. OSM employs various error detection tools that identify potential data errors, inaccuracy or sparsely mapped places. Any participant can view recent changes to the map data (using the recent changes setting), or even watch edits live, using the LiveEditMapViewerJ tool available on the OSM wiki. Participants can submit reports to OSM about quality issues, although there is no standard tool directly in the map interface to do so and this requires investigation of the quality assurance section of the OSM wiki.
25
More advanced participants using the JOSM editor can make use of the JOSM validator, which highlights errors and warnings before the user submits the data. OSM uses a very comprehensive wiki in order to provide guidance for participants: there is a general wiki page covering almost every aspect of involvement and contribution, while there is also a section especially for beginners, as well as several video tutorials. In order to reduce error, ways of mapping are outlined, as well as examples of how they would like features to be mapped, edited and tagged (good practice guidelines). In addition, OSM suggests things that need to be mapped. What is effective about OSM is that it provides plenty of guidance and information for participants, which is located in a wiki accessed by clicking on the Documentation link in a menu beside the main application map, which leads to the wiki site (Figure 2.1). However, its advantage may also be its downfall, since while the information there is extensive and illustrative, the volume of information is daunting and in order to contribute users must complete lengthy tutorials and read through many illustrated guides. While all of this guidance is helpful, it does not allow users to contribute easily.
Figure 2.1 OpenStreetMap Wiki menu listing help options What could make this site less intimidating is having the guiding information shown as a pop-up linked to the main map and options, so that the information is more easily associated with use in the map. Participation is guided differently for the different kinds of OSM contributors. Lengthy basic information pages usually entitled for beginners/for first time edits are easily available and presented in a simple language. Whereas there are also help documents and guidance for more advanced/experienced mappers, but these are mostly in the form of a forum discussion with other mappers and in articles with less explanatory text, more coding which experienced users understand. The choice of editors is also divided based on the target user group: beginner vs. advanced, with beginners using Potlatch and advanced editors using the Java OpenStreetMap (JOSM) editor, whose interface is more like a traditional GIS. While several apps are available for download which allow users to make minor edits or contributions to the map, these are not free and most edits can only be completed from the desktop. This includes wiki edits (which are approved by peers with a longer history of use) and map edits.
26
OSM played an important role during the Haitian and Australian disasters. OSM was able to provide relief workers with map coverage very quickly and the spatial data was crucial for ground relief efforts. OSM is also used for land cover validation and land use planning. 2.3.2 Ushahidi The Ushahidi platform uses an entirely different method of crowdsourcing. In addition to submitting a web-based report or an email, users can send SMS reports from their mobile device or use a Twitter hashtag to communicate their information to Ushahidi. The application interface is user-friendly and easy to understand. The application shows the main map and categories of damage or reports, and users wishing to contribute simply click on Submit incident to send their VGI. Here they provide a description, category, location name, approximate coordinates, media uploads and this form contains the local date and time. The reports are then geo-tagged and displayed in an interactive map. Unlike OpenStreetMap, Ushahidi participants cannot make edits directly to a map. The options for contributing VGI are limited to submitting a report (Figure 2.2), which asks the user to submit detailed information about any category of information that deployers would like. As simple as the report may be, there is no guidance given to participants about what kind of information they should submit, and whether or not they must fill out all of the fields or not. For instance, if a user does not know what geotagging is, it is unlikely that they will submit a photo. The result is that information submitted varies in detail and in completion. While no guidance is offered to participants immediately around the main reporting interface, guidance documents are found in a wiki, a blog and a user community all found by exiting the report and searching on the main Ushahidi site these are not easily discoverable and help is not easy to find for those who need it.
Figure 2.2 Ushahidi Haiti report
27
There is no distinction made between amateur or experienced users, only trusted or not trusted. Mobile users can view demos in order to view examples of deployments and/or reports already submitted or used. An additional resource is the Ushahidi guide, which is meant for those who will implement the platform, but which could also be helpful for those looking for guidelines of how to complete the report and what type of information is considered valuable. For instance, the Ushahidi guide specifies what constitutes information that would be valuable for fields on the report. The title should provide a brief description of the event (no more than one sentence), including nearest landmark. The title should give viewers an idea of what happened and where; the description should describe the event in a few sentences, including details of who, what, where, when, how and any additional contextual information that may be important, such as the source of the information; Date & Time should refer to the date and time of the event, not when the report is submitted; Category specifies the type of incident that occurred. The map allows contributors to visually provide the location of the incident. This is especially helpful when the address or GPS coordinates are not known, but the location can be identified on a map. It is important that the contributor zooms in to a scale that allows them to accurately place the marker at the correct location, as this action provides the coordinates and point on the Ushahidi map. The reporter is also required to write the name of the location in the Refine Location name box under the map. If the location cannot be identified on a map, there is a find tool where a name can be entered and found. Videos, photos and contact information can also be attached to the report. As with OSM, Ushahidi does have a quality control system in place: submitted reports will not be disclosed immediately, but first reviewed by moderators to see if the information is accurate and reliable. Moderators give tags to the reports in order to highlight the level of trust they attach to the given information. The report will not appear on the public list of reports or on the map until they have been approved by an administrator with the proper permissions. Once approved, the reports will immediately appear on the public map. Reports that have not been verified by an Admin are already on the map but are flagged as not verified in term of course or content. Verifying the report is not a requirement to appear on the map, but it does let viewers know that the information in the report itself has been verified by either another source or the administrator of the platform. Reports are marked with the status unverified until it is confirmed that they meet the following criteria: The same information has been reported from multiple reliable sources The incident has been reported more than twice from different sources The incident has been reported in different sources by different people (such as Twitter, the news, etc.) and has also been confirmed. The report has been discussed with the contributor personally, to verify more detailed information Photos or videos have been supplied documenting the report
28
The contributor is a trained community member, part of a reputable network, or a trusted source (someone who has given credible information in the past or over a long period of time.)
The administrator will verify that they have direct knowledge about the event and can be sure that it is true. They will also mark the report with the level or reliability and probability that they believe it has. Other users can comment on the report and/or give scores to the report by specifying whether they trust it or not. The trust function will not affect the verification of the report, but will give an indication to the administrator and other viewers of what is thought about its credibility. Although these quality assurance systems are in place, during disaster situations it can be valuable to focus on receiving the highest quantity of information, not necessarily only quality of information. For instance, although individual tweets may not be particularly reliable, once they are on a map it is easy to notice clusters of incidents based on where most of the message sources come from. Although administrators can tag reports as verified or not verified, when reliability is in doubt it is still possible to use Ushahidi the verification by aggregation method. This system is not perfect but it has worked in previous disaster relief operations, since government and humanitarian actors were able to act upon these reports. In an evaluation of the 2010 Ushahidi Haiti project (Morrow et al, 2011) explained that the information was used because it was the only aggregator of information coming from the affected area. Despite the risk of the information, the credibility of the project and project team was often cited as a reason for professionals to use the information. High levels of trust that had been built in earlier years by project team members among their graduate academic colleagues and professional networks were relied upon. Reports that had been given guidance from humanitarian response professionals were more trusted by decision makers as credible sources of information. Decision-makers had a suspicion of the crowd and feared that they may intentionally manipulate information. The information was also not used by professionals due to the rigidity of the information requirements of the actors (traditional disaster response organizations which typically require certain types of information at certain times) and to the inconsistency of the incident data aggregated by Ushahidi. Only 3,854 messages of 60,000 messages submitted were placed on the map, due to a large number of them not containing geographic information. Another problem was that of misclassification of category tags: for instance, only 47% of the reports classified as services available were actually related to service or resource availability, but most were actually related to trapped persons or appeals for food and water. Chat discussions suggested that some of the misclassification was a deliberate attempt to move reports into what were perceived to be more closely monitored categories in order to improve chances of response. On the other hand, omission of many reports may have in turn lead to omission of important messages of need. It is heartbreaking to think of so many urgent messages
29
being ignored simply because they were not perfect especially when participants were not offered any kind of guidance, help or examples of how to complete the reports. Most of the messages received probably would have contained the required information had they been guided properly to include this. While better quality assurance or reliability checks need to be developed to establish more trust in VGI reports, more importantly Ushahidi has to establish guidance for participants to submit better information.
Figure 2.3. The Ushahidi program theory for the Haiti disaster As we can see from Figure 2.3 showing the theory behind the Ushahidi operation in Haiti, a few factors stand out as those which might be most important for organizations looking to take action based on the crowdsourced information: - Responders should be able to make decisions based on the information - Information is value-added or has a perceived utility - Some reports are verified - Reports can be geo-located - Reports are actionable and categorized The relevance for decision-makers and subsequent impact on those who need help (timeliness of the report appearing on the map to others) can only happen if the effectiveness and efficiency of the information supplied by participants improves.
30
2.3.3 The Geo-wiki project The Geo-wiki project is a research project initiated by the International Institute for Applied Systems Analysis (IIASA), an international research organization that conducts policy-oriented research into large-scale global problems in the areas of energy and climate change, food and water, poverty and equity. A product of their Ecosystem Services and Management program is the Geo-Wiki project: an effort to improve the quality of global land cover maps. The project is an online tool for validating land cover. Accurate and up to date land cover information plays an important role in many different research fields, but since there are many areas of disagreement, a wide community of participants is needed to validate this information. Upon first visiting the site, users are presented with a short summary of what the project is and how they can contribute. In main menu they see that it is possible to read instructions or watch a video tutorial, where they also have the option of trying the site as a guest before they make any commitment of becoming a registered user. What is helpful is they users can choose the language that the site is presented in. The instruction section is a very helpful document that guides potential users through the entire process of participation, so that they know what to expect at each stage of the process. Volunteers are asked to review maps showing land use disagreement and to determine the map correctness based on Google Earth images and their local knowledge. If a map is determined to be incorrect, volunteers can take photos with orientation and geolocation. These photos can be uploaded on Geo-wiki and will be used by the Geo-Wiki project to produce hybrid land cover maps. There is no differentiation of guidance or submission form for different user levels: both VGI participants and remote sensing experts use the same classification terminology. The problem with this is that the participant may not be familiar with all of the terminology. Figure 2.4 shows part of the land validation form in the Geo-Wiki project and while participants may be unsure of the classification terminology used by experts, they can set their profile to choose a more simple validation method, however this provides only a slight change, as seen in Figure 2.5. A more user-friendly improvement would be to provide an illustration of each type of feature classification, containing clues and examples of what the feature type looks like.
31
Figure 2.4 Geo-wiki classification method
Figure 2.5 Simple validation
High-resolution satellite imagery is usually (though not always) used in order to assist participants to classify land features. Areas of disagreement can be compared to highresolution Google Earth imagery (-m to 4m resolution) and georeferenced photographs. Field photographs from Panoramio provide valuable information on what type of land cover is actually found on the ground. At the same time this information has to be used with care, as there is no indication of the date of capture. During validation participants must specify whether they have used high-resolution imagery or not (Figure 2.6), and if they are unsure they can use a help link which differentiates between high and low-resolution imagery (Figure 2.7).
Figure 2.6. Geo-Wiki validation submission form

32
Figure 2.7. Help window showing difference between image resolutions37 A challenge is that it can be difficult to give the correct land cover validation if the land cover maps visualized are not clear or consistent enough. In addition, the images are not up to date. This must be addressed more clearly in the help documents. For instance, at the moment there is one link which produces a pop-up showing the difference between high-resolution and low-resolution images, but the information is very brief and not detailed enough. Although the Geo-Wiki project is already a good example of how guided participation can be used to help VGI participants give high quality information where it is needed, there is still room for improvement and fortunately these improvements are forthcoming. The project plans to implement the capability to discuss any point and flag it as difficult so that an expert is alerted and joins the discussion directly within geo-wiki in the future. Information is considered valid if: a) it is given by a registered volunteer, b) if the classification is detected as being correct, or automatically deemed acceptable Examples of how participation is guided include: a 12-minute tutorial video, a quick start option for people just wanting to test how the site works by validating any random point, a tutorial which guides the participants through the process of validation, an instruction section and lots of help links. In addition, a disagreement analysis tool helps by showing areas where there is disagreement and provides details of how many disagreements there are, between whom, the extent of disagreement, etc. For instance, when Oslo is chosen as the place to validate, users can display the land cover data from MODIS, GlobCover and GLC-2000 and the different areas of disagreement. MODIS imagery shows Oslo city center as urban and built-up areas, but also some evergreen Needle leaf forests and mixed forests, while GlobCover imagery classifies most of Oslo city center as sparse vegetation and artificial areas, with some closed broadleaved deciduous forest, and GLC 2000 imagery classifies most of Oslo city
33
center as artificial surfaces and associated areas, with a few areas that are Tree cover, needle-leaved, evergreen. There is a 5-10% forest disagreement, as shown in Figure 2.8.
Figure 2.8 The Geo-Wiki validation process, showing area of disagreement By loading additional data (base data, Panoramio photos and geo-tagged photos), it is possible to examine the disagreement more closely. Geo-wiki recommends that in areas where a volunteer does not have local knowledge, they should choose areas that are covered with high resolution images. When a user zooms in to an area that they are familiar with, they can confirm the things that they see are correct. When a user wants to validate an area, they are asked to rate the classification of MODIS, GlobCover and GLC-2000 from good or not sure to bad. When the MODIS classification of Evergreen needle leaf forests is rated as bad, the user can also specify the correct class, urban/built-up area. The user can also specify how certain they are about their classification. Users can provide picture URLs for each direction of the point (N, E, S, W) and specify whether Google Earth high resolution imagery was used to validate or not. If a user is unsure whether they have used high-resolution imagery or not, they can click show help, which opens a window showing an image and text explanation of the difference between high and low resolution Google Earth images. While Geo-Wiki is trying to incorporate the confidence ratings that users have entered with each validation point along with whether the image is low or high resolution, they have not yet perfected the system. After submitting, the user receives a notification that the validation was successful. The project evaluates the quality of information that is submitted based on how accurate their inputs are relative to control points that three experts have agreed upon. This information is used to provide a score for each participant that changes over time as
34
they do more validations on geo-wiki. Users build up trust based on the classification scores they have accumulated, as can be seen in Figure 2.9 and 2.10. The score is calculated based on the users validated land cover pixels in all three of the land cover datasets and is based on only the first validation. Right now is based purely on how accurate they were relative to the control points but Geo-Wiki would like to include other things like the amount they have validated, their background, how much help they have given other people, etc. To provide them with an overall score for the purpose of publishing results of the competition, they have included the number of points they validated in combination with quality in the user classification scores.
Figure 2.9. Top Geo-Wiki user classification scores
Figure 2.10 List of all Geo-wiki users and their classification scores Many of the Geo-Wiki participants are already remote sensing experts so their opinions in some kind of social interaction capability would be very valuable implementation. They
35
would like experts to be a part of the discussion after participants flag pixels as difficult and begin to discuss them. Geo-Wiki is still in a very early stage in terms of incorporating quality control of crowdsourced information. They are currently using the crowd to validate a biofuel map. This has not been published yet but they are trying to include only points that have been validated multiple times and then only the majority answer is used in combination with how well a particular individual scored overall on the control points. The validation points from the crowd via the Geo-Wiki project are not yet being used for creating hybrid land cover maps, nor will they be until they have a robust method of quality assurance. This demonstrates that actors already using some degree of guided participation are still not able to meet their own informational needs, and that improvement of guided participation techniques is needed.
2.4 The crowdsourcing tools in action: case studies

Existing tools and techniques for guided participation that were used in the Oslo, Haiti, Christchurch and Geo-Wiki case studies will be investigated, tested and assessed for effectiveness. These findings are used to develop an original methodology for experimentation.
2.4.1 Use case study 1: Crowdsourcing for building use classifications in Oslo
Crowdsourcing has not yet been the most popular method for gathering information in Norway to date. The method has been criticized in Norway by Wikipedia founder Jimmy Wales as being used by businesses for commercial profit and not involving the users and their needs in the design of crowdsourcing initiatives which they will use. While storms and floods are the main natural risks affecting Norway, since there have not been any large-scale natural disasters in Norway that are comparable to other international disasters for at least the last century, the method is sometimes seen as an old-fashioned way of gathering information. However, there is a burgeoning group of geographers and mapping enthusiasts in Norway who are active in mapping Norway for OpenStreetMap and have been for the past 3 years. But as the project is still relatively new in Norway and since roads, municipal borders and bicycle routes are the most mapped features, buildings are often overlooked. In addition, Google MapMaker has just recently become available for Norway. Thus local residents and volunteers have missed an opportunity to map important features of the country. Oslo will be used as a local experiment case where local residents can contribute their local knowledge to a crowdsourcing initiative to gather VGI about building usage types. The participants who will be involved are a mixture of beginners and mostly experienced GIS users, who will be using a methodology of guidance that is in its beginning stages. This case will be used to learn about user mapping behaviors in a local setting and what guidance is most effective. These results will then be extrapolated to a more international use case, leading to a methodology for guiding crowdsourcing participants in large-scale or unfamiliar initiatives.
36
Choosing Oslo as a use case with which to experiment provides an opportunity for greater high-quality information about features and points of interest in Oslo, but also contributes towards the development of quality control for VGI participation in general, which may later be applied to any domain, country or other use of VGI. 2.4.2 Use case study 2: Crowdsourcing Christchurch earthquake damage When an earthquake struck Christchurch, New Zealand in February, there was an effort to map damages using satellite imagery, shown in Figure 2.11.
Figure 2.11 Christchurch damage assessment map Though the images were cluttered by cloud cover, some buildings were clear and damage could be identified, as in Figures 2.12 and 2.13 below, showing before and after images of the same area in Christchurch.
Figure 2.12 Christchurch before
Figure 2.13 Christchurch after
In both examples it is clear that the imagery is not helpful or detailed enough. The remotely sensed imagery lacks the local perspective, and the local perspective sometimes does not satisfy informational needs. This leads us to question the methods that participants use in creating their maps and map classifications. A group of CrisisCommons and Ushahidi professionals created an Ushahidi deployment entitled Christchurch Recovery Map 8no longer active), which was launched less than 24
37
hours after the Feb.22nd earthquake and maintained by volunteers. Local on the ground volunteers were asked to submit reports about services for food, water, toilets, fuel, ATMs and medical care. Twitter, SMS and email messages were also accepted. The site was listed as a credible source of information by New Zealands main news sites, as well as Googles Crisis Response site. OpenStreetMap was involved in the crisis response and via their wiki (http://wiki.openstreetmap.org/wiki/2011_Christchurch_earthquake) asked on the ground volunteers to help improve map data for Christchurch primarily by editing the OSM basemap (which was already well-mapped before the earthquake) by updating temporary disaster updates e.g. Medical facilities and blocked roads. For other immediate situational information (e.g. requests for help, or photos of the situation) they referred volunteers to the Ushahidi site where they could log reports at http://eq.org.nz/. They were particularly interested in naming roads in the north of the city and highlighted these on a separate nonames map. Remote mappers were asked to contribute by tracing building outlines or infrastructure (including buildings, roads, electricity lines, etc.), both original and damaged with Bing imagery. The Ushahidi site was eventually replaced by a more official, government-run website: http://canterburyearthquake.org.nz/. 2.4.3 Use case study 3: Crowdsourcing for disaster mapping in Haiti To assess damage caused during the 2010 Haiti earthquake, volunteers first identified building damage points using high-resolution satellite imagery. Analysis of aerial photographs was then carried out to delineate building footprints of collapsed or very heavily damaged buildings. The photographs were also interpreted visually to classify land use and to estimate the total square footage of buildings requiring significant repairs or reconstruction. Pre-earthquake GeoEye and Digitalglobe satellite imagery were compared to post-earthquake aerial photos, provided by World Bank, Google and NOAA, as shown in Figures 2.14 and 2.15. The spatial resolution for the satellite imagery used was approximately 50 cm while for the aerial photos it was approximately 15 cm. Buildings were categorized as destroyed, severely damaged, moderately damaged or having no visible damage according to the European Macroseismic Scale 1998 (EMS 98) definition, shown in Figure 2.16.
38
Figure 2.14. Pre- disaster satellite image
Figure 2.15. Post-disaster aerial photo
Figure 2.16. EMS-98 Building Classification Diagram Over 300,000 individual buildings were evaluated for damages with post-disaster satellite and aerial imagery, 67,000 of these classified to be at grade 4 and 5 of damage, as shown in Figure 2.17. However, since the damage was assessed from an overhead viewpoint,
39
lateral damage and damage to the internal structures of buildings was not detectable from the analysis and therefore the total damage assessed by these images underestimating actual damages.
Figure 2.17. EMS-98 Damage classifications based on imagery Field validation of remotely sensed damage maps was then carried out on approximately 6,000 building samples, with overall accuracy of the aerial image classifications resulting in 61% accuracy. It was concluded that the remotely sensed images were limited, for instance in their in their spatial resolution and angle of acquisition. It was also concluded that crowdsourcing could provide a very important primary and secondary data collection, provided the right analysis guidance is given, and that future damage assessments should be more collaborative, using the same definition of standards and validation methods. --The disaster response system used by relief actors after the 2010 earthquake in Haiti when the disaster first occurred allowed actors to collect intelligence through internal channels and share the information with each other, but it did not utilize local knowledge from the Haitian community. Instead, Haitians bombarded UN groups by telephone, email and post. It was difficult for teams to verify the high volume of reports that were received. The solution consisted of manual monitoring and sorting of the information.
40
The situation was a disaster, because there was a crisis in communicating within a communitya difficulty for people to get informed and to inform other people. During the Haiti disaster, the volunteer crisis mapping group (that eventually became Ushahidi) monitored social media sources to identify useful information that had a location attached to it. Two hundred students monitored and geolocated reports from Twitter, Facebook and Mission 4636 on the Ushahidi platform. They found GPS coordinates for the information through Google Earth and OpenStreetMap, and then published this information on a map available on the Ushahidi online site. Although they received from 1-2,000 SMS messages per day, useful reports were tagged with geo-coordinates and reported to teams on the ground within 10 minutes. General reports were handled by student volunteers who had received no more than impromptu training, while urgent reports were flagged and handled by a smaller team of experienced volunteers who could contact the information provider. Of the 25,186 SMS messages, emails and social media communications, 3,596 reports were actionable and included enough relevant information to be mapped on Ushahidi, although only 202 of these were tagged as verified. Although volunteers were under serious time pressures (they completed a map of all roads, buildings and refugee camps affected within just two days), it is obvious here that a system that could quickly identify inaccurate information was also needed. For instance, there was a shortage of information about health facilities and their location, which volunteers were asked to geo-locate. These inputs were verified by an OSM member, who located the facility on high-resolution imagery at 15cm resolution to verify that the facility was located at the submitted coordinates. OSM mobilized 640 volunteers worldwide who scanned and rectified old atlases and maps traced Haitian roads, bridges and buildings into the OSM wiki using tools that only required a simple browser. However according to an OSM volunteer in Haiti who is part of the Humanitarian OpenStreetMap Team, after evaluating hundreds of building=collapsed earthquake damage objects in the OSM database, most of them were not useful or deleted because they (without any additional information such as image links, names of collapsed buildings or other building) are no longer useful in OSM in general but to good extent because of problems in data. The biggest single problem was that many who had done the assessment had clearly never been to Haiti or any other similar country. There were countless buildings that were very clearly not collapsed but marked as such simply because they were (and many still are) under construction and missing the roof -- possibly only from the top floor. In many developing countries this is completely normal and quite common.
41
2.5 Quality assessment of crowdsourced data

Based on Aronoff (1989), Morrison (1995) and Longley et al. (1999) the five main reasons for concerns about spatial data quality issues were identified as: There is an increasing availability, exchange and use of spatial data; There is a growing group of users less aware of spatial data quality; GIS enable the use of spatial data in all sorts of applications, regardless of the appropriateness with regard to data quality; GIS offers hardly any tools for handling spatial quality; There is an increasing distance between those who produced the data and the end users of the data. Furthermore, the producers do not provide enough information regarding the quality of the data.
As methods and models of GIS analysis become more sophisticated, the quality of data increases in importance. Many datasets undergo temporal adjustments, which add uncertainty to the analysis. For example, using one- or two-day-old data in disaster forecasting at the time of the crisis would lead to a faulty conclusion: the conditions change from moment to moment. We must be able to analyze and incorporate such temporal uncertainty in the analysis and forecast that we make. We must be able to quantify the uncertainty in the data (and the analysis) and express this in a satisfactory mode. A major problem exists in how we report uncertainty in GIS. Another major problem is the propagation of uncertainty through the data set as we combine several sets of data of different levels of confidence and even potentially different types. According to Fisher (1999) there are three forms of uncertainty that arise when deriving a spatial data set from the real world: error, vagueness and ambiguity. Error is the difference between the value of an object in a test dataset and in a reference dataset; vagueness can be caused by poor documentation; and ambiguity arises due to disagreement on the definition of objects. These problems may be solved by agreeing on rules for collection and description of data; and agreement on what constitutes the truth. One of the major advantages of crowdsourced data is that it is free and up to date. However, the main challenge is the quality, since anyone can contribute and the information given may not be accurate. We cannot just assume that data that was collected with a GPS is accurate. Common methods of judging VGI data quality are by the number and type of tags participants attach to their data, how many likes or external ratings their edits have received, or by what metadata they have provided. But to measure the quality, more specific methods are required. Traditionally geographical maps have implied quality assurance indirectly by using quality constraints and rules for mapping. While this may have significance for professional map users, casual map users may judge the quality of the map by its resolution. However, digital geographical information can be presented at almost any scale and datasets of any scale may be combined, thus each dataset must individually be of high quality in order for the combination to work.
42
The International Organization for Standardization (ISO) is a network of the national standards institutes of 163 countries that coordinates and develops international standards for a range of technical fields for business, government and society. ISO (2002) defines quality as the totality of characteristics of a product that bear on its ability to satisfy stated and implied needs. But this has also been broken down into more detailed quality specifications in order to be useful. ISO standard 9000 defines data quality as the degree to which a datasets intrinsic physiognomies meets the requirements of a users purpose within a certain academic field; while ISO 19113 (2002) defines quality as the totality of characteristics of a product that bear on its ability to satisfy stated and implied needs and identifies the following elements as being important to consider for data quality: Completeness Logical consistency Positional accuracy Temporal accuracy Thematic accuracy Purpose Usage Lineage
ISO standards also exist for general quality (ISO 19113, 2001), data quality (ISO 19138, 2006), quality evaluation (ISO 19114, 2001) and how to document the results of the quality assessment (ISO 19115, 2003). While these are certainly useful and recommended to use, these may not be entirely applicable to VGI, which is an informal method of data collection. It may be possible to a certain extent, if it does not compromise the nature of crowdsourcing and giving geographical information voluntarily. With some degree of institutionalization, crowdsourced information could be available sooner after imagery is acquired, making it a more feasible resource. These elements of data quality are echoed in academic and scientific literature by van Oort (2006), who has identified eleven elements of spatial data quality which should be used as commonly accepted standards to measure its quality: 1. Positional accuracy: According to Devillers and Jeansoulin (2006), positional accuracy may be defined as the degree to which the digital representation of a real-world entity agrees with its true position on the earths surface. How much does this crowdsourced data correspond to the (absolute) true values? Is the coordinate system defined? Is the study area covered? Does it overlay with other datasets? Do features mapped match features in other datasets? Inaccuracy can be caused by measurement error, scale effects, projection distortion, or cartographic error. Goodchild and Hunter (1997) have presented a measure for positional accuracy. They propose that the positional accuracy of a spatial object or feature can be defined by measuring the difference between the location of the feature as recorded in a database and its true location (a location determined to have higher accuracy).The method consists of
43
using basic GIS operations for calculating the percentage of the total length of the low accuracy feature that lies within a specified distance (buffer zone) of the higher accuracy data. Hence, a buffer of width x is created around the high-quality object; the percentage of test object that falls within this buffer is evaluated using a simple overlay operation. This procedure is iterated 4-5 times until results seem to stabilize and for each iteration, the size of the buffer is increased. The sum and count statistics are used to find the percentage length of lines within the buffers. In this way a statement to the effect that x% of the tested feature lies within y meters of its true position can be derived. 2. Attribute accuracy: Are the attributes close to their true values? How have the feature types been classified? Attribute accuracy is the correct description of a features non-spatial characteristics. Examples of attribute errors include the incorrect use of character case, incorrect numeric values, or misspelled street names. 3. Completeness: Descriptions serve to allow the user to evaluate the fitness of the data for their particular application. Does the data describe the features it is intended to represent? Is there a lack of data that accounts for what may be missing on the map? Does the metadata contain information about where the original data came from? How was it produced? By whom? How much area does the data cover? What scale was used to digitize the information on the map? What projection, coordinate system and datum were used for the map? How many observations were used to compile the dataset? What is the accuracy of the positional and attribute features? Has the data been checked for errors and/or verified? For instance, OSM data tends to cover most urban areas, but not rural or remote areas. 4. Logical consistency: Is the data consistent? Does it adhere to logical rules of data structure, attribution and relationships? For example, these can be characterized as: Conceptual consistency degree of conformity to the conceptual schema; Domain consistency constraining of values to the value domains; Format consistency or validity storage of the data in conformance to the physical structure of the dataset; Topological consistency degree of topological fidelity. Is there consistency within the dataset? Toblers First Law (Sui 2004) tells us that any location is likely to be similar to its surroundings, so information that appears to be inconsistent with the known properties of the location itself or of its surroundings can be subject to increased scrutiny. 5. Lineage: What is the history of the data? What sources were used, which methods, who was responsible? Does the dataset contain metadata? Are there details about how the dataset was collected and how it has changed over time? Does it contain information about the scale or projection? What is its history? How has the feature developed over time? How many different users contributed to the feature? 6. Semantic accuracy: is the title/description given understandable to the interpreter?
44
7. Thematic accuracy: accuracy of the quantitative attributes, correctness of nonquantitative attributes and classifications. To analyze this: look at what kinds of tags users have attached to features such as nodes, ways or areas. Have Ushahidi participants classified their report with the correct category? I.e. do they need food and water, but only selected water needed as a category? 8. Temporal quality/currency: Is the data valid/fit for the time period? (This may be pre, during or post- disaster, depending on the need.) What is the accuracy of the temporal attributes of the data, such as dates and times? VGI has an advantage when it comes to temporal accuracy, in that it is updated rapidly and can sometimes provide the only realtime data. Ushahidi data tends to have high temporal accuracy, since it accepts information collected from mobile devices and maps these almost instantaneously. 9. 10. Variation in quality/precision: Are there any deviations in repeated measurements? Meta-quality: is there any metadata?
11. Resolution: What are the smallest units that can be detected? Spatial resolution determines the ability to view individual building features and damage conditions. . A resolution of approximately 10 meters or less is necessary to discern the presence and location of individual buildings, while a high resolution of one meter or less allows us to distinguish the damage conditions of structures. Of all of the elements mentioned, perhaps the most important is usage/purpose, also called fitness for use: An important factor to consider when judging the quality of a product is not only its degree of error, but how well it meets the expectations of the user. The quality requirements of one user may differ to anothers, hence data quality is not absolute, but depends on the intended use and expectations of the user. What was the rational for creating this dataset? What will be the application of the data? What is the ability of this data to satisfy my need? Fitness for use also describes how the user describes the quality of the data in their opinion. For instance, the participant may have specified its level of uncertainty, which can be quantified. This uncertainty can then be factored into decision making processes. The quality of geographic information may have different effects, depending on how it is to be used. For any given level of uncertainty, there will be some applications for which uncertainty is not an issue and some for which it is. For instance, is a disaster occurs at a location, time-dependent information is critical for determining a response. Two types of error exist in this situation: a false positive or a false negative. For instance, during the Haiti disaster there were false rumors of violence in many areas (a false positive error in information) which prevented aid workers from delivering food and medical relief which was also needed. On the other hand, the complete absence of information (a false negative) about the existence of this problem does not help anyone. To analyze this, we can look at imagery from an OSM, Ushahidi or Geo-Wiki screenshot, then examine the position of a particular feature. For instance, does a crowdsourced road run
45
down the center or the side of the roads in the imagery? If the roads run along the side of the imagery, the map might be suitable for navigation but not for planning of services relative to the centerline of the street. The OSM wiki outlines what they define as accuracy and completeness for crowdcontributed information. Sources of error and how to correct the errors of others are defined. Position accuracy from GPS, aerial imagery and geo-rectification and older maps are discussed, as well as topology and naming accuracy. They state that minimal expected accuracy should be based on the existing accuracy; the only specification is that any edit should leave a map more accurate than it had been before the change was made. For instance, by uploading several GPS traces to the map and comparing how many of the traces are accurate in relation the map feature. Haklays evaluation of the accuracy of OSM data In OSM topology is frequently checked: how do roads and other features connect? OSM also judges the naming accuracy of features, by comparing to on-the-ground reality: what do street signs say? What do the local inhabitants call the place? Haklay believed that VGI datasets should be assessed in ways different to traditional GIS data because the two data types are so different themselves. Haklay argues that a meaningful evaluation of OSM data should take into account the characteristics of the dataset for instance, that the quality of GPS data can only be as good as the GPS receiver used to collect the information which is usually within 6-10 meters of a location; or that Yahoo! Imagery provides 15m resolution. Considering such factors, we know that any OSM data will be within 20m from the true location and should take this into account in our analysis and making further use of the data. Research by Haklay (2008) demonstrated that compared to professional reference data provided by Ordinance Survey, OpenStreetMap datasets can be quite accurate: as much as within 6 meters of the OS position and with 80% overlap of motorway objects. The preliminary investigation consisted of a visual comparison of OSM with Google Maps, Bing and Yahoo maps, assessing road completeness in five towns in Ireland and gradually developed into an automated comparison of an authoritative spatial dataset with OSM data. In the first part of the analysis, a grid-cell vector layout was generated in order to query spatial data from several different datasets form the same geographic region to establish their relationship to a specific grid cell. In the second part buffer analysis for road features was performed. How many km of OSM road features lie outside a 10 meter buffer of the corresponding OS road features. Almost all of the OSM motorways were outside of the buffer. The accuracy was determined using the method developed by Goodchild and Hunter (1997). OSM data was analyzed in 5km grid cells to gather statistics about the completeness, accuracy and fitness for purpose of the spatial data. Buffers were used to determine the percentage of lines from one dataset within a certain distance of the same feature in another dataset of higher quality. In this case British OS data were used as higher quality data. Haklay also indicated that OSM datasets provide great coverage in urban areas, but not in rural or remote areas thus OSM data completeness is low.
46
Haklays comparison of OSM data to OS data (for motorways) was based on only two of the quality measures mentioned previously completeness and positional accuracy, with a view that these are the most important quality indicators for data. The comparison was carried out using a method developed by Goodchild and Hunter (1997) and Hunter (1999), using buffers to determine the percentage of line from one dataset within a certain distance of the OS feature buffer (Figure 2.18).
Figure 2.18. Goodchild and Hunter buffer comparison method This study showed that VGI can have a high level of positional accuracy and thus be useful for urban planning purposes, as an example. Analysis was also carried out by creating a buffer around each dataset and then evaluating overlap. The buffer for the OS data was 20 meters and the OSM data 1 meter. The results were displayed in a table (Figure 2.19).
Figure 2.19. Haklays comparison of OSM vs. OS data The table shows that percentages vary between 60-89%, with an average of almost 80% overlap, thus proving that OSM data can give a satisfactory representation of features
47
(motorways). A second analysis was done using five tiles (5km x 5km) of OSM data and buffers around roads, the areas shown in Figure 2.20.
Figure 2.20 Haklays study area for comparison of data The overlap was between 77-88% and quite a large range of variability. In addition, a visual comparison was carried out across 113km2 in London using five OS 1:10,000 raster tiles. One hundred samples were taken in each tile to evaluate the differences between the OS centerline and the recorded OSM location. The average distances between the dataset locations are provided in table, Figure 2.21:
Figure 2.21. Positional accuracy across five areas of London
48
Many of the locations are within 1-2 meters of the OS data (Figure 2.22, New Cross area), however some had distances of up to 20 meters (Figure 2.23, Highgate area).
Figure 2.22. An example of good positional accuracy
Figure 2.23. An example of poor positional accuracy To evaluate completeness, Haklay compared the lengths of the roads. A grid at a resolution of 1km was created across England and OSM data was projected onto the British National Grid (the OS projection). For each cell, the following formula was calculated: (OSM roads length) - (Meridian roads length). The rest of the analysis was carried out through SQL
49
queries, which added up the length of lines that were contained in or intersected the grid cells. The results showed the current state of OSM completeness in 2008. The length of OSM data was just 69% of the OS data and OS data provides a better, more comprehensive coverage. This difference is of course affected by factors such as crowdsourcing volunteers being more prevalent in larger cities, in urban areas and of course form sloppy digitization. In places where the participant was diligent and committed, the information quality was very good. The roads were also compared based on their attribute quality and the results again showed that OS data was of higher quality in this respect (64.7% more detailed). Overall, the evaluation showed that OSM data was satisfactory in terms of its positional accuracy in coverages, but inconsistent in terms of the quality of completeness. Questions for further investigation that were suggested were at which point does the information become useful? Is there a point over which the coverage is good enough? If the answer to these questions is yes, then OSM data may be more satisfactory than assessed.
Positional and attribute accuracy Another research project at UNB (Sabone, 2009) investigated the extent to which PS enabled mobile devices commonly used by VGI contributors were compliant with Canadian Geospatial Data Infrastructure (CGDI) accuracy standards for positional and attribute accuracy. Based on criteria specified in the standards, 90% or more of features in the dataset should be within 10 meters of the true position. Sabone used road, street and POI data from the GeoBase portal to compare with the VGI. The approach used by Goodchild (1997), Hunter (1999) and Haklay of overlaying VGI data onto buffers was used to assess the percentage of linear VGI features falling within buffers of defined accuracy. The distance between points of VGI data and their corresponding feature in the reference data were also measured manually. Street data from three different VGI datasets were compared using this approach (data downloaded from OpenStreetMap, collected from iPhone and collected from Garmin eTrex GPS.) The conclusion was that OSM data was most accurate in terms of positional accuracy, followed by eTrex and iPhone respectively. The accuracy of the OSM data may be a result of contributors using high-quality Yahoo imagery to digitize their features, or perhaps the peer reviewing process of the OSM community reduced the amount of errors present. Other methods of measuring data quality are using overlay and selection methods, which will be tested and reported on during the experiment phase and included in the final chapters.
2.6 Conclusion
Based on the review of existing tools, techniques and related work for crowdsourcing and guided participation of VGI, the following points have been listed and organized by category, in order that they could be used to form a prototype design and methodology which was tested in the case study experiments.
50
2.6.1
Collaboration
There is a need for collaboration between organizations handling satellite imagery and ground/VGI mappers. Damage assessments/classifications must incorporate the street view, eyewitness reports or field validation. (The local perspective): Damage assessment maps commonly underestimate the building damage present due to a lacking lateral and ground perspective, which can provide insight into the level of structural damage as well. Image analysts and field validators should use the same definitions and standards for validation: Without cooperation between image analysts and field damage validators, damage assessment maps will only contain a one-dimensional and inaccurate view of the damages present on the ground. But in order to cooperate both actors need to use the same definitions of damage levels. The local perspective must satisfy informational needs: While VGI from participants on the ground in the study area is valuable because it contains local knowledge that is incomparable to any satellite interpretation, the information given must satisfy the information needs of the professionals who are gathering the information. For instance, in order for search and rescue teams in Haiti to prioritize urgent reports and help those in most urgent need, they needed reports to be submitted with the most appropriate category. Upon inspecting the original 2010 Ushahidi reports, it was clear that many of the reports submitted were given categories that were not applicable to that specific case, but chosen in order to get attention. Many people submitted reports enquiring about missing persons gave their report every possible category, including medical emergency and damaged structure. This causes time-consuming sorting of accurate vs. inaccurate reports in a time-critical situation where precision and accuracy are helpful to emergency responders. All contributions should be verified (and then have this status displayed afterwards) by a professional, in order to gain trust in the data: Inspection of 2010 Ushahidi Haiti data revealed that not all reports had been verified. Although they had been automatically approved due to the urgency of the situation, not all reports that had been approved were also verified. This made it challenging for professional actors to know which reports were trustworthy. Details of how inputs are approved/verified should be made easily available for all: Neither OpenStreetMap, nor the Geo-Wiki project provide details to participants of how the information received will be verified and/or approved. While Ushahidi has published some details about how this is done, this information is not easily discoverable and participants must search the Ushahidi blog to find out.
51
The more participants who validate the same feature/area, the better: Although none of the VGI tools investigated guide participants to give information about a feature or area that has already been mapped. Include expert help to ensure mapping skills are transferred to the public through training: For instance, experts can offer mapmaking lessons to ensure that data quality standards are met during the collection and creation of VGI data
2.6.2
Site functionality
If there is a wiki, blog or forum associated with the map, they should be accessible directly from the same area/page as the map via a link these must be easily discoverable: This is not the case with OpenStreetMap, where there is no link to the wiki from the main map. Nor is it possible form the main reporting sites in Ushahidi deployments, participants must first find and then search on the Ushahidi website where there is a blog, a wiki and a user community. Urgent reports should appear on the map/to others in a timely manner: Participants should be given an indication of the possible length of time it may take for their information to be verified or acted upon. It should be possible to filter the map display based on whether the inputs have been verified or are unverified: while it is possible to do this in Ushahidi deployments, it is not possible to do in the Geo-Wiki project.
2.6.3 Guidance
There needs to be guidance for every form of input, whether it is GPS points, geocoded images or videos, creating new features, editing existing features, whether by email, sms, report or direct input online. While OpenStreetMap offers guidance for all of the editing possibilities they offer, Ushahidi only offers some guidance about what type of information to provide. Not only is this guidance only found after searching outside of the main site, but there is no guidance offered for how to submit information by email or sms, even though these are options for submitting information. Like OSM does, crowdsourcing initiatives should have a step-by-step guide for all ways of how to collect and/or contribute data; should specify what data to add; give recommended examples; information on how/when their contribution will be added to the map.
52
Both the contributor and administrator should be prompted to specify the certainty of their input and give it a rating: while it is possible for Ushahidi deployment administrators to provide a certainty rating, the users do not, which allows them to submit information without any indication of accuracy. All VGI submitted should contain some geographical information; all reports should be geo-located (in order to be actionable): According to OpenStreetMap Haiti volunteers, most of the VGI inputs could not be used because they did not contain geographical information. This is not only a waste of participant effort, but of information which could be important. The geographical element should be a pre-requisite of submitting the information. Land use classifications should be illustrated with examples of what it could look like. These examples should also include some written details for clarification. The Geo-wiki project does not give any help or examples of the classification terms they ask participants to use, so participants may only submit information that they are unsure about which is not very helpful. High-resolution imagery and field photographs should always be an option available to turn on for help identifying an area. This is not the case with Ushahidi, where only three map layers of the deployers choice are shown. Even if these are the best images available to choose from, the Ushahidi map interface is so small that it is challenging for participants to make a judgment, particularly if the imagery provided is not highresolution imagery. Participants should be aware of the rules for data collection and description: In order for participants to be aware of what is expected in the description of the data they are submitting, this information must be given to them. While OSM does give guidance in this area, Ushahidi does not. Participants should try to fulfill the elements of spatial data quality for all contributions: While this could be a built-in checklist which is part of the submission process, what would be more helpful is if participants were to receive specific guidance for how to submit data containing these elements. While all crowdsourcing projects expect spatially accurate data, none of the projects give guidance specifically helping participants to fulfill these expectations. Participants should be shown examples of errors and how to fix them: None of the crowdsourcing tools investigated provides examples of errors or bad submissions.
53
Video tutorials are helpful: Of the VGI tools investigated, only OpenStreetMap provides these. Even so, the videos provided for beginners would not be easy for absolute beginners to follow: that is, they pre-suppose some prior mapping or map editing knowledge. Information should not be TOO extensive, otherwise it may be daunting. There should be enough to create understanding, and an extra option (for more information, see here) for those who would like it. This is something that OpenStreetMap should consider, for while their help documentation for beginners is extensive, it is daunting even for non-beginners to try to retain all of the information given in these wiki pages. Guiding information can be given in pop-ups or links that are directly linked to the area the participant is questioning (i.e. Wikipedia is a good example): This could be an improvement for both OpenStreetMap and Ushahidi, while Geo-Wiki does a little of this already. This would provide the information that the user needs, directly when and where they need it, rather than trying to educate and provide masses of overwhelming information orientation that the user will probably forget half of before they even get to the map. There should be different levels of guidance for different participant user levels, beyond the categories beginner or experienced or expert: OpenStreetMap guidance for beginners is not actually easy enough for absolute mapping beginners to understand easily. It would be more appropriate if they had at least three categories: beginner, somewhat experienced and experienced, or at least define what each of these statuses implies so that users can choose the information they feel is best suited to them. Ushahidi does not differentiate the information for their potential users, they are all provided the same information which is not easy to find. The site should include a Google Translator toolbar, so participants can select their language: Crowdsourcing tools which can be deployed anywhere in the world need to accommodate the language needs of the users. For instance, many of the 2010 Ushahidi Haiti reports contained reports in Creole, which needed to be interpreted to be understood before responding. Configuring or making it possible to configure the tools in different languages would not only be very useful to participants, but make the tool more attractive to more users. Instructions should let participants know what to expect at each stage of the process: while Geo-wiki does this well, this is an area where both OpenStreetMap and Ushahidi could improve.
54
Use guidance towards a specific course of action: In order to obtain reports that contained accurate categories, perhaps the Ushahidi Haiti 2010 project should have provided specific guidance for each type of possible category, such as examples. This would have helped participants to provide the correct category. Based on these criteria (developed from factors missing in use cases and areas of improvement in popular crowdsourcing tools) it was possible to develop an original methodology.
55
Chapter 3: Design and development of a methodology for improved guided participation

This chapter presents the design and development of an improved methodology for guiding participation in VGI applications. This methodology is based on offering different types of guidance for different user levels and backgrounds but also offering step-by-step instructions that will lead participants in an effective way to submit the information that is desired. The idea is that the participants will feel as if they have an expert standing beside them, telling them how to do everything, with only a wiki guiding them. It is more explanatory and tailored to user needs than current and existing methods, but most of all, it is much simpler. This methodology is suggested as a guideline for VGI application developers and deployers and a prototype solution has also been designed to visualize the guidelines. The elements of the methodology which needed to be designed were: expert interviews, the proposed method for improving guided participation, the prototype solution with different levels of guidance and the crowdsourcing experiments for Oslo, Christchurch and Haiti, including instructions for participants on how to classify buildings; different levels of group participant guidance; how to test guided participation methods; and how to check for and assess factors such as positional and attribute accuracy, currency, completeness, consistency, lineage, and precision.
3.1 Strategies for implementing improvements

A methodology is a system of guidelines for solving a problem, with specific components such as tasks, methods, techniques and tools. In this case an improved methodology is outlined for use in crowdsourcing initiatives. Based on Section 2.6 the investigation of existing tools and techniques for guiding participation in crowdsourcing projects, conclusions in section 2.6 indicate that an improved methodology needs to take into account the following: an improved collaboration, site functionality and guidance of participation. These are discussed individually in terms of how these have been used as methods during the design of a prototype solution. 3.1.1 An improved collaboration Based on the examples of existing tools, techniques and use cases, a few major elements in need of improvement were identified: collaboration, site functionality and guidance of participation. One of the main problems of crowdsourcing initiatives is that there is not enough collaboration between field and remote mappers. By asking field mappers and remote mappers to use the same definitions and standards for classifying information, there is already a better collaboration, as the two different data types can more easily be combined and shown on the same map. Also, incorporating the street view for remote mappers allows them to include the same perspective as on the ground mappers or field validators.
56
However, perhaps the biggest problem with current crowdsourcing initiatives is that there is no collaboration between professionals and non-professionals, for instance the organizations deploying the initiative or who will benefit from using the data and the mappers that are providing the VGI. The only way for organizations or professionals to meet their informational needs is to specify these needs directly to the participants and to narrow the focus area and submission possibilities so that this specific information is more likely to be received, rather than a wide variety of inputs which are not helpful enough. One way to accomplish this is to provide specific examples of classifications and the information required through illustrations. For example, the study areas were defined and illustrated for each of the experiments in this research downtown Oslo within the borders illustrated on the wiki (as shown in Figure 3.1); Christchurch central business district as illustrated in the wiki and central Port-au-Prince, later focused even more to just one grid which was illustrated in the wiki. This made it possible to obtain information about the specific study areas only and also to possibly get multiple validations for the same features or areas (in order to confirm the quality of inputs). Even though participants were made aware that any inputs received for areas outside of the defined borders would not be useful to the study, a few reports were received for areas outside the defined areas. However, defining the study area did focus the data results overall. A measure to further ensure that only study area-specific data is received could be to clip to main map to the study area, so there is no possibility to zoom or give inputs outside of that area.
Figure 3.1 Study area: downtown Oslo
57
Better collaboration between inexperienced and more experienced mappers can be possible by encouraging the process of peer review, where VGI contributions receive feedback by other users. While the site administrator/professional will approve and verify each contribution, as well as provide participants with a rating and/or correction, it would be even better if other participants were more motivated to rate the credibility of the source and comment on the input. These ratings will be displayed on the contributions in order to provide trust in the information provided, and information about how contributions are approved and verified will also be provided on the site, so that the process is transparent. Not only is this element of transparency important for participants, but encouraging peer review creates more participant interaction, learning and in turn a better feeling of collaboration and teamwork 3.1.2 An improved site functionality One of the other reasons that participants may find it difficult to participate in a crowdsourcing initiative (and therefore not have a good understanding of how to contribute properly) is that the crowdsourcing site does not have a simple or effective enough functionality. For instance, help links and information should be easily found and if possible, directly from the area where participants will be giving their input, so that they will not be forced to leave that page and search through other pages. One way to do this is by providing pop-ups that open as separate windows while leaving the input map or report still open. A good way to do this is to configure the pop-ups so that they appear only when hovering over a ? icon that appears directly where participants might need guidance. For instance, just after an input field. This concept of having only one input page and not having to leave it would be useful not only for help links and guidance, but for participant input in general. If it is possible for the crowdsourcing deployer to create one map that is configured to allow participants to give inputs through pop-ups that open and accept information (i.e. reports) separately without leaving that main page, this would be ideal. While this is not shown in the prototype due to the limitations of the Crowdmap platform, this would be ideal for anyone who enough web programming expertise to design this functionality. 3.1.3 An improved guidance of participation When participants are guided towards a specific course of action, i.e. submitting a report classifying building use types in a specific area, it is likely that the information needed by deployers will be deployed. For instance, when it is determined which factors are of the greatest importance in the data collected, these factors should not only be expected, but they must also be guided in a methodological way. In turn, there must be a way to assess each type of guidance given for each specific outcome wanted.
58
Table 3.1 lists some of the spatial accuracy factors that are most commonly sought after in VGI and how they can be guided, as well as how they can later be assessed.
Accuracy factor Positional/Locational accuracy Attribute accuracy Method of guidance Help to trace the right building Place a point on the right spot on the map Help to give the correct description of the feature's characteristics (correct spelling of the name and address of the place) Help to fill out the form completely Help to describe how the information was obtained (i.e. by using satellite imagery or Google Street View) Help to provide your building with the correct category, i.e. with illustrated examples Help to give information about building damage, not any other kind of information (repeat purpose: to submit reports about buildings damaged by the 2010 earthquake) Method or questions for assessment Goodwill & Hunters buffer method Intersect Are the attributes close to their true values?
Completeness
Lineage
Indicate the percentage of attribute completeness for each contribution and in total compared to the reference data Have details been provided? Which has more details?
Semantic
Compare to reference data Check imagery
Thematic
Is this information fit for use? I.e. about buildings?
Table 3.1 How to guide van Oorts accuracy factors When examples of the data wanted are provided and examples of what is useful vs. not useful are given, the needs become even clearer. However, in order the type of guidance must be different for each participant type or user level. It is not enough to simply classify participants either as beginners or experts. The guidance given must be tailored specifically to the users needs, considering factors such as their computer literacy, mapping or GIS expertise, or familiarity with the subject matter. The result will be that not only will a deployer be able to offer some guidance or no guidance, but guidance that is more helpful for the variety of users that will be contributing. An improvement to tailoring guidance to the user level is to define how many levels are within each general user category, such as beginner (yellow), experienced user (orange) or expert (red), as one broad umbrella generally does not tend to fit all users. This can be done by first considering what the possible computer literacy levels of the users will be, since this type of literacy is necessary in order for any user to participate.
59
Table 3.2 illustrates the level of computer literacy corresponding to each sort of guidance given.
Computer literacy level 0 1 Criteria No computer skills Basic computer skills: Knowing how to power on the computer Being able to use a mouse to interact with elements on the screen Being able to use the computer keyboard Being able to shut down the computer properly after use Intermediate skills: Functional knowledge of word processing How to use e-mail How to use the Internet Installing software Navigating a computer's filesystem Advanced skills: Programming Understanding the problems of data security Use of a computer for scientific research Fixing software conflicts Repairing computer hardware
Table 3.2 Levels of participant computer literacy While all of the potential participants who will contribute to the project will be either GIS users or non-GIS users, it is necessary to also quantify their level of experience using or not using GIS. Table 3.3 illustrates the various possible levels.
Mapping /Expertise level 0 1 2 3 4 5 Criteria
No GIS experience Associated with a GIS initiative 0+ years of GIS work 10+ years of GIS work 20+ years of GIS work 30+ years of GIS work
Table 3.3 Levels of participant GIS/Mapping experience The GIS user levels can be defined even further by specifying the GIS expertise of the user.
60
Table 3.4 defines these GIS expertise levels from 1 (GIS end-users) to 5 (GIS professional).
GIS User level 1 Criteria
GIS data user (Sometimes use Google Maps or other online mapping tools) Data User: These are GIS end-users, who are concerned with querying and viewing GIS data, and perhaps with creating hardcopy output of GIS maps and associated information. Data Users do not create or modify data. While they may use GIS routinely in their day-to-day work, their primary job description is not GIS-oriented. Data Analyst: Data Analysts can be either GIS professionals or end-users. Like Data Users, they also query and view GIS data, and likely create hardcopy output using GIS. However, they tend to employ more sophisticated GIS methods, and create more complex and technically demanding maps than Data Users. They likely create and maintain GIS data for project-level use, but generally do not create and maintain data that is used beyond the scope of their immediate work group. Data Analysts may also provide support to other GIS users. Interested in GIS: Have used GIS in studies / occasionally Database/System Administrator: Database/System Administrators, and other IT support personnel may or may not be familiar with GIS concepts and software functionality, yet have a need to deal with GIS from a technical standpoint (such as maintenance of a RDBMS containing a geodatabase, or administration of desktop computers loaded with GIS software) as part of the IT support services they provide. GIS Decision-maker: While Decision-makers may properly be assigned to any of the other user categories, this category includes those individuals who have little or no experience with GIS, but must still deal with GIS issues. Decision-makers may be supervisors or managers who have GIS personnel working for them, or they may be project managers whose projects have a GIS component. GIS professional: Data Maintainer (Steward): Data Maintainers are GIS professionals who are usually also Data Analysts, but are also stewards of enterprise data (that is data used by multiple groups or agencies). They deal with data related issues of conversion, quality assurance and control, and metadata. Data Maintainers likely provide support to other GIS users. Developer: These are GIS professionals who are responsible for the development of GIS scripts, programs, and applications that are typically used by others. They deal with software development issues such as requirements definition, design, testing, deployment, troubleshooting, and operation. Developers nearly always provide support to other GIS users. Data Analyst: Data Analysts can be either GIS professionals or end-users. Like Data Users, they also query and view GIS data, and likely create hardcopy output using GIS. However, they tend to employ more sophisticated GIS methods, and create more complex and technically demanding maps than Data Users. They likely create and maintain GIS data for project-level use, but generally do not create and maintain data that is used beyond the scope of their immediate work group. Data Analysts may also provide support to other GIS users. GIS consultant, course instructor
Table 3.4 Levels of GIS expertise Based on these user levels, crowdsourcing deployers can determine what kinds of guidance should be given for the three different levels: beginners, experienced GIS users/mappers and for experts.
61
As shown in Table 3.5, deployers may even choose to give participants more options than the three to choose from.
Guided participation level 0 1 2 3 4 5 6 7 Criteria In-person/Skype assistance Autofilled example answers in fields, look at good example Illustrated examples - video Illustrated examples written/descriptive step-by-step instructions what info to give, how to give it Help links / pop-ups Pre-disaster imagery, bookmarked damage sites General overview/description of what to do None
Table 3.5 Corresponding levels of guidance given In the experimentation phase of the research (Chapter 4), three experiments are performed with these suggestions for improvement taken into account as the method used to guide participants. The methodology improves drastically from the first experiment to the third, with the resulting data quality being greatly improved. For instance, level 6 type guidance is given to experts with the option of having a quick start (meaning: as little guidance as possible and jumping right into providing data); while level 3 type guidance is given to experienced GIS users with general overviews and level 1 type guidance is provided to beginners with extensive how-to information allowing them to see what to expect at every stage of the process, and many optional examples and video tutorials. While for this project only reports submitted from the Crowdmap site (whether by PC or mobile technology) were accepted, if other forms of input, such as sms, email or Twitter messages are accepted, there must be separate guidance offered for these as well. One of the most important things to guide is that participants give actionable geographical information: that is, they provide an address, coordinates, a drawing on the map, or all of the above. As participants in the beginner user level are usually not geographers or unaware of what actionable geographic information is, it is not enough to simply tell them about the elements of spatial data quality, but participants must actually be guided to provide information that contains these elements. This can be done by providing tips, examples of good vs. bad information, helpful hints, etc.
3.2 A proposed prototype solution

A prototype solution example has been made in the form of a VGI web-mapping application in order to demonstrate that the above suggestions can be implemented. Here the various elements of its design are discussed. 3.2.1 Web design The interactive part of the proposed methodology consists of a WIKI with an OpenStreetMap-type interface that contains links or pop-ups with all of the instructions necessary to be informed properly about how to contribute to the map. The map is provided by the Crowdmap platform and users have the opportunity to submit data by submitting a report about individual features they see and identify on the map. The
62
Crowdmap platform provides the map as a mashup of existing services, such as Google or Bing Maps. Ideally one application that integrates both the crowdsourcing and guidance components would be created. As an alternative this experiment handles these separately, but through links found within the same portal. The Ushahidi Crowdmap platform was chosen as the VGI collection platform because it is easy to set up and use, as well as a straightforward way to show a map. Another positive factor is that it can be accessed from a mobile device, which is helpful in a crowdsourcing activity. Additionally, it is interesting to deploy one of the applications that is most commonly used for crowdsourcing projects and to see what it strengths and weakness are in practice. Most web apps do not allow for tracing, so by combining the layers required and using Crowdmap this has been made possible. While ideally the Ushahidi Crowdmap platform would be configured and deployed with the purpose of being the main user interface for participants to submit VGI for the project, the Crowdmap platform limits the extent of personalization and functionality that a site designer can control. Thus a Wikispaces site has been configured and is used as a supplementary site to the Crowdmap deployment. As ideally there will not be too much information on the Crowdmap site and the customization is limited, using the Wiki allows for extra flexibility in terms of design and personalization, but most importantly this is a site where extra information can be found by participants who need it as the main information site for the project. While the Crowdmap site contains all of the most basic information in order for VGI participants to provide information, the wiki provides an arena for the more in-depth information and instructions to be presented to specific user levels separately i.e. for beginners, for experienced GIS users and for experts. Different users have different requirements for instructions, thus an arena for separate instructions is important. The map options displayed are common to those normally used for humanitarian or land use mapping. They are thematic maps which are visual representations of data from the areas of Oslo, Christchurch, and Port-au-Prince. The primary aim is communication of a specific topic and themes of information, to display building classifications on a map. The symbology used draws the readers attention to important aspects of the distribution being mapped. These cartographic representations can be a powerful means of converting complex datasets into actionable information, allowing busy decision makers to quickly grasp overall trends and take appropriate action. 3.2.2 Functionality and usability Studies by Peng (2001) and Carver et al. (2001) have outlined recommendations for a successful application, which include elements such as allowing the user to evaluate geographic data; participate in some type of forum; provide data submit information and to see the results in order to build trust and create a sense of contribution and engagement.
63
Participants can use the prototype wiki to read instructions, see examples, watch videos and send an email with questions or comments. It is also a portal which provides access to the crowdsourcing sites that have been created to gather VGI about building classifications in Oslo, Christchurch and Haiti. (This is not done simultaneously, but in sequence, so the current wiki only contains guidance for the last experiment.) In terms of usability the wiki is not unlike most of the other wikis that exist in crowdsourcing initiatives, for example those used by Ushahidi and OpenStreetMap providing a general overview, along with more specific help tips, access to contact and links. However unlike these other sites, this wiki does not include any kind of forum. Since this is a short-term project which does not have an ongoing purpose and group of contributors who can become members, a forum is unnecessary. The Crowdmap platform allows participants to read about the project, view the study area on a map, view reports submitted and comment on or rate them, and to submit reports themselves. They can submit a report either directly from the site or from their mobile. From their mobile they will only be able to submit a title, description and location that they describe verbally, while using the Ushahidi app they will be able to provide more information and can view reports, but will have the submit the report from the actual location they would like to report about, since there is no possibility for them to specify the building on the map. These mobile options have much less functionality but can still have a high degree of usability, if the participant is guided to submit accurate geographical information. The introductory web pages instruct all participants on the background of the study, what is involved and how to contribute by submitting a report. Screenshots are used to highlight the functionality of the tool. After the introduction, participation is guided differently based on the users level of expertise: whether they are a beginner, an experienced GIS user or an expert. Beginners are guided through every step of the participation process and given examples of what information to submit, how to obtain that information, etc. They also have video tutorials to use as an example. All of the guidance offered to beginners is also summarized on one central help page, which answers commonly asked questions and can be viewed by any other users who are experiencing trouble, including experienced and expert users. Otherwise, experienced and expert users are only given a quick overview of the goal and how to participate, as well as a quick start option, rather than receiving step-by-step instructions for every part of the participation. As the tools are more familiar to these users, it is expected that they will only need guidance if they are unsure of something, and at this point they are independent and familiar enough to seek the help themselves. After participants have finished contributing to the project, they are asked to complete a short survey about the usability/ease of use of the tool and guidance given. 3.2.3 Monitoring data edits and inputs Each individual report contributed is listed in the report summary visible to all participants on the Crowdmap site, however the reports do not appear on the map until
64
they have been verified. Once they have been verified, reports are displayed (and can be filtered by) the status approved or not approved. For instance, information could have been verified but still incorrect, and therefore not approved. The sites were monitored daily and reports handled on the same day they were submitted.
3.2.4 Assessing data quality The quality of the data will be assessed using the following methods found in literature research: By considering van Oorts elements of accuracy By using Goodchild & Hunters buffer method, if applicable By performing a visual assessment of the attribute data By using simple GIS analysis tools, such as proximity (buffers); overlay (intersect, union) to analyze spatial relationships and patterns By observing VGI clusters By comparing current results to the 2010 and 2011 VGI to check for improvement By comparing current results to the 2010 and 2011 reference data By comparing the results obtained when different kinds of guidance are given for instance no guidance vs. the guidance used in 2010 or 2011 vs. using the improved methodology.
65
Chapter 4: Experimentation and Analysis

The experimentation comprises the qualitative part of the research, while the analysis comprises the quantitative part of the research.
4.1 Experimentation
Empirical (observational) design and applied research are carried out through experimentation.The purpose of the experimentation is to use and test the effectiveness of the proposed method for guided participation that has been developed during this research. First using a local environment, an experiment was built around participants contributing VGI about features of building usage types in Oslo. Participants were then led to provide geographic information about buildings damaged by earthquakes in both Christchurch and Haiti. Qualitative and quantitative assessments of VGI participants and the data they submitted were performed during experimental tests. Table 4.1 provides an overview of how the use cases relate to and differ from the experiments, with variables listed in the rows. In order to make a direct comparison and connection with the methodologies used in actual use cases scenarios, the use cases are also included in the table.
Use case 1: Oslo Building use type VGI Grade of building damage Remote mapping guidance Wiki guidance Participant familiarity Mapping party Tested vs. no guidance Defined study area Defined grid within the study area Suggested order for scanning the grid In-person mapping guidance Observe user mapping behavior patterns Influence user mapping behavior Improved method for guiding Use case 2: Christchurch Use case 3: Haiti Experiment 1: Oslo Experiment 2: Christchurch Experiment 3. A: Haiti Experiment 3. B: Haiti
Table 4.1 How the experiments relate to the use cases
The original improved methodology proposed was implemented in a prototype solution in order to illustrate how these guidelines can be provided in the right way (simple, comprehensive and at the right time). Participants used the prototype to submit VGI during the experiments, which was later analyzed using ArcGIS. The effectiveness of the system and
66
improvement of VGI is evaluated based on factors found in the literature review and investigations, but also based on user satisfaction surveys. Factors measured during the experiments included: timing - how much and the quality of mapping that could be done within a certain time limit; guidance extent of guidance used and results observed in differences; familiarity with the study area and patterns in user behavior, especially affected by outside influences. After testing different approaches, a choice was made as to what has been least and most effective and should be included or excluded from the proposed methodology and prototype. The final experiment test presents the most effective approach.
4.2 Experiment 1: Mapping building usage types in Oslo

Lessons learned from the participatory guidance techniques of the use cases have been formed into an original methodology that was tested to collect VGI about Oslo. While there have not been any disasters in Oslo city that would be directly comparable to the disasters mentioned in Haiti or New Zealand, there are certainly opportunities to use VGI for land use planning. 4.2.1 Design Purpose: Volunteer participants of varying levels of expertise will be given a task with varying levels of guidance and the differences will be measured and analyzed. They are asked to map various attribute information of buildings in Oslo. Technology used: A wiki is used as the main platform to introduce and explain the project to participants. It is not enough to have a map: participants must also know what the purpose of using it is, how to use it and what to do. Many Web 2.0 sites use some kind of wiki and/or blog format, so this is already a familiar format to participants. Figure 4.1 shows the layout of the wiki. A full overview of all wiki pages is included as an appendix.
67
Figure 4.1 Oslo experiment wiki layout 4.2.2 Participants With the belief that it is possible to reinforce the control on the production chain by establishing a standardized data creation method and by working with a limited number of well-trained volunteers, 55 target people were sent an email invitation to participate: 9 Family members - all beginners; 14 Friends - 8 of them beginners, 5 of them experienced mappers or GIS users and 1 expert; 38 work colleagues - 5 of them beginners, 11 of them experienced GIS users and 22 experts. Each group was sent a separate email, either in English or Norwegian, consisting of the following: Dear ______, I am specifically contacting you because I need test persons for my MSc thesis project and I believe that you would give an excellent contribution based on your background. Participating may take anywhere from a few minutes to many hours, depending on how much you would like to contribute! Every little contribution helps, so I appreciate 1-2 classifications, but I also appreciate more if you have the time to do so! Thus the time required for participating varies hugely depending on how much help you decide to view and how many reports you would like to submit. For experienced GIS users this may go quite quickly as these are all techniques that you are familiar with.
68
You can go to this site to start, which is the home site of my project: http://improvinggeocrowdsourcing.wikispaces.com/ - hopefully it explains everything that you will need to know. Participation consists of 2 different experiments: 1. 2. Experiment 1: The Oslo experiment you will need the password vgitestoslo to participate. Experiment 2: The re-mapping of earthquake damage in Christchurch, New Zealand - password: vgitestchristchurch
After you have finished participating, could you please complete the following short survey and send it back to me as a reply to this email? Thank you. Survey 1. a) b) c) d) 2. a) b) c) d) 3. a) b) c) d) e) f) 4. 5. How would you classify your experience using computers? None A little Experienced Advanced What is your experience using GIS? None A little Experienced Advanced How difficult you find it to participate? Very easy Easy Ok Slightly difficult Difficult Very difficult Please elaborate more about why you found it so easy or difficult here. What information, pages or sites of the project did you use to help you? Please list them here. What did you find the most helpful? Did you find that you had enough guidance for your user level (beginner, experienced GIS user, expert)? If not, did you use other help materials, which? How would you describe the fitness for use of the information you submitted? I.e. will the information you submitted be helpful for the purposes of : a) Making improvements to the buildings of downtown Oslo?
69
6. 7.
8.
b) 9.
Mapping damages in Christchurch for emergency relief?
Do you think there were any outside factors that influenced your performance? Please list those here: Did you pay attention to the accuracy factors mentioned on the site? How could this experiment improve or become better?
10. 11.
The sooner that you are able to contribute to the project, the better. Thanks so much for your help, I really appreciate it! 4.2.3 Methodology: Participant instructions and guidance One invitation was sent to all of the potential participants, asking them to participate in Experiment 1 and Experiment 2. Since these experiments were done remotely, the wiki needed to be informative. Instructions were published in English, but there was also a translated version for Norwegian participants. Altogether, there were 11 main pages on the wiki. First a welcome page explained the steps involved in participating, especially the fact that participating involved contributing to two different experiments: one for Oslo and one for Christchurch; then an introduction page provided participants with the background for the experiment, the goals and purpose of the project, examples of other similar projects they may already be familiar with, and geographical terminology used and the use of a wiki and Crowdmap sites were explained. A getting started page reiterated the fact that two different sites would be used, the help page for all users was mentioned and badges to be awarded to participants explained. The experiment was explained in the same way to all participants: The goal of the experiment was briefly explained in one sentence, along with an illustration of the study area limits. However, this page also had links for different user levels to follow (If you are a beginner/experienced GIS user, please continue reading here; If you are an expert, you can continue reading here, or begin with a quick start.) The quick start led expert users directly to the Crowdmap site where they could begin submitting reports immediately, without any instruction. There was also a link to a video tutorial for the Oslo experiment for anyone wishing to learn more. But from the getting started page, participants were led to contribute information through different forms of guidance using separate instructions and techniques. Beginners were led to a page with instructions tailored specifically for them. The text used was more descriptive and explanatory than other instructions, and using a simple vocabulary. Everything was explained, as it was assumed that these participants had no prior knowledge of this type of work. An explanation about why they were classified as beginners was offered; an illustrated preview of what they could expect to see when they clicked the Crowdmap link was shown and the interface explained; a link to the Crowdmap site which would open as a separate window was provided; a link to a list of
70
tips for providing accurate geographical information was provided, as well as a link to more information about how reports would be approved; an illustrated example of what the Crowdmap report looks like and what to fill out for each field was provided; extra examples of building classification is Oslo were provided. Experienced GIS users were led to a page explaining briefly the goal of the experiment, as well as the same preview of what a report looks like and what information to fill in that the beginners saw; links were provided to the list of tips for submitting accurate information and how the reports are approved, but experienced users were also encouraged to take part in the process of peer review- that is, leaving a comment, correction or credibility rating about inputs that others have given. Experts who chose to keep reading about the experiment were led to a page where the goal of the experiment was explained in one sentence, links were provided to examples, tips and more information, as well as the same illustration of how a report looks and should be filled out; experts were also encouraged to participate by providing a peer review for the inputs of others. After contributing to the experiment, all participants were led from their respective instructional pages to a final communal page where they could copy survey text, paste it in an email, complete the answers and submit. There were eleven questions about user background, difficulty experienced and which guidance was most helpful. 4.2.4 Results Nine people contributed thirteen reports of building type classifications for Oslo. Their details are outlined below in Table 4.2.
Participant Torleiv Anne User level Beginner Beginner Familiarity with topic Lives in Oslo Not familiar with disaster mapping or Christchurch Raised in Oslo Not familiar with disaster mapping or Christchurch Raised in Oslo Not familiar with disaster mapping or Christchurch No familiarity with Oslo Not familiar with disaster mapping or Christchurch No familiarity with Oslo Not familiar with Christchurch Lives in Oslo Not familiar with Christchurch Lives in Oslo Not familiar with disaster mapping or Christchurch Lives in Oslo Familiar with VGI and data collection Unknown Domain expertise Environmental risk consultant Has used a little GIS Guidance counselor Some computer literacy No experience using GIS Elementary school teacher Computer literacy No experience using GIS Antiques dealer Computer literacy No experience using GIS MSc GIMA student Computer literacy 1+ years using GIS GIS consultant Advanced computer literacy 4+ years using GIS GIS consultant Advanced computer literacy 6+ years using GIS GIS consultant Advanced computer literacy 25+ years using GIS Unknown
Torkild
Beginner
J.P.
Beginner
Martijn
Experienced
Ove
Experienced
Stian C.
Experienced
Gaute
Expert
Unknown
Unknown
Table 4.2 Experiment 1 participants
71
Two of the reports had errors: the first error was that one participant submitted information about a roundabout in Oslo. Although it was clearly stated that participants should only submit building type information, perhaps the category other misled this participant to think that other types of information could also be submitted, when Other meant other types of building category that were not already listed. The error is surprising considering that it came from an experienced GIS user, but this user also mentioned they need to read beginner instructions to understand what information to give, which proves that not only is guiding participants important, but giving the right kind of guidance for each user level even more important. The second error was that a non-Norwegian participant provided accurate attribute information for a building in Oslo, but placed a marker on the wrong spot on the map, providing the wrong coordinates for the location. This is hardly surprising, considering the user had no familiarity with Oslo, however guidance given for using different map layers could have improved. These errors are highlighted in Figure 4.2, showing the attribute data for the Oslo experiment.
Figure 4.2 Oslo VGI attribute table Feedback from surveys While most of the participants who responded to the survey classified themselves as having experienced or advanced computer literacy and GIS expertise, only half of them found it easy to participate and were able to submit satisfactory information. The participants who found it challenging to participate thought it was difficult to give information about places totally unfamiliar to them, in which case the guidance given was not good enough to reassure them that they too could submit information about this area. Most participants found the differentiated wiki instructions and in particular the instructions about how to complete the reports most helpful, while all other guidance
72
was ignored and outside technologies such as Google maps and Google Earth were used. In addition, all participants stated that that the pre- and post-disaster imagery provided were essential in order for them to make a classification. On the other hand, as most participants stated time as the greatest factor influencing their performance, it is likely that participants did not bother to look at all of the guidance information available. However, this may be an indication that too much guidance was given, so much so that participants could not be bothered to look at all of it. Fifty-five participants varying from beginner to experienced mapper to expert were contacted by email and invited to participate. Thirteen reports were received for Oslo.
4.3 Experiment 2: Re-mapping earthquake-damaged buildings in Christchurch, New Zealand

4.3.1 Methodology One email had been sent out to 55 potential participants inviting them to participate both in Experiment 1 and Experiment 2. The second part of the experiment involved mapping earthquake-damaged buildings in Christchurch. From Experiment 1 instructions, all participants were led back to a communal page explaining the goal of Experiment 2. They were told that they would attempt to map Christchurch building damages as if the earthquake had just happened, however there was no way to control or observe this. Background information about why and how VGI is usually collected in these types of disaster situations was given, then links directing participants to instructions made for their user level were provided. This time beginners and experienced GIS users were directed to the same instructions, while experts were directed to further instructions or another quick-start option (led directly to the Crowdmap site). Links to a video tutorial about how to contribute to the experiment were provided for anyone wanting to use this extra help. Beginners and experienced GIS users were led to the same page, where the goal of the experiment was briefly explained, a link to the Crowdmap site provided, a preview of the Crowdmap report shown along with instructions for how to fill out each field in the report, the study area limits shown, illustrated examples of damage classes provided, as well as extra examples of building damage classifications, and an explanation of how satellite imagery is used by volunteers to map disaster damage. A written and illustrated example of how to identify damage from satellite imagery was also provided. Experts who chose to read more instructions were led to a page explaining briefly the goal of the experiment, as well as the same descriptive and illustrated example of what information to provide for the reports, and links to further examples.
73
After contributing to this second part of the experiment, all participants were led from their respective instructional pages to a final communal page where they could copy survey text, paste it in an email, complete the answers and submit. There were eleven questions about user background, difficulty experienced and which guidance was most helpful. 4.3.2 Results Three people contributed five reports of earthquake building damages for Christchurch. Their details are outlined below in Table 4.3:
Participant Jelle User level Experienced Familiarity with topic Not familiar with Christchurch Familiar with VGI Not familiar with Christchurch Familiar with VGI, has studied it indepth Not familiar with Christchurch Familiar with VGI Domain expertise MSc GIMA student Computer literacy 1+ years using GIS MSc GIMA student Computer literacy 3+ years using GIS GIS consultant Advanced computer literacy 25+ years using GIS
Rene
Experienced
Gaute
Expert
Table 4.3 Christchurch experiment participants In contrast with the first experiment, most participants did not feel confident in volunteering information about damaged buildings in Christchurch, New Zealand. Only three people submitted information about damaged buildings in Christchurch. This may be explained by several factors: most participants were Norwegian and felt that they could only contribute information about areas that were familiar to them; but also, it was probably too demanding to ask participants to contribute to two different experiments and this may have seemed too complicated or even impossible to participants. Figure 4.3 shows the attribute data for reports submitted about Christchurch building damages.
Figure 4.3 Christchurch VGI attribute table Feedback from surveys One experienced user commented that there was not enough guidance for his user level and that he needed to look at beginner instructions in order to understand how to contribute. This was obviously confusing though, as the user still did not submit correct information and their report was not approved. Most participants found the differentiated wiki instructions and in particular the instructions about how to complete the reports most helpful, while all other guidance was ignored and outside technologies such as Google maps and Google Earth were used. On the other hand, as most participants stated time as the greatest factor influencing their performance, it is likely that participants did not bother to look at all of the guidance
74
information available. However, this may be an indication that too much guidance was given, so much so that participants could not be bothered to look at all of it. Design changes based on experiment results Fifty-five participants varying from beginner to experienced mapper to expert were contacted by email and invited to participate. Five reports were received for Christchurch. Although the guidance given was developed based on factors which were missing in the 2011 crisis, was the guidance given good enough? The conclusion was no, and so the entire wiki was re-worded and re-made, along with a new experiment and way of guiding based on a more methodological way of giving guidance, including: quantifying user levels beyond beginner, experienced and expert and tailoring the types of guidance given based on these quantified levels and simpler (yet still effective) guidance.
4.5 Experiment 3.a mapping party: Re-enactment of Haiti damage mapping

4.5.1 Methodology Rather than inviting people to participate by email and remotely, a group of five volunteer participants were invited to a (mock) emergency meeting at the United Nations Office for the Coordination of Humanitarian Affairs (UNOCHA) and briefed on the situation at hand, as though the date were January 24, 2010. Welcome to this urgent UNOCHA volunteer meeting. On January 12th a magnitude 7.0 earthquake struck a town 25 km west of Port-au-Prince, Haitis capital city. At least 52 aftershocks measuring 4.5 and above have been recorded. 316,000 people have died; 300,000 have been injured; 1,000,000 people are homeless; and 280,000 homes and residential buildings have collapsed or have been severely damaged. We already have a team of mappers on the ground that is gathering information, but we need your help to assess building damages from the satellite imagery as well, in case they are not able to map all of the damage. Thus participants (listed in Table 4.4) were led to re-enact the mapping of earthquake damaged buildings in Port-au-Prince, Haiti in person, in an official setting.
Participant Torleiv User level Beginner Familiarity with topic Not familiar with Haiti A little familiar with damage mapping Not familiar with Haiti Not familiar with damage mapping Not familiar with Haiti Not familiar with damage mapping Domain expertise Environmental risk consultant Computer literacy A little GIS/mapping experience Trainee, Italian embassy in Oslo; political scientist background No GIS experience IT consultant, systems developer No GIS experience Familiar with data management, IT and map development BSc student No GIS experience Familiar with free online mapping tools Military officer Some GIS experience, military field operations Familiar with using digital and non-digital maps to pinpoint locations
Stefano
Beginner
Henning
Beginner
Erlend
Beginner
Not familiar with Haiti Not familiar with damage mapping Not familiar with Haiti Not familiar with damage mapping
Thomas
Beginner
Table 4.4 Experiment 3 participants

75
4.5.2 Test 1: no guidance Methodology Due to the multitude of different actors, methods and tools used separately and in combination during the 2010 Haiti earthquake disaster, it was impossible to reproduce exactly the same methodology for guidance as given in the Haiti 2010 mapping. It is important to keep in mind that the Ushahidi Haiti project aimed to crowdsource information from local Haitians on the ground in the affected crisis area about trapped people and urgent, time-critical information, not from remote volunteers and not concentrating solely on damaged buildings. The louder the voices from the ground, the better the response will be and access to accurate and timely information from the ground during post-crisis response periods will enable humanitarian responders to act more efficiently were the philosophies of Ushahidi at the time, which was actually officially founded as an organization by acting upon the Haiti disaster. However the same level of guidance as used by the Ushahidi team in 2010 could be reproduced, as it was effectively none apart from help options on the forum and blog, which people on the ground in Haiti would not have had time to use. Experiment participants were told: You will map and report as many damaged buildings as possible, using the site www.haitidamage.crowdmap.com. During this first mapping test, you will not receive any guidance. Are you confident in participating? All but one optimistically answered Yes, fineno idea yet. The one not feeling confident had also participated in the earlier Oslo/Christchurch experiment, which he had found challenging and impossible to contribute fully to. Start time: 20:10, end time: 21:00 Results The five participants had a lot of important questions, such as: Where is the damage? What do the classifications mean? How can we mark the map? How can we find damage just by looking? One participant felt the need to write notes on paper, and after a few minutes, a participant who said they felt confident in beginning admitted that they were not able to contribute, they had no idea what to do. After almost one hour, thirteen reports had been submitted, but only a few participants felt confident enough to submit anything, and among those who did they were very uncertain. Feedback Participants found the experiment stressful and extremely difficult. They were unsure about what they observed in the imagery because they had nothing to compare with nor any information about what damage should look like. They also had no idea what else to write in the description other than damaged building in Haiti and had no idea how to use the report tools or why their report sometimes could not be submitted because of errors. While participants had many software-specific criticisms, such as the fact that the constant zooming in necessary being a nuisance, most of the stress was caused by having
76
no guidance whatsoever and they did not find the Crowdmap platform software to be intuitive. 4.5.3 Test 2: Guidance using a wiki Still re-enacting that the earthquake had just happening in Haiti, participants were told The website www.improvinggeo-crowdsourcing.wikispaces.com has been set up for you to report on damages from satellite images, with instructions for how to do so. Although all of the instructions are there, I will also be here to offer in-person assistance. The goal is still to map as many damaged buildings in Port-au-Prince as possible. Most participants felt more confident in participating with instructions and more guidance. Another version of the wiki was created especially for participants of the mapping party, using the following methodology. Methodology: In order to assess what guidance could have been helpful in the use case studies, a link was made between guided participation strategies and how they influence each type of data quality by developing methods for guiding and assessing each type of guidance. For instance, it has been stated that van Oorts accuracy factors are good factors by which to measure the quality of VGI given, but specific methods needed to be defined for how to guide participants to submit information with these kinds of accuracy, and specific methods for assessing these also had to be defined. While up until this point user levels had been simply defined as Beginner, Experienced or Expert, the user levels were elaborated and the specific methodology (not just guidance, little guidance, no guidance) developed in Chapter 3 (section 3.1.3) was tested in this experiment. The participants computer literacy (all but one participant who was advanced had orange intermediate skills) and GIS and/or mapping experience (one participant had mapping/GIS level 2 skills, all others had none, thus yellow beginner skills) were considered. The wiki was adjusted so that only the guidance designed specifically for the experiment was based on the relevant user levels was shown. Although level 7 of guided participation (none) was tested, this was with the intention of comparing our test results to the original guidance (also none) used by Ushahidi in 2010. However, as most of the experiment participants were either beginner or Intermediate level users, they received level 0-4 guidance. Level 0: in- person assistance Level 1: Example answers / good examples Level 2: Illustrated examples video Level 3: Illustrated examples descriptive Level 4: Help links
77
The wiki then guided participants in the following way: A homepage welcomed all participants, gave them an overview of the three steps required to participate read the instructions (either for beginners, experienced GIS users or experts), submit reports and complete the survey. An introduction page briefly presented the goal of the experiment, provided a link to a video about how to use the wiki, an explanation of using two sites, and referred to the help page. Then a getting started page asked users to click the link appropriate for their user level: Beginner, Experienced GIS user/mapper, or Expert. Beginners were reminded of their mission, given a link to the Crowdmap site, and a short bullet-point list of 3 main steps: 1) click submit a report 2) provide information in the report fields 15 other points to consider- and 3) submit the report) of instructions to follow in order to complete every field in the reports. The steps were provided in the order that was necessary in order to complete the report correctly, not in any random order. The only illustrations were one to show how to submit a report from the Crowdmap site (a need expressed earlier by Oslo experiment participants) and one chart showing classes of damage. Otherwise previews, examples and videos were optionally available by clicking on links, for those wanting to read or get more help. Accurate information was guided by including tips for participants to consider. Experienced GIS users and mappers were informed about the goal of the experiment with a short sentence and provided with a short list of twelve bullet point instructions for what information to complete in the report. These participants were not offered links to further help or examples and their instructions were worded simply but assuming that they had some prior knowledge or understanding of these things. For instance identify a damaged building by placing a mark on it or tracing around it implies that they are already familiar with the types of tools to do this and require no further guidance, or Use Google Maps in satellite mode to search for the same spot implies that they are already familiar with Google Maps. Overall the language used to instruct experienced users was appropriate for their expertise and understanding level. They were again encouraged to review the work of their peers. The instructions for experts were even less explanatory, but mentioning the most important information to provide in the report and giving a reminder that reports must be complete in order to be approved. This implies that an expert would be familiar with the concept of completeness for metadata. They were also encouraged to review the work of their peers. Survey: The third and final step for all participants after submitting reports was to complete the survey. The survey had 10 questions but also asked more indepth questions about which specific guidance methods were used and most helpful. At the end of the experiment participants were told The information you submitted will be useful, because it will be compared to the original 2010 information that volunteers submitted, and if it is better, there will be a method of guidance for improving the way we get this information in the future.
78
Figure 4.4 shows the layout of the wiki. A more detailed overview of the wiki pages is included as an appendix.
Figure 4.4 Layout of the wiki for the final experiment: re-mapping Haiti
Figure 4.5 Mapping party participants

79
Start: 21:17, Finish 22:25 Results: Five participants submitted twelve reports, this time all of the participants felt confident enough to submit information. However five of the reports were incomplete, missing details such as Name, certainty and imagery used. Figure 4.6 shows that two of the reports were not approved because the information provided in the description was vague and incorrect.
Figure 4.6 Mapping party Test 2 attribute table Feedback: Participants found it challenging to use the Crowdmap platform because they constantly needed to find the spot in the map they had previously identified, and did not find it easy to classify damage without having a pre-disaster image to compare to. It was challenging for them to use different sources of information (the report map and Google Maps) to grade damage. Ideally in a developed application, participants would be given one map interface with the ability to switch different layers on and off, instead of using different sources of information. Participants also commented that they would prefer instructions to be even simpler and to be told exactly what to do in this simple way. However, only one participant reported that they found it difficult to participate. Interestingly, the two participants who had the most IT and domain experience classified themselves as having a little experience using computers. But the participant with the least GIS experience classified themselves as experienced. While it does not matter since I know their true user levels and have included this in the findings, results did show that domain expertise did directly affect the results obtained: the least experienced user who classified himself as experienced provided the worst quality VGI, while the more experienced users who gave themselves a humble classification provided better information. Participants found the wiki instructions, in-person assistance and chart explaining the damage classifications to be the most helpful and did not use the other guidance methods. However, it was observed that time played a critical role here: because the experiment was performed during summer holidays and when most participants had very limited time, they did not feel that they had enough time or needed to look at all of the guidance offered. Again, this may have been an indication that too much guidance was offered.
80
4.6 Experiment 3.b Re-enactment of Haiti damage mapping using an improved methodology for guidance
Based on the participant feedback and results from test 2, a final and improved methodology was developed to test. Participants were given an even more defined area (shown in figure 4.7) in order to obtain enough information about the target area. The target area has been determined using areas already defined by EU JRC as having the most severe damage.
Figure 4.7 JRC building damage assessment atlas In addition, participants were given an order in which to scan the target area for damages, as shown in Figure 4.8:
Figure 4.8 Order of scanning for damage

81
The task at hand was more simply explained, with only information that was necessary. As the only task in this experiment was to submit VGI via reports, participants were given a short but concise bulleted list of instructions to follow which would allow them to complete the reports with good information. Participants were also given in-person/verbal examples of what is not helpful information. This time participants were told that they had to submit only reports that contained fully completed fields and using the guidelines provided, to ensure they had followed the guidance given. Since the only participants were beginners, only the beginners guidance on the wiki was improved. Again there were only three pages for all participants to follow on the wiki, corresponding with the three steps in participation. The instructions for beginners followed the suggestions of previous experiment participants, keeping it simple and telling them exactly what to do. The goal of the experiment was presented in one sentence and a link to Crowdmap provided. A step-by-step bulleted list of instructions guided participants in filling out the reports in the correct order and in an accurate way, with links to examples, previews and videos for those wanting more information.
Results: Two participants (details in Table 4.5) submitted thirteen high-quality (accurate) reports about a specific area of Port-au-Prince. Participant Torleiv User level Beginner Familiarity with the topic No familiarity with Haiti A little familiar with damage mapping No familiarity with Haiti Not familiar with damage mapping Background Environmental risk consultant Computer literate A little experience using GIS
Stian H.
Beginner
HR Manager Computer literate No experience using GIS Familiar with online mapping tools Table 4.5 Experiment 3.b Participants
Although neither of the participants were experienced GIS users, they had enough computer literacy and previous mapping skills to successfully participate. While in previous experiments there had been participants with greater GIS expertise, the factors that played the biggest roles in affecting the results of this experiment were: a) The methodology used was more effective and resulted in simpler yet effective instructions which participants were able to easily follow, and b) All of the guidance offered was consumed by the participants. This demonstrates that with the proper guidance given, any kind of crowdsourcing participant may contribute accurate VGI.
82
Figure 4.9 shows that a large improvement can be observed: the data is 100% complete all of the fields have been completed; participants have followed the step-by-step instructions precisely and have paid attention to the tips for accuracy, resulting in data that is described in a way that is more geographically accurate and helpful.
Figure 4.9 Experiment 3b attribute table The data is also free from any errors. Thus it is possible that with the right kind of guidance, crowdsourcing deployers can receive VGI that is complete and geographically actionable. This is the type of result that crowdsourcing deployers should aim for. Feedback: Both participants found it ok or fine to participate in this experiment (meaning, not easy, but not difficult either.) While they found the in-person help extremely useful (which could be replicated through Skype conversations or live chat in future deployments), they still found the Crowdmap platform frustrating to use and missed having pre-disaster imagery to compare to. Although both participants used all of the guidance provided in the wiki, they found using Google Maps, instructions on how to submit a report, examples of how damaged buildings appear in satellite imagery, the chart explaining damage classifications and the wiki instructions to be most helpful. While all of these forms of guidance had available in previous experiments, they had not been used due to the way they were presented. More of the guidance was used in this experiment because it was provided in a simpler and less complicated manner (short bullet points with links for those wanting to read more), and participants were guided to complete every individual field in the report in order using these simple steps and they felt that they had enough guidance or that there could even be less. While time was still the biggest outside factor which influenced results, the guidance given was simple enough that it was able to be consumed within the time available for the experiment.
4.7 Analysis of experiment results

The quantitative part of the research is in the analysis of experiment results. In order for the data to be analyzed in ArcGIS, it first needed to be prepared. The Ushahidi reports were downloadable in CSV (comma-separated-values) format, which requires a great deal of work before the data is presented in a format that ArcGIS can read. The following list outlines the
83
steps that were taken to prepare the data for analysis, but may also be used as a recipe for others who need help in dealing geographic data that is stored in CSV files.
4.7.1 Data preparation How to prepare CSV files for use in ArcGIS
Data preparation involved the following steps: In CSV format, use the replace function to replace all characters not accepted by ArcGIS in fields, such as , (,), *,+ and ; with nothing. Ensure that the number of commas in each report field is the same as the number of fields Open the saved CSV file using a text editor such as TextPad and save in .txt format Open a new Excel document and import data from .txt
Figure 4.10 Importing .txt data into Excel In the Text Import wizard, select delimited
Figure 4.11 Selecting delimited file type
84
Then choose commas as a separator, select next, then Finish and save.
Figure 4.12 Selecting commas as the delimiters Ensure that the Latitude and Longitude cells are formatted to Number format with 8 decimal places Ensure that there are no spaces in the field names Remove any extra or unnecessary commas from the description field, this can be time-consuming when manually sorting through many records. Delete any extra sheets, keeping only the sheet with the Lat/long data Save as an Excel 97-2003 Worksheet (.xls) Import the xls into ArcMap, then export it to .dbf format Open the attribute table and add two new fields called Lat and Long with the data type double For each new field, use Field Calculator to populate the new fields with the old values. ArcGIS will only be able to read these new double fields as the lat/long point information Use display XY data for the dbf file, ensuring that the Lat/Long fields are selected for the X and Y fields Since the data does not have any coordinate system, use define projection to select the coordinate system (i.e. GCS WGS 1984) that will align the report data with the rest of your data and/or data frame.
85
4.7.2: Data quality
a) Oslo VGI Figure 4.13 shows the reports submitted for Oslo displayed on the map.
Figure 4.13 Oslo experiment VGI There is a high degree of positional accuracy. While Goodwill & Hunters buffer method is not viable with this experiment data, all but one of the reports are from local inhabitants who are familiar with the areas they reported. In addition these were verified by comparing to Google Maps imagery. Unfortunately, the attributes for this dataset were not complete. This is partly a fault of the application, which configures required fields yet accepts empty required fields anyway, but also of the participants who chose not to complete all fields. Figure 4.14 Shows that not every participant identified themselves or provided information about their computer literacy or GIS expertise, nor did any of the participants bother to note the start and end time for their reports.
Figure 4.14 A selection of the incomplete Oslo experiment attributes

86
Although participants were encouraged in the guidance information to use Google Maps and Street view and to include this information in their report description, participants included this information in the closing survey rather than the data description. This indicates that the guidance given was not sufficient in order for participants to supply information about lineage. All of the participants gave the title of their report the name of the building, in this case all of the reports were about public places. This was semantically accurate and there were no spelling errors. All but one of the reports were fit for use, meaning that they concerned building types. The one unfit report concerned a roundabout. *Reports can still be viewed on the Crowdmap site: https://osloexperiment.crowdmap.com/ b) Christchurch VGI While the attribute accuracy, completeness, semantic and thematic accuracy of the data submitted for Christchurch was satisfactory (the participants used correct spelling of the names and descriptions of places; all of the fields had been completed; participants entitled their reports with short descriptions of the damage observed to the buildings; all of the information given was concerning buildings that had been damaged by the earthquake and was fit for use), the elements of positional accuracy and lineage did not satisfy the requirements set by the deployer. While a study area was defined and illustrated, three of the five points were outside of the study area, as shown in Figure 4.15 with the study area defined in red.
Figure 4.15 Christchurch VGI data
87
While the descriptions used to specify location were sometimes specific (corner of ferry road and Mathesons Rd or Beresford St. New Brighton, Christchurch, 8061), they were not specific enough for instance there were no specific building addresses. Other reports had very imprecise/unhelpful ways of describing the location, such as Just above the intersection of Aikmans road or The building located closest to Latimer Square as shown in Figure 4.16. And while participants specified imagery used to make classification in the closing survey, this was not included in the data description as requested.
Figure 4.16 Location and description fields Although reference data from LINZ (New Zealand government) is available for Christchurch, none of it overlaps with the experiment VGI and there is unfortunately not any original Ushahidi VGI from 2010-11 to compare with. *Reports can still be viewed on the Crowdmap site: https://christchurchearthquakemap.crowdmap.com/ c) Haiti building damage VGI As there were over 4,000 reports from the original 2010 Ushahidi Haiti project, it was not feasible to manually review each individual report. Instead, all reports containing the classification or words collapsed structure were used. These sometimes contained information about several issues, such as people trapped or emergency, so it is likely that had the category people trapped also been used, there would have been several hundred more reports to include in the analysis. While this could influence the results of my analysis positively, it is not feasible to manually sort so many reports in the given timeframe. In addition, it is clear after manually reviewing hundreds of these reports that it is not even feasible to compare these with the current experiment data, simply because they were produced by volunteers in two totally different contexts experiencing completely different factors which affected the information. However, for the sake of having an example, a small comparison will be made. Goodwill & Hunters buffer method is not applicable to this data, and while intersect methods such as selection by location are useful, but do not yield substantial results with this data.
88
Figure 4.17 shows the search parameters used for the selection.
Figure 4.17 Selection by location 2 points come within 50 meters of the National Palace, both are reports for the same building. The shortest distance that receives any result is 40m, where one point is within 40 meters of another report and these are reports about the same damaged building. Figure 4.18 shows the result of the search within 40m.
Figure 4.18 One point intersecting at 40 m
89
This is the only example of any reports that are within a certain distance (50m) of each other and concerning the same building. No other reports about the same building were found within this radius. Therefore a buffer analysis is unnecessary. Experiment data vs. 2010 Ushahidi data When comparing the experiment data to the original Ushahidi Haiti VGI from 2010, it is observed that while the title of the Ushahidi report is more descriptive, the experiment data gives a more precise location and explanation of the damage levels, as well as whether the information given is certain or not and how the certainty was determined. Thus the attribute accuracy can be said to have improved due to the guidance given during the experiment. Figure 4.19 shows the attribute data for the 2010 Ushahidi data while Figure 4.20 displays the attribute information for the experiment data.
Figure 4.19 Ushahidi Haiti data
Figure 4.20 Experiment data
In terms of completeness, the experiment data for this building is more complete. For instance, in the above illustrations you see that the experiment reports have three extra fields of information: Certainty, Imagery_used and Name.
90
While Figure 4.21 shows that the description is also more precise.
Figure 4.21 Comparison of the descriptions While many of the Ushahidi reports had varying levels of category accuracy (sometimes a report about a damaged building was given every category that was available), none of the reports were missing information for any of the fields. However, due to experiment participants believing that some of the reports fields were optional to complete, five of the experiment reports were missing information about certainty, imagery used and name. Thus for this particular experiment while accuracy of the information was better, the data was not as complete in total as the 2010 data. Most of the Ushahidi reports provide very detailed and precise information about the location and incident description. However, it is evident that the two datasets (Ushahidi Haiti report data form 2010 and the experiment data) should not be compared for a few reasons. Most of the Ushahidi Haiti reports from 2010 were submitted from people on the ground in Haiti, directly affected by the earthquake. Not only were they under time and situational danger pressures, but most of the information submitted was about emergencies, trapped people and rescue information. Although it is possible to infer damaged building information from many of the reports categorized as people trapped, this cannot be directly compared to any experiment data since it contains only general information about collapsed buildings and more information about the people affected. There were 4052 Ushahidi Haiti reports in total, and thus only approximately 200 of these which were categorized specifically as Collapsed structure were analyzed. While use of more categories would have allowed for more points to compare experiment data against on the map, the comparison of the two data types is not advisable regardless, considering the two different contexts of information provided. While the experiment was an effort to re-enact the mapping of damaged buildings in Haiti, experiments would never have been able to provide the same kinds of information that people in Haiti did in 2010. The descriptions were specifically describing building damages only and thus can only be compared to VGI data such as OpenStreetMap or others who collected data only about specific features. However, in these cases the methodologies wee different and have not been directly tested.
91
Experiment data vs. other VGI A Selection by location to find experiment VGI that intersects with 2010 OSM data results in five buildings that overlap. The search parameters are shown in Figure 4.22, while the resulting selection is shown in Figure 4.23.
Figure 4.22 Select by location all OSM buildings that intersect experiment VGI
Figure 4.23 Result of selection: Intersecting experiment and 2010 VGI
92
The attributes of the selected features are highlighted in Figure 4.24. As it is plain to see, the attribute information provided by OSM volunteers is neither detailed nor complete.
Figure 4.24 Atrributes of the OSM VGI In table 4.6, A comparison of the attribute information for the reports shows that the experiment data shows more and better information about the same buildings. OSM damaged building classification FID 1306: Experiment data FID 12:
FID 1794
FID 7
93
FID 53
FID 29
FID 1690
FID 19
94
FID 356
FID
Table 4.6 Comparison of OSM 2010 VGI vs. vs. Experiment data
95
Experiment data vs. Reference data Although the methodology cannot be directly compared, the quality of the information can. When the experiment data is compared with UNOCHA damage assessment data, the outcome is similar to that of the previous comparison there is more and better information provided by the experiment VGI, thus an improvement has been made. This is evident when looking at the attributes of the UNOCHA building data.
Figure 4.25 UNOCHA data attributes Of the two datasets only the experiment data provides details of how the information was obtained (imagery used). Compared to other VGI datasets such as OSM, this is an improvement in the lineage of the data quality. The semantic accuracy of the experiment data is also improved and better than the UNOCHA data. During the experiment plenty of illustrated examples and descriptions of building damage categories were provided, thus only one of the reports underestimated the level of damage to a building. However, many of the Ushahidi reports provide mixing categories that may or may not be appropriate for the report submitted. Lastly, experiment data was also improved and better than UNOCHA data in terms of thematic accuracy because the information given was fit for use, in terms of it being about building damage. This was a specific difference when compared to Ushahidi 2010 data, although as discussed perhaps the experiment data is better compared to OSM VGI, which was gathered in a similar fashion and with similar goals.
*Reports can still be viewed on the Crowdmap site: https://haitidamage.crowdmap.com/main .
96
Chapter 5: Discussion and conclusions

5.1 Introduction discussion of experiment results
Participants of the three experiments that were carried out during this research reported that lack of time was the biggest influence on their mapping behavior. Although user mapping behavior was best when experiments were carried out in person, participants till that a felt a lack of time was a considerable influence on their performance. The lack of time was due to experiments taking place during the participants summer holiday period. As anticipated, invitees were more eager to participate in classifying buildings for areas they were familiar with. However, given the proper guidance in a controlled setting, participants were able to give just as many and even better quality reports about locations they were totally unfamiliar with. The effect of differentiating the types of guidance based on the user level is positive and encouraged for all crowdsourcing projects. Over the course of three different experiments it was observed that the more differentiated these forms of guidance were and the better focused each type was, the better the quality of the VGI. Based on feedback from participants in the closing survey (Table 5.1), guidance that was short, simple, easy to understand and directly follow was most helpful and in particular they found guided illustrations about how to complete the reports; the bulleted list of ordered instructions and damage classification chart to be the most helpful and useful guidance tools in order to submit the necessary VGI.
97
Guidance Videos 1 How to use the wiki
Used Yes x
Why? No x x x x x x x x x Already familiar
How to use the maps
Already familiar
Illustrated images 3 How to submit a report 4 How damaged buildings appear in imagery
x x x x x x
The image was missing Very useful
Illustrated examples 5 How to use Google Maps to find damaged buildings
How to use Google Street View
x x x x x x x
Already familiar
Chart explaining Grades 1-5 building damage classifications
Bulleted list of wiki instructions
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Keep it simple + expect people to understand some content Use ? help symbols
Help page
Tips 10
Make sure you mark the right building
11
Make sure you spell the building name or address correctly
12
Use geographical words for direction (N, E, S, W of) to describe the location
In-person 13 Ask Sabrina
Table 5.1 Experiment guidance survey results
98
5.2 Scope and limitations

There were a few factors which limited the extent of this research and if considered in future efforts, could improve the possible results: Web programming expertise Due to limitations in web programming expertise, an all-encompassing web mapping application (combining an editable web service + guidance) was not possible. Nor was setting up the Ushahidi platform. An alternative was to use the hosted version of the Ushahidi platform, Crowdmap. Pre-disaster satellite imagery for Haiti The only downloadable pre-earthquake imagery for Haiti available was for use in the OSM editor JOSM, thus VGI experiment participants could not compare pre and post- earthquake imagery. Pre- and post- earthquake imagery were, however, available to import into ArcGIS and for analysis. It would then be possible to publish the map service and create a unique web mapping application showing pre and post disaster imagery as layers. Time of year It was challenging to get participation from colleagues during the summer months, when most people are away and/or on summer holiday. However, experiments with smaller groups of people were possible and perhaps even better for maintaining control. And although experts were involved throughout the research process in identifying professional information needs and areas where VGI initiatives could improve, the design of the improved methodology and prototype, in testing the prototype as experiment participants and in checking the accuracy of results, it was not possible to involve experts in microtraining of participants during the summer months.
5.3 Quality of VGI assessed data

As analysis of the experiment results has shown, there has been an improvement in the quality of VGI data due to adjustments and improvements made to the methodology during the three experiments. By the third experiment, participants were able to provide spatial data with the exact elements of accuracy that were needed by the crowdsourcing deployer. The methodology that was presented and used for the third experiment is an example of a methodology that has used the most effective techniques and that has resulted in the most accurate VGI. While level of expertise did influence results, the final experiment shows that such accurate VGI can be obtained regardless of expertise levels, so long as the right methodology is used to guide the participants.
99
5.4 Answers to sub-research questions

Satellite imagery representing specific map feature categories: While it is possible to detect building damages in satellite imagery, research by Yamazaki et al (2004) indicates that this is only possible with high-resolution satellite imagery that has at least a resolution of 0.6 m. Even then, only damages levels 3 (buildings surrounded by debris), 4 (partially collapsed buildings) and 5 (collapsed buildings) are visible. In most cases, it is necessary to compare pre-disaster imagery with post-disaster imagery to provide the most accurate damage classification. Further examples are provided with the Christchurch and Haiti use cases and their respective figures in Section 2.4. How crowdsourcing participants contribute to maps: Three different crowdsourcing tools were investigated OpenStreetMap, in which participants make direct edits or contributions to an editable map by importing or creating new data and placing it onto the map; Ushahidi, in which participants can submit reports about specific features which will appear on a main map after being approved; and the Geo-Wiki project, in which participants contribute to a global land use map by validating areas of conflicting categorization. In a disaster management situation, participants also often use pre- and post- disaster satellite imagery to judge levels of damage before they make a contribution to a map. These tools are discussed in detail in Section 2.3. Good methods to measure data quality: While some of the recommended methods (such as that proposed by Goodchild and Hunter) were not feasible to perform on such small point datasets as the VGI data collected from the experiments, other factors such as the most important elements of spatial data accuracy as suggested by van Oort and by comparing the tested methodology vs. no methodology and vs. the methodology used in 2010, as well as the test data vs. original 2010 VGI and reference data were good methods of measuring the quality of improvement of the VGI data. These methods are discussed in detail in Section 2.5. How existing sites validate and filter information: While Ushahidi provides a little information about how information in their deployments is verified and approved, this is not made known to everyone that participates it is only found in an administrator guide document. However, this information is not made publicly known at all for OpenStreetMap or Geo-Wiki project participants. This is discussed in further detail in Section 2.3. Existing tools and techniques for guiding participation: Most crowdsourcing tools offer wiki and blogs to guide their participants, but the guidance is not always offered in the simplest and most effective way. Some tools only offer some limited instructions. But all of these differ in extent and methodology used. The techniques that three investigated tools have used are critically discussed in section 2.4. How participation is guided for amateur vs. trusted contributors: While some sites do not acknowledge different user levels at all in the guidance provided, others divide only between beginner and expert levels. While the guidance given for beginners is not appropriate for all beginners and having only these two categories excludes a multitude of user levels and potential participant needs. How each of the investigated crowdsourcing tools handles guidance for the different user levels is presented in Section 2.3.
100
Factors affecting user mapping behavior patterns: There are two different types of use cases identified in this research land use and disaster management and the factors resulting from these different scenarios alone explain much of the differences in user mapping behavior patterns. For instance, VGI participants contributing information about a disaster are influenced by a sense of urgency and panic. The context in which information is given plays a large role, but the varying extent of participant expertise and interest sometimes plays a larger role. The factors that affect user mapping behavior patterns are discussed in Sub-section 1.3.2. They are also discussed as experiment participant feedback in the experiment results sub-sections in Chapter 4. Different forms of guidance for different forms of input: While in OpenStreetMap participants are able to contribute input in several different ways and there is guidance provided for these; Ushahidi participants can contribute information by sms, email, Twitter message or online report however there is only limited guidance given for what kinds of information to submit on the report, and not any of the other forms of input. This is part of the findings presented for Section 2.3. When experts should be involved in guiding participation: While this was impossible to investigate through literature research, through expert interviews and through the design of an original methodology for guiding VGI participants it was possible to outline when and how often experts should be involved in the crowdsourcing project. This is mentioned in Sub-section 3.2.2 but also in Section 5.2. The effect different levels of guidance has on VGI quality: Through experimenting with different levels of guidance for different user levels in three different experiments, it was possible to observe that such differentiation has a positive effect on the spatial accuracy of the VGI data. These results are discussed for each individual experiment in Chapter 4, but also in Section 5.1.
101
5.5 Conclusions
The main research question (What generic methods for guided-participation will improve crowdsourced VGI?) can now also be answered. The goal for this project was to develop an improved methodology for crowdsourcing deployers to use as a guideline for future deployments. Through an investigation of existing techniques and testing of original methods, experiment results show that when the proposed improved type of guidance is given, any kind of crowdsourcing participant may contribute accurate VGI. Based on experiment observations, the following generic methods for guiding participation are recommended. The crowdsourcing initiative must take into account collaboration, site functionality and guidance of participation. In particular, by applying the following suggestions: 1) The application should be as intuitive as possible, without too much guidance 2) Geographers and/or interest groups should be encouraged to volunteer, while other participants must be given special guidance so that they are not careless about the quality of the information they submit. 3) Participants should be given step-by-step instructions guiding them to submit exactly what is required. They should feel as if an expert is standing next to them providing guidance every step of the way, even though they are only using a wiki. 4) The more guidance, the better. This is especially true for crowdsourcing initiatives where participants are asked to contribute information about an unfamiliar area. While participants need guidance even for areas that are familiar to them, more and better guidance should be given for unfamiliar areas. 5) The guidance instructions must be short and simple. Participants contribute more and better information faster when there is a good and simple method to follow. If instructions are not simple and very explicit, participants will see some parts as optional and not provide complete information. 6) Educational elements must be optional, only for those interested or who feel the need for it (or worked in to the guidance offered) 7) Experienced GIS users and experts need guidance too. This means that guidance should be given for all different kinds of user levels. 8) The guidance must be different for each user level, but all levels should find it easy to participate. 9) Teamwork and/or peer review should be encouraged The above-mentioned recommendations are proposed as an improved methodology to guide VGI application developers and deployers and while it has been tested for the domains of disaster and land use management, it is applicable to crowdsourcing initiatives in other domains as well. While this methodology may be used for future crowdsourcing projects, it is also hoped that existing crowdsourcing tools such as OpenStreetMap, Ushahidi and the Geo-wiki project might take these recommendations into consideration for making improvements.
102
6. References
Australian Broadcasting Corporation. (2010). QLD Flood Crisis Map: Mapping community reports for QLD floods and recovery. Ushahidi-based crowdsourcing deployment. Available at http://queenslandfloods.crowdmap.com/main. Australian Broadcasting Corporation. (2010). ABC Emergency Floods. National emergency portal. Retrieved from http://www.abc.net.au/emergency/flood/queensland2010. Al-Bakri M. and Fairbairn D. (2010). Assessing the accuracy of 'crowdsourced' data and its integration with official spatial data sets. Paper for the Accuracy 2010 Symposium, July 2023, Leicester, UK. Available at http://www.spatial-accuracy.org/system/files/imgX06165606_0.pdf, Barrington, L., Ghosh, S., Greene, M., Har-Noy, S., Berger, J. Gill, S., Yu-Min Lin, A. and Huyck, C. (2011).Crowdsourcing earthquake damage assessment using remote sensing imagery. Article featured in the ANNALS OF GEOPHYSICS, Vol. 54, Pages 680 686. Available at http://www.google.no/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&sqi=2&ved=0CFUQFj AB&url=http%3A%2F%2Fwww.annalsofgeophysics.eu%2Findex.php%2Fannals%2Farticle%2F view%2F5324%2F5492&ei=v_IOULDCcL_4QT9sYC4Cg&usg=AFQjCNEGKgQg_fORd_tw9b7_DIEBfpcTVw&sig2=dSmdUDHGrkofh Nyuvg8fEg. Carver, S. (2001). Public Participation Using Web-Based GIS. Guest editorial featured in Environment and Planning B, Vol. 28. Available at http://www.envplan.com/epb/editorials/b2806ed.pdf. Ciepuch, B. and Mooney, P. (2011). Assessing the quality of open spatial data for mobile location-based services research and applications. Article featured in Archives of Photogrammetry, Cartography and Remote Sensing, Vol. 22, pages 105-107. Available at http://www.sgp.geodezja.org.pl/ptfit/wydawnictwa/krakow2011/APCRS%20vol.%2022%20p p.%20105-116.pdf. Coleman, D. (2010). Volunteered Geographic Information in Spatial Data Infrastructure: An Early Look At Opportunities And Constraints, page 6. Academic research paper. Department of Geodesy and Geomatics Engineering, University of New Brunswick. Available at http://www.gsdi.org/gsdiconf/gsdi12/papers/905.pdf. Coleman, D., Georgiadou, Y. and Labonte, J. (2009). Volunteered Geographic Information: the nature and motivation of producers, page 8-12. Article under Review for the International Journal of Spatial Data Infrastructures Research, Special Issue GSDI-11, submitted 2009-03-27. Available at http://ijsdir.jrc.ec.europa.eu/index.php/ijsdir/article/viewFile/140/223. Coote, A. and Reckham, L. (2008). Neogeographic data quality is it an issue? Paper given on behalf of consultingWhere Ltd. at the AGI Conference in September 2008. Available at
103
http://www.consultingwhere.com/resources/Neogeography+Data+Quality++is+it+an+issue+-+V1_1.pdf. Cooper, A., Coetzee, S., Kaczmarek, I., Kourie, D., Iwaniak, A. and Kubik, T. (2011). Challenges for quality in volunteered geographical information. Presentation of a report for the AfricaGEO2011 conference in Cape Town, 1 June 2011. Available at http://africageodownloads.info/5c_115_cooper.pdf. Colt Kommunikasjon. (2009). Jimmy wales kritiserer crowdsourcing. Media blog post. Available at http://coltpr.no/sosiale-medier/jimmy-wales-kritiserer-crowdsourcing/. CrisisCommons.(2011). Christchurch NZ Earthquake 21.02.2011 . Wiki for Crisis Commons crisis mappers. Available at http://wiki.crisiscommons.org/wiki/Christchurch_NZ_Earthquake_21.02.2011. CrisisMappers.(2011). Ushahidi Project Evaluation Final Report. Blog for Crisis Mappers. Available at http://blog.ushahidi.com/index.php/2011/04/19/ushahidi-haiti-projectevaluation-final-report/. Das, T. and Kraak, M.J. (2011). Does neogeography need designed maps? Paper for the 25th International Cartographic Conference. Presented on behalf of ITC, University of Twente, Enschede, Netherlands. Available at http://icaci.org/files/documents/ICC_proceedings/ICC2011/Oral%20Presentations%20PDF/B 3-Volunteered%20geographic%20information,%20crowdsourcing/CO-123.pdf. De Longueville, B., Smith, R.S., and Luraschi, G (2009). OMG, from here, I can see the flames!: A use case of mining location based social networks to acquire spatio-temporal data on forest fires. Paper for the 2009 International Workshop on Location Based Social Networks, Pages 73-80. Available at http://www.cs.rochester.edu/twiki/pub/Main/HarpSeminar/OMG_from_here_I_can_see_th e_flames-_a_use_case_of_mining_Location_Based_Social_Networks_to_acquire_spatio_temporal_data_on_forest_fires.pdf. DellOro, L. (2011). Geospatial Approaches to Damage Assessment: The Example of Haiti. Earthquake. Presentation on behalf of UNITAR-UNOSAT for the World Reconstruction Conference 10/5/2011 in Geneva, page 17. Available at http://www.preventionweb.net/files/globalplatform/entry_presentation~1lucawrcitinnovati onsgeospatialapproachestodamageassessmenttheexampleofhaitildov1.pdf. Eguchi, R., Huyck, C., Adams, B., Mansouri, B., Houshmand, B. and Shinozuka, M. (2003). Resilient Disaster Response: Using Remote Sensing Technologies for Post-Earthquake Damage Detection. Research paper. Available at http://mceer.buffalo.edu/publications/resaccom/03-sp01/09eguchi.pdf. Exel, van M., Dias, E. and Fruijtier, S. (2010). The impact of crowdsourcing on spatial data quality indicators. Academic paper for Vrije Universiteit. Available at http://www.giscience2010.org/pdfs/paper_213.pdf.
104
Fritz S., McCallum, I., Schill, C., Perger, C., Grillmayer, R., Achard, F., Krazner, F. and Obersteiner, M. (2009). Geo-Wiki.Org: The Use of Crowdsourcing to Improve Global Land Cover. Letter featured in Remote Sensing, Vol.1, Pages 345-354. Available at http://www.google.no/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&sqi=2&ved=0CFcQFj AB&url=http%3A%2F%2Fwww.mdpi.com%2F20724292%2F1%2F3%2F345%2Fpdf&ei=E_4OUJODKcfl4QTM5oDwBA&usg=AFQjCNHXrJ7yfqbAVs HgMsL1kldoRiZeDg&sig2=KjqTQxvzx5HN9aotGYsBIA. Goodchild, M. (2007). Citizens as sensors: The world of volunteered geography. Online publication for Springer Science+Business Media B.V. GeoJournal , Vol. 69, Pages 211221. Available at http://www.ncgia.ucsb.edu/projects/vgi/docs/position/Goodchild_VGI2007.pdf. Goodchild, M. (2008). "Commentary: whither VGI?" Article for GeoJournal Vol.72, Pages 237244. Goodchild, M. (2009) The Changing Face of GIS. Powerpoint presentation . Univesity of California, Santa Barbara. Available at http://www.geog.ucsb.edu/~good/presentations/geoinformatics.pdf. Goodchild, M. and Hunter, G. (1997). A simple positional accuracy measure for linear features. Article featured in the International Journal of Geographical Information Science, Vol.11, No.3, Pages 299-306. Available at http://www.geog.ucsb.edu/~good/papers/269.pdf. Goretti, A. and Di Pasquale, G. (2002). An overview of post-earthquake damage assessment in Italy. Paper for the 2002 EERI Invitational workshop, An action plan to develop earthquake damage and loss data protocols in Pasadena, California. Page 9. Available at http://www.eeri.org/lfe/pdf/italy_molise_goretti_pasadena_paper.pdf. Greene, R. (2002). Confronting Catastrophe: A GIS Handbook. Publication by ESRI Press, Redlands, California, Pages 18, 41-95. Haklay, M., (2008). How good is Volunteered Geographical Information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Article featured in Environment and Planning B: Planning and Design, Pages 9-12. Available at http://www.ucl.ac.uk/~ucfamha/OSM%20data%20analysis%20070808_web.pdf. Haklay, M. and Weber, P.(2008). OpenStreetMap: User-Generated Street Maps. Article for IEEE Pervasive Computing, Vol. 7. No.4, Pages 12-18. Available at http://discovery.ucl.ac.uk/13849/1/13849.pdf. Heipke, C. (2010). Crowdsourcing geospatial data. Article featured in ISPRS Journal of Photogrammetry and Remote Sensing, Vol.65, No.6, Pages 550-557. Available at http://www.sciencedirect.com/science/article/pii/S0924271610000602.
105
Heinzelman, J. and Walters, C. (2010). Crowdsourcing Crisis Information in Disaster-Affected Haiti. Special report 252 for the United States Institute of Peace, Center of Innovation for Science, Technology and Peacebuilding, Pages 1-16. Available at http://www.usip.org/publications/crowdsourcing-crisis-information-in-disaster-affectedhaiti. Helleranta, J. (2012). Humanitarian OpenStreetMap member in Haiti. Email communication 11.7.12. Hunter, G.(1999). New Tools for Handling Spatial Data Quality: Moving from Academic Concepts to Practical Reality. Article for URISA Journal Vol.11, No.2, Pages3, 25-34. Available at http://www.urisa.org/files/HunterVol11No2-3.pdf. Iacucci, A. (2011). Ushahidi Guide: A step-by-step guide on how to use the Ushahidi platform. The Ushahidi platform user instruction manual, page 87. Available at http://community.ushahidi.com/uploads/documents/Ushahidi-Manual.pdf. International Organization for Standardization. (2012). About. ISO Website. Available at http://www.iso.org/iso/about.htm. JRC/ISFEREA Team. (2010). HAITI Earthquake January 2010 (Port au Prince Centre) Damage Assessment Map. Available at http://ec.europa.eu/dgs/jrc/downloads/jrc_pp_haiti_map_05.pdf. King County GIS Center. (2007). KCGIS Training Curriculum. Framework for GIS education. Available at http://www.kingcounty.gov/operations/GIS/About/TrainingPlan.aspx. Koukoletsos, T. (2010). VGI data quality: a dynamic way of assessing OpenStreetMaps accuracy. Position Paper for the GIST Workshop on The Role of VGI in Advancing Science. Available at http://www.ornl.gov/sci/gist/workshops/2010/papers/Koukoletsos.pdf. Leson, H. (2011). How the Eq.org.nz site came about to help with the Christchurch earthquake. CrisisCommons wiki post. Available at http://crisiscommons.org/2011/02/24/how-the-eq-org-nz-site-came-about-to-help-with-thechristchurch-earthquake. Mooney, P., Corcoran, P. and Winstanley, A. (2010). Towards Quality Metrics for OpenStreetMap. Article for the GIS 10 proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, Pages 514-517. Available at http://www.cs.nuim.ie/~pmooney/websitePapers/ACMGIS-peter-SUBMITTED-FINAL.pdf. Morrow, N., Mock, N., Papendieck, A. and Kocmich, N. (2011). Independent evaluation of the Ushahidi Haiti Project. Independent evaluation by DISI for Ushahidi, page 25. Available at http://www.alnap.org/pool/files/1282.pdf.
106
Nordheim-Hagtun, I. and Meier, P. (2010). Crowdsourcing for Crisis Mapping in Haiti. Article for Innovations: Technology, Governance, Globalization, Fall 2010 issue, vol.5, No.4, Pages 81-89. Available at http://www.scribd.com/doc/48600321/Ida-and-Patrick. Oort, P. van. (2006). Spatial data quality: from description to application, PhD Thesis, Wageningen University, NL, Pages 11, 13, 16, 132. Available at http://www.ncg.knaw.nl/Publicaties/Geodesy/pdf/60Oort.pdf. OpenStreetMap.(2011). 2011 Christchurch earthquake. OSM wiki for the Christchurch earthquake mapping. Available at http://wiki.openstreetmap.org/wiki/2011_Christchurch_earthquake. Oslo Kommune.(2006). Gateadresser og tilhrende bydel. Municipal information webpage. Available at http://www.oslo.kommune.no/om_oslo_kommune/bydelsoversikt/?WT.svl=global_menu. Peng, Z. (2001). Internet GIS for Public Participation. Article featured in Environment and Planning B: Planning and Design, Vol.28, No.6, Pages 889-905. Available at http://www.google.no/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CFgQFjAB&url =http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.128.260 3%26rep%3Drep1%26type%3Dpdf&ei=TgoPUJ2QL4Xl4QSCxICACg&usg=AFQjCNHlO6jTmAOd tTUGyUdsYlrheyaBtw&sig2=bJPPnaZ0RQ-cdWmCLuvQog. Poser, K., Dransch, D. (2010). Volunteered geographic information for disaster management with application to rapid flood damage estimation. Article featured in Geomatica, Vol.64, No.1, Pages 89-98. Available at http://edoc.gfzpotsdam.de/gfz/get/15132/0/4528bbeaa81db7bf9fc24b501313277c/15132.pdf. Poser, K., Kreibich, H. and Dransch, D. (2009). Assessing Volunteered Geographic Information for Rapid Flood Damage Estimation. Paper for the 12th AGILE International Conference on Geographic Information Science 2009. Pages 1-9. Available at http://www.ikg.unihannover.de/agile/fileadmin/agile/paper/117.pdf. Rak, A., Coleman, D. and Nichols, S. (2012) Legal liability concerns surrounding volunteered geographic information applicable to Canada. Pages 125-138. Academic paper for the Department of Geodesy and Geomatics Engineering, University of New Brunswick, Canada. Available at http://www.gsdi.org/gsdiconf/gsdi13/papers/256.pdf. Ramage, S. (2010). User-generated spatial content and the need for SDI standards. OGC paper for GSDI 10 Conference. Available at http://www.gsdi.org/gsdiconf/gsdi12/papers/105.pdf. Rinner, C. and Bird, M. (2009). Evaluating Community Engagement through Argumentation Maps - A Public Participation GIS Case Study. Article featured in Environment and Planning B, Vol.36, No.4, Pages 588-601. Available at http://digitalcommons.ryerson.ca/cgi/viewcontent.cgi?article=1010&context=geography.
107
Satellite Imaging Corporation (2012) Hurricane, Tornados and cyclone hazard mitigation. Webpage about damage assessment using remotely sensed imagery. Available at http://www.satimagingcorp.com/svc/hurricane_mitigation.html. Schade, S., Luraschi, G., De Longueville, B., Cox, S. and Diaz, L. (2010). Citizens as sensors for crisis events: Sensor web enablement for volunteered geographic information. Paper for the WebMGS 2010 Conference, page 2. Available at http://www.isprs.org/proceedings/XXXVIII/4-W13/ID_02.pdf. See, Dr. Linda. (2012). IIASA Geo-Wiki development team member, email communication 24.5.2012. Senaratne, H., Gerharz, L., Pebesma, E., Schwering, A. (2012). Usability of Spatio-Temporal Uncertainty Visualization Methods. Accepted paper AGILE 2012 conference. Available at http://ifgi.uni-muenster.de/~epebe_01/agile2012.pdf. Silverman, C. (2011). Best Practices for Social Media Verification. Online article for Columbia Journalism Review, page 6. Available at http://www.crowdsourcing.org/document/bestpractices-for-social-media-verification/4411. SERTIT. (2011). Christchurch, New Zealand city center damage assessment map - detailed. Available at http://reliefweb.int/sites/reliefweb.int/files/resources/D11973B5BAB988618525784200694 DCE-map.pdf. Sithole, M. (2012). Making order out of noise, reconfiguring the VGIS applications into meaningful spatial data structures and environmental decision support systems, theoretical concepts behind the integration of VGI and spatial data structure. A Critical review of VGI data quality. Online article. Available at http://uwaterloo.academia.edu/MunyaradziSithole/Papers/1533072/Making_order_out_of _noise_reconfiguring_the_VGIS_applications_into_meaningful_spatial_data_structures_and _environmental_decision_support_systems_theoretical_concepts_behind_the_integration_ of_VGI_and_spatial_data_structure._A_Critical_review_of_VGI_data_quality. Stark, H. (2010). QUALITY ASSESSMENT OF VGI BASED ON OPEN WEB MAP SERVICES AND ISO/TC 211 19100-FAMILY STANDARDS. Paper for the University of Applied Sciences Northwestern Switzerland, Institute of Geomatics Engineering.. Pages 1-17. Available at http://2010.foss4g.org/papers/3264.pdf. Sui, D., Elwood, S. and Goodchild, M.(2012). Volunteered Geographic Information, Public participation, and crowdsourced production of geographic knowledge. Post on Po Ve Sham, Muki Haklays personal blog. Available at http://povesham.wordpress.com/2011/11/27/citizen-science-as-participatory-science/. Tveite, H. and Langaas, S.(1999). An accuracy assessment method for geographical line data sets based on buffering. Article featured in the International Journal of Geographical
108
Information Science, Vol.13, No.1, Pages 27-47. Available at http://spatialnews.geocomm.com/whitepapers/pap-scan.pdf. URISA. (2007). GISCorps Volunteer Deployment Handbook. Handbook for URISA volunteers. Available at http://www.giscorps.org/documents/vol_handbook.pdf . Ushahidi. (2011). Ushahidi Guide to verification. Community resource. Available at http://community.ushahidi.com/uploads/documents/c_Ushahidi-Verification-Guide.pdf. UNITAR, UNOSAT, EC, JRC and The World Bank. (2010). Atlas of building damage assessment, Haiti earthquake 12 January 2010. In support to Post-disaster needs assessment and recovery framework. Atlas series of main affected cities in Haiti, Page 1. Available at http://unosat-maps.web.cern.ch/unosatmaps/HT/EQ20100114HTI/PDNA_HTI_EQ2010_AtlasPaP_v1_HR.pdf. UNOSAT. (2010). Haiti: Building Damage Assessment - In support to Post Disaster Needs Assessment and Recovery Framework (PDNA). Building damage poster for Haiti. Available at http://unosat-maps.web.cern.ch/unosatmaps/HT/EQ20100114HTI/PDNA_HTI_EQ2010_BuildingDamagePosterA0_v1_LR.pdf Wikipedia. (2011). Computer Literacy: Computer skills. Wiki article. Available at http://en.wikipedia.org/w/index.php?title=Computer_literacy&action=history. Yamazaki, F., Kouchi, k., kohiyama, M., Muraoka, N. and Matsuoka, M. (2004). Earthquake damage detection using high-resolution satellite images. Pages 1--4. Academic paper. Available at http://staff.aist.go.jp/m.matsuoka/others/IGARSS04_yama.pdf Zielstra, D. and Hochmair, H., (2011). Digital Street Data - Free versus Proprietary. Online article for GIM International, July 2011, Vol.25, No.7, Pages 29-33. Available at http://www.gim-international.com/issues/articles/id1739-Digital_Street_Data.html.
109
7. Appendices
The appendices include all of the original wiki guidance given to experiment participants and their survey answers. These can be found on the wiki: http://improvinggeocrowdsourcing.wikispaces.com/.
110

Sabrina Grimsrud MSC GIMA Thesis

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Sabrina Grimsrud MSC GIMA Thesis

Diunggah oleh

Hak Cipta:

Format Tersedia

Improving geo-information through guided participation

By Sabrina Grimsrud August 2012

Keywords: Crowdsourcing; VGI, Guided participation, data quality

Site functionality ......................................................................................................................... 52 Guidance........................................................................................................................................ 52

CHAPTER 3: DESIGN AND DEVELOPMENT OF A METHODOLOGY FOR IMPROVED GUIDED PARTICIPATION............................................................................................................. 56

CHAPTER 4: EXPERIMENTATION AND ANALYSIS ............................................................... 66

CHAPTER 5: DISCUSSION AND CONCLUSIONS...................................................................... 97

Tool Google Maps

Purpose Web Mapping Service

User features Customize and save maps Rate places

Crowdsource data for areas not already covered by Google maps

Lets describe the whole World!

Users can add information in the form of a note to the map

Figure 1.2 Geotagged photos for Oslo, Geo-Wiki project

1.4 Research objective

1.5 Research question

1.6 Sub-research questions

1.7 Research methodology

Chapter 2: Investigation of existing crowdsourcing tools and guided participation techniques

2.2 VGI concepts and applications

2.3 VGI tools

Figure 2.2 Ushahidi Haiti report

Figure 2.4 Geo-wiki classification method

Figure 2.5 Simple validation

Figure 2.6. Geo-Wiki validation submission form

Figure 2.9. Top Geo-Wiki user classification scores

2.4 The crowdsourcing tools in action: case studies

Figure 2.12 Christchurch before

Figure 2.13 Christchurch after

Figure 2.14. Pre- disaster satellite image

Figure 2.15. Post-disaster aerial photo

2.5 Quality assessment of crowdsourced data

Figure 2.21. Positional accuracy across five areas of London

Figure 2.22. An example of good positional accuracy

Chapter 3: Design and development of a methodology for improved guided participation

3.1 Strategies for implementing improvements

Figure 3.1 Study area: downtown Oslo

Compare to reference data Check imagery

Is this information fit for use? I.e. about buildings?

3.2 A proposed prototype solution

Chapter 4: Experimentation and Analysis

Table 4.1 How the experiments relate to the use cases

4.2 Experiment 1: Mapping building usage types in Oslo

Mapping damages in Christchurch for emergency relief?

Table 4.2 Experiment 1 participants

4.3 Experiment 2: Re-mapping earthquake-damaged buildings in Christchurch, New Zealand

4.5 Experiment 3.a mapping party: Re-enactment of Haiti damage mapping

Table 4.4 Experiment 3 participants

Figure 4.5 Mapping party participants

Figure 4.8 Order of scanning for damage

4.7 Analysis of experiment results

Figure 4.11 Selecting delimited file type

4.7.2: Data quality

Figure 4.14 A selection of the incomplete Oslo experiment attributes

Figure 4.15 Christchurch VGI data

Figure 4.18 One point intersecting at 40 m

Figure 4.19 Ushahidi Haiti data

Figure 4.20 Experiment data

Figure 4.23 Result of selection: Intersecting experiment and 2010 VGI

Chapter 5: Discussion and conclusions

Guidance Videos 1 How to use the wiki

Why? No x x x x x x x x x Already familiar