CSCW 2016 - How To Hackathon: Socio-Technical Tradeoffs in Brief, Intensive Collocation

How and When Do Hackathons for Scientific Software
Work? Insights from a Multiple-Case Study

1st Author Name
Affiliation
City, Country
e-mail address
2nd Author Name

Affiliation
City, Country
e-mail address
ABSTRACT
Scientific communities are experimenting with hackathons,

short-term intense software development events, in order to
advance the technical and social infrastructure that supports
science. We know little empirically, however, about the
outcomes of these hackathons and even less about how to
plan them to maximize the likelihood of success. This paper
aims to fill that gap by presenting a multiple-case study of
the stages a hackathon goes through as it evolves and how
variations in how stages are conducted affect outcomes. We
identify practices across the preparation, execution, and
follow-through stages of a hackathon that meet the
specialized needs of scientific software. Differences in the
kinds of disciplines included, classes of users, and team
formation strategies reveal tradeoffs among technical
progress, surfacing user needs, and building community.
Our findings have implications for future empirical studies,
the kinds of technology support that need to be in place for
hackathons, and funding policy.
Author Keywords
Scientific software;
qualitative methods.
hackathons;
multiple-case
study;
ACM Classification Keywords
H.5.3. [Information interfaces and presentation (e.g., HCI)]:

Group and Organization Interfaces Computer-supported
cooperative work.
INTRODUCTION
Software is of central importance to modern scientific

practice [1416,40]. Scientists often write their own
software, from small scripts that process data and create
figures for publications to large workbench applications
that integrate visualization, simulation, and analysis.
Scientists quite often replicate each others efforts,
however, because they do not realize they have common
needs. Some proportion of this software is an invisible
Paste the appropriate copyright/license statement here. ACM now supports
three different publication options:
ACM copyright: ACM holds the copyright on the work. This is the
historical approach.
License: The author(s) retain copyright, but ACM receives an
exclusive publication license.
Open Access: The author(s) wish to pay for the work to be open
access. The additional fee must be paid to ACM.
This text field is large enough to hold the appropriate release statement
assuming it is single-spaced in Times New Roman 8-point font. Please do
not change or modify the size of this text box.
Each submission will be assigned a DOI string to be included here.
3rd Author Name

Affiliation
City, Country
e-mail address
resource that scientists might be willing to share [15].

Because software is easily replicated and distributed, it can
in theory be collectively enhanced and maintained.
In practice, however, even when qualified and motivated
people are available, a lack of human infrastructure [24]
and users unfamiliarity with the codebase [3,34] presents
barriers to contribution. To overcome these barriers,
scientific communities are experimenting with hackathons,
short-term intense events where teams of scientists from
academia and industry, postdocs, graduate students, and
software developers collaborate face-to-face to share and
develop software. Prior research suggests that hackathons
may be effective ways to attract and train new contributors,
learn about the technical details of users needs, and create
and enhance ad hoc teams [18,22,25,39]. Other than
informal evidence from scientific communities who have
held hackathons, CSCW researchers know little empirically
about the immediate outputs of these hackathons, and we
know even less about how to plan them to maximize the
likelihood of success.
A key challenge for design is that hackathons vary along
many dimensions (e.g., duration, size, goals, agendas). Prior
research, however, has identified underlying practices that
seem to be common across different instances [25,39]. For
example, soliciting use cases from participants provides an
occasion to identify community needs, which is important
even if the rest of the hackathon is unsuccessful. These
practices influence downstream hackathon activities, such
as defining technical objectives, who will work on what and
with whom, outputs of the hackathon, and what lasting
impact, if any, the hackathon will have. Therefore, to
provide useful, evidence-based design guidance, it seems
desirable to understand the entire lifecycle of a hackathon.
We therefore ask the following questions:
(1) What are the stages a hackathon goes through as it
evolves?
(2) How do variations in how stages are conducted affect
outcomes?
To answer these questions, we conducted a multiple-case
study [42] of three hackathons applied to scientific
software. We collected multiple sources of evidence,
including online documentation to understand hackathon
planning practices and task progress, 71 hours of on-site
observations to understand event dynamics, and 23 semi-
structured interviews to understand in more detail the kinds

of interactions we observed and the reasons for them. We
conducted pre and post surveys to understand participants
preparation activities, their perceived outcomes, their
satisfaction with outcomes, and their reasons for being
satisfied or dissatisfied. Finally, we extracted committed
changes and issues from hackathon source-code repositories
to triangulate on our qualitative data and compare outputs
from each hackathon.
Our findings group hackathon activities under three stages:
a preparation stage marked by idea brainstorming, learning
about tools and research profiles, and preparing tools and
datasets; an execution stage marked by team formation,
building solutions, knowledge sharing, and building social
ties, and a follow-through stage marked by reification of
ideas, stimulation of user engagement, and maintenance of
social ties. Differences in the kinds of disciplines included,
classes of users, and styles of team formation reveal
tradeoffs among technical progress, surfacing user needs,
and building community. In the following sections we
review related research, describe our study, present our
results, and discuss the implications of our findings.
BACKGROUND
What is a Hackathon?
A hackathon can be defined as a short-term event where

computer programmers and others involved in software
development collaborate intensively on software projects.
The term is a portmanteau of the words hack and marathon.
In this context, the word hack refers to computer
programming in the exploratory, not criminal, sense.
A hackathon typically begins with presentations about the
event, as well as about the subject of the hackathon if any.
Attendees then suggest ideas and form teams based on
individual interests and skills. The main work then begins,
generally lasting from one day up to one week. At the
conclusion of the hackathon, there are presentations where
teams demonstrate their results. If the hackathon is a
competition, a panel of judges selects winning teams and
awards prizes.
What is the Point of a Hackathon?
Briscoe and Mulligan [2] loosely group hackathons as being

either tech-centric or focus-centric. Tech-centric
hackathons aim at software development with a specific
application or technology. Focus-centric hackathons, in
contrast, apply software development to address a social
issue or business objective.
Technology companies such as Google and Yahoo! tend to
hold tech-centric hackathons to grow their user base and
create demand for their products. For instance, Open Hack
Day, a hackathon run publicly by Yahoo! since 2006 has
focused on having third-party developers learn about and
use Yahoo! APIs (e.g., Flickr) to build novel software
applications and win prizes [11].
On the one hand, open-source software projects like PyPy,

OpenBSD, and Linux put on tech-centric hackathons to
rapidly advance work on specific development issues. On
the other, incorporating new developers into the project to
replace ones who leave is a key concern. As such, during
hackathons, core developers work in pairs with newcomers
to help them learn project conventions and details of the
codebase [32].
National and local government agencies tend to hold focuscentric hackathons in order to address social issues such as
crisis management, open government, and health. For
instance, in 2014 the British Government ran
DementiaHack, a hackathon dedicated to improving the
lives of people living with dementia. CareUmbrella, an app
that lets users create tags for things around the house that
when tapped, activate audio recordings explaining how they
work or the story behind them, won first prize [9].
In contrast, technology companies like Google, Facebook,
and Yahoo! put on focus-centric hackathons to encourage
new product innovation. For instance, Facebooks Like
button was conceived at one of the companys hackathons
[19]. Hackathons also complement routine software
development, addressing the need to explore ideas that
involve high market and technical uncertainties [30].
While hackathons have been commonplace among
professional developers for some time [1,28], now
hackathons organized by and for students and scientists are
surging in size, scale, and frequency for networking,
recruiting, pitching, and learning. For instance, in 2014,
there were some 40 intercollegiate hackathons. This year,
more than 150 are expected [23]. A large reason for this is
that, in comparison to a job fair, the chaotic environment of
a hackathon allows recruiters to more easily identify
students who are likely to thrive in a technology career.
Likewise, students can test-drive the experience of working
in a technology company before committing to a job.
Scientists are using hackathons to advance the ecosystem of
software that supports research, encourage collaboration
between communities working on related problems, and
train scientists in software development. For example, at
Mozilla Science Labs 2014 global hackathon, over 22
cities were represented over two days, including scientists
from Auckland to Melbourne, to Paris and London, to New
York and San Francisco. Products included tutorials and
other learning materials on topics like data analysis and
software development, and tools for reproducibility in
science, such as extracting scientific facts from publications
[38].
The Tension Between Needs of Scientific Software and
Open-Source Software in General
The open-source software development model is widely

held up as an approach for scientific software developers to
follow. In both cases, communities who are geographically
and organizationally dispersed write and maintain the
software. Both depend on new contributors [20:179], but

newcomers face both technical and social barriers to
contribution. For instance, developers placing their first
contribution into an open-source software project often
have trouble using a projects libraries and frameworks, as
well as finding able and willing mentors [3,34].
However, directly applying the open-source model to
scientific software development neglects important
differences between scientific software and open-source
software in general. First, scientists who build tools are
serving their own, possibly idiosyncratic short-term needs
[21]. As a result, there is little opportunity for other
scientists to learn what tools are available. Second, even if
the tools could be of much more value to a community if
built in a particular wayfor instance using popular data
structures and the newest versions of libraries and
frameworksthe scientists building them often lack
knowledge of these needs as well as the incentives to meet
them [14,15].
In contrast, reputation is an effective incentive for
contribution in open-source software [31], where the
number of followers a developer has is a signal of social
status [7]. Third, unlike open-source software, much
scientific software only remains active for the span of the
research grants supporting them. Even if scientists
understand the larger communitys needs and work to serve
them, the time scale of their available resources may not
match the community needs. Fourth, scientists who would
be willing to invest the time to adapt, extend, and maintain
these tools may be deterred by the lack of human
infrastructure [24] and unfamiliarity with the codebase. It is
daunting to learn about a code base to make useful
modifications, especially when one has little or no
connection with the codes authors. Furthermore, most
scientists are never formally taught how to build, validate,
and share software well [10,29].
Fortunately, informal evidence suggests that hackathons
may be a good fit to both the specialized problems of
scientific software and open-source software in general.
Interactions with other attendees and tutorials expose
participants to new tools [25,39]. Setting the agenda of the
hackathon provides an occasion to discuss and prioritize
community issues and needs [18,22,39]. Incentives are
built-in; to do their work, scientists need tools, and will
invest the time and resources to create them. Hackathons
can be repeated, potentially bringing in new recruits [25,32]
and renewing the human infrastructure over time. Finally,
people new to a code base can get a gentle introduction,
with hands-on experience and mentoring [5,22,25,32,39].
However, there is as of yet little empirical support for these
claims.
Benefits of Face-to-Face Interaction
Temporary collocation can speed up software development

work that is normally coordinated remotely [27,37]. When
team members are collocated, they can move visible
artifacts, mark them to reflect mutually agreed-on changes,

and easily consider issues and alternatives [27]. In a field
study of an automobile company using radical collocation,
an extreme form of collocation where all team members
work a few feet from one other in the same room, Teasley
et al. [37] found that the ability to overhear allowed team
members to have informal training sessions and meetings
around project artifacts. Radically collocated teams doubled
their productivity compared with the previous company
baseline.
Organizational research shows that face-to-face meetings in
the life of distributed teams can create and enhance social
ties among team members. Starting a project with a face-toface meeting may jumpstart this process because team
members can develop a shared understanding of the work
[13]. The literature recommends face-to-face meetings
throughout a projects lifetime to maintain the social ties
underlying professional relationships [26]. Hinds and
Cramton [12] found that the effects of site visits, where
team members travel to the location of their coworkers to
spend time working and socializing with them, can be longlasting. After returning home, globally distributed team
members tended to be more responsive, communicate more,
and disclose more information with one another.
To summarize, the possible benefits of hackathons are: (1)
use the affordances of temporary collocation to rapidly
advance technical work; (2) use formal and informal
communication to create awareness of community needs
and thus facilitate extra work [40] outside the hackathon;
and (3) use face-to-face interactions to build durable social
ties. We aim to contribute to this body of knowledge by
understanding whether and how hackathons achieve these
benefits, and how to manage various aspects of design
throughout the hackathon timeline to increase the likelihood
of reaching desired outcomes.
METHOD
To address our research questions, we conducted a

multiple-case study [42] of three hackathons applied to
scientific software. Several considerations led us this set.
Our first criterion was to pick a clearly single-disciplinary
hackathon and an interdisciplinary hackathon to see how
design considerations could facilitate or hinder achieving
balance between software developers interests and
scientists needs, which is a key challenge in Science of
Team Science 1 research (e.g., [36]). For instance, we
expected to see different mechanisms for achieving
common ground between communities than when only one
community was present.
We found two hackathons meeting this criterion in
OpenBio Codefest 2014 (referred to hereafter as OBC)
1
Science of Team Science (SciTS) is a field aimed at

understanding and improving the processes and outcomes of
collaborative, team-based research [8].
and the 2014 NSF Polar DataVis Hackathon (referred to

hereafter as PDV). OBC was a two-day hackathon aimed
at giving developers of open-source bioinformatics software
libraries such as Biopython and scientific workflow
platforms like Galaxy a chance to be fully focused on their
projects.
Attendance was informal, with about 45
participants on the first day, and 35 on the second day.
PDV was a two-day hackathon aimed at fostering
collaboration between data visualization experts and polar
scientists. Expected outcomes were novel and high impact
prototypes and visualizations. Attendance was more stable
than OBC, with the same 39 participants on the first and
second day. Because of the difference in collaborative
orientations, PDV served as a theoretical replication [42] of
OBC.
We sought a third case that would contrast with our original
pair on other dimensions, serving as another theoretical
replication. We selected the 2015 R PopGen Hackathon
(referred to hereafter as RPG), a hackathon that aimed to
foster an interoperating ecosystem of tools and resources
for population genetics data analysis using the popular R
platform. Whereas PDV comprised two different disciplines
working on related problems, RPG comprised a single
discipline. We therefore expected to see fewer mechanisms
for developing common ground. Although both OBC and
RPG comprised participants from a single discipline, OBC
had primarily developers while RPG had different classes of
users, including end users contributing use cases, end users
with some programming experience wanting to learn how
to develop reusable packages (the unit of code distribution
in R), and pure method developers. We expected this
contrast in roles and programming experience to be helpful
in forming theory about the kinds of knowledge exchanged,
and how it is exchanged during a hackathon. It should also
reveal differences in how awareness for common needs
arises than when only developers are present. Attendance
for RPG was 28 participants, and in contrast to OBC and
PDV, it was five days long.
Data Collection
We collected multiple sources of evidence, including event

documentation (e.g., mailing list discussions, agendas,
announcements, idea lists, and team progress reports) to
understand planning practices, 71 hours of on-site
observations (OBC = 17 hours; PDV = 17 hours; RPG = 37
hours) to understand event dynamics (e.g., how teams form
around tasks), and 23 semi-structured interviews (Table 1)
to understand in more detail the interactions we observed
and the reasons behind them. At each hackathon we
captured photographs of the event space, daily team standup reports, work breaks, technical sessions, and team
meetings. The organizers of OBC and RPG allowed us to
video record participant introductions, stand-up reports, and
final demonstrations. We were unable to video record any
portion of PDV due to legal and insurance requirements
associated with the venue.
OBC
PDV
RPG
ID
Team(s)
Role
P1
Seven Bridges
P2
ADAM
P3
Arvados, CloudBioLinux
P4
Arvados
P5
Khmer, Galaxy
P6
Arvados
P7
ADAM
P8
Crawl Polar Data
P9
Visual Story, Temporal Vis
P10
Crawl Polar Data, Temporal Vis
P11
TangeloHub, GISCube
P12
Crawl Polar Data, Event Metrics
D, O
P13
Visual Story, Crawl Polar Data,

Polar Imagery
P14
Visual Story, Polar Imagery
P15
Crawl Polar Data, PolarHub
P16
Community website
D-U,
O
P17
Community website
D-U
P18
Streamline VCF data flow
P19
Outliers in multi-variate stats
P20
Simulation
P21
Simulation
P22
Estimating population size
P23
Outliers in multi-variate stats
D, O
D
D-U
D-U
U
Table 1. Summary of interview participants (D = developer; U

= End user, often little to no development expertise; D-U =
End user with moderate development expertise; O =
organizing team; M = manager).
In selecting interviewees we aimed for coverage across

hackathon teams. For RPG, we looked across the spectrum
of participant roles (Table 1), aiming to see examples of
teaching and learning as well as end user feedback. Our onsite observational notes helped us develop probes around
the motivations for concrete interactions, how they
happened, and their results. We solicited participants by email and interviewed them using either Skype or Google
Hangouts. We interviewed one participant by phone.
Interviews typically lasted just under an hour. A
professional transcription services firm transcribed all
interviews.
We designed a pre-survey to understand participant
expectations (i.e., What would the ideal outcome of this
hackathon be to you?), tasks participants desired to work
on (i.e., Please specify one or more tasks you want to

accomplish at the hackathon), and preparation for those
tasks (i.e., What preparation did you do for the above
tasks? Select all that apply. [list]). One week before each
hackathon, the organizers e-mailed a link to our survey to
all registered participants.
We created a post-survey to assess if, how, and why/why
not outcomes matched expectations. The survey asked
questions about participants satisfaction with their teams
work (i.e., To what extent were you satisfied or
dissatisfied with the work completed in your team? from
Very dissatisfied to Very satisfied), reasons for this
(i.e., What were the reasons for the extent to which you
were satisfied or dissatisfied with the work completed in
your team?), perceived outcomes (i.e., In your opinion,
what were your most important outcomes of the event?),
and whether outcomes matched expectations (i.e., Think
about what your ideal outcome coming into the event was.
To what extent was this outcome achieved? from Not at
all to Perfectly). On the last day of OBC, the organizer
e-mailed a link to our post-survey to participants, resulting
in a response rate of 68%. To achieve a higher response rate
for PDV we handed out and collected paper copies of the
post-survey after final demonstrations, but before
participants left. This resulted in a 100% response rate. We
tried this approach for RPG as well, but before leaving the
venue, only several participants returned paper copies back
to us. We therefore immediately sent e-mails to participants
with the link to the post-survey, and then sent reminders a
few days later. The final response rate was 75%.
Finally, we obtained work artifacts (e.g., presentation
slides, committed source-code changes) throughout each
hackathon in order to triangulate on our qualitative data and
compare outputs from each hackathon.
Data Analysis
We applied standard qualitative analysis techniques [4] to

our interview transcripts, observational notes, and event
documentation. We first imported these materials into the
Dedoose qualitative data analysis software [33]. Three of
the authors independently conducted open coding on the
text about activities before, during, and after each
hackathon, differences among them, and hackathon outputs.
In the next phase of analysis we wrote, shared, and
discussed descriptive memos about emerging themes in the
data. We used the video recordings to corroborate and
augment our observational notes. We met weekly to unify,
refine, and collapse codes where there was commonality,
using themes from our memos as support. We applied the
resulting set of codes to the remaining data, adding codes
when necessary. We continued this process until theoretical
saturation.
RESULTS
We group hackathon activities under three primary stages:

preparation, execution, and follow-through.
Preparation
Idea Brainstorming
Participant engagement begins a few weeks before the

hackathon. Participants brainstorm ideas for hackathon
tasks using information and communication technologies
(ICTs) that organizers have provided. Common among
them is the ability to communicate asynchronously via text
in a way that is publicly viewable. For instance, OBC
attendees used a shared Google Document to indicate their
interest in different ideas. Because most attendees were
affiliated with software projects, they had identified tasks in
the usual ways, e.g., directly from customers and the issue
tracker. As such, the content on this page tended to be
scant, with only a few phrases documenting the idea. In
contrast, PDV and RPG attendees used GitHubs Issues
feature to propose and discuss ideas. GitHub issues are a
way to keep track of tasks, enhancements, and bugs. After
someone opens an issue, a linear discussion view allows
anyone to comment on the issue or reply to other
comments.
The general process of evolving an idea is as follows. First,
a participant posts a textual description of their idea, often
in the form of a use case. The participant may also say
something about their general strategy for implementation
of the idea, as well as provide source-code and hyperlinks
to supporting datasets and technologies.
After the idea is proposed, other participants start to ask for
clarifications about the use case provided and details of the
datasets and technologies suggested. People from different
disciplines are involved here, typically in suggesting
potentially useful technologies with which they have
familiarity (P8, P9, P11). For instance, a computer scientist
may reply asking for a description of what each column
represents in a dataset that a domain scientist provided.
Participants tend to begin making use of social networking
functionality to direct attention to questions, e.g.,
@[name1] ^^ please see @name2 question above (P8)
and notify others who may have relevant expertise but are
not yet participating in the discussion (P8, P11). As
proposers clarify their ideas, other participants begin to
understand their needs and their skillsets (P8, P19). For
example, after a polar scientist (P10) clarified their dataset,
a computer scientist was able to write a script to transform
it into a format that would be easy for visualization tools to
visualize.
As the idea becomes clearer, support begins to build.
Participants may begin making positive comments in
support of the idea (e.g., P8, P9, P19), but some critiquing
goes on as well. For instance, participants point out that
additional use cases should be considered (P8, P9, P23), or
that more effective technical solutions exist than those
proposed, e.g., more useful visualization layouts (P9).
Participants, however, do not critique at length. Unless it is
clear to proposers that addressing the critique will benefit a
larger number of participants, they will not significantly
alter the idea (e.g., merge it or remove it). Instead,

participants defer to the team formation stage to make a
decision (e.g., P8).
While evaluating proposed ideas, other participants use
hyperlinks to cross-reference related ideas previously
posted and compare and contrast objectives (P11, P17). If
the idea stems from a personal research or work need,
proposers do tend to feel ownership and will advocate for it
as a separate idea (P8, P19). In other cases, the idea may
stem from the proposers desire to learn about a particular
technique or technology (P9) or to provide an obvious
community service, e.g., a tool demonstration, reference
document for available software to do analyses (P11, P17).
With agreement from other participants, these ideas are
merged with related ones.
Brainstorming comes to a halt a few days before the
hackathon. Proposed ideas are in various states. A few do
not have any comments. Many more others have been
clarified, have participant support, and have suggested
enhancements. Few ideas have concrete development
objectives and task assignments because they are still just
collections of use cases, technologies, and datasets. No
ideas have been ruled out. The work of translating ideas
into tasks begins on the first day of the hackathon.
Learning about Tools, Datasets, and Research Profiles
While participants are brainstorming ideas, they are

simultaneously learning about tools and datasets that may
address their own needs (P3, P9, P10), as well as the needs
of others (P11, P20, P22). They use the @mention notation
to bring potentially useful tools and datasets to others
attention. In some cases, references to these resources
resolve some of the proposed ideas due to an existing
solution already being in place (e.g., P9).
Our results suggest that this process can help people from
different disciplines characterize other ones. For instance,
P11, a developer of a scientific visualization tool who was
looking to learn about open problems in the polar science
community where his tool could contribute told us:
So that GitHub pre-meeting activity was helpful to me to
orient, learn the problems that are interesting, to learn who
the sum of the profiles of the researchers, Ah, this is a very
visionary person who wants to do this. This is somebody
who's providing data specific to this community. This is
somebody who is doing experiments and using this. (P11)
A practice unique to RPG was that organizers encouraged
participants to introduce themselves using a mailing list set
up for the event (P16). We found some evidence that this
helps participants engage others when they arrive to the
hackathon (e.g., P22, P23). For instance, even though P22
did not contribute to the task brainstorming discussions, he
reached out to another participant doing related research,
with whom he would eventually work together at the
hackathon.
Alignment: Preparing Tools and Datasets
In the days leading up to the hackathon, participants prepare

tools (P4, P11, P14, P15) and datasets, as well as install
software (P20) to be used during the hackathon. This
generally involves making sure documentation is available
and the code is in a buildable state (P4, P11, P14, P15).
The ideas discussed during brainstorming serve as
important inputs to this process. Developers modify their
tools to address additional use cases brought up in
discussions (P11). For instance, P11 added new features to
demonstrate a workflow that used a dataset discussed in
another idea page. Domain scientists ensure that their
datasets are in formats that can be easily understood,
queried and processed at the hackathon (P9, P10). For
instance, after receiving questions about the formatting
standards used in her data set, P10 updated her readme file
to explain the different formats.
One useful practice used in both PDV and RPG was the
creation of a list of software that should be installed in
advance to ensure efficient completion of the proposed
tasks (P12, P18). The organizers of PDV provisioned an
Amazon machine that people could get login credentials
for, and then install any software they wanted on it, which
eliminated the barrier of having to install unfamiliar
libraries and frameworks on ones own machine (P12).
In general, social networking functionality such as
@mentions allowed participants to quickly ask directed
questions of other participants, receive updates about idea
clarifications, and bring in participants with relevant
interest and expertise to bear on the discussion. The ability
to cross-reference other ideas helped participants identify
and merge related ideas. However, there were some
difficulties in using the ICTs provided. Some of the
computer scientists at PDV suspected that the domain
scientists did not propose ideas because they could not
figure out how to use GitHub (P9, P12). This is likely to be
a problem whenever bringing multiple disciplines together
at a hackathon. Different disciplines have familiarity with
different tools, and some members of one discipline will be
unintentionally excluded from participation in the medium
selected.
Execution
Team Formation
Ideas identified in the preparation stage seed team

formation. Team formation is the very first activity on day
one of the hackathon after brief introductions from
organizers and participants.
Our observations revealed three distinct team formation
strategies. In the open shepherding style of OBC, most
participants came to the hackathon already associated with
a project (and therefore a team) since OBCs objective was
to give these developers focused time on their projects. As a
result, most participants by default sat with their colleagues.
There were, however, free agents, attendees not
Building Solutions
As soon as team formation concludes, teams spend focused

time on their projects for the remainder of the hackathon.
Event spaces are configured to seat all participants in a
single room, and to accommodate multiple teams (see
Figure 2). According to survey responses, the average size
of a team in OBC (n=31) was 4 (sd=1.8, low=1, high=8), in
PDV (n=32) it was 7 (sd=2.5, low=2, high=14), and in RPG
(n=19) it was 6 (sd=1.06, low=4, high=8).
Figure 1. An idea with high interest from participants (left)

and low interest (right).
associated with these projects. During individual

introductions, the event organizer suggested matches
between free agents and existing teams and teams with each
other.
In contrast, PDV and RPG used project pitches, short
presentations made by individuals to the group describing
ideas intended for wider adoption. Most were based on
ideas discussed in the preparation stage. Pitches were
followed by an opportunity for questions and group
interactions with the proposers for participants to pick
teams. In the selection by organizer style of PDV,
participants indicated their interest in ideas by writing their
names on flip charts (one flip chart per idea). The
organizers selected a few ideas with high interest to work
on first. Other high interest ideas were reserved for later,
according to the organizers, in order to balance between
ideas that had lower interest. On the second day,
participants were encouraged to work on different ideas to
disperse participants enthusiasm and energy across ideas
(P8). Periodically the organizers walked around and
determined which teams were complete and which
needed more time. When teams were complete, new teams
formed around remaining ideas.
In the selection by attraction style of RPG, ideas that
people got behind were de facto selected. Participants wrote
down ideas they thought would be interesting, one idea per
sheet. Participants then discussed their ideas with others
sitting at their table, and each table was asked to pitch the
most important idea. The organizer wrote this idea down on
a chart and attached the relevant post it-notes. Each table
used different color notes. This was repeated in round robin
fashion. If ideas from other tables were similar, the post-its
were attached to the same chart (see Figure 1). Volunteers
were then asked to stand next to the flip charts, and
everyone else was free to wander around the room,
discussing pitches, offering suggestions, and deciding how
to fit in. In contrast to the selection by organizer style,
teams in the selection by resource attraction style stayed
together for the duration of the hackathon.
How Did Teams Work? Different working styles could be

observed. Teams of developers with a priori development
targets (e.g., addressing the backlog of issues in the project
tracker) did not generally need input from participants
outside of their teams. Therefore, they often worked
independently seated with their colleagues. Bursts of
independent work were followed by team discussions of the
code they were writing (P2, P7). These face-to-face
discussions supplemented the more formal code review that
occurs in open-source software development:
We have a bit of informal discussion at the table.
Normally we mark up each others' [source-code changes]
pretty heavy. But since everybody was sitting right around
the table it was a lot easier just to say, Oh, I would change
this. Oh, I would change that, instead of actually
commenting directly on the [source-code changes] (P2).
Another common pattern was pairs of individuals
communicating while their other team members worked
independently (P3, P18, P20, P22, P23). Here we saw team
members talking through their ideas with one another, and
showing each other errors and successes on their screens.
What Was the Role of End Users? We observed a
generative approach to requirements gathering. To take
advantage of the expertise in the room, developers on teams
would implement a series of new features, ask attendees in
other teams to try them out, and have those same attendees
approach them throughout the event to address issues and
Figure 2. One of the event spaces, with separate tables for

each team.
bugs they encountered in use (P3, P5). We also observed

that some teams had end users who would approach other
teams in the hackathon to clarify use cases or needs (P20,
P21). In team discussions about design of the tools, end
users and developers realized that certain use cases were
unclear (e.g., what format the data is in when the tool reads
it). Team members decided that while developers wrote
code, end users should initiate these conversations with
other teams. Other teams expected end users to approach
them with such questions because the teams looking to
clarify needs would announce that they needed input during
their daily progress reports to all participants.
Our observations revealed end users working to keep task
boundaries clear. This happens because of the short-term
nature of the hackathon; participants need something to
show by the end of it. For example, one participant working
on a project to help users build simulations of population
genetics data told us about her role in staving off
developers desire to build a user interface:
That was sort of, I guess, at least two to three times a day
there would be us talking about [a user interface] and like,
oh, yeah, yeah, I could have done that, and I would just
say, ok, so does that help with our immediate task? (P20)
How Did ICTs Support Collaboration? Certain
technology needs to be in place to support technical work.
For instance, participants need version control to capture
and merge contributions. They need shared documents
(e.g., wikis) to keep track of individual assignments and
progress, and shared repositories to store these documents,
as well as datasets, relevant publications, and
documentation.
Overall, GitHub worked well because it provided
integration of these technologies. Even developers,
however, talked about the learning curve associated with
the GitHub workflow (e.g., P12, P13, P18, P23). For
instance, P18 and P23 recalled that their team members
would occasionally overwrite each others code changes

while working. However, participants initially unfamiliar
with GitHub (e.g., P23) acknowledged that using it in the
preparation stage to brainstorm ideas helped lower the
barrier to use during execution. Participants used many of
the same features in both stages (e.g., posting issues,
authoring on the wiki). This suggests that using consistent
ICTs in preparation and execution is an important design
consideration for advancing technical work.
Scientists developing software that analyzes large datasets
(on the order of gigabytes) require storage support beyond
what is currently provided by state-of-the-art version
control. Members of the RPG Outliers in multi-variate stats
team, for instance, found that storing datasets on GitHub
slowed down their package. Moving the datasets to a shared
Google Drive folder eliminated this problem (P19, P23).
Types of and Satisfaction with Outputs. In contrast to
OBC and RPG teams, we found that PDV teams generated
many discussions of collaboration plans but few novel
software prototypes (Figure 3). PDV teams were also less
satisfied with their technical output. Only 66% (21/32) of
PDV participants were satisfied or very satisfied with what
was achieved in their team. In contrast, 81% (25/31) of
OBC and 86% (18/21) of RPG participants were satisfied or
very satisfied with technical work achieved in their team.
Interviews and open-ended survey responses revealed that
this stemmed from not having clear team objectives, and
was exacerbated by the team formation strategy chosen.
Because inputs needed from each discipline were not clear,
polar scientists and data visualization developers were
uncertain how they could concretely contribute (P9, P13,
P14). As a result, participants joined teams based primarily
on their interests; polar scientists joined teams working on
ideas proposed by other polar scientists, and data
visualization developers joined teams working on ideas
proposed by other developers. This led to relatively
Figure 3. Average number of source-code commits (left) and discussions (right) per day two weeks before, during, and after
each hackathon. For PDV and RPG we show unique comments posted to the GitHub issue tracker. In contrast, because OBC
participants used a shared Google Document, we show unique edits to that document extracted from the revision history.
Figure 4. Two different knowledge sharing practices: bootcamps (left), and watching others code (right).
homogenous teams. When some teams dissolved mid-day

continuing teams had to rehash previous discussions for
newcomers (P9), leaving little time to take ideas from
concept to realization.
Knowledge
Sharing
Practices
Knowledge Exchanged
Bootcamps
-How to download, install, configure,

use software
-Programming conventions and
practices
Tool
demonstrations
-Datasets and tools under development

-How to download, install, configure,
use software
-How to construct workflows
-User needs
Watching others
code
-Data structures
-Programming conventions and
practices
Round robin
discussions
-Data sets and tools under

development
-User needs
Table 2. Knowledge sharing practices

and kinds of knowledge exchanged.
Knowledge Sharing
We found seven different types of knowledge shared, and

four supporting practices (Table 2). One important type of
knowledge that we expected to find, and did find, was
knowledge about user needs. For instance, while
demonstrating a metadata search tool for polar data, the
developer of the tool learned:
[Polar scientists] want to be able to search on [a single
meta data attribute on a file] and they want to be able to
say like give me all the files from this specific data set or
this slice of the data set that have cumulus clouds and over
this region, and the cumulus clouds attribute is right down
nested within the data set. It's not like top level, it's not
explicit, it's very implicit within the data set. So that was
one thing that fall that was kind of a shortcoming that
we've managed to address now (P12)
A knowledge sharing practice that we observed only at
RPG was the bootcamp, an interactive tutorial designed to
get participants up to speed on a particular technology or
codebase. Unlike OBC where all participants were
developers and unlike PDV where domain scientists were
not necessarily expected to write code, all RPG participants
were expected to collaboratively write and share software.
Yet because of the spectrum of users to developers, the
organizers anticipated that some participants would not
have much familiarity with popular tools for software
development (P16). As a result, the organizer of RPG ran a
bootcamp where the topic was GitHub. Participants sat
around the organizer with their laptops and the organizer
projected his laptop on a screen. The organizer went
through the basics of setting up a Git repository,
transferring it to GitHub, and then adding files to the
repository. Participants followed along on their computers.
Anyone with questions would shout out, and either the
organizer or someone else with experience would answer
the question, sometimes coming around to the askers
screen to examine the issue (Figure 4, left). The close
proximity of bootcamps to other teams allowed the team
members to move freely in and out, depending on their
interest and expertise in the topic (P17, P18, P19, P22).
Bootcamps and tool demonstrations are similar in that they
both teach how to download, install, configure and use
software, but there are important differences between them.
Bootcampus focus on well-known tools that have gained
widespread adoption (e.g., GitHub, R), teaching broad skills
that will benefit most participants as they work. As a result
they are typically held before teams start work. In contrast,
tool demonstrations focus on tools developed within
research labs to do specialized analyses. They are thus
likely of interest to fewer scientists and occur within teams.
They often use specific datasets and use cases that scientists
provide during brainstorming activities in the preparation
stage. Scientists need these resources to do their work. As a
result, during tool demonstrations, scientists learn about
datasets relevant to their work, and how to construct

workflows using the tools. Scientists provide datasets and
use cases that developers of the tools may have not
originally anticipated, allowing developers (and end users
to some extent) to learn more about user needs.
Watching others code allows participants to understand data
structures and programming conventions and practices (P7,
P10, P17, P22, P23) more effectively compared to learning
on their own. For example, two team members using their
own computers to go through a coding tutorial would look
over each others shoulders occasionally to see if their
output matched and discuss errors (P23). To learn how to
use particular frameworks and data structures, participants
would go over to team members who were using those
frameworks to code and watch over their shoulder as they
were coding. The more experienced team members would
vocalize what they were doing and why they were doing it.
Watching experts code worked so well because it was more
effective than copying coding examples without having
sufficient context to understand why they were useful, and
at the same time it was not burdening the experts to the
point where they would not get anything done.
Building Social Ties
Participants build community both inside and outside of

working on technical tasks. Intense teamwork under
pressure allows participants to learn more about their
collaborators personalities, see how they react to problems
along the way, and develop strong connections with them
(P2, P5, P7, P18). For example, in the hours before final
demonstrations, participants rush to integrate code they
have been writing independently. Often, they must solve
errors together such as missing dependencies or overwriting
each others changes in the code repository. Participants
told us that this intense collaboration lowers the barrier to
future collaboration (P5, P6, P9, P13, P18, P22):
I had some big asks about [Galaxy] workflows, how they
do their workflows, and I knew it was just going to be more
productive for me to start to build that working relationship
in personand thats exactly what happened. We
understand each others personalities and perspectives and
what motivates us, and we can drop each other notes. (P5)
Spending time together outside of the intense technical
work (e.g., during coffee breaks, meals, and bus rides to the
hackathon venue from the hotel) allows participants to learn
about each others interests and reflect on opportunities for
collaboration. These informal discussions lead to
collaboration plans for writing grant proposals (P9, P13),
working on manuscripts (P15, P20, P22) and collaborating
on source-code projects outside the purview of the
hackathon (P2, P7, P18, P20, P21). For example, these
discussions resulted in the creation of a new Working
Group to create a tool definition and workflow language to
make workflows portable across different platforms (e.g.,
Galaxy), so that users can easily move their workflows
among these platforms and share them with other scientists.
This objective was not identified a priori as a priority of any

one team. Instead, it emerged from informal discussions
among participants from multiple teams in a kitchen area
adjacent to the main hackathon space.
Responses to our surveys provide additional support for
building upon existing social ties. We found that 68%
(23/34) of OBC participants had worked previously
remotely with other participants. Of these participants, 35%
(8/23) responded that these relationships were now a little
better and 57% (13/23) responded much better, with
only 9% (2/23) saying the relationships had not changed.
Similarly, of PDV participants, 63% (20/32) worked
together previously remotely. Of these 20 participants, 10%
(2/20) described these relationships as a little better and
90% (18/20) described them as much better. In contrast,
only 43% (9/21) of RPG participants had worked
previously remotely. All 9 described their relationships with
these participants as much better.
The results for OBC seem obvious when considering that
most participants at the hackathon worked on their primary
open-source software projects, therefore they would
naturally already have prior experience with their team
mates. The number of participants of PDV who
collaborated previously remotely initially surprised us,
since the objective of the hackathon was to bring together
two disjoint communities. From looking at participants
institutional affiliations and speaking with participants in
interviews, it seems that some proportion of participants
within each community had worked together previously.
Because teams ended up being relatively homogenous,
participants strengthened these existing ties. The organizers
of RPG, in contrast, sought to diversify participants along
demographics and expertise within the same community
(P16). Although participants described knowing of one
another, e.g., in package documentation and on mailing lists
(P18, P20, P22), they had not worked together prior, even
remotely.
Follow-through
Reification of Ideas
Teams often have a good idea about what the next steps are
regarding their tasks because there are naturally some
objectives that are incomplete and feedback that needs to be
addressed. For instance, developers who give
demonstrations of their tools can incorporate feedback
given to them about important use cases (P11, P12, P15).
OBC and RPG participants seemed quite motivated to
continue working on hackathon tasks.
For these
participants there were obvious motivations to continue the
work. OBC participants had selected tasks that they knew
would provide value to users. These included fixing
reported bugs and implementing features requested by
potential customers (P1, P2, P7). To do their work, RPG
participants need tools. Making these tools interoperate
more effectively, e.g., readily use the output of one tool as
input for another or enhance a tool to support multiple data

formats, would reduce extra work required in their daily
use.
will have input to that. Other than that, I have no plans to

follow up or maybe to see people in the future. (P13)
Figure 3 provides evidence that these participants followed

through, continuing to commit source-code to their teams
repositories. Team members who had collaborated
previously outside of the hackathon (e.g., P2 and P7, P18)
were confident about being able to wrap up development
work on their tasks fairly quickly. Several participants
mentioned having regularly scheduled teleconferences with
their team members to finish hackathon tasks (P1, P5, P6,
P17, P20, P21), and arranging to meet at other hackathons
to continue the work (P20, P21). Participants mentioned
targeting source-code packages and manuscripts.
In the weeks following the hackathon, developers,

especially those who gave tool demonstrations in the
execution stage, reach out to potential users to see if and
how they are using the tools (e.g., P7, P11, P12, P14). This
generally involves incorporating issues that users submitted
during and after the hackathon and notifying them of this
(P12), or working with users to help them make their own
source-code contributions (P5, P7, P12).
Most notably, the working group that emerged from OBC is

still meeting every two weeks via Google Hangouts (P1,
P5, P6). To coordinate work and make decisions they use a
Google Group. Since OBC, the group has grown to 99
members, and is still active 10 months after the event,
averaging 41 posts per month. They have produced two
drafts of a tool definition language for formally describing
tools used in scientific workflows, as well as a reference
implementation. They are also working on a publication.
Reactions from PDV participants were mixed. On the one
hand the computer scientists who had demonstrated their
tools were able to get some feedback that could address use
cases that domain scientists provided (P8, P15). On the
other hand, many teams comprising domain scientists never
got to the point where concrete development tasks were
proposed. As a result, there was nothing for the few
computer scientists in these teams to follow up on, and no
obvious incentive to do so:
So [P9] thinks that he's going to get a proposal in second
day's hackathon and I expect he will run that by me and I
Preparation
Stimulation of User Engagement
Multiple developers expressed a desire to meet users faceto-face at future events to facilitate this process as well as
describe their projects roadmap going forward (P2, P7,
P12, P14, P18), though we do not have much evidence for
whether this actually happened. An exception was that in a
follow-up e-mail to us two months after the hackathon, P12
mentioned that at a recent research conference, he met with
some data archive managers to whom he had demonstrated
his tool. The data managers were still using his tool, and he
was able to obtain additional feedback. Using feedback he
received there, he was able to add more features to the tool.
Maintenance of Social Ties
At the same time developers are reaching out to end users,

attendees may be working to maintain and follow up on the
relationships they have built at the hackathon. Participants
talked about plans to follow up on the relationships that
they built there. This included continuing to give each other
feedback on ideas for development features (P5, P21),
exchanging resources like scripts and datasets (P10, P18),
hashing out plans for manuscripts (P9, P15, P19, P20), and
making site visits to give research talks and explore
possibilities of future collaborations (P2, P7, P18).
Execution
Follow-through
Knowledge Sharing
1. Research profiles
2. Datasets & tools
3. Use, configure, install software
4. Construct workflows
5. User needs
6. Programming conventions & practices
7. Data structures
Technical Work
1. Brainstorming tasks
2. Building solutions
Community Building
1. Establishing ties
2. Maintaining ties
Table 3. Summary timeline of hackathon outcomes and component activities. We use ovals rather than straight lines to indicate
that start and end times and extent of overlap with other activities are approximations.
DISCUSSION
Table 3 summarizes outcomes and component activities

within the hackathon lifecycle. Surprisingly, some, such as
learning about users needs actually begin to take shape
before the face-to-face portion of the hackathon begins.
Similarly, technical work and community building start in
the preparation stage, but also continue in follow-through.
Below, we discuss how differences between the kinds of
disciplines, team formation strategies, and classes of users
mean there are likely tradeoffs among these outcomes.
Mixing Domain Scientists with Computer Scientists
In selecting our cases for this study, we expected to see

differences in how stages were conducted when domain
scientists were included in addition to computer scientists.
At OBC, open-source software teams had mostly identified
their tasks ahead of time, jotting them down in a shared
document but having no public discussion. Because
participants had clear goals and expertise, they were able to
make rapid progress on their technical work. However, not
including domain scientists in brainstorming discussions or
at the event mean that there were fewer opportunities to
gain awareness of the needs of end user domain scientists.
The interdisciplinary nature of PDV and the team formation
strategy selected seem to have combined to produce many
collaboration plans but limited technical progress.
Compared with OBC, computer scientists were able to learn
about domain scientists needs, since visualization
developers learned about polar scientists research
questions and datasets during preparation. During
execution, the organizers selected which tasks would
happen when, without specifying specific contributions
needed from each discipline and without ensuring that
teams included a mix of people from each. Moreover,
continuing teams had to rehash discussions for newcomers
when other teams stopped, distracting from the time
available to develop software.
Including Different Classes of Users
By selecting RPG, we expected to see how including

participants from all areas of the spectrum of end users to
developers would influence the types of knowledge
exchanged, how it was exchanged, and how common needs
would arise. We found knowledge sharing practices unique
to RPG, such as bootcamps and watching others code, that
can help incorporate newcomers and build community.
Including only developers at OBC seems to have advanced
technical work, but come at the cost of not incorporating
newcomers and not learning about users needs. Although
there was more training than in OBC, RPG participants
were also able to make quite a bit of technical progress.
Why was this the case? Watching others code resulted in
participants learning about programming conventions and
practices needed for their work without burdening
developers. The longer duration of this hackathon may have
also helped offset any losses in productivity due to experts
mentoring less experienced programmers. In general
however, there seem to be tradeoffs between technical

progress, building community, and awareness of user needs.
Comparisons to other Engagements
How do hackathons compare with other forms of

engagement and professional knowledge exchange? On the
one hand, they seem quite similar to other events. The
formal exchange of knowledge in bootcamps are similar to
tutorial workshops, such as Software Carpentry [41], which
focuses on teaching researchers about software program
design, version control, and testing. The close ties that
participants form with their team mates resemble
abbreviated versions of the bonds that form between
Summer of Code students and mentors working remotely
from one another [39]. The hands-on training participants
receive is similar to the experiences of newcomers to Agile
Sprints [32]. The informal conversations among the large
number of participants at coffee breaks and dinner, some of
which may lead to collaboration plans, resemble those of
academic conferences.
On the other hand, hackathons seem quite different. In
particular, the nature of the social ties that hackathon
participants develop seems a level deeper than what can be
expected out of a tutorial/workshop or academic
conference. Our findings have shown that hackathon
participants discuss their work with one another, observe
each other working, socialize, and exchange knowledge.
The pressure to produce before a deadline forces
collaboration that leads to understanding peoples
personalities and their working styles, resembling some of
the antecedents to situated coworker familiarity [12]. At the
same time participants are prototyping tools, they seem to
be prototyping working relationships with their peers.
Future Work
We can think of several avenues for future work. One way

to further test confidence in our findings would be to add a
series of replications to our case study. For instance, we
suspect that the PDVs failure to deliver new prototypes
was due, in part, to the team strategy chosen. However, a
rival explanation might be that all interdisciplinary
hackathons face this challenge. Less common ground might
have necessitated more discussion, which took time away
from technical work. To rule out this rival explanation we
might perform one or more theoretical replications of PDV
with multi-disciplinary teams, but with different team
formation strategies. A survey could also be used to
augment our qualitative data on the prevalence of
hackathons and the practices they use (i.e., planning,
knowledge sharing, working styles), the reported
experiences of participants, the perceived outcomes, and the
relationships among these variables.
We also see a need for longitudinal studies that follow
hackathon participants months afterward, perhaps
administering surveys, to assess the longer term impact of
the event. The impact may be more interesting in terms of
the social relationships rather than the maturation of
hackathon prototypes into production quality software. The

reason is because participants are busy professionals with
their own objectives, priorities, and responsibilities. It
seems unrealistic to continue to work toward hackathon
goals when they are no longer supported to do so. In
contrast, we found evidence that hackathon participants
developed collaboration plans with their team members and
other attendees outside their teams. To follow-up on our
findings that hackathons seem to increase familiarity and
build social ties it seems useful to explore quantitative
measurements of these constructs in surveys (e.g., [6,12]),
and in co-authorship to scientific software source-code
repositories, both before and after hackathons.
Implications for Design
Our results have important implications for the kinds of

technology support needed for hackathons. In the
preparation phase, social networking functionality should
be in place to allow participants to quickly ask directed
questions to other participants, receive updates about idea
clarifications, and bring in participants with relevant
interest and expertise to bear on the discussion. It should
provide transparency that allows participnats to find ideas,
merge related ones, and suggest modifications to ideas they
care about. Especially if the hackathon is multi-disciplinary,
the functionality should provide transparency to allow
interested parties to find about others research interests and
the technologies and datasets they use.
In the execution phase, participants need version control to
capture and merge contributions.
They need shared
documents (e.g., wikis) to keep track of individual
assignments and progress, and shared repositories to store
these documents, as well as datasets, relevant publications,
and documentation. When possible, the technology should
be consistent across preparation and execution so that
participants do not have to learn a new technology or tool.
A platform that integrates social networking functionality
with software development seems ideal. When using a
hackathon to link multiple disciplines, however, organizers
should keep in mind that different disciplines have
familiarity with different tools, and members of one
discipline may be unintentionally excluded from
participation. One potential solution here may be to align
two different technologies in the same system by creating
an application programming interface (API) that allows
them to interoperate.
Existing ICTs are likely to fall short when considering
hackathons that include remote participants, both
individuals and whole teams. As others (e.g., [37]) have
pointed out, the real benefits of collocation come from
people being at hand. For hackathons, individuals need to
be included in one of the teams. Current solutions for
communication via cameras and microphones (e.g., Google
Hangouts) are not ideal, as sharing documents (e.g.,
sketches, whiteboards) and brainstorming require the ability
to fluidly shift ones visual attention. Knowledge shared via
watching others code, overhearing group discussions and

tutorials are also likely to be hampered. Future studies of
such hackathons are needed.
Implications for Funding Agencies
We view this paper as part of a growing body of work (e.g.,

[17,40]) on scientific collaboration that has important
implications for funding policy. In particular, we focus on
software written by scientists. Unless this software is
maintained, it soon becomes useless. Yet as we noted
previously, funding is generally limited to a specific
research project. Even software that has generated much
interest across multiple scientific communities can founder
if the primary maintainers find it impossible to meet the
evolving needs of users.
It is clear that funding agencies are interested in producing
and sustaining software that advances scientific knowledge
[35]. Other than developing better indicators for tracking
software usage (e.g., Scientific Software Map), hackathons
seem like a potential strategy. Indeed, the National Science
Foundation sponsored PDV, suggesting genuine interest in
the hackathon model. In the longer-term we see
opportunities for producing policy prescriptions on
incorporating hackathons as an element of scientific
software sustainability. For instance, hackathons could be
included in software maintenance plans for proposal
applications. Policy prescriptions could help funders design
and evaluate these plans. More careful empirical study,
especially of design considerations and lasting impact, will
be needed in the meantime.
CONCLUSION
In this paper we examined the stages a hackathon goes

through as it evolves and how variations in how stages are
conducted relate to outcomes. We identified practices
across the preparation, execution, and follow-through
stages of a hackathon that meet the specialized needs of
scientific software. Differences in the kinds of disciplines
included, classes of users, and team formation strategies
suggest tradeoffs among technical progress, surfacing user
needs, and building community. Surprisingly, quite a few
activities begin to take shape before the hackathon begins
and have implications for the kinds of technology that need
to be in place. Our hope is that in addition to informing
future empirical studies, our results bring attention to the
hackathon model and raise the level of discussion in the
scientific community about planning and conducting
successful engagements.
REFERENCES
1.
Mariva H. Aviram. 2015. JavaOnes Palm-sized

winner: How 3Com stole the show, palms down.
Retrieved May 11, 2015 from
http://www.javaworld.com/article/2076473/mobilejava/javaone-s-palm-sized-winner.html
2.
Gerard Briscoe and Catherine Mulligan. 2014. Digital

Innovation: The Hackathon Phenomenon. Retrieved
August 4, 2014 from
http://www.creativeworkslondon.org.uk/wpcontent/uploads/2013/11/Digital-Innovation-TheHackathon-Phenomenon1.pdf
3.
4.
5.
Gerardo Canfora, Massimiliano Di Penta, Rocco

Oliveto, and Sebastiano Panichella. 2012. Who is
Going to Mentor Newcomers in Open Source Projects?
Proceedings of the ACM SIGSOFT International
Symposium on the Foundations of Software
Engineering, ACM Press, 44:144:11.
http://doi.org/10.1145/2393596.2393647
Juliet Corbin and Anselm Strauss. 2014. Basics of
Qualitative Research: Techniques and Procedures for
Developing Grounded Theory. SAGE Publications,
Inc., Thousand Oaks, CA.
Michael R. Crusoe and C. Titus Brown. 2014.
Channeling Community Contributions to Scientific
Software: A Sprint Experience.
6.
Jonathon N. Cummings and Sara Kiesler. 2008. Who

Collaborates Successfully? Prior Experience Reduces
Collaboration Barriers in Distributed Interdisciplinary
Research. Proceedings of the ACM Conference on
Computer-Supported Cooperative Work, ACM Press,
437446. http://doi.org/10.1145/1460563.1460633
7.
Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim

Herbsleb. 2012. Social Coding in GitHub:
Transparency and Collaboration in an Open Software
Repository. Proceedings of the ACM Conference on
12771286. http://doi.org/10.1145/2145204.2145396
13. Pamela J. Hinds and Suzanne P. Weisband. 2003.

Knowledge Sharing and Shared Understanding in
Virtual Teams. In Virtual Teams that Work: Creating
Conditions for Virtual Team Effectiveness, Cristina B.
Gibson and Susan G. Cohen (eds.). Jossey-Bass, San
Francisco, CA, 2136.
14. James Howison and James D. Herbsleb. 2011.
Scientific Software Production: Incentives and
Collaboration. Proceedings of the ACM Conference on
Computer-Supported Cooperative Work.
http://doi.org/10.1145/1958824.1958904
15. James Howison and James D. Herbsleb. 2013.
Incentives and Integration in Scientific Software
Production. Proceedings of the ACM Conference on
459470. http://doi.org/10.1145/2441776.2441828
16. Xing Huang, Xianghua Ding, Charlotte P. Lee, Tun Lu,
Ning Gu, and Sieg Hall. 2013. Meanings and
Boundaries of Scientific Software Sharing.
Proceedings of the ACM Conference on ComputerSupported Cooperative Work, ACM Press, 423434.
http://doi.org/10.1145/2441776.2441825
17. Steven J. Jackson, Stephanie B. Steinhardt, and Ayse
Buyuktur. 2013. Why CSCW Needs Science Policy
(and Vice Versa). Proceedings of the ACM Conference
on Computer-Supported Cooperative Work, ACM
Press, 11131124.
http://doi.org/10.1145/2441776.2441902
8.
Holly J. Falk-Krzesinski, Noshir Contractor, Stephen

M. Fiore, et al. 2011. Mapping a research agenda for
the science of team science. Research Evaluation 20, 2,
145158.
http://doi.org/10.3152/095820211X12941371876580
18. Toshiaki Katayama, Mark D. Wilkinson, Kiyoko F.

Aoki-Kinoshita, et al. 2014. BioHackathon series in
2011 and 2012: penetration of ontology and linked data
in life science domains. Journal of Biomedical
Semantics 5, 5, 5. http://doi.org/10.1186/2041-1480-55
9.
HackerNest. 2014. DementiaHack TORONTO by the

British Govt & HackerNest. Retrieved May 11, 2015
from http://www.eventbrite.com/e/dementiahacktoronto-by-the-british-govt-hackernest-tickets12349265987?aff=estw
19. Pedram Keyani. 2012. Stay focused and keep hacking.

Retrieved May 11, 2015 from
https://www.facebook.com/notes/facebookengineering/stay-focused-and-keephacking/10150842676418920/
10. Jo Erskine Hannay, Carolyn MacLeod, Janice Singer,

Hans Petter Langtangen, Dietmar Pfahl, and Greg
Wilson. 2009. How Do Scientists Develop and Use
Scientific Software? Workshop on Software
Engineering for Computational Science and
Engineering, IEEE Computer Society, 18.
http://doi.org/10.1109/SECSE.2009.5069155
20. Robert E. Kraut and Paul Resnick. 2011. Building

Successful Online Communities: Evidence-Based
Social Design. MIT Press, Cambridge, MA.
11. Ross Harmes. 2008. Open! Hack! Day! Retrieved May

13, 2015 from http://code.flickr.net/2008/09/03/openhack-day/
21. Grace de la Flor, Marina Jirotka, Paul Luff, John

Pybus, and Ruth Kirkham. 2010. Transforming
Scholarly Practice: Embedding Technological
Interventions to Support the Collaborative Analysis of
Ancient Texts. Computer Supported Cooperative Work
(CSCW) 19, 3-4, 309334.
http://doi.org/10.1007/s10606-010-9111-1
12. Pamela J. Hinds and Catherine Durnell Cramton. 2014.

Situated Coworker Familiarity: How Site Visits
Transform Relationships Among Distributed Workers.
Organization Science 25, 3, 794814.
22. Hilmar Lapp, Sendu Bala, James P. Balhoff, et al.

2007. The 2006 NESCent Phyloinformatics
Hackathon: A Field Report. Evolutionary
Bioinformatics 3, 287296.
23. Steven Leckart. 2015. The Hackathon Fast Track,

From Campus to Silicon Valley. The New York Times.
Retrieved May 11, 2015 from http://nyti.ms/1CawQxH
24. Charlotte P. Lee, Paul Dourish, and Gloria Mark. 2006.
The Human Infrastructure of Cyberinfrastructure.
Proceedings of the ACM Conference on ComputerSupported Cooperative Work, ACM Press, 483492.
http://doi.org/10.1145/1180875.1180950
25. Steffen Mller, Enis Afgan, Michael Banck, et al.
2014. Community-driven development for
computational biology at Sprints, Hackathons and
Codefests. BMC Bioinformatics 15 Suppl 1, Suppl 14,
S7. http://doi.org/10.1186/1471-2105-15-S14-S7
26. Bonnie A. Nardi and Steve Whittaker. 2002. The Place
of Face-to-Face Communication in Distributed Work.
In Distributed Work, Pamela J. Hinds and Sara Kiesler
(eds.). MIT Press, Cambridge, MA, 83110.
27. Judith S. Olson, Stephanie Teasley, Lisa Covi, and
Gary Olson. 2002. The (Currently) Unique Advantages
of Collocated Work. In Distributed Work, Pamela J.
Hinds and Sara Kiesler (eds.). MIT Press, Cambridge,
MA, 113135.
28. OpenBSD. Hackathons. Retrieved May 11, 2015 from
http://www.openbsd.org/hackathons.html
29. Prakash Prabhu, Yun Zhang, Soumyadeep Ghosh, et al.
2011. A Survey of the Practice of Computational
Science. State of the Practice Reports, 112.
http://doi.org/10.1145/2063348.2063374
30. Mikko Raatikainen, Marko Komssi, Vittorio Dal
Bianco, Klas Kindstom, and Janne Jarvinen. 2013.
Industrial Experiences of Organizing a Hackathon to
Assess a Device-Centric Cloud Ecosystem.
Proceedings of the IEEE Annual Computer Software
and Applications Conference, IEEE Computer Society,
790799. http://doi.org/10.1109/COMPSAC.2013.130
31. Jeffrey A. Roberts, Il-Horn Hann, Sandra A. Slaughter,
and John F. Donahue. 2006. Understanding the
Motivations, Participation, and Performance of Open
Source Software Developers: A Longitudinal Study of
the Apache Projects. Management Science 52, 7, 984
999. Retrieved May 9, 2014 from
http://pubsonline.informs.org/doi/abs/10.1287/mnsc.10
60.0554
32. Anders Sigfridsson, Gabriela Avram, Anne Sheehan,
and Daniel K. Sullivan. 2007. Sprint-driven
development: working, learning and the process of
enculturation in the PyPy community. In Open Source
Development, Adoption and Innovation, J. Feller, B.
Fitzgerald, Walt Scacchi and A. Sillitti (eds.). Springer
US, 133146. Retrieved May 28, 2014 from
http://cs.anu.edu.au/iojs/index.php/ifip/article/view/113
08
33. LLC SocioCultural Research Consultants. 2014.

Dedoose Version 5.0.11, web application for
managing, analyzing, and presenting qualitative and
mixed method research data. Retrieved from
http://www.dedoose.com
34. Igor Steinmacher, Marco Aurlio Gerosa, and David F.
Redmiles. 2015. Social Barriers Faced by Newcomers
Placing Their First Contribution in Open Source
Software Projects. Proceedings of the ACM Conference
on Computer-Supported Cooperative Work & Social
Computing, ACM Press, 13791392.
http://doi.org/10.1145/2675133.2675215
35. Craig A. Stewart, Guy T. Almes, and Bradley C.
Wheeler. 2010. Cyberinfrastructure Software
Sustainability and Reusability: Report from an NSFfunded Workshop. Indiana University, Bloomington,
IN. Retrieved from http://hdl.handle.net/2022/6701
36. Daniel Stokols, Kara L. Hall, Brandie K. Taylor, and
Richard P. Moser. 2008. The Science of Team Science:
Overview of the Field and Introduction to the
Supplement. American Journal of Preventive Medicine
35, 2 Suppl, S7789.
http://doi.org/10.1016/j.amepre.2008.05.002
37. Stephanie Teasley, Lisa Covi, M.S. Krishnan, and
Judith S. Olson. 2000. How Does Radical Collocation
Help a Team Succeed? Proceedings of the ACM
Conference on Computer-Supported Cooperative
Work, ACM Press, 339346.
http://doi.org/10.1145/358916.359005
38. Kaitlin Thaney. 2014. The #mozsprint heard round the
world. Retrieved May 11, 2015 from
http://www.mozillascience.org/the-mozsprint-heardround-the-world/
39. Erik H. Trainer, Chalalai Chaihirunkarn, Arun
Kalyanasundaram, and James D. Herbsleb. 2014.
Community Code Engagements: Summer of Code &
Hackathons for Community Building in Scientific
Software. Proceedings of the ACM Conference on
Supporting Group Work, ACM Press, 111121.
http://doi.org/10.1145/2660398.2660420
40. Erik H. Trainer, Chalalai Chaihirunkarn, Arun
Kalyanasundaram, and James D. Herbsleb. 2015. From
Personal Tool to Community Resource: Whats the
Extra Work and Who Will Do It? Proceedings of the
ACM Conference on Computer-Supported Cooperative
Work & Social Computing, ACM Press, 417430.
http://doi.org/10.1145/2675133.2675172
41. Greg Wilson. 2006. Software Carpentry: Getting
Scientists to Write Better Code by Making them More
Productive. Computing in Science & Engineering 8,
6669.
42. Robert K. Yin. 2014. Case Study Research. SAGE
Publications, Inc., Thousand Oaks, CA.

CSCW 2016 - How To Hackathon: Socio-Technical Tradeoffs in Brief, Intensive Collocation

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

CSCW 2016 - How To Hackathon: Socio-Technical Tradeoffs in Brief, Intensive Collocation

Diunggah oleh

Hak Cipta:

Format Tersedia

How and When Do Hackathons for Scientific Software

Work? Insights from a Multiple-Case Study

2nd Author Name

Scientific communities are experimenting with hackathons,

ACM Classification Keywords

H.5.3. [Information interfaces and presentation (e.g., HCI)]:

Software is of central importance to modern scientific

3rd Author Name

resource that scientists might be willing to share [15].

structured interviews to understand in more detail the kinds

A hackathon can be defined as a short-term event where

Briscoe and Mulligan [2] loosely group hackathons as being

On the one hand, open-source software projects like PyPy,

The open-source software development model is widely

software. Both depend on new contributors [20:179], but

Temporary collocation can speed up software development

artifacts, mark them to reflect mutually agreed-on changes,

To address our research questions, we conducted a

Science of Team Science (SciTS) is a field aimed at

and the 2014 NSF Polar DataVis Hackathon (referred to

We collected multiple sources of evidence, including event

Crawl Polar Data

Visual Story, Temporal Vis

Crawl Polar Data, Temporal Vis

Crawl Polar Data, Event Metrics

Visual Story, Crawl Polar Data,

Visual Story, Polar Imagery

Crawl Polar Data, PolarHub

Streamline VCF data flow

Outliers in multi-variate stats

Estimating population size

Outliers in multi-variate stats

Table 1. Summary of interview participants (D = developer; U

In selecting interviewees we aimed for coverage across

on (i.e., Please specify one or more tasks you want to

We applied standard qualitative analysis techniques [4] to

We group hackathon activities under three primary stages:

Participant engagement begins a few weeks before the

alter the idea (e.g., merge it or remove it). Instead,

While participants are brainstorming ideas, they are

Alignment: Preparing Tools and Datasets

In the days leading up to the hackathon, participants prepare

Ideas identified in the preparation stage seed team

As soon as team formation concludes, teams spend focused

Figure 1. An idea with high interest from participants (left)

associated with these projects. During individual

How Did Teams Work? Different working styles could be

Figure 2. One of the event spaces, with separate tables for

bugs they encountered in use (P3, P5). We also observed

would occasionally overwrite each others code changes

homogenous teams. When some teams dissolved mid-day

-How to download, install, configure,

-Datasets and tools under development

-Data sets and tools under

Table 2. Knowledge sharing practices

We found seven different types of knowledge shared, and

datasets relevant to their work, and how to construct

Participants build community both inside and outside of

This objective was not identified a priori as a priority of any

input for another or enhance a tool to support multiple data

will have input to that. Other than that, I have no plans to

Figure 3 provides evidence that these participants followed

In the weeks following the hackathon, developers,