Anda di halaman 1dari 20

This article was downloaded by: [Ohio University]

On: 5 December 2008


Access details: Access Details: [subscription number 789542906]
Publisher Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK

International Studies in the Philosophy of Science


Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713427740

Franklin, Holmes, and the Epistemology of Computer Simulation


Wendy S. Parker

Online Publication Date: 01 July 2008

To cite this Article Parker, Wendy S.(2008)'Franklin, Holmes, and the Epistemology of Computer Simulation',International Studies in
the Philosophy of Science,22:2,165 — 183
To link to this Article: DOI: 10.1080/02698590802496722
URL: http://dx.doi.org/10.1080/02698590802496722

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
International Studies in the Philosophy of Science
Vol. 22, No. 2, July 2008, pp. 165–183

Franklin, Holmes, and the


Epistemology of Computer Simulation
Wendy S. Parker
0parkerw@ohio.edu
Professor
00000July
International
10.1080/02698590802496722
CISP_A_349840.sgm
0269-8595
Original
Taylor
22008
22 and
& WendyParker
2008
Article
Francis
(print)/1469-9281
Francis
Studies in the Philosophy
(online) of Science
Downloaded By: [Ohio University] At: 14:16 5 December 2008

Allan Franklin has identified a number of strategies that scientists use to build confidence
in experimental results. This paper shows that Franklin’s strategies have direct analogues
in the context of computer simulation and then suggests that one of his strategies—the so-
called ‘Sherlock Holmes’ strategy—deserves a privileged place within the epistemologies of
experiment and simulation. In particular, it is argued that while the successful application
of even several of Franklin’s other strategies (or their analogues in simulation) may not be
sufficient for justified belief in results, the successful application of a slightly elaborated
version of the Sherlock Holmes strategy is sufficient.

1. Introduction
Allan Franklin (1986, 1989, 2002) has argued that the results of experiments in physics
usually come to be accepted primarily on rational evidential grounds, contrary to the
suggestions of some scholars in science studies, who have instead emphasized the
importance of social factors. In his efforts to show how scientists come to have good
non-social reasons for believing in experimental results, Franklin identifies numerous
experimental checks that scientists perform, as well as features of experimental results
that they look for, in light of which they can be more confident that the results are
‘valid’. In doing so, Franklin explicitly takes himself to be offering an epistemology of
experiment.
At least two authors (Weissart 1997; Winsberg 1999a, 1999b, 2003) have claimed
that many, if not all, of Franklin’s strategies have analogues in the context of computer
simulation. However, very little detailed analysis of what those analogous strategies
involve has been given. This paper provides such analysis and then considers how the
sets of strategies in the two contexts might be used to justify believing or accepting
results.

Wendy S. Parker is at the Department of Philosophy, Ohio University. Correspondence to: Department of
Philosophy, Ohio University, Ellis Hall 202, Athens, OH 45701, USA. E-mail: parkerw@ohio.edu

ISSN 0269–8595 (print)/ISSN 1469–9281 (online) © 2008 Open Society Foundation


DOI: 10.1080/02698590802496722
166 W. S. Parker
Section 2 provides relevant background on computer simulation models and their
evaluation. Section 3 presents five of Franklin’s strategies for building confidence in
experimental results and identifies analogous strategies in the context of computer
simulation, noting some special difficulties and caveats associated with the latter.
Section 4 suggests that one of these strategies—the so-called ‘Sherlock Holmes’
strategy—deserves a privileged place within the epistemologies of simulation and
experiment. In particular, it is argued that while the successful application of even
several of the other strategies may not be sufficient for justified belief in results, the
successful application of a slightly elaborated version of the Sherlock Holmes strategy
is sufficient. Finally, Section 5 offers concluding remarks.

2. Computer Simulation, Model Evaluation, and Code Evaluation


Scientists often represent real or imagined systems of interest using mathematical
Downloaded By: [Ohio University] At: 14:16 5 December 2008

models. Not infrequently, it is difficult or impossible to find exact solutions to the sets
of equations associated with these mathematical models. This often happens, for
instance, when the equations of interest are nonlinear partial differential equations. In
such cases, scientists may have little choice but to transform the equations of interest
in various ways—some of the terms in the equations may need to be combined, simpli-
fied, given alternative mathematical expression or omitted entirely—until they are in a
form such that approximate, local solutions can be found using brute-force numerical
methods. Call this set of differential equations for which scientists eventually set out to
find approximate solutions the continuous model equations.
Having selected methods by which to estimate solutions to the continuous model
equations, scientists often turn to the digital computer to perform the calculations that
are required to estimate solutions by those methods. This is especially true when scien-
tists are interested in how the modelled system will behave over time and thus need to
estimate solutions to the continuous model equations repeatedly, for a sequence of
values of the variable that denotes time. A computer program is written to tell the
computer which calculations to perform and in what order. When actually imple-
mented on a digital computer, this program is a computer simulation model—a physical
implementation of a set of instructions for repeatedly solving a set of equations in order
to produce a representation of the temporal evolution (if any) of particular properties
of a target system. The execution or ‘running’ of the computer simulation model with
specified initial and/or boundary conditions is a computer simulation.1
Scientists typically hope to use a computer simulation model to infer something of
interest about a target system—what the future state of the target system will be, or why
the target system displays particular properties, or how the target system would behave
under various counterfactual conditions, etc. Consequently, the question of whether
the computer simulation model is an adequate representation of the target system,
relative to the goals of the modelling study, is of utmost importance. The activity of
model evaluation (also sometimes known as ‘validation’) aims to collect evidence
regarding precisely this question. Depending on the goals of the modelling study, the
process of model evaluation might treat the simulation model as a black box and focus
International Studies in the Philosophy of Science 167

only its output, or it might involve opening the black box to investigate the accuracy of
particular modelling assumptions and/or the adequacy of the process by which solu-
tions to the continuous model equations are estimated.2
Investigation of the latter—of the adequacy of the process by which solutions to the
continuous model equations are estimated—is often considered an important activity
in its own right and will be referred to here as code evaluation (also sometimes known
as ‘verification’). Code evaluation is ultimately a mathematical activity, concerned with
the accuracy with which a piece of computer code can estimate solutions to a set of
equations, regardless of what those equations might or might not be selected to repre-
sent. As implied above, model evaluation sometimes, but not always, explicitly includes
code evaluation as a part. Code evaluation may also be undertaken when there is an
expectation that a set of equations will need to be solved as part of many different simu-
lation studies in the future; in that case, future model building and evaluation can be
streamlined by having a trustworthy piece of code that can be used in an off-the-shelf
Downloaded By: [Ohio University] At: 14:16 5 December 2008

manner for a range of solution tasks.

3. Analogues of Franklin’s Strategies in Model and Code Evaluation


Before discussing Franklin’s strategies and their analogues in computer simulation, it
is important to consider what Franklin means when he says that his strategies can
increase confidence that an experimental result is ‘valid’.
Two kinds of experimental validity are commonly distinguished: internal validity
and external validity.3 To help see the difference, assume that an experimental result
comes in the form of a statement about purported states of affairs during the experi-
ment; this statement will often be one that attributes properties to entities believed to
be involved in the experiment. For instance, the result statement might be of the form,
‘The sample of liquid boiled when its temperature reached T ± ε degrees Celsius’, or ‘On
seven out of fifty trials, a collision of type C produced a particle with mass M ± ε GeV/
c2’. To say that an experimental result is internally valid is to say that its associated result
statement is true. Internal validity is concerned with what is true of a particular exper-
imental system—the particular volume of liquid used in the experiment, the particular
set of collision events involved in the experiment, etc.4 An experimental result is exter-
nally valid when what its associated result statement says about the experimental system
is also (or would also be) true of other specified entities under specified conditions, e.g.
other volumes of liquid, other sets of particle collisions under such-and-such condi-
tions; external validity is concerned with the generalizability of the experimental result.
Franklin does not say explicitly whether by a ‘valid’ result he means one that is inter-
nally valid or externally valid, or something else altogether, but his primary concern
seems to be the internal validity of results. That is, when he speaks of building confi-
dence in an experimental result, or in its validity, in general this should be understood
to mean building confidence that a result statement (i.e. a statement about the experi-
mental system) is true.
The concepts of internal and external validity can also be applied in the context of
computer simulation modelling. Again assume that a result is presented as a statement,
168 W. S. Parker
in this case a statement about the behaviour of the programmed digital computing
system during the computer simulation (e.g. that a value of such-and-such was calcu-
lated for a particular variable at a particular time step). Because knowledge of this
behaviour is usually thought to be obtainable in an unproblematic way via printouts
and screen displays, the internal validity of simulation results is usually taken for
granted. It is their external validity—their being indicative of what is (or would be) true
of a specified target system—that is the primary focus of the epistemology of computer
simulation; both code evaluation and model evaluation are best understood as activi-
ties concerned with external validity.5
Despite this difference in focus, many if not all of Franklin’s confidence-building
strategies do have straightforward analogues in the context of computer simulation, for
reasons that will become clearer in Section 4. These simulation-related analogues are
strategies that can be, and in some cases already are, used to increase confidence in
simulation results. The remainder of this section presents five of Franklin’s strategies
Downloaded By: [Ohio University] At: 14:16 5 December 2008

and their model evaluation and code evaluation analogues, noting some special caveats
and difficulties along the way. (A summary of the analysis appears in Table 1.)

(1) Apparatus gives results that match known results. This strategy builds confi-
dence in a result by showing that an experimental apparatus used in obtaining the
result gives accurate results in other relevant instances.6 For example, when a ther-
mometer accurately registers the temperatures of volumes of liquid whose tempera-
tures are known in advance, this counts as prima facie evidence that the thermometer
is working properly and that it has registered the correct temperature for the volume of
liquid in our experiment (see Franklin 1989, 447–448). Franklin sometimes describes
this practice as ‘calibration’.7

Table 1 Computer Simulation Analogues of Five of Franklin’s Strategies


Franklin’s strategies Model evaluation strategies Code evaluation strategies
Show that… Show that… Show that…

(1) Apparatus gives other Simulation output fits closely Estimated solutions fit closely
results that match known enough with various enough with analytic and/or
results observational data other numerical solutions
(2) Apparatus responds as Simulation results change as Solutions change as expected
expected after intervention expected after intervention on after intervention on algorithm
on the experimental system substantive model parameters parameters
(3) Capacities of apparatus Simulation model is constructed Solution method is underwritten
are underwritten by well- using well-confirmed theoretical by sound mathematical
confirmed theory assumptions theorizing and analysis
(4) Experimental results are Simulation results are Solutions are produced using
replicated in other reproduced in other simulations other pieces of code
experiments or in traditional experiments
(5) Plausible sources of Plausible sources of significant Plausible sources of significant
significant experimental modelling error can be ruled out mathematical/computational
error can be ruled out error can be ruled out
International Studies in the Philosophy of Science 169

When it comes to model evaluation, an analogous confidence-building strategy


involves demonstrating that simulation output other than that whose trustworthiness
is at issue fits well with empirical data that are available for the target system. Call the
output whose trustworthiness is ultimately of interest R and the output to be compared
with empirical data R′. At a minimum, R′ should be selected such that the quality of the
model’s performance with respect to R′ is not independent of the quality of its perfor-
mance with respect to R; the more highly correlated the quality of the model’s perfor-
mances with respect to R and R′ are thought to be, the more one’s confidence can
increase in light of finding that R′ fits well with empirical data. For instance, suppose
that a climate model is used to produce predictions of yearly rainfall amounts in a
region for the coming decades. In most cases, confidence in the rainfall predictions will
be increased more by demonstrating that the model can simulate to a specified accu-
racy the past evolution of yearly rainfall amounts in that region than by demonstrating
that the model can simulate to a comparable accuracy the past evolution of average
Downloaded By: [Ohio University] At: 14:16 5 December 2008

yearly temperature for a different region. This is because the former is thought to be a
more reliable indicator than the latter of the model’s ability to accurately predict yearly
rainfall in that region.8
When the focus is on code evaluation, an analogous confidence-building strategy is
often referred to as ‘benchmarking’. Benchmarking comes in different varieties, but it
always involves comparing model output with ‘known’ solutions. Most obviously,
model output can be compared with any exact analytic solutions that are available for
the continuous model equations.9 However, computer simulation is often turned to
when it is difficult or impossible to find such solutions for conditions similar to those
of interest in the simulation study. Thus, alternative benchmarking activities might
demonstrate that the programmed solution algorithm accurately solves other, related
equations for which either analytic solutions or highly accurate numerical solutions are
antecedently known (see e.g. the Method of Manufactured Solutions in Roache 2002).
Through benchmarking, modellers build confidence that the computer simulation
model is able to deliver accurate-enough solutions for a set of equations of a relevant
type and thus that it can deliver accurate-enough solutions in the simulation at hand.

(2) Apparatus responds as expected following interventions. This strategy involves


demonstrating that the response of the experimental apparatus following an interven-
tion on the experimental system is as one expects it to be if the apparatus is functioning
properly. As with the previous strategy, through such a demonstration, ‘we increase our
belief in both the proper operation of the apparatus and in its results’ (Franklin 1989,
440). Consider the thermometer again: if a volume of liquid in which the thermometer
is placed is heated and the temperature registered by the thermometer increases by an
amount that is to be expected if the thermometer is working properly, then one has
prima facie evidence that the thermometer is working properly and therefore that it will
accurately indicate the temperature of the volume of fluid in our experiment.
An analogous model evaluation strategy involves demonstrating that changes to
the values of substantive model parameters affect the simulation output in ways that
are expected, given what is already known about the target system. For example, if
170 W. S. Parker
background knowledge of a mechanical system leads one to expect that decreasing fric-
tion in the system would increase its efficiency, one might check whether decreasing the
value of the variable denoting friction in the simulation model does affect the values
taken by other variables in such a way that the calculated efficiency of the system is
increased. As with the previous strategy, the output variables selected when pursuing
this strategy should be ones for which the quality of the model’s performance is
thought to be at least weakly (and preferably strongly) correlated with the quality of its
performance for the variables that are ultimately interest. By showing that the model’s
responses to such interventions match with the expected responses of the target system,
one builds confidence that the model can provide the information that is ultimately
desired about the target system.
When it comes to code evaluation, an analogous confidence-building strategy
involves demonstrating that interventions on the solution algorithm change the simu-
lation output in ways that are expected on mathematical grounds. For instance, to
Downloaded By: [Ohio University] At: 14:16 5 December 2008

check whether the solution algorithm has the desired feature of convergence, modellers
can reduce the simulation model’s time step or its spatial discretization mesh (i.e. the
distance between points on the spatial grid for which calculations are performed) and
see whether the new solutions that are generated more closely approximate the exact
solutions (or at least seem to be converging, in the case where the exact solution is not
known). An even more demanding code evaluation test involves determining whether
the rate at which solutions are observed to converge in response to such an intervention
matches the rate at which formal analysis of the equations predicts the solutions should
converge (see e.g. Roy 2005, 134–136). Again, the aim is to build confidence in the
model qua solver of a particular kind of equation (or set of equations) in order to build
confidence in the solutions that it produces in the simulation of interest.

(3) Apparatus based on well-confirmed theory. If the theoretical principles used in


designing an experimental apparatus as a probe of a particular sort are themselves
well-confirmed, then the evidence that supports those theoretical principles ‘inspires
confidence’ in the apparatus (Franklin 2002, 5) and gives one some reason to believe in
the observations that are made with the apparatus during an experiment (Franklin
1989, 440). Franklin offers as examples the electron microscope and the radio tele-
scope; our confidence in results obtained with these instruments comes in part from
our recognizing that key principles relied upon in their construction are themselves
well-confirmed.
In the context of model evaluation, there is often a similar attempt to ‘inspire confi-
dence’ by pointing out that key modelling assumptions come from accepted scientific
theories; it may be emphasized that a model is ‘based on well-established physical
principles’. As Winsberg (2003) makes clear, however, the relationship between the
theoretical principles used in constructing a simulation model and results obtained
using that model is often far from straightforward, both because simulation models
often include simplified and idealized versions of the applicable theoretical principles
and because so many extra-theoretical considerations (including, but not limited to,
those having to do with solution methods) are involved in generating simulation
International Studies in the Philosophy of Science 171

results (see ibid.). This suggests that the increased confidence that can be conferred
through this strategy may often be modest at best, at least without further argument
concerning the relationship between the relied-upon theoretical principles and the
simulation results.10
In the context of code evaluation, an analogous confidence-building strategy
involves showing that the numerical solution methods used in producing the simula-
tion results are underwritten by sound mathematical theory and analysis. For example,
one might point to accepted analyses showing the method to be stable for the kinds of
equations to be solved during the simulation. As noted by Roache (1998, 50), Roy
(2005, 133), and others, however, while such analyses have sometimes been given for
simple, linear equations, the methods and theorems used in those analyses are not
necessarily valid for the kinds of nonlinear equations often used in simulating natural
and social systems. So, once again, the prospects for using this strategy to increase
confidence in the context of code evaluation may be somewhat limited, at least at
Downloaded By: [Ohio University] At: 14:16 5 December 2008

present.

(4) Independent confirmation of results. Another strategy involves showing that


the results of an experiment can be replicated in other experiments using different appa-
ratus (Franklin 1989, 438). The example discussed by Franklin follows Ian Hacking’s
argument (Hacking 1983) that when two or more different kinds of microscopes
produce the same result, this can count as strong evidence that the result is genuine,
rather than an artefact of the apparatus. According to Franklin, two experiments can
be considered ‘different’ when they involve apparatus based on different theories, and
perhaps even when they simply involve differences of size, geometry, or personnel
(1989, 438). It should be noted, however, that not all replications count the same; the
increase in confidence that replication provides depends on how unlikely it is that the
two different experiments would both deliver the same mistaken result (i.e. would err
in the same way).
An analogous model evaluation strategy involves showing that a simulation result
whose external validity is at issue closely matches a result generated in another study
(whether simulation, traditional experiment or ordinary observation) addressing the
same question about the target system; to the extent that it is unlikely that the two stud-
ies would give the same mistaken result, confidence in the simulation result is increased
by the match. Note that in the case of comparing results from two simulation studies,
scientists often may not have good reason to think it unlikely that results from the two
studies would err in the same way; two simulation models may rest on some of the
same questionable and/or untested assumptions about the target system or even
borrow pieces of computer code from one another. In the case of comparing simula-
tion results with the results of traditional experiments, including those carried out on
the target system itself, one might think that if such experiments could be performed,
there would be no need to bother with computer simulation studies in the first place.
This need not be the case, however. Suppose, for instance, that conducting the tradi-
tional experiment would be a very delicate or difficult undertaking, prone to producing
results with very significant error. In that case, there might be good reason to think that
172 W. S. Parker
a simulation study would have a better chance of telling one what one wants to know
about the target system. Moreover, if the traditional experiment were later conducted
in a careful way and found to deliver a result that closely matched the simulation result,
then to the extent that it is unlikely that the studies would produce the same mistaken
result, this match can increase confidence in the result of the simulation. (Presumably,
the argument could work in the other direction as well, with the simulation result
increasing confidence in the experimental result! See possible examples in Roache
1998, 310–311.)
An analogous code evaluation strategy involves demonstrating that calculated solu-
tions are reproduced (to a desired degree of accuracy) by another piece of code
designed to estimate solutions to the same set of equations but implemented on a
different physical machine and/or with the use of a different numerical solution tech-
nique (see also Weissart 1997, 123). As an example of the latter, one might switch from
one finite differencing scheme to another or use a spectral solution method instead;
Downloaded By: [Ohio University] At: 14:16 5 December 2008

insofar as it is unlikely that models employing the different solution techniques would
each incorporate significant mathematical or programming errors and still produce the
same result, confidence in the result is increased.11 However, Roy (2005) notes that in
general it may not be that unlikely that two different solution algorithms would have
errors in common, since new algorithms often are based at least loosely (and some-
times directly) on earlier ones. Once again, then, the increase in confidence provided
by this strategy may often be relatively modest.

(5) Elimination of plausible sources of error / alternative explanations of the re-


sults. This strategy, which Franklin often refers to as the ‘Sherlock Holmes’ strategy,
involves showing that all plausible sources of error and alternative explanations of the
results can be ruled out (Franklin 1988, 422; 2002, 4).12 By ‘sources of error’ Franklin
seems to mean problems with the experimental apparatus, while by ‘alternative expla-
nations’ he seems to mean accounts of the data according to which they were produced
by processes other than just those implied by the result statement (but correctly
detected/reported by the experimental apparatus). His illustrations include a case in
which a spacecraft collected data that seemed to indicate electrical discharges in the rings
of Saturn. Confidence that there really had been electrical discharges in the rings of
Saturn was increased by ruling out plausible sources of error, such as defects in the space-
craft’s telemetry, and alternative explanations of the data, such as the presence of electric
discharges near the spacecraft due to environmental phenomena, lightning, or dust; in
the end, that there really had been electrical discharges in the rings of Saturn was judged
the only plausible way of accounting for the data (Franklin 1989, 446–447; 2002, 4).
When it comes to computer simulation studies, there is usually little reason to worry
that the resulting ‘data’ have been produced by anything other than an attempt to run
the computer simulation model, so the possibility of alternative explanations of the
data in the sense described above is rarely a source of concern.13 However, it typically
is plausible (prima facie) that any of several sources of error might have impacted simu-
lation results of interest in such a way as to render them inadequate for the purposes
for which they are to be used. A few examples include error related to the mathematical
International Studies in the Philosophy of Science 173

form of the continuous model equations (e.g. due to simplifications and idealizations),
error due to instability in the chosen solution method, and error due to programming
mistakes. A model evaluation strategy analogous to Franklin’s Sherlock Holmes strat-
egy would involve showing that these and various other plausible sources of error can
be ruled out or shown to be of acceptably small magnitude (relative to the goals of the
simulation study).
When the focus is on code evaluation, an analogous confidence-building strategy
would involve showing that one can rule out (or bound the magnitude of) all of the
plausible sources of error that would prevent the simulation code from delivering accu-
rate-enough solutions of interest to the continuous model equations. These plausible
sources of error might include such things as: computational instability, round-off
error, truncation error, iterative convergence error, programming mistakes, and hard-
ware malfunction. Confidence that the simulation code does deliver the desired
approximate solutions to the continuous model equations can increase as the various
Downloaded By: [Ohio University] At: 14:16 5 December 2008

plausible sources of error are ruled out (as unlikely) or shown to be of acceptably small
magnitude.14

4. Sherlock Holmes and the Epistemologies of Experiment and Simulation


The strategies just presented, along with several others, are understood by Franklin to
constitute an epistemology of experiment. He says of them, ‘They provide us with good
reasons for belief in experimental results’ (Franklin 2002, 6). One of Franklin’s primary
aims, recall, is to defend a view of experimental practice on which results come to be
accepted not because of social factors, but because there are good non-social reasons
for doing so. His strategies are meant to illustrate some of the good non-social reasons
on which belief in experimental results actually rests.
What is unclear, however, is exactly how Franklin understands his strategies (and
perhaps others—he emphasizes that his list is not meant to be exhaustive) to provide a
basis for justified belief in or acceptance of experimental results. He does not consider
any fixed combination of his strategies to be always sufficient for justified belief, nor
any individual strategy to be always necessary (1989, 459; 2002, 6). Doubting that there
is any general method for establishing the validity of experimental results (1989, 459),
he suggests that in practice scientists simply ‘use as many of the strategies as they can
conveniently apply in any given experiment’ (2002, 6).
Reflecting on Franklin’s strategies, however, one cannot help but notice that the
support provided by some of them, especially when considered in isolation, will be
weak indeed. For instance, learning that a key experimental observation was made
using an instrument whose proper functioning depends upon a well-confirmed and
applicable theory should not, on its own, give one much confidence in the experimen-
tal result—what if the experiment was poorly designed or the instrument was miscali-
brated? Of course, additional strategies might be employed to build confidence that the
experiment did not go wrong in these other ways, but nothing in Franklin’s account
seems to require it. At the end of the day, it is possible that one could have successfully
applied several of Franklin’s strategies yet still be far from justified in believing or
174 W. S. Parker
accepting an experimental result. For instance, if studies investigating the phenomenon
of interest have been plagued by well-known confounding factors in the past, yet no
effort has been made to ensure that such confounders have not interfered in the present
experiment, then it seems clear one is not yet justified in believing or accepting the
results of the experiment.
At the same time, it appears that there is one strategy identified by Franklin whose
successful application would avoid this problem: the Sherlock Holmes strategy, which
as Franklin presents it involves showing that all plausible sources of error and alterna-
tive explanations of the result can be ruled out (as unlikely). The central claim of this
section is that the successful application of a slightly elaborated version of the Sherlock
Holmes strategy is sufficient for justified belief in experimental results—and simula-
tion results as well—and can provide helpful structure to the task of investigating the
validity of such results.
Downloaded By: [Ohio University] At: 14:16 5 December 2008

4.1. Sherlock Holmes and the Results of Traditional Experiments


If the Sherlock Holmes strategy is to provide the sufficient basis just claimed, the
conditions for its successful application need to be made clearer and more stringent.
Otherwise, the problem mentioned above may still arise—there may be cases in which
it is claimed that the strategy has been successfully applied while believing or accept-
ing the result is clearly not justified.
First, in order to claim that the Sherlock Holmes strategy has been successfully
applied, there must have been a thorough and good-faith attempt to identify all plau-
sible sources of error and alternative explanations of the results. Franklin says little
about how this should be done, but one possibility, which is advocated here, would be
to look to a set of standard ways in which experiments can go wrong (e.g. poor exper-
imental design, inadequate control of confounding factors, instrument miscalibration,
etc.)—what Mayo (1996) refers to as ‘canonical’ sources of experimental error—and
then attempt to identify the specific instances of these canonical types of error plausible
in the context of the experiment at hand.15 So, for instance, one would ask: In what
ways might the instruments in this particular experiment have plausibly malfunc-
tioned? What are the plausible potential confounding factors in this experiment?
Which design assumptions of this experiment might not have been met? And so on.
Second, the situation must not be one in which there is little clue how to answer
questions like those just identified. That is, the situation must not be one in which there
is little sense of what the plausible instances of these canonical types of error would be
in the study at hand. Such a situation might arise, for example, if the processes hypoth-
esized to be at work in the experiment were poorly understood because they were novel
or otherwise mysterious. If one has very little confidence when it comes to identifying
plausible sources of error and alternative explanations of the results, then conditions
are not right for using the Sherlock Holmes strategy as a justificatory strategy.
Third, there needs to have been a thorough and good-faith effort to uncover any
indications that the experiment did go wrong in the ways that have been identified as
plausible, and the methods used in this investigation must be ones that have a good
International Studies in the Philosophy of Science 175

chance of finding the problematic sources of error, should they be present. It is not
enough to half-heartedly check for plausible sources of error, or to do so using proce-
dures that have little chance of revealing them (if they are present).
With this elaboration, it is suggested here, the successful application of the Sherlock
Holmes strategy provides a sufficient basis for justified belief in or acceptance of exper-
imental results. This is not a once-and-for-all justification, of course, for it is doubtful
that such a thing exists; one can be justified in believing or accepting something at one
time, yet not at a later time, if new information comes to light. If plausible sources of
error that one had not previously recognized are subsequently called to one’s attention,
for instance, then one may no longer be in a position to claim that the Sherlock Holmes
strategy has been successfully applied, and the status of one’s belief in or acceptance of
the associated result may need to be re-evaluated.
Nevertheless, the Sherlock Holmes strategy is applicable in a wide range of cases16
and, with the elaboration above involving canonical sources of error, has the attractive
Downloaded By: [Ohio University] At: 14:16 5 December 2008

feature of providing significant structure to the task of justifying experimental results.


Moreover, granting a special status to the Sherlock Holmes strategy does not render
extraneous the other strategies identified by Franklin; many, if not all, of them can
continue to be of value in the course of our investigating specific sources of error and
alternative explanations of results. Most obviously, for instance, one or more of the
strategies that focus on the experimental apparatus might be used in the course of
investigating whether results include artefacts due to apparatus malfunction or inade-
quacy (particular sources of error). Likewise, in some cases the strategy involving inde-
pendent confirmation of results might be used to dismiss a number of alternative
explanations at once.
An epistemology of experiment that gives a special place to the Sherlock Holmes
strategy can also be connected with other recent work on experiment. For instance,
Franklin attributes to Peter Galison the view that experiments end—i.e. their results are
accepted—‘when the experimenters believe they have a result that will stand up in
court’ (Franklin 2002, 6; see also Galison 1987). Successful application of the Sherlock
Holmes strategy might be understood as one answer to the question: Under what
circumstances should a result be able to stand up in court? And perhaps also to the
question: Under what circumstances will a result typically be able to stand up in court?
Given that Franklin’s other strategies can continue to play a contributing role in the
way suggested above, this way of looking at the Sherlock Holmes strategy is consistent
with Franklin’s claim that a result that will stand up in court is one whose validity is
supported using his strategies (Franklin 2002, 6).
More direct connections can be found between a Sherlock Holmes approach to the
justification of experimental results and recent work by Deborah Mayo (1996, 2000).
Mayo’s error-statistical epistemology of experiment, like the elaborated version of the
Sherlock Holmes approach sketched above, requires that one be in a position to rule
out (as unlikely) various canonical sources of error, before one accepts experimental
results. Although Mayo rejects Franklin’s Bayesian analysis of his strategies, claiming
that it misses their real epistemological rationale (Mayo 1996, 100), she might well be
sympathetic to something like the interpretation of Franklin’s strategies given above,
176 W. S. Parker
namely, that they can play a role in developing arguments concerning the presence/
absence of various canonical sources of experimental error.17
Furthermore, it would seem that experimenters often are sophisticated and diligent
in their investigation of canonical sources of experimental error and that the results of
these investigations often do figure prominently in their discussions of the trustwor-
thiness of experimental results. Though there is not space for significant defence of
these claims here, they are given some support by the detailed studies of experimental
practice that have been carried out by Franklin, Mayo, and others. It is suggested
below, however, that such a picture—one in which the thorough and systematic
investigation of familiar sources of error is taken to be of paramount importance—is
often rather less fitting as a depiction of current practice in the context of computer
simulation modelling.

4.2. Sherlock Holmes and Computer Simulation Results


Downloaded By: [Ohio University] At: 14:16 5 December 2008

At present, scientists’ approach to the evaluation of computer simulation models and


to the task of defending particular simulation results very often has more in common
with the unstructured epistemology of experiment sketched by Franklin than with the
Sherlock Holmes approach advocated above. Model and code evaluation activities are
often determined largely by convenience—by how much computing power is available,
which analytic solutions and/or observational data one has in hand, which visualiza-
tion tools are available, etc.—and frequently are not accompanied by any explicit argu-
mentation concerning what the evaluation activities that are undertaken indicate, if
anything, about the adequacy of the model for the purposes for which it is to be used.
From a practical point of view, this state of affairs is worrisome if computer simula-
tion models and their results are to be used in situations in which far-reaching deci-
sions hang in the balance, e.g. to address questions whose putative answers will
influence large-scale environmental policy decisions. There is a danger similar to that
identified above in the context of experiment: in light of the successful application of
several convenient confidence-building strategies, one might proceed under the
assumption that the simulation results are likely to be indicative of what one wants to
know about the target system, when in fact one does not have strong support for such
a belief. This danger may be especially acute when colourful graphical displays of simu-
lation results appear ‘believable’ or ‘realistic’ to the eye and/or when the results seem to
provide support for ideas or actions that one antecedently favours.
Once again, however, this danger can be mitigated by allowing the Sherlock Holmes
strategy (in its elaborated form) to structure the task of justifying results. In the case of
simulation results, the canonical sources of error that are relevant to the application of
the strategy are somewhat different from those discussed in connection with traditional
experimentation. Some of these canonical sources of error in simulation studies were
already mentioned in the last section, in the course of discussing model and code eval-
uation analogues of Franklin’s Sherlock Holmes strategy; they include such things as:
computational instability, truncation error, iterative convergence error, programming
mistakes, hardware malfunction, error in the mathematical form of the continuous
International Studies in the Philosophy of Science 177

model equations, error in the parameter values included in those equations, etc.
Taxonomies of sources of error in simulation studies are presented in more detail in
various places, including Oberkampf et al. (1995), Roache (1998), Winsberg (1999a),
and Parker (2008).
Just as in the case of traditional experiment, recognizing this special role for the
Sherlock Holmes strategy in the epistemology of computer simulation need not render
extraneous the other strategies identified above for building confidence in simulation
results, nor additional ones that might be in use; many if not all of them can continue
to be of value in the course of investigating canonical sources of error. For instance, the
strategy that involves showing that a simulation model gives results that closely match
analytic solutions for other initial and boundary conditions might be employed in
arguing that it is unlikely that significant programming errors are present in the model
(and thus that it is unlikely that the simulation results of interest reflect such errors).
As with investigations of error in experimental contexts, sometimes one may be able to
Downloaded By: [Ohio University] At: 14:16 5 December 2008

rule out several sources of error with a single strategy or test, while other times one may
have to employ several strategies or tests before even the presence of a single source of
error can be considered unlikely (or within acceptable bounds).
Ruling out what might be called ‘substantive modelling error’—error related to the
form of the continuous model equations, or the parameter values included in those
equations, or the chosen initial and boundary conditions—seems to present a particu-
lar challenge. How might these potential sources of error be investigated? One option
is to understand them as assumptions about the target system and to test them via
traditional experiment and observation. If such tests reveal the assumptions to be inac-
curate in various ways, it may be possible to estimate (perhaps via mathematical anal-
ysis) how the results of the simulation will likely be impacted by the recognized
inaccuracies. When this sort of analysis cannot be performed, one may be able to
explore how much one’s uncertainty regarding which assumptions will be adequate
matters when it comes to the simulation output of interest. For example, for parame-
ters for which one is uncertain which values would be best (given the aims of the
modelling study), one might vary the values assigned to the parameters over ranges that
one considers plausible (given the aims of the modelling study) and see how the model-
ling results of interest change in response. In principle, the same sort of investigation
can be undertaken when there is uncertainty concerning the form that the continuous
model equations should take.18
Unfortunately, in many cases it is difficult or impossible to directly test key assump-
tions relied on in constructing computer simulation models, because the target systems
of interest are inaccessible in space and/or time. Moreover, identifying ‘plausible’
ranges of parameter values or variant modelling assumptions can also be a tricky
matter. For one thing, a model that includes parameter values or equations that are
strikingly inaccurate (if understood as assumptions about the target system) can never-
theless remain adequate for the purposes for which it is to be used (see e.g. the case of
artificial viscosity discussed in Winsberg 2006).19 In light of these difficulties, Parker
(2008) presents one way in which the task of probing specifically for the errors that fall
under the heading of substantive modelling error might sometimes be circumvented;
178 W. S. Parker
this involves collecting statistical evidence of the model’s ability to deliver adequate
results but is itself subject to certain caveats and pitfalls (see ibid). Additional work is
needed on how the task of ruling out (or bounding the magnitude of) substantive
modelling error can be approached and how the challenges associated with doing
so affect the prospects for arriving at justified belief in or acceptance of computer
simulation results.
As in the context of traditional experiment, a justificatory approach built upon the
Sherlock Holmes strategy would dovetail with other recent work on the epistemology
of computer simulation. Most notably, Winsberg (1999b) has highlighted the impor-
tance of error management in the context of computer simulation modelling, taking
inspiration, in fact, from Franklin’s work on traditional experiment. However,
Winsberg at times seems sanguine about the extent to which the results of simulation
studies are already being thoroughly scrutinized for error, whereas it is suggested here
that—in practice—there is significant room for improvement in this regard. Granting
Downloaded By: [Ohio University] At: 14:16 5 December 2008

a central epistemological role to the Sherlock Holmes strategy in the way suggested
above is one way such improvement could be facilitated while preserving the insights
already had by Winsberg.20

4.3. The Bigger Picture


It has now been claimed that many, perhaps all, of Franklin’s strategies have direct
analogues in the context of computer simulation and that the Sherlock Holmes strategy
can provide a sufficient basis for justified belief in results in both contexts. The question
naturally arises: What explains these parallels? Are they made possible by some special
underlying similarities between traditional experiment and computer simulation?
On the contrary, the view taken here is that such parallels can be drawn because
Franklin’s strategies are instantiations of more general confidence-building approaches
that have quite broad applicability. For instance, the first of Franklin’s strategies might
be understood as an instantiation of the following: (i) identify an apparatus or activity,
a, whose performance of a function, f, is thought to be necessary for the inquiry at hand
to deliver sought-after information; (ii) identify circumstances, c, such that, if a is
placed in c and a can perform f, then a will deliver expected results, e; (iii) demonstrate
that a, when placed in c, does deliver e. This confidence-building strategy might be
implemented in a variety of contexts; as illustrated by the code evaluation version in
Section 3, it can be used even when the sought-after information is information about
a set of equations, rather than a natural or social system.
Similarly, a generalized form of the Sherlock Holmes strategy can be given as: (i) iden-
tify the canonical types of error for the kind of inquiry being undertaken; (ii) identify
the plausible instances of each of those canonical types of error for the specific inquiry
at hand; (iii) show that it is unlikely that those instances have impacted the results of
interest by more than some acceptable amount. Again, this confidence-building strategy
might be pursued in various contexts, though it is worth noting that implicit in (i)–(iii)
are preconditions for its successful use (e.g. the inquiry at hand must be of a kind for
which canonical types of error can be identified). What is special about this strategy is
International Studies in the Philosophy of Science 179

that its successful application allows not only for increased confidence in results but also,
it is claimed above, for justified belief in results.
At this point, it is worth attempting to allay what might be a lingering concern. It
might be worried that the Sherlock Holmes strategy is not really a ‘strategy’ at all—that
it is really little more than a restatement of the goal of model evaluation (or validation),
rather than a method that prescribes a definitive course of action for pursuing that goal.
Indeed, that would explain why the other strategies can have the contributing role
claimed for them above. Such a worry only seems legitimate, however, if one mistak-
enly understands the content of the Sherlock Holmes strategy to be something like,
‘Show that it is unlikely that the results are in error’. It should be clear by now that the
Sherlock Holmes strategy has much more internal structure; it does prescribe a definite
course of action, one that requires attention to the kinds of error that commonly arise
in the type of study being undertaken as well as the plausible ways in which those kinds
of error could have been instantiated in the specific study at hand.
Downloaded By: [Ohio University] At: 14:16 5 December 2008

5. Conclusions
Many, perhaps all, of Franklin’s strategies have direct analogues in the context of
computer simulation. In addition, Franklin’s depiction of the unstructured use of his
strategies in justifying belief in experimental results fits well with the way in which
scientists commonly investigate and defend the trustworthiness of computer simula-
tion results. Unfortunately, this approach to justification is inadequate; it is possible
that one might apply several of Franklin’s confidence-building strategies (or their
analogues in computer simulation) yet clearly not be justified in believing or accepting
results of interest.
There is at least one strategy identified by Franklin, however, that does provide a suffi-
cient basis for justified belief in experimental results, namely, the Sherlock Holmes strat-
egy. This strategy, in a slightly elaborated form, involves (i) identifying the canonical
types of error associated with the kind of study at hand; (ii) identifying the plausible
instances of each of those canonical types of error for the specific study being undertaken;
(iii) showing that it is unlikely that those instances have impacted the results by more than
some acceptable amount (where what is acceptable depends on the goals of the study).
An approach to justification built around the Sherlock Holmes strategy would dove-
tail with other recent work on the epistemologies of experiment and simulation and
would provide needed structure to the task of model evaluation. However, further
work remains to be done to better understand in what circumstances the requirements
of the Sherlock Holmes approach are likely to be met in the context of computer simu-
lation modelling and what can be said about simulation results when they are not met.

Acknowledgements
Thanks to Phil Ehrlich, Allan Franklin, Francis Longworth, John Norton, Eric Wins-
berg, two anonymous referees, and the editor of this journal—James McAllister—for
valuable feedback on earlier versions of this paper.
180 W. S. Parker
Notes
[1] In this paper, attention is restricted to computer simulations that involve estimation of solu-
tions to differential equations. Other kinds of computer simulation studies are also carried
out in science. Perhaps best known are those involving cellular automata, where the discrete
state of each node in a network is updated according to rules that reference the discrete states
of neighbouring nodes. Some details of strategies discussed in Section 3—especially those
related to code evaluation—would be different if these and other types of simulations were
included in the analysis, but the more general epistemological points made in section 4 would
remain the same.
[2] It is worth reiterating that model evaluation should be understood as an investigation of a
model’s adequacy-for-purpose, not an investigation of its truth or falsity, whatever that might
mean. A model that is constructed with the use of a variety of false assumptions about a target
system might nevertheless be an adequate representation of that target system, relative to the
goals of the modelling study.
[3] Campbell and Stanley (1963) provide an early discussion. As there is some variation (and
sometimes fuzziness) in the use of this terminology, the following is my own attempt to char-
Downloaded By: [Ohio University] At: 14:16 5 December 2008

acterize these two kinds of validity. Guala (2003) presents the distinction in a slightly different
way.
[4] A realist perspective is not necessarily assumed here. The non-realist could offer a different
statement of the experimental result or could interpret the same result statement differently.
For instance, while the realist might say, ‘On seven out of fifty trials, a collision of type C
produced a particle with mass M ± ε GeV/c2’, the non-realist might say instead, ‘On seven out
of fifty trials, after the experimental apparatus E was placed in arrangement A, the detector
gave a signal of type S.’ Or the non-realist might offer a result statement that is syntactically
similar, if not identical, to the result statement given by the realist but intend it to mean that
conditions during the experiment were ‘as if’ particles with mass such-and-such were
produced.
[5] In the former, it is generalization of the simulation results (i.e. results about the behaviour of
a computer) to conclusions about the continuous model equations that is at issue, while in the
latter it is generalization of the simulation results to conclusions about a natural or social
target system that is at issue (see also Parker forthcoming).
[6] ‘Relevant’ instances are ones for which there is reason to think that the quality of the appara-
tus’ performance will be indicative of the quality of its performance in the experiment of
interest. Franklin does not emphasize this point about relevance, but it is important; I incor-
porate it explicitly into the simulation-related analogues that I identify below, both for this
strategy and for the next one.
[7] In collaboration with Howson, Franklin has offered a Bayesian analysis/justification for many
of the strategies discussed in this section (Franklin and Howson 1988); in the interest of space,
those analyses will not be rehearsed or examined here.
[8] It should be noted, however, that if model parameters have been tuned in an ad hoc fashion
for the very purpose of ensuring a close match with past rainfall data, then the finding of such
a match should result in little or no increase in confidence in the adequacy of the model for
predicting future rainfall. This is easy to see if we think of Franklin’s strategies in a Bayesian
framework, as he does: we get little or no boost in confidence when close model–data fit has
been achieved through ad hoc tuning, since in that situation we expect the fit to be close and
so have to assign p(e) ≈ 1 in the Bayesian updating equation, p(h|e) = [p(e|h)*p(h)] / p(e).
Here, p(h|e) is the probability that the hypothesis (i.e. a statement about the target system) is
true, given that such a close model–data fit was obtained; p(h) is the probability assigned to
the hypothesis before the close model–data fit was obtained; p(e|h) is the probability of
obtaining such a close model–data fit, given that the hypothesis is true; p(e) is the probability
of obtaining such a close model–data fit. For the same reason, none of Franklin’s strategies will
International Studies in the Philosophy of Science 181

provide any significant increase in confidence if we have engaged in ad hoc fiddling in order
to guarantee the demonstration that the strategy requires. This should be kept in mind
throughout the discussion that follows.
[9] Weissart (1997, 123) and Winsberg (1999a, 39) both follow Franklin in calling this practice
‘calibration’. However, in the context of computer simulation modelling, calibration is some-
times synonymous with tuning, i.e., with adjusting model parameter values in an ad hoc way
to achieve a better fit between model output and empirical data (see e.g. Oberkampf and
Trucano 2002). It seems best to stick with the terminology of ‘benchmarking’, as suggested by
Oreskes et al. (1994).
[10] Moreover, this strategy applies only for simulations grounded in accepted theoretical princi-
ples; not all simulations fall into this category.
[11] Weissart (1997, 123–124) suggests that decreasing the computational time step and rerunning
the simulation model would also be an analogous strategy. This would seem to increase confi-
dence even less than employing a different solution technique, since it seems more likely that
solutions would err in the same ways when the simulations differ only in their time-step
lengths than when they also incorporate different solution techniques.
[12] In a footnote, Franklin offers a quote in which Holmes speaks of determining alternatives to
Downloaded By: [Ohio University] At: 14:16 5 December 2008

be ‘impossible’ and concluding that what remains ‘must be the truth’ (see Franklin 2002,
250n8). It is clear, however, that Franklin does not take his Sherlock Holmes strategy (or any
other) to deliver certainty concerning the validity of experimental results. It is the idea of
rejecting alternative hypotheses, rather than doing so with certainty, that inspires his ‘Sher-
lock Holmes’ label for the strategy.
[13] The fact that there is relatively little reason to worry about ‘confounders’ here might be
considered an epistemic advantage or strength of computer simulation. Note, however, that
this does not mean that the algorithm in fact implemented as the computer simulation
model is the algorithm that one intended to implement or an algorithm that estimates accu-
rate-enough solutions to the continuous model equations; various shortcomings in the
design and programming of the algorithm are possible, but these are better categorized
(given the characterizations above) as ‘sources of error’ than as ‘alternative explanations’ of
the data.
[14] Franklin (1989, 2002) discusses at least nine confidence-building strategies. That only a subset
of those strategies is discussed here does not seem problematic, since Franklin considers
none of his strategies to be necessary for rational belief in experimental results (see Section 4).
Four of Franklin’s strategies not discussed here relate to the following: observation of
expected artefacts; explanation/prediction of the result by a theory of the phenomenon;
coherence or naturalness of the results; and statistical considerations. Though I would argue
that these also have analogues in the context of computer simulation, even if they do not, the
five strategies just discussed seem sufficient to demonstrate that interesting parallels exist
between the confidence-building strategies available in the contexts of these two practices.
[15] Both of Franklin’s categories are encompassed by ‘sources of error’ in the broader sense of
‘ways that a study can go wrong’. In what follows, when speaking of ‘canonical sources of
error’ or ‘canonical types of error’, I intend this broader meaning, i.e. ‘canonical ways that a
study of this type can go wrong’.
[16] That is, in a wide range of cases, it is a strategy that is in principle appropriate to employ; it
may be that in practice various plausible sources of error or alternative explanations of the
results cannot be ruled out.
[17] More specifically, she likely would argue that if the strategies are epistemically significant, it is
because they can sometimes be used (whether singly or in combination) in carrying out
‘severe tests’ of hypotheses concerning the absence of specific sources of experimental error
(see Mayo 1996).
[18] This kind of probing of the implications of uncertainty in substantive modelling assumptions
is being undertaken now by climate modellers who want to see how uncertainty in representing
182 W. S. Parker
the climate system translates into uncertainty with regard to projections of future climate
change (see Stainforth et al. 2005; Parker 2006; IPCC 2007).
[19] Indeed, the idea that a ‘plausible’ parameter value is one that is believed to be somewhat
‘close’ to the ‘real’ value of some quantity may need to be reconsidered or even abandoned in
this context (see Smith 2006 for a related discussion).
[20] An error-statistical epistemology of computer simulation, taking inspiration from Mayo
(1996; 2000), is another option. Parker (2008) sketches the beginnings of such an account. An
error-statistical approach to the epistemology of computer simulation would overlap in
important ways with the Sherlock Holmes approach presented above but would highlight the
importance of ‘severe testing’ for error within a non-Bayesian framework. There is not space
here to explore whether there is a need to choose between error-statistical and Sherlock
Holmes approaches to the epistemology of computer simulation nor, if so, which should be
chosen and why.

References
Downloaded By: [Ohio University] At: 14:16 5 December 2008

Campbell, D. T., and J. C. Stanley. 1963. Experimental and quasi-experimental designs for research.
Chicago, IL: Rand McNally.
Franklin, A. 1986. The neglect of experiment. Cambridge: Cambridge University Press.
———. 1989. The epistemology of experiment. In The uses of experiment, edited by D. Gooding,
T. Pinch, and S. Schaffer, 437–460. Cambridge: Cambridge University Press.
———. 2002. Selectivity and discord: Two problems of experiment. Pittsburgh, PA: University of
Pittsburgh Press.
Franklin, A., and C. Howson. 1988. It probably is a valid experiment: A Bayesian approach to the
epistemology of experiment. Studies in History and Philosophy of Science 19: 419–427.
Galison, P. 1987. How experiments end. Chicago, IL: University of Chicago Press.
Guala, F. 2003. Experimental localism and external validity. Philosophy of Science 70: 1195–1205.
Hacking, I. 1983. Representing and intervening. Cambridge: Cambridge University Press.
IPCC (Intergovernmental Panel on Climate Change). 2007. Climate change 2007: The physical science
basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental
Panel on Climate Change, edited by S. Solomon, D. Qin, M. Manning, Z. Chen, M. Marquis,
K. B. Averyt, M. Tignor, and H. L. Miller. Cambridge: Cambridge University Press.
Mayo, D. 1996. Error and the growth of experimental knowledge. Chicago, IL: University of Chicago
Press.
———. 2000. Experimental practice and an error statistical account of evidence. Philosophy of
Science 67 (Proceedings): S193–S207.
Oberkampf, W. L., F. G. Blottner, and D. P. Aeschliman. 1995. Methodology for computational fluid
dynamics code verification/validation. AIAA Paper 1995-2226.
Oberkampf, W. L., and T. G. Trucano. 2002. Verification and validation in computational fluid
dynamics. Sandia National Laboratory Report, SAND2002-059.
Oreskes, N., K. Shrader-Freschette, and K. Belitz 1994. Verification, validation, and confirmation of
numerical models in the earth sciences. Science 263: 641–646.
Parker, W. S. 2006. Understanding pluralism in climate modeling. Foundations of Science 11:
349–368.
———. 2008. Computer simulation through an error-statistical lens. Synthese 163: 371–384.
———. Forthcoming. Does matter really matter? Computer simulations, experiments, and materi-
ality. Synthese.
Roache, P. J. 1998. Verification and validation in computational science and engineering. Albuquerque,
NM: Hermosa.
———. 2002. Code verification by the method of manufactured solutions. Journal of Fluids
Engineering 124 (1): 4–10.
International Studies in the Philosophy of Science 183

Roy, C. J. 2005. Review of code and solution verification procedures for computational simulation.
Journal of Computational Physics 205: 131–156.
Smith, L. A. 2006. Predictability past predictability present. In Predictability of weather and climate,
edited by T. Palmer and R. Hagedorn, 217–250. Cambridge: Cambridge University Press.
Stainforth, D. A., T. Aina, C. Christensen, M. Collins, N. Faull, D. J. Frame, J. A. Kettleborough, S.
Knight, A. Martin, J. M. Murphy, C. Piani, D. Sexton, L. A. Smith, R. A. Spicer, A. J. Thorpe,
and M. R. Allen. 2005. Uncertainty in predictions of the climate response to rising levels of
greenhouse gases. Nature 433: 403–406.
Trucano, T. G., M. Pilch, and W. Oberkampf. 2003. On the role of code comparisons in verification
and validation. Sandia National Laboratory Report, SAND2003-2752.
Weissart, T. 1997. The genesis of simulation in dynamics. New York: Springer.
Winsberg, E. 1999a. Simulation and the philosophy of science: Computationally intensive studies of
complex physical systems. Ph.D. diss., Indiana University.
———. 1999b. Sanctioning models: The epistemology of simulation. Science in Context 12: 275–292.
———. 2003. Simulated experiments: Methodology for a virtual world. Philosophy of Science 70:
105–125.
———. 2006. Models of success vs. the success of models: Reliability without truth. Synthese 152: 1–19.
Downloaded By: [Ohio University] At: 14:16 5 December 2008

Anda mungkin juga menyukai