This book presents an introduction to the set of tools that has become
known commonly as geostatistics. Many statistical tools are useful
in developing qualitative insights into a wide variety of natural phenomena; many others can be used to develop quantitative answers
t o specific questions. Unfortunately, most classical statistical methods make no use of the spatial information in earth science data sets.
Geostatistics offers a way of describing the spatial continuity that is an
essential feature of many natural phenomena and provides adaptations
of classical regression techniques to take advantage of this continuity.
The presentation of geostatistics in this book is not heavily mathematical. Few theoretical derivations or formal proofs are given; instead,
references are provided to more rigorous treatments of the material.
The reader should be able to recall basic calculus and be comfortable
with finding the minimum of a function by using the first derivative
and representing a spatial average as an integral. Matrix notation is
used in some of the later chapters since it offers a compact way of writing systems of simultaneous equations. The reader should also have
some familiarity with the statistical concepts presented in Chapters 2
and 3.
Though we have avoided mathematical formalism, the presentation
is not simplistic. The book is built around a series of case studies on
a distressingly real data set. As we soon shall see, analysis of earth
science data can be both frustrating and fraught with difficulty. We
intend to trudge through the muddy spots, stumble into the pitfalls,
and wander into some of the dead ends. Anyone who has already
Introduction
IlawUlane
Kl
NEVADA
Figure 1.1 A location map of the Walker Lake area in Nevada. The small rectangle
on the outline of Nevada shows the relative location of the area within the state.
The larger rectangle shows the major topographic features within the area.
in; this reflects both the historical roots of geostatistics as well as the
experience of the authors. The methods discussed here, however, are
quite generally applicable t o any data set in which the values are spatially continuous.
The continuous variables, V and U ,could be thicknesses of a geologic horizon or the concentration of some pollutant; they could be soil
strength measurements or permeabilities; they could be rainfall measurements or the diameters of trees. The discrete variable, T , can be
viewed as a number that assigns each point to one of two possible categories; it could record some important color difference or two different
species; it could separate different rock types or different soil lithologies; it could record some chemical difference such as the presence or
absence of a particular element.
For the sake of convenience and consistency we will refer to V and
U as concentrations of some material and will give both of them units
of parts per million (ppm). We will treat T as an indicator of two
types that will be referred to as type 1 and type 2. Finally, we will
assign units of meters t o our grid even though its original dimensions
are much larger than 260 x 300 m2.
T h e Walker Lake data set consists of V , U and T measurements a t
each of 78,000 points on a 1 x 1 m2 grid. From this extremely dense
d a t a set a subset of 470 sample points has been chosen t o represent a
typical sample data set. To distinguish between these two data sets,
the complete set of all information for the 78,000 points is called the
exhaustive data set, while the smaller subset of 470 points is called the
sample data set.
6. The use of sample values of one variable t o improve the estimation of another variable.
Introduction
The first question, despite being largely qualitative, is very important. Organization and presentation is a vital step in communicating
the essential features of a large data set. In the first part of this book
we will look a t descriptive tools. Univariate and bivariate description
are covered in Chapters 2 and 3. In Chapter 4 we will look a t various
ways of describing the spatial features of a data set. We will then take
all of the descriptive tools from these first chapters and apply them
to the Walker Lake data sets. The exhaustive data set is analyzed in
Chapter 5 and the sample data set is examined in Chapters 6 and 7.
The remaining questions all deal with estimation, which is the topic
of the second part of the book. Using the information in the sample
data set we will estimate various unknown quantities and see how well
we have done by using the exhaustive data set to check our estimates.
O u r approach to estimation, as discussed in Chapter 8, is first to consider what it is we are trying to estimate and then t o adopt a method
that is suited to that particular problem. Three important considerations form the framework for our presentation of estimation in this
book. First, do we want an estimate over a large area or estimates for
specific local areas? Second, are we interested only in some average
value or in the complete distribution of values? Third, do we want our
estimates to refer to a volume of the same size as our sample data or
do we prefer to have our estimates refer to a different volume?
In Chapter 9 we will discuss why models are necessary and introduce the probabilistic models common to geostatistics. In Chapter
10 we will present two methods for estimating an average value over
a large area. We then turn to the problem of local estimation. In
Chapter 11 we will look at some nongeostatistical methods that are
commonly used for local estimation. This is followed in Chapter 12
by a presentation of the geostatistical method known as ordinary point
kriging. The adaptation of point estimation methods t o handle the
problem of local block estimates is discussed in Chapter 13.
Following the discussion in Chapter 14 of the important issue of
the search strategy, we will look a t cross validation in Chapter 15
and show how this procedure may be used to improve an estimation
methodology. In Chapter 16 we will address the practical problem of
modeling variograms, an issue that arises in geostatistical approaches
to estimation.
In Chapter 17 we will look at how to use related information t o
improve estimation. This is a complication that commonly arises in
Introduction