A State-Space Model Approach To Optimum Spatial Sampling Design Based On Entropy

A state-space model approach to
optimum spatial sampling design based

on entropy
M. C. BUESO, J . M. ANGULO a nd F. J . ALONSO
Departamento de Estad stica e I.O. Universidad de Granada, Campus de Fuentenueva s/n,
E-18071 Granada, Spain
Received February 1996. Revised July 1997
We consider the spatial sampling design problem for a random field A. This random field is in
general assumed not to be directly observable, but sample information from a related variable Y
is available. Our purpose in this paper is to present a state-space model approach to network
design based on Shannon's definition of entropy, and describe its main points with regard to
some of the most common practical problems in spatial sampling design. For applications, an
adaptation of Ko et al.'s (1995) algorithm for maximum entropy sampling in this context is
provided. We illustrate the methodology using piezometric data from the Ve lez aquifer
(Ma laga, Spain).
Keywords: Shannon's entropy, spatial sampling, state-space model
1352-8505 1998 Chapman & Hall
1. Introduction
Spatial sampling design is a common problem in many fields of application (geology,
geophysics, agriculture, etc.), with spatial dependence playing a crucial role. Of
particular significance is the Gaussian case, for which, without considering the effect
of the deterministic trend defined by the mean, the stochastic spatial dependence is
determined by the covariance structure, which can be derived from a specific model
or represented by an empirical function. This problem can be formulated differently
depending upon the situation and, of course, on the purpose. The design problem is
to find a set of sampling locations (optimum under some specific criterion) either
observing A or some related variable (random field) Y , with or without assuming any
restrictions, and with or without considering any prior sample or model information. It
would be a difficult, if not impossible, task to give a full answer to this general problem
and all its many possible derivations. Different approaches have been introduced in the
literature (see, for example, De Gruijter and Ter Braak, 1990). A geostatistical
approach is used by Bras and Rodrguez-Iturbe (1976), Bo gardi et al. (1985), Aspie
and Barnes (1990), Samper and Carrera (1990, Chapter 19), Trujillo-Ventura and
Ellis (1991), Haas (1992), and Journel (1994), among others. Cressie summarises the
main aspects of the general geostatistical approach (1991, Sections 4.6.2 and 5.6.1).
Environmental and Ecological Statistics 5, 2944 (1998)
1352-8505 1998 Chapman & Hall
A more random-field focused formulation can be found in Christakos (1992, Chapter
10). Using an information theory approach, a point of view that we adopt in this paper,
Caselton and Hussian (1980) propose the choice of a network which maximizes the
entropy of the random variables at gauged sites. Along the same lines, Caselton and
Zidek (1984) consider the problem in a Bayesian framework, formulating it as a deci-
sion problem. Their optimal choice maximizes the information in the random variables
at gauged sites on the random variables at ungauged sites. Caselton et al. (1991) assume
that the random vector depends on a parameter with a prior distribution and their
purpose is also to reduce uncertainty about this parameter. Thus, they select stations
to be observed that minimize the residual uncertainty. Wu and Zidek (1992) study the
problem of reducing a network considering clusters, applying for each cluster the
method proposed by Caselton et al. (1991). Guttorp et al. (1993) examine the comple-
mentary problem of extending an existing network; their optimal network reduces the
uncertainty about future observations and model parameters. In the Gaussian case, Ko
et al. (1995) provide an upper bound for the entropy and develop an exact algorithm
based on this bound for solving the design problem.
In this paper we consider the idea of using the Shannon entropy (Shannon, 1948) to
sampling network design on a state-space model framework. The method is applicable
to discrete or continuous parameter spatial processes, and is described in detail for the
Gaussian case, where the special distribution properties lead to a particularly conveni-
ent mathematical treatment. In section 2 we present the general formulation of the
problem, analyse the problems of extending and reducing a pre-existing network, and
provide an adaptation to a state-space-model framework of Ko, Lee and Queyranne's
(1995) algorithm for maximum entropy sampling. An application to piezometric data
from the Ve lez aquifer (Ma laga, Spain) is presented in section 3, where the non-obser-
vable process of interest is assumed to satisfy a Laplace stochastic partial differential
equation. This model has been considered by several authors (Whittle, 1954; Jones,
1989; Angulo et al., 1994, among others).
2. Fundamentals and the method
Below we describe in detail the main formal aspects of the procedure proposed for
sampling network design based on entropy.
2.1 General formulation
In this section we formulate the spatial sampling design problem.
Assume that the variable of interest A is not directly observable, but that informa-
tion on A is obtained by sampling a variable Y related to A, the dependence relation-
ship between both variables being known. A simple case of practical interest appears
when Y is obtained by adding an observation error to the variable A. In general,
however, A and Y may represent different physical magnitudes. Let us also consider
that Y is potentially observable on H, and we are interested in knowledge of A on a
(possibly) different set A. For practical purposes, we assume discrete sampling and
finite H and A. Let o H be the subset (to be determined) of the locations where Y
30 Bueso, Angulo, Alonso
is actually to be observed, and let o
/
be its complement in H, o
/
= H o. Denote by X
A
the vector of the random variables A(:
i
), for all :
i
A, and denote by Y
o
the vector of
sampled random variables (:
i
o). The Shannon entropy of X
A
is defined as
H(X
A
) = E
X
A
[log )(X
A
)[.
where )(X
A
) is the density (or probability mass) function of X
A
. The conditional
entropy of X given Y
o
= y
o
} is defined as
H(X
A
[Y
o
= y
o
) = E
X
A
[Y
o
=y
o
[log )(X
A
[Y
o
= y
o
)[.
Thus, the mean conditional entropy of X
A
given Y
o
is defined as
H(X
A
[Y
o
) = E
Y
o
[H(X
A
[Y
o
= y
o
)[ = E
(X
A
.Y
o
)
[log )(X
A
[Y
o
)[.
where )(X
A
[Y
o
) is the conditional density (or probability mass) function of X
A
given
Y
o
. (E
Z
denotes the expectation with respect to the distribution of Z.)
The amount of information on X
A
in Y
o
is given by
1(Y
o
. X
A
) = H(X
A
) H(X
A
[Y
o
).
Demanding optimum knowledge of A on set A forces us to find
nax
(o)
1(Y
o
. X
A
). (1)
or, equivalently, since H(X
A
) is fixed,
nin
(o)
H(X
A
[Y
o
). (2)
where (o) stands for a set of restrictions such as imposing a maximum cardinality for o
or sampling cost. Note that
H(X
A
[Y
o
) = H(X
A
. Y
o
) H(Y
o
). (8)
where both terms on the right-hand side depend on o.
If Y
o
and X
A
are jointly Gaussian for any admissible o (meeting the restrictions), we
have
H(X
A
[Y
o
) =
`
2
(1 log(2))
1
2
log
[ C
X
A
Y
o
[
[ C
Y
o
[
. (4)
where ` is the cardinality of A, C
X
A
Y
o
is the covariance matrix of the joint vector
(X
/
A
. Y
/
o
)
/
, and C
Y
o
is the covariance matrix of vector Y
o
. The parameters involved in
the model and the latter matrices are estimated from historical data, the estimates
obtained then being assumed to be the true values. Note that the quotient of determi-
nants in the last term of expression (4) is equal to [ C
X
A
[Y
o
[, where C
X
A
[Y
o
is the con-
ditional covariance matrix of vector X
A
given Y
o
(which does not depend on actual
observed values, but on the locations of the sampled random variables in Y
o
), and
[ [ denotes the matrix determinant. Then problem (2) is reduced to finding
nin
(o)
log [ C
X
A
[Y
o
[. ()
or, equivalently, since the logarithm is an increasing function,
nin
(o)
[ C
X
A
[Y
o
[ . (6)
Optimum spatial sampling design 31
Note that this criterion, as well as certain other related criteria (in the simple case
where A and Y are equal and sets A and o correspond to the non-observed sites and
the observed sites, respectively), has been used by other authors for spatial sampling
applications (e.g. Mardia and Goodall, 1993). Our purpose is to give a formal definition
and justification of these criteria from an entropy point of view and in a state-space
framework. The mutual information between two Gaussian processes can also be
expressed in terms of the canonical correlation coefficients, an approach considered
by Caselton and Zidek (1984).
The computation of C
X
A
[Y
o
in (5) can be difficult, for it involves the inversion of a
matrix that may be of large dimension. The matrix inversion problem can be approa-
ched using an orthogonalization procedure (e.g. a Cholesky decomposition).
In practice, search algorithms are commonly considered to find best suboptimal solu-
tions which give, in general, good approximations to optimal solutions, or may be used
as a starting point for the application of the optimum entropy exact sampling design
algorithm (see Ko et al., 1995, and sections 2.3 and 3 below).
2.2 Adaptation to some pre-existing sampling network redesign
problems
In this section we consider certain problems common in practice that are related to the
extension or reduction of an assumed pre-existing observation network. Again, a variety
of particular problems may be obtained from this general idea. In the paragraphs below
we describe two simple important cases: adding to and deleting fromthe network a certain
number of sampling locations. Both problems are solved using an adaptation of the
approach introduced in section 2.1.
We assume the same elements defined at the beginning of section 2.1.
2.2.1 Extending a pre-existing network
Let us assume that the pre-existing network is composed of i sites in a set o H, and
that we want to add some new sampling locations, this set being denoted by o
a
. (o
a
)
denotes the class of all the subsets of H o to be considered as potential candidates for
the extension. For example, a common choice for (o
a
) is the class of all the subsets of
H o having a certain fixed number j of elements. Y
oo
a
represents the final sample
observation vector.
A
in Y
oo
a
is given by
1(Y
oo
a
. X
A
) = H(X
A
) H(X
A
[Y
oo
a
).
Then, an optimum design minimizes
H(X
A
[Y
oo
a
) = H(X
A
. Y
oo
a
) H(Y
oo
a
).
In the joint multivariate Gaussian case, the mean conditional entropy to be minimized
takes the form
H(X
A
[Y
oo
a
) =
`
2
(1 log(2))
1
2
log [ C
X
A
[Y
ooa
[.
where ` is the cardinality of X
A
, and C
X
A
[Y
ooa
is the conditional covariance of X
A
given
Y
oo
a
. As ` is fixed, the problem is reduced to finding
nin
(o
a
)
log [ C
X
A
[Y
ooa
[ .
Now, the Y
o
part in the conditioning subvector Y
oo
a
is common to all the choices
o
a
(o
a
). So, for computational purposes, it is advantageous to express C
X
A
[Y
ooa
in
terms of C
X
A
[Y
o
. This is given by the following relation:
C
X
A
[Y
ooa
= C
X
A
[Y
o
C
X
A
.Y
oa
[Y
o
C
1
Y
oa
[Y
o
C
Y
oa
.X
A
[Y
o
. (7)
where
C
X
A
.Y
oa
[Y
o
= C
X
A
.Y
oa
C
X
A
.Y
o
C
1
Y
o
C
Y
o
.Y
oa
.
C
Y
oa
[Y
o
= C
Y
oa
C
Y
oa
.Y
o
C
1
Y
o
C
Y
o
.Y
oa
.
C
Y
oa
.X
A
[Y
o
= (C
X
A
.Y
oa
[Y
o
)
/
(here C
.1
denotes the cross-covariance matrix of and 1, and C
.1[C
the conditional
cross-covariance matrix of and 1 given C). Thus, assuming that C
X
A
, C
1
Y
o
, and C
X
A
.Y
o
are available from previous computations, we only need the additional computation of
the matrix
C
Y
oa
.(X
A
.Y
ooa
)
.
for each o
a
(o
a
), where (X
A
. Y
oo
a
) is the joint vector of X
A
, Y
o
, and Y
o
a
. (As men-
tioned in section 2.1, the parameter estimates are here assumed to be the true parame-
ter values, and the problem of sensitivity of the latter matrices as estimated from the
data is not considered.)
In section 2.1 we propose that the inversion of C
Y
oa
[Y
o
can be made easier by ortho-
gonalizing Y
oo
a
, which can be done using a fixed orthogonalization for Y
o
for all o
a
.
2.2.2 Reducing a pre-existing network
Let us now assume that we are interested in finding an optimal reduction of a pre-
existing sampling network o H, by deleting some of the locations in o. Let o
d
be
the set of sites to be removed from o, and let (o
d
) be the class of all the possible
admissible choices for o
d
. For example, (o
d
) might consist of the subsets of o with a
certain fixed number of elements. Y
oo
d
represents the final sample observation vector.
The structure of the problem is basically similar to that presented in section 2.2.1.
Except for particular specifications of (o
d
), it would seem that the advantage of having
fixed common information might not be applicable in this case. However, as we show
below, this is indeed possible by substituting computations involving inverted matrices
of dimension id by the computation of a single inverted matrix of dimension i and
inverted matrices of dimension d (note that in practice usually d i).
A
in Y
oo
d
} is given by
1(Y
oo
d
. X
A
) = H(X
A
) H(X
A
[Y
oo
d
).
Then, an optimum design for the reduced network minimizes
H(X
A
[Y
oo
d
) = H(X
A
. Y
oo
d
) H(Y
oo
d
).
In the joint multivariate Gaussian case, the problem is reduced to finding
nin
(o
d
)
log [ C
X
A
[Y
oo
d
[ .
Again, C
X
A
[Y
oo
d
can be related to C
X
A
[Y
o
, which is given by the following expression:
C
X
A
[Y
oo
d
= C
X
A
[Y
o
C
X
A
.Y
o
d
[Y
oo
d
C
1
Y
o
d
[Y
oo
d
C
Y
o
d
.X
A
[Y
oo
d
. (8)
where
C
X
A
.Y
o
d
[Y
oo
d
= C
X
A
.Y
o
d
C
X
A
.Y
oo
d
C
1
Y
oo
d
C
Y
oo
d
.Y
o
d
.
C
Y
o
d
[Y
oo
d
= C
Y
o
d
C
Y
o
d
.Y
oo
d
C
1
Y
oo
d
C
Y
oo
d
.Y
o
d
.
C
Y
o
d
.X
A
[Y
oo
d
= (C
X
A
.Y
o
d
[Y
oo
d
)
/
.
In this case, matrices C
X
A
.Y
o
d
and C
X
A
.Y
oo
d
are (complementary) blocks in C
X
A
.Y
o
. The
same is true for C
Y
o
d
, C
Y
oo
d
, C
Y
o
d
.Y
oo
d
, and C
Y
oo
d
.Y
o
d
with respect to C
Y
o
. With regard
to the inverse matrix C
1
Y
o
d
[Y
oo
d
, and matrix C
1
Y
oo
d
C
Y
oo
d
.Y
o
d
, they can be obtained by
easy manipulations of C
1
Y
o
. In fact, C
1
Y
o
d
[Y
oo
d
is the matrix obtained by deleting the rows
and columns corresponding to o o
d
in C
1
Y
o
. In addition, C
1
Y
oo
d
C
Y
oo
d
.Y
o
d
can be
obtained as
C
1
Y
oo
d
C
Y
oo
d
.Y
o
d
= [C
1
Y
o
[
oo
d
.o
d
[C
1
Y
o
[
1
o
d
. (0)
where both factors in the right member are matrices corresponding to the indicated
blocks in C
1
Y
o
, obtained by deleting the respective complementary indexes in o.
Usually, o
d
have a small cardinality in comparison to that of o, so that the inversion
required in the last term of (9) for obtaining C
1
Y
oo
d
C
Y
oo
d
.Y
o
d
is relatively easier than
directly computing C
1
Y
oo
d
, for each o
d
. For example, if i = 200 and d = 10, instead of
inverting as many matrices of dimension 100 as different sets of locations to be checked,
we only need to invert one matrix of dimension 200 and the same number of matrices
of dimension 10, which obviously means a drastic reduction in computational burden.
(If observations Y
o
have been previously orthogonalized, it is necessary in this case
to recover the original C
Y
o
by using the inverse basis change matrix, for the orthogo-
nalized variables are not individually associated to single locations.)
2.3 Exact optimum design algorithm
In the Gaussian case, Ko et al. (1995) have provided an upper bound for the entropy.
Based on this bound they have developed an exact algorithm for the maximum entropy
sampling design. This algorithm is not directly applicable to a spatial state-space frame-
work since the covariance matrix involved is conditional. In this section we adapt this
algorithm to the minimization problem considered. A lower bound for the minimum is
given by the following expression:
nin
o:[o[=:
1o11
[C
X
A
[Y
o
[ _ d(C
Y
11
. 1. 1. :) := [C
X
A
[Y
1
[
:)
i=1
`
c:i)
(C
Y
1
[(X
A
.Y
1
)
)
`
i
(C
Y
1
[Y
1
)
.
where `
i
(C) is the ith eigenvalue of C (eigenvalues are taken in decreasing order).
In the algorithm proposed here we start with an initial location set o
c
(for example,
the solution obtained by the greedy algorithm) and the upper bound l1 := [C
X
A
[Y
oc
[.
Consider the initial set of active subproblems L = (C
Y
11
. 1. 1. :) and the lower
bound 11 := d(C
Y
11
. 1. 1. :). The algorithm consists of the following steps:
1. If 11 < l1, remove an active subproblem (C
Y
11
. 1
/
. 1
/
. :) from L and select
i 1
/
as a branching index.
(a) Consider subproblem (C
Y
11
. 1
/
. 1
/
i. :).
i. If [1
/
[ [1
/
[ 1 :, append (C
Y
11
. 1
/
. 1
/
i. :) to L and calculate
d(C
Y
11
. 1
/
. 1
/
i. :).
ii. If [1
/
[ [1
/
[ 1 = :, define o := 1
/
1
/
i. If [C
X
A
[Y
o
[ < l1, replace o
c
with
o and set l1 := [C
X
A
[Y
o
[.
(b) Consider subproblem (C
Y
11
. 1
/
i. 1
/
i. :).
i. If [1
/
[ 1 < :, append (C
Y
11
. 1
/
i. 1
/
i. :) to L and calculate
d(C
Y
11
. 1
/
i. 1
/
i. :).
ii. If [1
/
[ 1 = :, define o := 1
/
i. If [C
X
A
[Y
o
[ < l1, replace o
c
with o and
set l1 := [C
X
A
[Y
o
[.
(c) Update 11 := nin
1L
d(1).
2. Otherwise, o
c
is an optimal solution.
3. An application to piezometric data
We illustrate the method described above by an application using piezometric data
from the Ve lez aquifer (Ma laga, Spain), consisting of observations from 66 wells. The
data have been collected by the Instituto del Agua (Water Institute) at the University
of Granada (Spain). The observations represent water heights in metres above sea level
and are shown in Table 1. The :
1
and :
2
coordinates are in UTM (Universal Transverse
Mercator). Figure 1 shows a contour-level plot of piezometric heads obtained by ordin-
ary isotropic kriging with a linear variogram.
We assume that the random component of the piezometric random field A satisfies a
stochastic partial differential equation given by the following expression:
0
2
r(s)
0:
2
1
0
2
r(s)
0:
2
2
cr(s) = c(s). (10)
where s = (:
1
. :
2
)
/
is the continuous coordinate vector, c is a positive parameter and
c(s) is white noise with variance o
2
c
. Equation (10) is the stochastic Laplace equation
considered by Whittle (1954) for data observed on a complete grid, and by Jones (1989)
and Angulo et al. (1994) for irregularly observed data (extensions of this equation have
been proposed by Vecchia, 1988, and by Jones and Vecchia, 1993). A justification of
the meaning and use of model (10) for the representation of piezometric head data is
provided, for example, in Jones (1989).
Process r(s) in equation (10) has an isotropic correlation structure in space given by
(Whittle, 1954)
,(i) = i
c
_
1
1
(i
c
_
).
where i is the distance between points and 1
1
is the modified Bessel function of the
second kind order 1. This equation establishes the meaning of parameter c with regard
to spatial dependence in model (10).
We consider the sample information to be given by observations at i locations
:
1
. . . . . :
i
, of a process, y(s), related to process r(s) by the observation equation
y(s
i
) = r(s
i
) c(s
i
). i = 1. . . . . i. (11)
where c(s
i
) (i = 1. . . . . i) are the measurement errors with zero mean and variance o
2
c
.
The unknown parameters are c, o
2
c
and o
2
c
. We assume that r(s) and y(s) are Gaussian
processes.
:
1
:
2
Water heights
395000 4075788 71.509
397028 4074353 44.853
396920 4074295 44.793
397028 4074318 43.334
397065 4074220 43.161
397235 4074390 41.921
397823 4073703 36.06
398108 4073688 33.845
398978 4072778 25.39
399434 4075893 47.545
399452 4075898 46.189
399359 4075228 37.063
399378 4074843 34.163
399438 4074685 32.81
399465 4074235 30.298
399540 4073320 27.369
399890 4072990 26.554
399470 4072265 23.215
399448 4072303 23.265
399640 4072320 23.117
399600 4072375 23.972
399930 4071465 16.849
399875 4070850 15.781
400423 4071088 16.069
400780 4071150 16.06
400925 4071040 17.254
400050 4070760 15.722
400710 4070098 12.976
400713 4069963 12.758
400210 4069970 12.319
400453 4069543 11.166
400458 4069448 10.476
400478 4069303 9.93
:
1
:
2
Water heights
400530 4069555 11.585
400530 4069670 12.066
400578 4069333 10.093
400713 4069058 8.892
400888 4068802 7.064
401164 4068658 6.339
400865 4068540 6.27
400938 4068353 5.358
400878 4068118 4.015
400715 4068020 2.938
400978 4067983 3.191
400815 4067903 1.204
400793 4067302 0.226
400560 4067100 0.81
400726 4066843 0.043
400778 4066678 0.396
400545 4066480 2.839
400620 4066360 0.387
400753 4066213 0.268
400830 4066275 0.275
401100 4066365 0.325
401002 4065793 0.039
400790 4065330 0.051
400925 4065515 0.005
400635 4065820 0.181
400710 4065650 0.073
400662 4065490 0.205
400690 4065325 0.076
400425 4065535 0.463
400220 4065655 0.191
401645 4066125 0.254
401412 4065890 0.28
401420 4065715 0.203
Table 1. Piezometric data from 66 wells in the Ve lez aquifer (Ma laga, Spain)
In order to apply the method described in section 2, we need to compute the condi-
tional covariance matrix of r(s) given y(s
i
). i = 1. . . . . i (for s in A), according to equa-
tion (8). We estimate the covariance matrix using the maximum-likelihood estimates
obtained for the parameters from observations at the 66 wells. For the purposes of illus-
tration, we consider reducing the pre-existing network. First, a deterministic trend is
removed from the original data by fitting a quadratic surface, the residuals obtained
then being considered as the values for y(s) in the observation equation (11). We estimate
the values of the unknown parameters using the approach in Jones (1989), which results in
values of c = 0.00000280, o
c
= 0.46202, and i = 0.1068, with i = o
2
c
,o
2
r
. Assuming these
estimates to be true values for the parameters, a reduction of the network is performed.
To that end, a FORTRAN 77 program has been developed.
In all the cases studied here, a certain subregion of the aquifer domain defined by a
discrete mesh (see Fig. 1) is considered as set A. First, we consider a sequential (optimal
in each step) reduction of the network. The initial network and the prediction error
Figure 1. Contour-level plot of piezometric heads in the Ve lez aquifer (Ma laga, Spain),
and subregion of interest (coordinates are in UTM).
Figure 2. Initial network for 66 observed locations in the Ve lez aquifer (Ma laga, Spain).
Figure 3. Contour-level map of prediction error standard deviations for the initial
network.
Figure 4. Resulting network after (sequentially) deleting 44 sites.
Figure 5. Contour-level map of prediction error standard deviations after (sequentially)
deleting 44 sites.
standard deviations are shown in Figs 2 and 3, respectively. The resulting network after
deleting 44 sites and the corresponding prediction error standard deviations are dis-
played in Figs 4 and 5, respectively. In Fig. 6, the resulting conditional entropy
(except for constant terms) and the rate of information 1(X
A
. Y
oo
d
),1(X
A
. Y
o
) are
represented with respect to the number of deleted sites. By using this ratio, we can
determine the maximum number of locations to be observed to maintain a certain
rate of information. For example, to retain 80% of the amount of information con-
Figure 6. Conditional entropy (except for constant terms) and rate of information vs.
number of (sequentially) deleted sites.
tained in the initial network on the region of interest, we require at least 22 locations to
be observed. Finally, in Fig 7, we compare the entropy-based criterion with other
related criteria based on alternative measurements of the covariance matrix such as
the trace, the maximum eigenvalue, or the maximum element of the diagonal (see,
for example, Mardia and Goodall, 1993). In our context, these criteria are respectively
formulated as follows:
Figure 7. Comparison of the conditional entropy (except for constant terms) in sequen-
tial reduction of network obtained for the entropy-based criterion and other related
criteria based on alternative measurements of the covariance matrix (trace, maximum
eigenvalue and maximum element of the diagonal).
nin
(o
d
)
liC
X
A
[Y
oo
d
nin
(o
d
)
(nax
`
i
`
oo
d
i
)
nin
(o
d
)
(nax
u
i
u
oo
d
i
).
with `
oo
d
i
and u
oo
d
i
, for i = 1. . . . . `, being the eigenvalues and the elements of the
diagonal of C
X
A
[Y
oo
d
, respectively.
In the second example studied, we consider only 45 of the 66 available sites as poten-
tial locations to be observed. The non-included sites are located in the north of the
aquifer, far away from the region of interest, and their influence is negligible. We
force 25 predetermined sites to be in the network (see Fig. 8), completing it by addition
of 15 more sites. The optimal design has been achieved using the exact algorithm pre-
sented in section 2.3. The sequential solution has been taken as the starting solution for
the algorithm, which in the end turned out to be optimal. The results of this example
are shown in Fig. 8.
4. Conclusion
The objective of this work is to present a methodology to design or redesign a spatial
network when the underlying process of interest and the observation is defined by
means of a state-space model. The main advantage of working within this framework
is given by the fact that in many practical situations the available data may not corre-
spond to the variable of interest. In addition, the potentially observable locations may
be different from the set of interest sites for the variable A.
Figure 8. Map containing the locations: non-included sites, v forced sites, non-
selected sites, and selected sites in the network.
The entropy-based approach to spatial sampling design has been extensively applied
in the literature. In the state-space-model context, the procedure consists in minimizing
the conditional entropy. In the Gaussian case, this is equivalent to minimizing the loga-
rithm of a conditional covariance matrix.
A detailed formulation for extending or reducing a pre-existing network is shown.
When the set of locations is increased (decreased) by sequentially adding (deleting)
locations, the procedure is quite simple by appropriately handling blocks of a certain
covariance matrix.
When the problem is of a large dimension, the achievement of an optimal network is
costly in computing time. An exact algorithm for finding an optimal design, adapting
the algorithm proposed by Ko et al. (1995) to the space-state-model framework, is pre-
sented. In the example studied here the optimal solution obtained coincides with the
sequential one.
The model proposed for the observed variable in the examples treated in this paper
consists in adding an observation error to the variable of interest. More complex
models may be considered as the procedure only requires, in the Gaussian case, the
covariance structure between the involved variables.
Acknowledgements
We thank the editor and the three referees for their helpful comments and suggestions,
which have significantly improved this paper.
This work has been supported in part by the Plan Nacional de I+D (Project AMB93
0932) of the Comisio n Interministerial de Ciencia y Tecnologa, Ministerio de
Educacio n y Ciencia, Spain.
We are also grateful to the Instituto del Agua (Water Institute), Universidad de
Granada (Spain), in particular to Jose , L. Garca-Aro stegui for support in preparing
the piezometric data of Ve lez aquifer and graphics.
References
Angulo, J.M., Azari, A.S., Shumway, R.H., and Yucel, Z.T. (1994) Fourier approximations for
estimation and smoothing of irregularly observed spatial processes. Stochastic and
Statistical Methods in Hydrology and Environmental Engineering, 2, 35365.
Aspie, D. and Barnes, R.J. (1990) Infill-sampling design and the cost of classification errors.
Mathematical Geology, 22(8), 91532.
Boga rdi, I., Ba rdossy, A., and Duckstein, L. (1985) Multicriterion network design using geosta-
tistics. Water Resources Research, 21(2), 199208.
Bras, R.L. and Rodrguez-Iturbe, I. (1976) Network design for the estimation of areal mean of
rainfall events. Water Resources Research, 12(6), 118595.
Caselton, W.F. and Hussian, T. (1980) Hydrologic networks: Information transmission. Journal of
the Water Resources Planning and Management Division, A.S.C.E., 106 (WR2), 50320.
Caselton, W.F., Kan, L., and Zidek, J.V. (1991) Quality data network designs based on entropy. In
Statistics in the Environmental and Earth Sciences, P. Guttorp and A. Walden (eds), Griffin,
London.
Caselton, W.F. and Zidek, J.V. (1984) Optimal monitoring network designs. Statistics and
Probability Letters, 2, 2237.
Christakos, G. (1992) Random Field Models in Earth Sciences. Academic Press, San Diego.
Cressie, N.A.C. (1991) Statistics for Spatial Data. Wiley, New York.
De Gruijter, J.J. and Ter Braak, C.J.F. (1990) Model-free estimation from spatial samples: A
reappraisal of classical sampling theory. Mathematical Geology, 22(4), 40715.
Guttorp, P., Le, N.D., Sampson, P.D., and Zidek, J.V. (1993) Using entropy in the redesign of an
environmental monitoring network. In Multivariate Environmental Statistics, G.P. Patil and
C.R. Rao (eds), Elsevier, New York pp. 175202.
Haas, T.C. (1992) Redesigning continental-scale monitoring networks. Atmospheric Environment,
26A, 18, 332333.
Jones, R.H. (1989) Fitting a stochastic partial differential equation to aquifer data. Stochastic
Hydrology and Hydraulics, 3, 8596.
Jones, R.H. and Vecchia, A.V. (1993) Fitting continuous ARMA models to unequally spaced
spatial data. Journal of the American Statistical Association, 88, 94754.
Journel, A.G. (1994) Resampling from stochastic simulations. Environmental and Ecological
Statistics, 1, 6391.
Ko, C.-W., Lee, J., and Queyranne, M. (1995) An exact algorithm for maximum entropy
sampling. Operations Research, 43, 68491.
Mardia, K.V. and Goodall, C.R. (1993) Spatial-temporal analysis of multivariate environmental
monitoring data. In Multivariate Environmental Statistics, G.P. Patil and C.R. Rao (eds),
Elsevier, New York, pp. 34786.
Samper, F.J. and Carrera, J. (1990) Geoestad stica. Aplicaciones a la hidrogeolog a subterranea.
Gra ficas Torres, Barcelona.
Shannon, C.E. (1948) A mathematical theory of communication. Bell System Technical Journal,
27, 379423.
Trujillo-Ventura, A. and Ellis, J.H. (1991) Multiobjective air pollution monitoring network
design. Atmospheric Environment, 25A(2), 46979.
Vecchia, A.V. (1988) Estimation and model identification for continuous spatial processes.
Journal of the Royal Statistical Society B, 50, 292312.
Whittle, P. (1954) On stationary processes in the plane. Biometrika, 41, 43449.
Wu, S. and Zidek, J.V. (1992) An entropy-based analysis of data from selected NADP/NTN
network sites for 19831986. Atmospheric Environment, 26A(11), 2089103.
Biographical sketches
The authors are members of the Departmento de Estadstica e Investigacio n Operativa
of Universidad de Granada, Spain, and collaborate on a regular basis with the Instituto
del Agua of this university on stochastic modelling and applications in hydrology, cur-
rently under project AMB93-0932, of Environment and natural Resources Planning, of
the Comisio n Interministeral de Ciencia y Technologa, Ministerio de Educacio n y
Ciencia, Spain. Jose M. Angulo, who is Associate Professor, is the person responsible
for the above mentioned project, and heads the research group on space-time stochastic
modelling. Francisco J. Alonso is Assistant Professor, and did his Ph.D. on estimation
and prediction of spatial processes. Maria C. Bueso is Assistant Professor and her
research is related to spatial sampling design problems.

A State-Space Model Approach To Optimum Spatial Sampling Design Based On Entropy

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

A State-Space Model Approach To Optimum Spatial Sampling Design Based On Entropy

Diunggah oleh

Hak Cipta:

Format Tersedia

A state-space model approach to

optimum spatial sampling design based

Anda mungkin juga menyukai