Anda di halaman 1dari 194

Nicolas Bez - IRD

Geostatistics
for fish data
Nicolas Bez
IRD - Ste
nicolas.bez@ird.fr

Nicolas Bez - IRD
With contributions (slides) from the Centre de Gostatistique in
Fontainebleau-France

and

the use of RgeoS a comprehensive free R-package
(http://rgeos.free.fr/ )


Nicolas Bez - IRD
Geostatistics : When ? Where ? Why ? How ?
In the 60s.
In South African gold mines.
Because there was systematic underestimation of recoveral gold reserves.
Traditional statistical methods (non spatial) was not appropriate.
Pr. Krige (South Africa) first analysed the problem.
Pr. Matheron (France) developed the theory.
Nicolas Bez - IRD
Fisheries Geostatistics : When ? Where ? Why ? How?
In the 80s.
In Europe.
Because acoustic survey provide autocorrelated samples which can not ne
handled properly by traditional statistical methods (non spatial).

First use in fisheries : Laurec 1977 (fishing power).
Some precursors : Conan (snow crab), Lalo (survey design), Gohin
(survey design) in 1985.
1990 : Petitgas/FooteSimmonds ICES workshop & recommendation
1990-2000 : regular spreading of the method

Nicolas Bez - IRD
Some key references
Chils J.P. et Delfiner, 1999. Geostatistics, Modeling spatial uncertainties. Ed. Wiley.

Complete and deep presentation of the geostatistical theory.
In geostatistics
Arnaud M., and X., Emery, 2000. Estimation et interpolation spatiale. Paris: Hermes
Sciences, 221p.

Emery X., 2008. Apunte de geoestatistica.

Comprehensive presentation of the geostatistical theory.
Exists in french, english and spanish
Nicolas Bez - IRD
Some key references
Rivoirard J., J. Simmonds, K. Foote, P. Fernandes and N. Bez, 2000. Geostatistics for
estimating fish abundance. Ed. Blackwell Science.

Theory & practice of geostatistics applied to fish stock assessement from survey data.
Petitgas P., 2001.Geostatistics in fisheries survey design and stock assessment: models,
variances and applications. FISH and FISHERIES, 2, 231-249.

Possible complement to the previous book. No application. But reference to applications
Bez N., 2002. Global fish abundance estimation from regular sampling: the geostatistical
transitive method. Can. J. Fish. Aquat. Sci., 59: 1921-1931.

Presentation of the transitive theory with 2 case studies.
In fisheries geostatistics
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Support and dispersion (variance)
Back to interolation techniques
Spatial aspects : associated problems
Spatial structure
Kriging
Nicolas Bez - IRD
Dispersion and support

Quelques concepts luvre,
en dehors de toute modlisation
Some basic concepts before any
modelisation

Nicolas Bez - IRD
Variable rgionalise
Regionalised variable
Regionalised phenomenom
Fish distribution
Sea bottom
etc
Regionalized (georeferenced) variable z(x),
x a point in 2D or 3D :
altitude (m, 2D)
fish density, NASC (mnm
-2,
1D, 2D, 3D)
etc


Nicolas Bez - IRD
Support & field
Support :
Area over which measures are performed.
a point, z(x),
a small volume, surface, UBM, trawl duration z(v),
or a large volume, surface, UBM, trawl duration z(V)

Field :
geograhical are where the variable is positive
assumed known
Nicolas Bez - IRD
Additive variable
v
i
equal and making a partition of V

z(V) =

ou

z(V) =

overall mean = mean of the local means

1
( )
i
i
z v
N

1
( ) z x dx
V
V
}
Nicolas Bez - IRD
Reminders

Values

Mean


Variance

Standard deviation
2 2
2 2
1 1
1 1
( )
n n
i i
z m z m
n n
o = =

1
1
n
i
m z
n
=

1 2
, ,..., ,...,
i n
z z z z
2
o o =
Nicolas Bez - IRD
Dispersion of v in V:

s
2
(v|V) =

with
2
1
[ ( ) ( )]
i
i
z v z V
N

Variance de dispersion
Dispersion Variance
1
( ) ( )
i
i
z V z v
N
=

Nicolas Bez - IRD
Exercice: dispersion variances
Additive variable z over field V, made of 6 identitical cells :


1 3 2 4 2 6




a) Mean of z over V ?
Variance (of the cell values) over V ?

b) The same field is now divided into 3 blocks of 2 cells?
Mean of z in each block z(v
1
), z(v
2
), z(v
3
) ?
Variance of the block in the field ?

c) Variance of cells in each of the three blocks ?
Mean of these variances (this is called cell variance in the blocks) ?
Nicolas Bez - IRD
Solution:
Mean of z over V :
(1+3+2+4+2+6)/6= 3

Variance (of the cell values) over V :
((1-3)+(3-3)+(2-3)+(4-3)+(2-3)+(6-3))/6
(4+0+1+1+1+9)/6 = 8/3

Nicolas Bez - IRD
3 blocs of 2 consecutive cells, blocks are denoted v
i
, i = 1, 2, 3 :


1 3 2 4 2 6




2 3 4




Mean of the z(v
i
) over the field V:
(2+3+4)/3=3

Variance of the block values in the field :
((2-3)+(3-3)+(4-3))/3=
(1+0+1)/3 =
2/3

Rq : Changing UBM does not change the mean , but changes the variance .
Nicolas Bez - IRD


1 3 2 4 2 6



2 3 4


Variance of cells in each of the three blocks :
first block : ((1-2)+(3-2))/2=1
second block : ((2-3)+(4-3))/2=1
third block : ((2-4)+(6-4))/2=4

Mean of these variances (also called variance of the cells in the blocks) :
(1+1+4)/3 = 2

Rq :This is the amount of variance we loose when increasing the UBM size


Nicolas Bez - IRD
Dispersion variances are additive :

8/3 = 2 + 2/3

total variance = variance of the cells in the block
+
variance of the blocks in the field

s
2
(O|V) = s
2
(O|v) + s
2
(v|V)

Redo the exercice :
with 2 blocks of three cells.
with ordered values (i.e. with stronger spatial
structure)

Compare the results
Nicolas Bez - IRD
Exercice: dispersion variances (2)
Nicolas Bez - IRD


1 3 2 4 2 6



2 4


Variance of cells in each of the thwo big blocks :
first block :((1-2)+(3-2)+(2-2))/3=2/3
second block : ((4-4)+(2-4)+(6-4))/3=8/3
So the variance of the cells in the big blocks is : (2/3+8/3)/2=5/3

Variance of the big blocks: ((2-3)+(4-3))/2=1

Nicolas Bez - IRD

For blocks of 3 cells:
8/3 = 5/3 + 1

For blocks of 2 cells:
8/3 = 2 + 2/3

Variance decreases when the support (UBM or ESDU in acoustic
terms; trawl size in fishing terms) increases.
This is called the REGULARISATION. Things are less variable, i.e.
more regular when the support is large.

Additivity:

s
2
(O|V) = s
2
(O|v) + s
2
(v|V)

Nicolas Bez - IRD


1 2 2 3 4 6



3/2 5/2 5



Over all mean : 3 Overall variance : 8/3
Variance of cells in each of the three blocks :
first block : ((1-3/2)+(2-3/2))/2=(1/4+1/4)/2=1/4
second block : ((2-5/2)+(3-5/2))/2=(1/4+1/4)/2=1/4
third block : ((4-5)+(6-5))/2=1

Variance of the cells in the blocks:
(1/4+1/4+1)/3 = 1/2

Variance of the blocks : ((3/2-3)+(5/2-3)+(5-3))/3=(9/4+1/4+4)/3=(26/4)/3=13/6


Nicolas Bez - IRD

For blocks of 2 cells:
8/3 = 2 + 2/3

For blocks of 2 cells with ordered values:
8/3 = 1/2 + 13/6

Mean &Variance are not affected by the location of the
data. They are not spatial statistics.
Dispersion variances are.

The amount of variance we losse by regularisation
(s
2
(O|v) ) depend on the spatial structure.

Nicolas Bez - IRD
Anchois, ufs de stade I Autres ufs de stade I
var
1
= 5854
var
2
= 774
=0.129
Nicolas Bez - IRD
Anchois, ufs de stade I
Autres ufs de stade I
var
1
= 2722
var
2
= 283
=0.134

Same data with
a larger support
(resolution)
Nicolas Bez - IRD
densit acoustique en hareng cossais
1 ping (3 m) rgularise ...
10 pings (30 m)
100 pings (300 m)
1500 pings (4500 m)
Support Variance CV
1 ping 100 % 17.4
10 pings 27.6 % 9.15
100 pings 3.9 % 3.42
1500 pings 0.7 % 1.46
distances in pings
Geostatistics for estimating fish abundance,
Rivoirard et al., Blackwell, 2000
Marine Lab, Aberdeen
Nicolas Bez - IRD
Up-scaling
Going from small support to larger ones with
progressive modification of the spatial structure

Down-scaling
Not possible to move from the support of the
observations to smaller ones without assumptions
about the spatial structure at this smaller support.

Deconvolution.

Simulation at small support conditional on the values
at large ones.
Nicolas Bez - IRD
From Dungan et al., 2002. Ecography.
Phenomenom Observations Analysis Fuzzyness
Extent X X X hight
Grain X X X hight
Resolution X X Medium
Lag X X Medium
Support X Low
Cartographic
ratio
X Low
Scale X X X high
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Support and dispersion (variance)
Back to interolation techniques
Spatial aspects : associated problems
Spatial structure
Kriging
Nicolas Bez - IRD
Spatial interpolations
And why is it worth paying attention to spatial
structure for interpolation
Nicolas Bez - IRD
Ponctual estimation
Samples
Estimate the value
at the nodes of a grid for
further graphical
representations
What do we want to estimate ?
Nicolas Bez - IRD
Estimate the mean
density over a bloc
samples
Nicolas Bez - IRD
Estimate the mean density over an
irregular internal polygon
Samples
Nicolas Bez - IRD
Samples
Estimate the entire
population density
Nicolas Bez - IRD
Pros and Cons of various interpolators
Moving average
Polygones of influence (e.g. Vorono)
Inverse distance
Polynomial
etc
samples
target
*
i i
Z Z e =

Nicolas Bez - IRD


Illustration
Nicolas Bez - IRD
Illustration
transect
Nicolas Bez - IRD
Moving average
Same weight
for each data in the neighbourhood
1
2
3
4
5
*
5
i
Z
Z =

20%
i
e =
20%
20%
20%
20%
20%
No concern on :
the distance target samples
redondant data (distance data data)

Shape and size of the neighbourhood a
priori unknown.

Nicolas Bez - IRD
Moving average
Nicolas Bez - IRD
Polygones of influence
(Voronoi)
All weight on the single most
proximate data point.

1
2
3
4
5
*
1
Z Z =
1
1
100%
0%
i
e
e
=
=
=
100%
0%
0%
0%
0%
Influence is limited to one data.
Limits of border samples may be
questionnable.

Nicolas Bez - IRD
Polygones of influence
(Voronoi)
Nicolas Bez - IRD
Inverse distance
Weights are functions of
the distances between
data points and the target
1
2
3
4
5
( )
( )
( )
*
2
1
i
i
i
Z
d
Z
d
d d

=
=

37%
7%
15%
21%
20%
Redondant data have the same
weight.
Choice of the degree.
Nicolas Bez - IRD
Inverse distances
Inverse d
Inverse d
2

Nicolas Bez - IRD
La surface ne passe pas par les points de
donnes
Estimation par un polynme d ordre k


Minimisation de la distance entre les donnes et
l estimation


Surface de tendance
Moindres carrs
1
( ) ( )
k
l
l
l
m x a f x
=
=

2
1 1
( ) ( )
n k
l
l i i
i l
a f x z x
= =
| |

|
\ .

Nicolas Bez - IRD
Moindres carrs
1
2
3
4
5
30%
8%
26%
19%
17%
Surface de tendance
(poids pour k=1)
Ne restitue pas les
valeurs aux points de
donnes
Choix du degr du
polynme
Nicolas Bez - IRD
Moindres carrs
Ordre 2
Ordre 6
Nicolas Bez - IRD
We look for a method that will :

weight sample data according to their distances to the
target and to their mutual distances
deduce the weight objectively from the characteristics
of the data
allow to quantifies the quality of the interpolation
Geo-statistics - KRIGING
Nicolas Bez - IRD
Krigeage
Variogramme linaire - voisinage unique
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problems
What is the classic way of doing ?
What to do with the zeroes ?
Summarising spatial structures
Spatial structure
Kriging
Nicolas Bez - IRD
Estimation

Limits of the traditionnal
approach
Nicolas Bez - IRD
The main question :

How many individual in the sea ?
i.e. Quantity (and spatial distribution)
of eggs during a spawning season ?
Nicolas Bez - IRD
the mean (m)
the variance (s
2
)
Can the mean m
be used to estimated the real
quantity ?
If yes, what is the quality of
this estimate ?
Histogram
of field data
Nicolas Bez - IRD
The quality of an estimation is quantified
by the estimation variance (variance of the error).
Nicolas Bez - IRD
2
2
E
s
N
o =
Estimation variance of the mean in statistics
Data variability |

Estimation variance |

Quality of the estimation +
2 2
E
s o | |
Number of samples |

Estimation variance +

Quality of the estimation |
2
E
N o | +
Nicolas Bez - IRD
N
i
2
i 1
N
i
i 1
N
i
i 1
Z
var( *) var
1
var Z
N
1
var(Z )
N
1
Ns
N
s
N
E
m
N
o
=
=
=
| |
|
|
= =
|
|
\ .
| |
=
|
\ .
=
=
=

Reminder :
Independence No correlation
So :


If the Z
i
are independent, there is no
covariance term and the variance of the
sum is the sum of the variances.


1 2 1 2 1 2
var( ) var( ) var( ) 2cov( , ) Z Z Z Z Z Z + = + +
If they have the same law, the
variance is the same irrespective of
index i.

Nicolas Bez - IRD
The formulae is relevant when we have
N samples
of N random variables
independent
with the same law ...
=
N independent and identically distributed
random variables
=
N i.i.d. variables
Nicolas Bez - IRD
N independent
and
identically distributed
random variables
Nicolas Bez - IRD
N independent
and
identically distributed
random variables
Nicolas Bez - IRD
Number of eggs per m2
Random variables
2 2
model
distribution
expected value : ( )
variance : var(
experimental
histogram
me
)
an :
variance :
i
i
i
i
Z
E
z
Z
m
s
Z
o

=
The use of Random Variable constitutes a choice. More or less sensible,
but a choice. Random variables do not exist in nature

Nicolas Bez - IRD
N independent
and
identically distributed
random variables
Nicolas Bez - IRD
Same law
We assume that the fish density
gets the same distribution
whatever the location in space.

==> Implicite reference to
spatial aspects
==> Spatial homogeneity of the
distribution

Example of a 1D distribution
Nicolas Bez - IRD
No spatial structure Strong spatial structure
=
with, however, the same histogram.
Nicolas Bez - IRD
Interpolation is blind

based on N data values

based on N data values
and
the spatial structure
No spatial structure Spatial structure
Possible benefit of spatial structures
Interpolation is helped
?
?
Nicolas Bez - IRD
N independent
and
identically distributed
random variables
Nicolas Bez - IRD
Reminder : Independence allows:
simplifying the formulae (no covariance terms)
avoiding redundancy between observations and
though loss of information
Independence
(statistically speaking)
By pure random sampling


classical statistics
(model based technique)

what about in practice

Independence can be achieved
1 2 1 2 1 2
var( ) var( ) var( ) 2cov( , ) Z Z Z Z Z Z + = + +
Nicolas Bez - IRD
Anchovy density
2000
2002
2003
2001
PelGas Survey
Courtesy of Ifremer-France
Nicolas Bez - IRD
IBTS surveys;
Courtesy of CIEM
Note :
One point per strata.
What is the meaning of the
variance of one point ?
Nicolas Bez - IRD
Barents Sea Bottom trawl survey
Cod, 1993
Courtesy of IMR-Norway
Nicolas Bez - IRD
CalCOFI Scripps Inst. Of Oceanography - USA
Nicolas Bez - IRD
Triennal Eggs Surveys 1998, CIEM,
Mackerel eggs
Anchovy eggs, BIOMAN Survey 1998
(AZTI, Spain)
Nicolas Bez - IRD
Gulf of St Laurent, snwo crab trawl survey, 2004
Pche et Ocean - Canada
Nicolas Bez - IRD
Cephalopod trawl survey, 2004
INRH, Morroco
Nicolas Bez - IRD
Reminder : Independence allows:
simplifying the formulae (no covariance terms)
avoiding redundancy between observations and
though loss of information
Independence
(statistically speaking)
By pure random sampling


classical statistics
(model based technique)

what about in practice

Independence can be achieved
1 2 1 2 1 2
var( ) var( ) var( ) 2cov( , ) Z Z Z Z Z Z + = + +
When the studied variable gets no
spatial structure, i.e. when
what happens in one location is
independent
from what happens in another location.

Biologically unsatisfactory


Geostatistics
Nicolas Bez - IRD
Compensation between the two is not known a priori
It depends on the spatial structure and
on the geometry of the sample and target points
Autocorrelation
Redundancy
loss of information

lesser conditions for
estimation

Autocorrelation
Spatial structure
gain of information

better conditions for
estimation

Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problems
What is the classic way of doing ?
What to do with the zeroes ?
Summarising spatial structures
Spatial structure
Kriging
Nicolas Bez - IRD
The zereos :
a necessary problem
To survey a population, one has to
cover the area of presence.
That is to detect its geographical limits.
That is to sample empty areas.
That is to record some 0.
Nicolas Bez - IRD
m=1.22 s
2
=3.72 Q=34.2
m=2.05 s
2
=5.15 Q=30.7
However, 0 have an impact on all statistics
based on arithmetic means :
mean, variance, histogram, correlation, etc
Nicolas Bez - IRD
Field
Field :
Domain where the variable is not 0.
longitude
l
a
t
i
t
u
d
e
-14 -12 -10 -8 -6 -4 -2
45
50
55
60
0 500 1000 1500
0
20
40
60
80
100
This must be defined prior to any statistical analysis
using averages.
Nicolas Bez - IRD
Practical problems associated with the field delineation
true / false zeroes
distributions
highly skew
Nicolas Bez - IRD
Two examples of possible delineation of fieldS
Nicolas Bez - IRD
Number of situations to handle
can be large
One survey :
nb of legs x nbr of species x nb of stages
4/5 x 4/5 x 2
=> 32 different cases
Nicolas Bez - IRD
Habitats are species specific

Fields are species specific
Pche et Ocan Canada
Nicolas Bez - IRD
Yellowtail flounder:
Compact in space.
No significant year effect.
Nicolas Bez - IRD
Rouhead Grenadier:
Highly questioning field delineation.
At least different form year to year.
Nicolas Bez - IRD
The field it self can be considered as a structural element for the population
(poor borders, rich heart).











It may then be relevant to include the field in the structural analysis.
Nicolas Bez - IRD
Two possible approaches
Use of sums or
weighted averages where zero data
gets a zero weight
Use of means
Intrinsic geostatistics
Fiel delineation required
Variogram, Kriging, etc
Transitive Geostatistics
No field delineation required
Centers of gravity,
Inertia, Vorono,
Covariogram.
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problems
What is the classic way of doing ?
What to do with the zeroes ?
Summarising spatial structures
Spatial structure
Kriging
Nicolas Bez - IRD
Centre of gravity
.
.
i i
i
i
i
i i
i
i
i
u z
z
x
v z
z

z
i
observed value
at the sample location


( , )
i i i
x u v =
longitude latitude
Nicolas Bez - IRD
Acoustic survey
Anchovy IMARPE Peru - 1994-2000
Nicolas Bez - IRD
Anchovy
Sardine
Mackerel
Horse Mack.
Sample
Evolution of the longitude-
component of
the centre of gravity
of some commercial species
through time
Distance (in Nm) from the coast
years
Nicolas Bez - IRD
Inertia
2
( ) .
i i
i
i
i
x x z
Inertia
z

Expressed in surface units.


Typically, square nautical miles.
Direction that explains
most and less
of the spatial dispersion.
Like a PCA analysis with 2 variables,
the longitude and latitude.
Nicolas Bez - IRD
Centre of gravity and Inertia in some different situations
Inertia=24
Inertia=2
Inertia=6
Inertia=17
Nicolas Bez - IRD
The distribution is summarised by the
centre of gravity and the ellipse of inertia
centered on the centre of gravity and with
radii given by the factors of the inertia.
Nicolas Bez - IRD
Anchovy
Sardine
Horse Mack.
Nov. 1990 Jun. 1991 Jan. 1992 Jan. 1993 Jan. 1994 Feb. 1995 Feb. 1996 Nov. 1996 Sept. 1997 Mar. 1998 May. 1998 Aug. 1998 Jan. 2000
Nicolas Bez - IRD
Srie chronologique de la rpartition de Anchois, Sardine, Chinchard
IMARPE-Prou
Nicolas Bez - IRD
Center of gravity and inertia :

are spatial statistics, i.e. they change when data location
change.

are not sensitive to zero data

summarise distributions
Nicolas Bez - IRD
Global Index of Collocation: GIC
d: mean distance between
2 individuals
of each species
D: mean distance between
2 individuals
GIC = 1 -
2
2
D
d
2 2
CG d =
2
2
2
1
2 2
I I CG D + + =
Nicolas Bez - IRD
Global Index of Collocation
0.00
0.20
0.40
0.60
0.80
1.00
1.20
Jan-84 Jan-88 Jan-92 Jan-96 Jan-00 Jan-04
0.00
0.20
0.40
0.60
0.80
1.00
1.20
Jan-83 Jan-87 Jan-91 Jan-95 Jan-99
Sardine / Horse Mackerel
Sardine / Anchovy
(Mean: 0.9 ; CV: 11 %)
(Mean: 0.66 ; CV: 38 %)
Time
Time
GIC
GIC
Nicolas Bez - IRD
Mean value of GIC
Anchovy Sardine Mackerel
Sardine
Mackerel
Horse Mackerel
0.66
0.75
0.74
0.81
0.90 0.85
Anchovy Sardine Mackerel
Sardine
Mackerel
Horse Mackerel
38
25
28
31
11 22
Coefficient of variation
Nicolas Bez - IRD
Etapes suivantes

Statistique interne un champ
Gostatistique intrinsque
Variogramme, krigeage, simulations
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problem
Spatial structure
From the variance to the variogram
Variogram models
Variograms : a processor to compute
variances
Kriging
Nicolas Bez - IRD
De la variance
au variogramme
From the variance to the
variogram
Concepts luvre, en dehors de
toute modlisation
Basic concepts without modelisation
Nicolas Bez - IRD
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

1 2 3 4
4
z z z z
m
+ + +
=
4
2
2
1
( )
4
i
i
z m
s
=

=

1
2
3
4
z
z
z
z

Nicolas Bez - IRD


i=1, j= 1,2,3,4
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

i=2, j= 1,2,3,4
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

i=3, j= 1,2,3,4
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

i=4, j= 1,2,3,4
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

Nicolas Bez - IRD
Bilan du nombre de paires
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

Total = 16 = 4
2

Nicolas Bez - IRD
x
1
,z
1

x
3
,z
3

x
2
,z
2

x
4
,z
4

1
2
3
4
z
z
z
z

1 2 3 4
4
z z z z
m
+ + +
=
4
2
2
1
( )
4
i
i
z m
s
=

=
==>
4 4
2
1 1 2
2
( )
*4 2
i j
i j
z z
s
= =

=

Nicolas Bez - IRD
Bilan des distances
4 Distance nulle
Distance = 1 unit 8
Distance = \2 units 4
Nicolas Bez - IRD
==>
4 4
2
1 1 2
2
( )
*4 2
i j
i j
z z
s
= =

=

2
( )
( )
2. ( )
i j
i j
x x h
z z
h
N h

=

=

Nicolas Bez - IRD
nombre
distance variance
de paires
h N(h) (h)
0 N(0)=4 (0)=0
1 N(1)=8 (1)
2 N( 2)=4 ( 2)

Derrire la variance, le variogramme


Regroupement des paires par gammes de distances
Nicolas Bez - IRD
2
( ) ( )
( )
h
h
N h h
s
N h

2
( )
( )
2. ( )
i j
i j
x x h
z z
h
N h

=

=

Demi cart quadratique moyen
des points distants de h
Variogram
Variance
Nicolas Bez - IRD
Variogramme : exemple
1 2 0 0 1 0 0 1 2 3
1 2 0 0 1 0 0 1 2 3
1 2 0 0 1 0 0 1 2 3
1 2 0 0 1 0 0 1 2 3
1 2 0 0 1 0 0 1 2 3
2 2 2 2
2 2 2 2 2
1 (1 2) (2 0) (0 0) (0 1)
(10)
2 9
(1 0) (0 0) (0 1) (1 2) (2 3)

9

+ + + +
=

( + + + +
(

10 m
(0) 0 =
1 2 0 0 1 0 0 1 2 3
(10) 0.56 =
2 2 2
2 2 2 2
1 (1 0) (2 0) (0 1) (0 0)2
(20)
2 8
(1 0) (0 1) (0 2) (1 3)

8

+ + +
=

( + + +
(

(20) 1.00 =
(30) 1.07 =
(90) 2.00 =
Nicolas Bez - IRD
Nicolas Bez - IRD
Empirical variogram
Nb pairs Distance Variogram
9 1 0.556
8 2 1
7 3 1.071
6 4 1.25
5 5 1.4
4 6 1.875
3 7 1.5
2 8 0.5
1 9 2
2
10*0 2*9*0.556 2*8*1 2*7*1.071 ... 100.
10 2*9 2*8 2*7 2*6 ... 100
total
s
N
= + + + + =
= + + + + + =
Nicolas Bez - IRD
Calcul exprimental Domaine 1D Support rgulier
<-- a-->

+1 +2 0 -1 0 1 1 -1 1 2
Calcul exprimental Domaine 1D Support rgulier
Absence de donnes
<-- a-->

+1 0 -1 0 1 -1 1 2
A
B
C
Calcul exprimental Domaine 1D Support rgulier
Prsence de valeurs extrmes
<-- a-->

+1 +2 0 -1 0 8 1 -1 1 2
Exercice: calcul 1D donnes rgulires
Nicolas Bez - IRD
Calcul exprimental Domaine 1D Support rgulier
Calcul exprimental Domaine 1D Support rgulier
Absence de donnes
A
B
C
Calcul exprimental Domaine 1D Support rgulier
Prsence de valeurs extrmes
Pas 1 2 3 4
Variogramm
e
1.100 1.100 1.250 0.375
Pas 1 2 3 4
Variogramm
e
0.944 1.750 1.071 0.417
Pas 1 2 3 4
Variogramm
e
7.167 11.375 9.071 6.250
Nicolas Bez - IRD
Calcul exprimental Domaine 2D Support rgulier
Notion dorientation

1 0 2 -1 1
-1 -2 1 2 0
-2 0 2 1 -1
0 -1 1 0 2
1 0 0 -1 1
a
a
Exercice: calcul 2D donnes rgulires

Nicolas Bez - IRD
1 1.550 1.200
2 1.867 1.600
3 2.100 1.200
4 0.600 0.400
X

Nicolas Bez - IRD


Distances (n. mi.)
Variograms in the East-West direction
(direction of the transect)
Nicolas Bez - IRD
Variogramme exprimental
sur donnees irregulieres
Nicolas Bez - IRD
Donnes 2D irrgulires
Profondeur dun horizon - P
N
Nicolas Bez - IRD
Nue variographique
Distance (x
1
,x
2
)
(P(x
1
)-P(x
2
))
2
Calcul omnidirectionnel
Nicolas Bez - IRD
Paramtres de calcul
Pas de calcul
tolrance
*
*
*
Variogramme exprimental
Nicolas Bez - IRD
Variogramme exprimental
distance
Nicolas Bez - IRD
Variogramme exprimental
distance
Variance des donnes
Variogramme
Nombre de paires
Nicolas Bez - IRD
Echantillonnage irrgulier 2D
Classes de distances:
pas du variogramme
nombre de pas
tolrance sur le pas

Classes de directions:
angle de rfrence
nombre de secteurs
tolrance angulaire
Nicolas Bez - IRD
Main steps of a geostatistical analysis
Data
representation
Field
delineation
Interpolations kriging (local & global)
Estimation Variance
Scale, support
Simulations
etc
Use of the model
Experimental
variogram
Model
definition
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problem
Spatial structure
From the variance to the variogram
Variogram models
Variograms : a processor to compute
variances
Kriging
Nicolas Bez - IRD
From experimental to model
Why ?
Correlations between known
points and target is not known
because distances are different
from those available in the
sampling.
How ?
Intrinsic Random
Functions
We are going to use a
probabilistic framework where
mean squares take the value of
variances
Nicolas Bez - IRD
De la variance
au variogramme
Passage au modle
Phase 1 : choix dun cadre mthodologique
Nicolas Bez - IRD
2 2
2
In general
var( ) ( ) ( )
When
( ) 0
then
var( ) ( )
Reminder
Z E Z E Z
E Z
Z E Z
=
=
=
2
( )
( )
2. ( )
i j
i j
x x h
z z
h
N h

=

=

1
( )
2
var( )
x x h
h Z Z
+
=
IF
i x
z Z
( ) 0
x x h
E Z Z
+
=
If can be considered as outcomes
of an intrinsic random function
( )
i
z x
( ) Z x
Nicolas Bez - IRD
( )
2
( ) 0
so that
var( ) ( ) 2 ( )
x x h
x x h x x h
E Z Z
Z Z E Z Z h
+
+ +
=
= =
Intrinsic Random
Function
(stationarity of increments)
Nicolas Bez - IRD
Random Function
A random function (RF) is an infinite family of random
variables.

A random variable is implemented everywhere. So a RF is defined by
the characteristics of the Random Variables:
the correlations between them, 2 by 2 :
3 by 3 :
etc ...

Spatial distribution: distributions of all finite possible combinations:




{ }
1
,..., 1 1 1
1 1
( ,..., ) ( ) ,..., ( )
, ,..., , and ,...,
n
x x n n n
n n
F z z P Z x z Z x z
n x x z z
= < <

( ), var( )
x x
E Z Z
cov( , )
x y
Z Z
( ) ), ( R x x Z Z e =
Nicolas Bez - IRD
Stationarity
A random function is stationary when its spatial distribution is
independent of the location.
In particular :






2
( ) ( )
( ) var( )
cov( , ) ( ) ( )
...
x y
x y
x y
E Z E Z m
Var Z Z
Z Z C x y C h
o
= =
= =
= =
Nicolas Bez - IRD
Stationarity
A random function is stationary when its spatial distribution is
independent of the location.
In particular :






2
( ) ( )
( ) var( )
cov( , ) ( ) ( )
...
x y
x y
x y
E Z E Z m
Var Z Z
Z Z C x y C h
o
= =

= =
`

= =
)
Order 2 stationarity
Nicolas Bez - IRD
Statistics over all the realisations of the RF
on known points (one or two).
E(Z(x))
var(Z(x))
E(Z(y))
var(Z(y))
cov(Z(x),Z(y))
stationarity
Spatial statistics over one realisation of the RF
Nicolas Bez - IRD
( )
2
( ) 0
so that
var( ) ( ) 2 ( )
x x h
x x h x x h
E Z Z
Z Z E Z Z h
+
+ +
=
= =
Intrinsic Random Function
(stationarity of increments)
Nicolas Bez - IRD
Main steps of a geostatistical analysis
Data
representation
Field
delineation
Interpolations (local = kriging & global)
Estimation Variance
Scale, support
Simulations
etc
Use of the model
Experimental
variogram
Model
definition
Nicolas Bez - IRD
De la variance
au variogramme
Passage au modle
Phase 2 : choix dun modle
Nicolas Bez - IRD
Manual/automatic fittings, number of pairs.

Most important part of the model is the behavior at the origin.

Inputs from physical knowledge with regards to variograms properties:
Measurement errors
Short scale structures

All directions must be modelled together.

The quality criteria for the fit depend on the use of the model, in
particular one has to know which distances are going to be used
(small/medium/long).

Only use allowed models i.e. models suggested by softwares.
Some practical considerations
Nicolas Bez - IRD
Variogram characteristics
Sill & range
Anisotropy
behavior near the origin (a tool to describe structures)
variances (a tool to computed variances)
basic structures
nested structures
Nicolas Bez - IRD
Sill & range
Range
(of autocorrelation)
Sill
Distance above which, there
is no more correlation.
Nicolas Bez - IRD
Porte, Palier
Porte:

Porte pratique:

Palier:
variogramme
covariance
( ) 0 si C h h a = >
( ) si C h h a c < >
( ) 0 alors ( ) (0) C h h C = =
a a
C(0) C(0)
Nicolas Bez - IRD
Modlisation
Nicolas Bez - IRD
Behavior at the origin
General principal : the behavior of the variogram at the origin is directly
linked to the degree of spatial continuity of the studied variable.
Differentiable & continuous
variable
Continuous
but non differentiable
variable
Discontinuous
variable
White noise
Highly smooth
Highly heterogeneous
Nicolas Bez - IRD
Interpretation of variograms
Piezometric level for an
aquifer measured from
July to Decembre.

Korhogo Bassin (Ivory
Coast)
Rainfalls
Rainfalls & flows
flows
time
Piezometric levels
time july | august | september | october | november | december
Nicolas Bez - IRD
Pluie Ruissellement Pizomtre n3
Pizomtre n4 Pizomtre n33 Pizomtre n18
Days
Days Days
Days Days
Days
Nicolas Bez - IRD
Geometrical anisotropy
Variogram computed in 2 directions
Range 1 Range 2
Range 1
Range 2
Ellipsoid of anisotropy
Nicolas Bez - IRD
+ =
Spherical component
with sill = 43
and
range = 8
Nugget effect component
with sill = 10.5
Nested structures (1)
Three parameters
required here.
Nicolas Bez - IRD
Nested structures (2)
Short range
Long range
Nested structure
+
Nicolas Bez - IRD
Unsystematic measurement errors
Consider :
We want to study Y(x) with known variogram
We record Z(x)=Y(x)+R(x) a measure of Y(x) with
some unsystematic and random errors R(x).
If we can assume that this error is:
on average 0
with variance
without spatial correlation
without correlation with Y(x)

Then
2
s
( )
Y
h
2
( ) . ( )
R
h s nugget h =
2
( ) . ( ) ( )
Z Y
h s nugget h h = +
( )
Y
h
( )
R
h
( )
Z
h
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problem
Spatial structure
From the variance to the variogram
Variogram models
Variograms : a processor to compute
variances
Kriging
Nicolas Bez - IRD
The variogram to compute variances (1)
Reminder on the definition:
Generalisation : the variogram can be used
to compute the variance of any linear combination
provided that the sum of the weights is 0:
linear combination
of variable
with sum of weights
equal to 0
2 ( ) var( )
x x h
h Z Z
+
=
var ( ) ( ) provided that 0
i i i j i j i
i i j i
Z x x x
| |
= =
|
\ .


Check:
( )
1 2
2 2
1 2 1 2
1 1
For 1& 1 1 1 0
var ( ) var ( ) ( ) ( ) 2 ( )
i
i
i i i j i j
i i j
Z x Z x Z x x x x x


= =
= = = =
| |
= = =
|
\ .


Nicolas Bez - IRD
The variogram to compute variances (2)
The variogram must be such that any variance
calculation is never negative.

Not all math function fulfil this requirement.

So we then must use allowed variogram functions.
Nicolas Bez - IRD
3
3
3 1
( ) 1
2 2
h a
h h
h C
a a

s
| |
= + |
|
\ .
Spherical variogram
a
C
Nicolas Bez - IRD
Exponential variogram
( ) 1
h
a
h C e

| |
=
|
|
\ .
C
Nicolas Bez - IRD
2
2
( ) 1 e
h
a
h C

`

)
| |
|
=
|
|
\ .
Gaussian variogram
C
Nicolas Bez - IRD
| |
( )
h
h
a
=
Linear variogram
a
0
0 if h 0
( )
if h 0
1
h
h C
C

=
=

= =

=

Nugget effect variogram


C
Nicolas Bez - IRD
Manual/automatic fittings, number of pairs.

Most important part of the model is the behavior at the origin.

Inputs from physical knowledge with regards to variograms properties:
Measurement errors
Short scale structures

All directions must be modelled together.

The quality criteria for the fit depend on the use of the model, in
particular one has to know which distances are going to be used
(small/medium/long).

Only use allowed models i.e. models suggested by softwares.
Some practical considerations
Nicolas Bez - IRD
Contents
Some history
Spatial aspects : why should we bother ?
Spatial aspects : associated problem
Spatial structure
From the variance to the variogram
Variogram models
Variograms : a processor to compute
variances
Kriging
Nicolas Bez - IRD
Ponctual Kriging
(mapping)
Nicolas Bez - IRD
What do we have?
What do we want?
What do we need?
1
2
3
4
5
Data points
Target points
1
2
3
4
5
Neighbourhood
variogram
Algorithm to determine weights
Nicolas Bez - IRD
Main steps of a kriging procedure
0

0 0

*
the unknowns are the . They are chosen with the
objective of no bias and minimum estimation varia
1- Estimator's form (linear)
2- No bia
nce.
( *) 0
Fulfilled i :
s
f
i i
i neighbourhood
i
i
i n
Z Z
E Z Z

e
e
=
=

2
0 0 1
3- Determine weights that minimise the estimation varian
1
var( - *) ( , ,..., )
ce
eighbourhood
E Z N
known
unknown
Z Z f o
=
= =

Nicolas Bez - IRD


Z 1
f( , ,..., )
The are then chosen so that, 0 for each
N
i i
i

c
=
c
Hint
For one particular
i

( )
i
f
( )
, '( ) 0
i
i i
i
df
f

= =
Nicolas Bez - IRD
ij 0
0
After developements, this leads to a linear system
whose
-
.
main part is


i i

(
(
(
(

(
(
(
(

I A
=
=

(
(

(
(

I
Vector of variogram
values between data
points and the target.
Matrix of variogram values
between data points
Vector of
unknowns
x =
Amount of information
provided by a set data with
a given geometry
Capacity of a given
geometry of data point to
provide information to a
target with a given
localisation relative to
these points.
Nicolas Bez - IRD
K K 1
1 N 0
The solution, i.e. the vector of the unknowns, is then given by
= ( ,..., ) = .
K


I I A
Once we know the spatial structure i.e. the variogram model, and
the neighbourhood, to krige the value at a given location, we then need :
1. To build the matrices and
0
2. To invert the matrix in order to get
-1
3. To multiply
-1
and
0
in order to get the kriging weights
4. To build the weighted average of the neighboring data points
0

*
K
i i
i neighbourhood
Z Z
e
=

K
i

Nicolas Bez - IRD


A kriging map, considered as a grid of points to be interpolated, is
then generated by repeating this procedure for any grid nodes.



1
2
3
4
5
Data points
Target points
Neighbourhood
This point is
kriged with one
point in the
neighbourhood
This point is kriged with two point
in the neighbourhood
This point is not kriged as there is no
point in the neighbourhood
Nicolas Bez - IRD
Kriging - Definition
Kriging estimate the value
of the variable on a given support (point, block,
polygon, ...).

We are concerned by the estimation error:
*
0
Z
0
Z
*
0 0
Z Z c =
Nicolas Bez - IRD
Kriging - Construction
Linear Combinaison of data
*
0
Z Z c =
So the estimation error
*
0
Z Z
o o
o
= +

Is a linear combinaison whose weights are unknowns and


whose variance could be computed with the variogram if the
weights were given.
o

Like for any kind of regression


Nicolas Bez - IRD
Kriging - Construction
Case of ponctual estimation:


Case of estimation over a polygon:
0 0
Z Z
o o
o
c =

0
1
( )
v
v
Z x dx Z
o o
o
c =

}
Nicolas Bez - IRD
(Simple) Ponctual Kriging
The weights that allow minimising the estimation variance are the
solutions of the following system


The kriging weights are thus


So that the estimation is:


And the estimation variance:
| | | | | |
0 , , o o | o
=
| | | |
0
) var(
o o
c =
t
| | | |
o o
Z Z
t
= *
| | | | | |
0 ,
1
, o | o o
=

Nicolas Bez - IRD
ij 0
0
After developements, this leads to a linear system
whose
-
.
main part is


i i

(
(
(
(

(
(
(
(

I A
=
=

(
(

(
(

I
Vector of variogram
values between data
points and the target.
Matrix of variogram values
between data points
Vector of
unknowns
x =
Amount of information
provided by a set data with
a given geometry
Capacity of a given
geometry of data point to
provide information to a
target with a given
localisation relative to
these points.
Nicolas Bez - IRD
Variogram model and kriging weights
L
25%
25%
25% 25%
Nugget
2
1.25 o =
L
40.6%
40.6%
9.4% 9.4%
Spherical(range =2L)
2
0.84 o =
L
49.8%
49.8%
0.2% 0.2%
Gaussian(range = 1.5L)
2
0.30 o =
Nicolas Bez - IRD
Kriging weights in case of anisotropy
L
Spherical isotropy range=1.5L
25%
25% 25%
25%
L
Spherical anisotropy
range_x=1.5L, range_y=L
17.6%
32.4% 32.4%
17.6%
Nicolas Bez - IRD
Localisation of the points
2
0.48 o =
37.0% 37.0%
26.0%
33.3% 33.3%
33.3%
2
0.45 o =
50.0%
50.0%
2
0.537 o =
25.7% 25.7%
48.7%
2
0.526 o =
Spherical, range = 3 * radius
Nicolas Bez - IRD
1-D Kriging with 7 data points
Spherical variogram sill=100
Ranges = 5, 10, 15, 20, 25, 30
Sensitivity to model parameters :
the range (1)
Nicolas Bez - IRD
Exponential Spherical
Cubic Gaussian
Sill = 100 & range = 10
Sensitivity to model parameters :
the type of structure
Nicolas Bez - IRD
Spherical model (range=10)
+ nugget effect
Sph: 100; Nugget: 0
Sph: 50; Nugget: 50
Sph: 75; Nugget: 25
Sph: 0; Nugget: 100
Sph: 25; Nugget: 75
Sensitivity to model parameters :
the nugget effect
Nicolas Bez - IRD
Spherical sill = 100
Range 20
Range 30
Range 10
Range 5
(red & blue 1 kriging standard deviation)
Sensitivity to model parameters :
the range (2)
Nicolas Bez - IRD
The average of the kriged value over a polygon is equivalent
to the kriged value of the mean density over the polygon.
However, this does not stand for the estimation variance.
The CV
E
might be larger when computed with
geostatistics than when computed with classical
statistics.
Interpolating, e.g. by kriging, induces smoothing.
Interpolated map do not have the same characteristics than
the original data. In particular the variogram of the
interpolated data in different from the one of tha data that
have been used to generate the map. This must be kept in
mind for any post processing of kriged maps.
The choice of a neighbourhood is not straightforward. Its
characteristics have an influence on the output (as for a
moving average).
Nicolas Bez - IRD
Spherical variogram
(range = 250m; sill = 2)
200m
*
10
*
12
o
?
400m
Exercice: Kriging with 2 points
89 . 0 ) 200 ( =
Nicolas Bez - IRD
Block-Polygon kriging
The estimate remains a linear combination of the available data:






And the error is now
*
0 v
Z Z
o o
o
= +

0
1
( )
v
v
Z x dx Z
o o
o
c =

}
Nicolas Bez - IRD
Block-Polygon Kriging
The weights that allow minimising the estimation variance are the
solutions of the following system


The kriging weights are thus


So that the estimation is:


And the estimation variance:
| | | | | |
v , , o o | o
=
| | | | | |
vv v
t
c
o o
= ) var(
| | | |
o o
Z Z
t
= *
| | | | | |
v ,
1
, o | o o
=

Nicolas Bez - IRD
One must choose the level of discretization
x
o
v o

v v

Nicolas Bez - IRD


Between ponctual and block kriging only the right hand
side of the kriging system changes

One can show that the average of the ponctual values
estimated by kriging over a polygon equals the kriged
value of the polygon.

But this does not hold for the estimation variance

Nicolas Bez - IRD
Krigeage - Proprits
Le systme de krigeage fait intervenir:

au travers de la structure de la variable

les distances entre points de donnes

les distances entre les donnes et la cible

la gomtrie de la cible
C ou

C
o|
ou
o|

0
C
o
ou
0 o

vv
C ou
vv

Nicolas Bez - IRD


Krigeage - Proprits
Ni le systme de krigeage, ni la variance de lerreur
destimation ne font intervenir les valeurs des donnes.

Les pondrateurs restent inchangs lorsquon multiplie le
palier du variogramme par une constante. Le paramtre de
Lagrange est multipli par cette constante.

La variance de lerreur destimation est directement
proportionnelle au palier du variogramme.
Nicolas Bez - IRD
S 1,..,4 *
( )
( )
4
i
i
Z x
Z S
=
=

2
, , ,
2
E i S S S i i
o =
Global estimation variance for the simple mean
Mean variogram
between
a point in S
and
a sampling point
Mean variogram
between
a point in S
and
an other point in S
Mean variogram
between
a sampling point
and
an other sampling point
Nicolas Bez - IRD
Sum of the variogram values for all possible distances
between a sampling point and a point in S
,
1
| |
i S
N S
=

,
2
1
| |
S S
S
=
Sum of the variogram values for all possible distances
between two points in S
Sum of the variogram values for all possible distances
between two sampling points
2
,
1
N
i i
=
Nicolas Bez - IRD
Sampling optimisation

Consistency along regularisation (ping UBM)
Nicolas Bez - IRD
Simulations
Modle intrinsque avec variogramme linaire:
10 simulations, le krigeage un cart-type
Nicolas Bez - IRD
1
2
3
4
5
Data points
What do we have?
What do we want?
What do we need?
Target points
1
2
3
4
5
Neighbourhood
variogram
Algorithm to determine weights
Nicolas Bez - IRD
Salinity ( ) Chlorophyll (mole.l
-1
)
Distances (n. mi.) Distances (n. mi.)
Same estimation
if unique neighbourhood
used for the local estimation

Different estimation
different estimation variance

Nicolas Bez - IRD
Local estimation
Kriging (map)
Global estimation : OK
Estimation variance
Spatial
integration
Global kriging :
Weighted average
Global estimation
Estimation variance
Arithmetic mean :
Weights = 1/N
Global estimate
Estimation variance
Nicolas Bez - IRD
Co kriging
Kriging with external drift

Anda mungkin juga menyukai