R. A. MacMillan
LandMapper Environmental Solutions Inc.
Outline
Unifying DSM Framework: Universal Model of Variation
Z(s) = Z*(s) + (s) +
Introduction
Deterministic part of
the predictive model
Predicted spatial
pattern of some soil
property or class
including uncertainty
of the estimate
EOR Series
DYD Series
KLM Series
FMN Series
COR Series
15
Statistical Models
Scorpan relate soils/soil
properties to covariates
Explain spatial distribution
of soils in terms of known
soil forming factors as
represented by covariates
10 0 x 10 0 m g rid
La ye r w e igh ting s
La nd sca pe
cu rva ture
2 x
Veg eta tion
1 x
R a infa ll
2 x
G e olo gy
1 x
S oils
3 x
La nd su rfa ce
Deterministic Prediction
Mental and Statistical Models
Not perfect often lack suitable
covariates to predict target variable
Lack covariates at finer resolution
Geostatistical Prediction
Insufficient point input data
Cant predict at less than the
smallest spacing of input point data
Range
Sill
Semi
Variance
Nugget
d1
d2
d3
Lag (distance)
d4
Past
Deterministic
Stochastic
Soil Classes
Soil Classes
Soil
Properties
Soil
Properties
Simonson (1959)
Process Model of additions,
removals, translocations,
transformations
Ruhe (1975)
Erosional -Depositional
surfaces, open/closed basins
Topography
Organisms
Parent
Material
Soil
Time
http://solim.geography.wisc.edu/index.htm
Neural Networks
Zhu, 2000
10 0 x 10 0 m g rid
La ye r w e igh ting s
La nd sca pe
cu rva ture
2 x
Veg eta tion
1 x
R a infa ll
2 x
G e olo gy
1 x
S oils
3 x
La nd su rfa ce
Regression Trees
Moran and Bui, 2002, Bui and
Moran, 2003
CART
Breiman et al., 1984
JMP (SAS)
http://www.jmp.com/
R
http://www.r-project.org/
Fuzzy Logic
SoLIM
Zhu et al., 1996, 1997
LandMapR, FuzME
Bayesian Logic
Prospector
Duda et al., 1978
Expector
Skidmore et al., 1991
Netica
Norsys.com/netica
O = Organisms
Manual Maps
Land Use
Vegetation
R = Relief (topography)
Primary Attributes
Slope, aspect, curvatures
Slope Position, roughness
Secondary Attributes
CTI, WI, SPI, STC
P = Parent Material
Published geology maps
Gamma radiometrics
Thermal IR, RS Ratios
A = Age
Profile Curvature
Plan (Contour) Curvature
Slope Gradient (& Aspect)
CTI or Wetness Index
Profile Curvature
Plan Curvature
Slope Gradient
Wetness Index
Divide 2 Channel
Extrapolation
Uncertainty of prediction
Bui and Moran (2003)
Geoderma 111:21-44
Source: Bui and Moran., 2003
Key Papers
Moore et al., 1993
Linear regression
Predictor (covariate)
Slope position as expressed by
length of slope from shoulder
Others held
Steady
Regression tree
2.17
160.1
Text: C
Text: S,LS,L,CL,LiC
1.18
2.84
54.61
27.45
BD<1.43 BD>1.43
Clay<46.5 Clay>46.5
0.64
15.65
2.21
13.00
2.97
14.59
BD<1.42
3.37
1.83
2.04
5.50
BD>1.42
2.81
8.90
Main Developments
Integration of single models
into multi-purpose software
ArcGIS, ArcSIE, ArcView
SAGA, Whitebox, IDRISI
Deterministic
Stochastic
Soil Classes
Soil Classes
Soil
Properties
Soil
Properties
Matheron (1971)
Theory of regionalized variables
Compute semi-variance
at different lag distances
7
7
8
7
x
Collect point sample observations
6.1
5.7
5.3
5.8
7.0
6.5
6.0
5.2
7.6
7.0
6.0
5.7
7.2
7.0
6.2
5.5
Pc-Geostat (PC-Raster)
Early version of GSTAT
VESPER
Variogram estimation and
spatial prediction with error
Minasny et al., 2005
http://sydney.edu.au/agricultu
re/pal/software/vesper
GSTAT
Pebesma and Wesseling, 1998
Incorporated into ISRISI
Now incorporated into R and
S-Plus packages
Pebesma, 2004
http://www.gstat.org/index.ht
ml
ArcGIS
Geostatistical Analyst
19
LAG (1 LAG = 30 M)
17
15
13
11
160
140
120
100
80
60
40
20
0
Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml
Relies on presence of
Sufficient point samples
Spatial structure over
distances longer then the
smallest sampling
interval
Source: Yasribi et al., 2009
a) HASM
b) Kriging
c) IWD
d) Splines
Concepts
Regression-kriging evolves
to include a separate part for
regression prediction
Models
Understanding and use of
universal model grows
Directional, local variograms
Main Developments
Software
From stand alone and single
purpose to integrated software
Improvements in
Visualization
Capacity to process large
data sets
Automated variogram fitting
Ease of use
Inputs
Developments in sampling
designs and sampling theory
Deterministic
Stochastic
Soil Classes
Soil Classes
Soil
Properties
Soil
Properties
Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation
But
Scorpan elaboration
highlights importance of
the spatial component (n)
and of spatially correlated
error (s)
Stochastic Part
Same underlying theory
Still based on theory of
regionalized variables
But
Increasing realization that
the structural part of
variation (non-stationary
mean or drift) can be better
modelled by a deterministic
function than by purely
spatial calculations
Factors as predictors
Factors explicitly seen as
quantitative predictors in
prediction function
Deterministic Part
Improvements in Data
Mining and Knowledge
Extraction
Supervised Classification
Training data obtained
from both points and maps
Sample maps at points
Ensemble or multiple
realization models (100 x)
Boosting, bagging
Random Forests
ANN, Regression tree
Improvements in Data
Mining and Knowledge
Extraction
Expert Knowledge Extraction
Unsupervised classification
Fuzzy k-means, c-means
Stochastic Part
Regression Kriging
Odeh et al., 1995
McBratney et al., 2003
Hengl et al., 2004, 2007,
2003
Heuvelink, 2006
Hengl how to books
http://spatialanalyst.net/book/
http://www.itc.nl/library
/Papers_2003/misca/hen
gl_comparison.pdf
S-Plus, Matlab,
Used by soil researchers
Netica (Bayesian)
Norsys.com/netica
Improvements
Better visualization
Better interfaces
Non-commercial Software
Fuzzy Logic
SoLIM Zhu et al., 1996, 1997
ArcSIE Shi, FuzME
Bayesian Logic
Full Range of Options
R
http://www.r-project.org
Regression Kriging
Random Forests
Regression Trees
GLMs
GSTAT (in R)
Original: 25 m
Flatness
Bottomness
Valley Bottom
Flatness
Generalised 675 m
Generalised: 75 m
Flatness
Bottomness
Valley Bottom
Flatness
Terrain Series
Terrain Classes
Fine texture,
High convexity
Fine texture,
Low convexity
Coarse texture,
High convexity
Coarse texture,
Low convexity
steep
13
11
15
10
14
12
16
gentle
Modal
profile
Fit masspreserving
spline
Fitted
Spline
Estimate
averages for
spline at
standardised
depth
ranges, e.g.,
globalsoilmap
depth ranges
Spline
averages
at
specified
depth
ranges
Recent Models
DEM
TOPAZ
TAPES-G
Predicted
soil series
LandMapR
TRAINING DATA
MODELLING
(NETICA)
OUTPUTS
Point Data
Detailed soil maps
Covariates
Accuracy
assessment
Expert
knowledge
Legend
(a)
monr_comppct
Value
High : 100
Low : 7
(b)
Recent Models
a) Point locations
b) Soil Map only
c) Ordinary Kriging
d) Plain Regression
e) Regression-kriging
Evidence supports RK
Assemble
field data
Linear Model
OC = f(x) + e
Predictors
Elevation
Aspect
Landsat band 6
NDVI
Land-use
Soil-Landscape
Unit
Predict both
property value
and standard
error over the
entire area
Residuals
Add interpolated
residuals to the
prediction from
regression
Final Prediction
(Std.err. of
regression)2
(Std. err. of
kriging)2
(Total
Variance)1/2
C predicted for
sampled locations
Residuals
C=
Regression
model
100-1.2EC-5.2REF-0.6REF2-2.1EL
C predicted for
all grid locations
Mg C/ha
95
85
75
65
55
Kriging
45
35
25
15
Final C map
Mean
Min
Max
CV%
RMSE
RI (%)
64.0
27.0
87.9
18.4
9.8
19.7
Future Trends
Lots of things
qualify as regression!
Regression depends on
having enough point data
Collaborative and
open and modelling
on an inter-active,
web-based serverside platform
Everything is
accessible,
transparent and
repeatable
Possibility to
develop and use
global models (even
for local mapping)
Possibility to assess
error and correct for
it everywhere
Possibility of
rescuing, sharing,
harmonizing and
archiving soil
profile point data
needed for soil
prediction anywhere
Possibility to
develop and use
multi-scale and
multi-resolution
hierarchical models
Possibility to
develop and use
global models (even
for local mapping)
Possibility to
develop and use
global models (even
for local mapping)
Global Models
inform and
improve local
mapping