What is it?
the purpose of geographic inquiry is to
examine relationships between geographic
features collectively and to use the relationships to
describe the real-world phenomena that map
features represent. (Clarke 2001, 182).
One Definition: the quantitative procedures
employed in the study of the spatial arrangement
of features (points, lines, polygons and surfaces)
Spatial Analysis:
What is it?
What types of relationships exist between
geographic features, and how do we express
them?
Properties of spatial features and/or
relationships between them: size,
distribution, pattern, contiguity,
neighborhood, shape, scale, orientation
3 Fundamental Questions
Regarding Spatial Relationships
How can two (or more) spatial distributions be
compared with each other?
How can variations in geographic properties over
a single area or data set be described and/or
analyzed?
How can we use what we have learned from an
analysis(es) to predict future spatial distributions?
Spatial Analysis can cover the spectrum implied by these
questions!
Data acquisition
Preprocessing
Database Management
Manipulation/Analysis
Final product output
Representation of
vector spatial objects
Hierarchical nature of objects (points, lines,
polygons)
Points: different types
Entity, label, area, node
Lines:
Line, arc, link, etc.
Polygons:
Area, polygon, complex polygon
Basic elements of
spatial information required
to undertake spatial analysis
Location
X,Y coordinate or locational reference
Attribute data
Describing the (aspatial) characteristics of
locations
Topology
Describing the spatial relationships between
spatial features
Measurement of Location:
GIS Issues
A GIS suitable for spatial analysis must
have the necessary functions dealing with
coordinate systems
What are these functions?
Measurement of Location:
GIS Issues
Basic measurement of spatial features:
Points are defined by x,y coordinates
Lines are represented by an ordered sequence of pointsthey
can be decomposed into sections of straight line segments
The distance between two points on a Cartesian plane is
derived through Euclidean distancethe length of a line
segment is the sum total of the Euclidean distances of all
segments that compose it (p. 105 Chou)
The area of any feature represented as a polygon an be
computed by constructing a trapezoid from every line segment
delineating the polygonthen systematically aggregating the
trapezoid areas (both positive and negative) (p. 106 Chou)
3 Major vector-based
datasets used in ArcGIS:
Shapefiles, Coverages, Geodatabases
ESRI Shapefiles:
3 Major vector-based
datasets used in ArcGIS:
Shapefiles, Coverages, Geodatabases
ESRI ARC/INFO Coverages:
Spatial data is stored in binary files
Topological and attribute tables are stored in INFO tables
Contain topological features classes that define line or
polygon topology
Topology is built for lines and polygons - lines: arcs,
nodes and routes; polygons: arcs, label points, polygons,
regions
Primary coverage feature classes are: point, arc, polygon,
and node; secondary: tic, link, annotation; compound:
region, route
ARC/INFO Coverages
ARC coverage files: defined by header files, index
files, ARC, PAL, LAB, CNT, PRJ, LOG, TOL
ARC: arc definitions and vertices; PAL: contains
polygon definitions; LAB: contains label point
records; CNT: contains polygon centroid
information; PRJ: contains projection information;
TOL: contains the tolerance values to use when
processing a polygon coverage
3 Major vector-based
datasets used in ArcGIS:
Shapefiles, Coverages, Geodatabases
ESRI Geodatabase
All spatial, topological, and attribute data is stored in tables in
a relational database
A feature dataset in a geodatabase can contain simple or
topological feature classes
Many feature classes can be associated with a topological role
within the geodatabase
User-defined associations can be created between features in
different feature classes
Types of feature classes: point, line, polygon, annotation,
simple junction, complex junction, simple edge, complex edge
Stochastic processes:
Processes whose outcome is subject to variation that cannot be
given precisely by a mathematical formula
Introduction of a random (stochastic) element to model the range
of potential solutions
Chance process with well-defined mechanisms p. 58
Predicting Patterns:
Expected Results
Assumptions
Example: independent random process (IRP) (or complete spatial
randomness (CSR))
Math used to predict frequency distribution under assumed
randomness
Observed vs. expected
What is this assumption called in the scientific method?
Density-based measures
Quadrat Analysis based on the frequency of
occurrence of points within quadrat units
Requires overlaying quadrats onto a layer of point
features
Once quadrats are overlayed onto the point layer,
frequencies of points per quadrat can be counted
All quadrats are classified according to observed
frequency of points
Null hypothesis: point features are randomly distributed
Density-based Measures
Kernel Density Estimation
A pattern has a density at any location
Continous densities for defined kernels to
create a continous surface
Distance-based
Point Pattern Measures
The Logic of Distance Measures
Can be described using types (categories):
Clustered points are concentrated in one or more
groups/areas
Uniform points are regularly spaced with
relatively large interpoint distance
Random Neither the clustered or uniform pattern
is prevalent
Planar-enforced areas
GIS-context?
Shape
Comparison of a polygon to a known shape
Spatial pattern
Contact numbers
Fragmentation (FRAGSTATS)
Spatial Autocorrelation
Most common spatial autocorrelation statistic is
Morans I coefficient
Similar to a traditional correlation coefficient
The I coefficient for the most part ranges between 1 and
+1; larger negative values indicate a scattered pattern
positive values indicate a clustered pattern
Spatial Autocorrelation
Joins Count approach
Logic?
Centroid Method
Predominant Type
Most Important Type
Hierarchical
Spatial Interpolation
Control points are points with known
valuesit is best if there is good
coverage of control points (how often does
this happen?)
Assumptions:
1. The surface of the Z variable is continuous
2. The Z variable is spatially dependent
Simple Spatial
Interpolation Techniques
Local Methods: The z value of an unknown
point location is estimated from known
local point neighbor locations
Interpolation procedures are used when we
have discontinuous datasets and we want
(or need) to process them into spatially
continuous datasets
Spatial Interpolation
Global (Statistical) Methods: The z value of an
unknown point location is estimated from all known
point data
Polynomial Trend Surface Analysis (Inexact,
Deterministic): approximates points with known values
with a polynomial equation
The equation is used as an interpolator to estimate
values at other points
Computed by the least squares method and a goodness
of fit can be computed for each control point
Zx, y b0 b1 x b2 y
Spatial Interpolation
Local Methods
Inverse Distance Weighted (Exact,
Deterministic): enforces that the estimated
value of a point is influenced more by nearby
known points than those farther away
All predicted values are within the range of the
maximum and minimum values in the distribution
Spatial Interpolation
Local Methods
Splines (Exact, Deterministic): create a surface
that passes through the control points and has
the least possible change in slope at all points
(minimum curvature surface)
Spatial Interpolation
Local Methods
Kriging (Exact, Stochastic): a geostatistical
method for spatial interpolation where the mean
is estimated from the best linear unbiased
estimator or best linear weighted moving
average
Assumes that the spatial variation of an attribute is
neither totally random nor totally deterministic (a
correlated component, a drift, a random error term)
Spatial Interpolation
Types of Kriging:
Ordinary:
the drift component is excluded
Focus on the degree of spatial dependence among sampled known
points (semivariance)
2
n
1
Semivariance = (h )
( z ( xi ) z ( x h ))
2n i 1
Semivariance values are plotted on a semivariogram where the
semivariance is recorded on the Y-axis and the distance between
known points on the X-axis (nugget, range, sill)
The semivariogram is fitted to a mathematical model (sherical,
circular, exponential, linear, Gaussian)s
Equation for estimating Z:
Z0
ZxWx
i 1
Spatial Interpolation
Types of Kriging:
Universal Kriging: assumes that the spatial
variation in z values has a drift or trend in
addition to the spatial correlation between
known points
Co-Kriging: Can be used to improve spatial
predictions by incorporating secondary
variables, provided they are spatially correlated
with the primary variable
Semivariogram
Covariance
Concept of Cross-correlation
Feature Classification
What type of distribution, how do we
determine? Uniform (equal interval, equal
frequency); Normal (standard deviation);
Multiple Cluster (natural breaks)
Proximity Analysis
ArcView: Buffer
ArcView cannot: Thiessen polygons
Map Overlay
(Multiple Layer) Operations
arguably, the most important feature of
any GIS is its ability to combine spatial
datasets (p. 285)
10 Possible types of Map Overlay
Erase (Coverage)
Identity Overlay
Intersect Overlay
Symmetrical Difference
Union Overlay
Update Overlay
Spatial Modeling
According to Chou (1997), a Spatial Model:
1. Analyzes phenomena by identifying
explanatory variables that are significant to the
distribution of the phenomenon and providing
information about the relative weight of each
variable
2. Is useful for predicting the probable impact
of a potential change in control factors
(independent variables)
Spatial Modeling:
Thinking About Models
Models can be:
Descriptive or Prescriptive
Deterministic or Stochastic
Static or Dynamic
Deductive or Inductive
Spatial Modeling
General Types of (Spatial) Models
Descriptive: characterization of the distribution of
spatial phenomena
Explanatory: deal with the variables impacting the
distribution of a phenomena
Predictive: once explanatory variables are identified,
predictive models can be constructed
Normative: models that provide optimal solutions to
problems with quantifiable objective functions and
constraints
Spatial Modeling
More specific types of spatial models:
Binary models (descriptive): use logical expressions to identify or
select map features that do or do not meet certain criteriaHow?
Index models (descriptive): use index values calculated for
variables to produce a ranked spatial surfaceHow?
Weighted Linear Combination Model
Spatial Modeling
Steps in the Modeling Process
Spatial Modeling
Important Issues in Conducting Spatial Analysis:
Delineation of geographic units of analysis
How do you choose geographic units of analysis so that spatial
analyses are valid?
Stormwater modeling
project logic
Based on TR-55
Stormwater modeling
project logic
TR-55
Network Analysis
Network analysis: the spatial analysis of linear (line)
features
Your text distinguishes between several different types of
lines
Network Analysis
Concepts:
Network
Line segment(s)/Links
Nodes (and vertices)
Impedance
Topology
Dynamic Segmentation
Network Analysis:
Network Structure
Evaluation of Network Structure:
Network Analysis:
Network Structure
Network Diameter: the maximum number of steps
required to move from any node to any other node
using shortest possible routes over as connected
network
Network Connectivity: an evaluation of nodal
connectivity over a network based on direct and
indirect connections (expressed through the
construction of matrices c1, c2, c3)
Network Analysis:
Network Structure
Network Accessibility: can be evaluated based on
nodes or the entire networkthe accessibility
network is many times called the T matrix
T matrix is the sum of all connectivity matrices up to
the level equal to the network diameter (i.e. c3 or c4)
Logically this makes sense if you are trying to evaluate
total connectivity of a node or the entire network
How do we read the matrix?
Network Analysis:
Network Structure
Network Structure in a Valued Graph
The previously discussed measures of network
structure are based on either counting links and/or
nodes.what element are we missing with these?
Q. What is a valued graph? A. A matrix is constructed
in which every link (line segment) in a network is
coded with an impedance measure (such as what?)
An often-used type of valued graph is the minimal spanning
treesatisfies 3 criteria:
Can a GIS construct a minimal spanning tree?
Network Analysis:
Normative Models of Network Flow
Normative models are those that are designed to
determine a best or optimal solution based on
specific criteria
Simple Shortest Path Algorithm:
Involves finding the path or route with the minimum
cumulative impedance between nodes on a network
Requires an impedance matrix (such as a valued graph)
and a set of interative procedures:
GIS must know which nodes are connected to whichmultistep evaluation of connectivity and least cumulative impedance
(distance, time, cost, etc.)
Network Analysis:
Normative Models of Network Flow
The Traveling Salesman Problem:
2 constraints 1) the salesman must stop at each location
once 2) the salesman must return to the origin of travel
(there can be variations)
The objective is to determine the path or route that the
salesman can take to minimize the total impedance value
of the trip
Often a heuristic method is usedbeginning with an initial
random tour, a series of locally optimal solutions is run by
swapping stops that cause a reduction in cumulative
impedance (an iterative procedure is also described in your
book on pp. 236-244).
Network Analysis:
Normative Models of Network Flow
Various Types of Network Problems:
Shortest Path Analysis (Best Route)
Simple shortest path
Traveling Salesman
Closest Facility
Network Analysis:
Normative Models of Network Flow
Dynamic Segmentation Data Model: The ability to
derive the locations of events in relation to linear
features dynamicallynot reliant upon the existing
topology of a network
Models linear features using routes and events
Routes: represent dynamic linear features
Events: phenomena that occur at locations along line
segments
Spatial Interpolation
Y
X
Mean= .01694
RMS = 2.862
Avg. Stan Error = 3.441
Mean Stan. = .004232
RMS Stan. = .8324
Without Anisotopy
Mean= .0002331
RMS = 2.857
Avg. Stan Error = 3.424
Mean Stan. = .0006747
RMS Stan. = .8347
Mean= .04253
RMS = 2.595
Avg. Stan Error = 2.354
Mean Stan. = .01806
RMS Stan. = 1.102
Without Anisotopy
Mean= .0001592
RMS = 3.054
Avg. Stan Error = .8181
Mean Stan. = .001031
RMS Stan. = 3.731
Regression Equations
TWOYR = -3.538 + 0.06031 * AVGCURV
+ 0.03331 * PERCIMPV
TENYR = -4.156 + 0.07806 * AVGCURV +
0.04368 * PERCIMPV