Anda di halaman 1dari 49

Konsep Utama, Spasial Spesial

Jarak (Distance) Kedekatan /


Ketetanggan (Adjacency/ neighborhood)
A

Besarnya pemisahan spasial Jarak Euclidean (garis lurus) hanya perkiraan Nominal / biner (0,1) setara dengan jarak Tingkat Kedekatan : 1st, 2nd, 3rd ketetanggaan (nearest neighbor)

Interaksi
(Interaction)

Kekuatan hubungan antar entitas Fungsi terbalik dari jarak

Ketetanggan Spasial berdasarkan Kedekatan


Raster persegi Hexagonal
Tak Beraturan (Irregular)

Dasar: Berbagi Batas atau Point (Sharing a boundary)

Kedekatan 1st and 2nd order

1st order

rook

hexagon

queen

2nd order

Deskripsi & Analisis


Deskripsi GIS banyak digunakan o/ Pemerintah & Swasta untuk menggambarkan (describe) the real world Contoh:

GIS pada akhirnya


didesain untuk

Mengelola pipa PDAM & saluran air Mengelola sumberdaya lahan

Membangun Database Spasial u/ menggambarkan realita dan pengelolaannya

Deskripsi & Analisis


Apakah lokasi dari industri Software berbeda dari industri telekomunikasi...?

Analisis

Mencoba untuk Memahami proses:

memahami proses yang menyebabkan/ membuat pola di dunia nyata


Membantu dalam pekerjaan
Membuat keputusan yang tepat

Membantu memahami fenomena

Merupaan peran Ilmu pengetahuan

Kasus ini, dapat menggunakan centrographic statistics u/ menyelesaikan pertanyaan tsb

Analisis Spasial bertujuan: Identifikasi dan menggambarkan pola Pola titik secara jelas Berkelompok (clustered) (Titik2 dalam beberapa Grup)

Identifikasi dan memahami proses Aksessibilitas Transportasi Aglomerasi ekonomi * dari berbagi ide, akses ke tenaga kerja terampil, akses ke layanan bisnis.

*penghematan biaya untuk perusahaan2 pada lokasi yang sama

Proses, Pola & Analisis

Proses menjalankan sistem menghasilkan Pola Analisis Spasial bertujuan:


Identifikasi dan menggambarkan Pola Identifikasi dan memahami proses
Proses

Create
(or cause)

Pola/ Patterns

Proses, Pola & Analisis

Terkadang, kita tidak dapat mengamati (melihat)


proses, jadi kita harus menyimpulkan (menebak ...?) proses dengan mengamati pola
No

Menduga

Yes

Proses

Create
(or cause)

Pola

Tingkatan /Level Analisis Spasial (Berdasarkan Tingkat Kecanggihan)

1. Deskripsi Data Spasial 2. Analisis Data Spasial Eksplorasi (ESDA) 3. Analisis Statistik Spasial and Uji Hipotesis 4. Permodelan Spasial dan Prediksi
More difficult, but more useful! (more powerful)

Analisis Spasial Level 1

1. Deskripsi Data Spasial


format
- computer map - computer database

Focus is on describing the world,


and representing it in a digital

Uses classic GIS capabilities


- buffering, map layer overlay
- spatial queries & measurement

Analisis Spasial Level 2 2. Exploratory Spatial Data Analysis (ESDA):

Mencari pola dan penjelasan (yang mungkin) GeoVisualization melalui perhitungan dan tampilan
Centrographic statistics

Calculation of Centrographic Statistics

Analisis Spasial Level 3


3. Analisis Statistik Spasial dan Uji Hipotesis
model statistik, biasanya dari proses acak (probabilitas)

data diharapkan atau tidak diharapkan bergantung pada

2.5%

2.5%
0 1.96

Uji Hipotesis: - Pola Titik (point patterns) - Termasuk data Poligon (polygon data)

-1.96

Uji apakah industri software & industri telekomunukasi memiliki pola: cluster (berpola) atau acak (todak berpola)

Analisis Spasial Level 4


4. Permodelan Spasial: Prediksi

Membangun model2 (proses) u/

Memprediksi hasil spasial (pola spasial) Notice how the density of points (number per square km) decreases as we move away from the highway. We can construct regression models to predict location patterns.
Density of points

Density of points = f (distance from highway) However, for spatial data, we need special: Spatial regression models

Distance from highway

The first example of Spatial Analysis

John Snows maps of cholera in 1850s London

Was it ESDA or hypothesis testing?

Did he discover the association between water


and cholera after drawing the map: ESDA

Did he draw the map in order to prove the


association: using a map for hypothesis testing

Maps are goodbut more is needed!

A. Is this clustered?

B. Is this clustered?

We must test rigorously using spatial analysis methods.

Not just look and guess


Source: R & Y, p. 5

Why is this important?

?
Is it clustered? We must measure and test --not just look and guess!

Because that is science! Because that is how earth management decisions must be made!

Statistik Deskriptif untuk Distribusi Spasial


Review Statistik Descriptif Standar Statistik Sentrografik untuk Data Spasial
Mean Center, Centroid, Standard Distance Deviation, Standard Distance Ellipse, Density Kernel Estimation, Mapping

Analisis Statistik Standar :


A Quick Review
1.

Statistik Deskriptif Concerned with obtaining summary measures to describe a set of data Calculate a few numbers to represent all the data we begin by looking at one variable (univariate)
Later , we will look at two variables (bivariate)

Three types:

Measures of Central Tendency Measures of Dispersion or Variability Frequency distributions

Statistik Deskriptif Standar Central Tendency

Central Tendency: single summary measure for one variable:


Formula for mean

1. mean (Rata2) 2. median (Nilai Tengah)


- 50% larger and 50% smaller - rank order data and select middle number

3. mode (most frequently occurring)


These may be obtained in ArcGIS by: - opening a table, right clicking on column heading, and selecting Statistics - going to ArcToolbox>Analysis>Statistics>Summary Statistics

ADMIN_NAME Beijing Liaoning Tianjin Taiwan Shanghai Guangdong Heilongjiang Shanxi Jilin Xinjiang Hebei Guangxi Hunan Jiangxi Hong Kong Henan Hubei Chongqing Shandong Jiangsu Nei Mongol Shaanxi Hainan Macao Zhejiang Ningxia Sichuan Fujian Yunnan Anhui Guizhou Qinghai Gansu Xizang Sum

Illiteracy-Prcnt Rank order 3.11 1 3.48 2 3.52 3 3.9 4 3.97 5 4.02 6 4.16 7 4.42 8 4.44 9 4.64 10 4.83 11 5.61 12 5.87 13 6.49 14 6.5 15 7.36 16 7.69 17 7.8 18 7.96 19 8.05 20 8.14 21 8.19 22 8.65 23 8.7 24 9.36 25 10.09 26 10.24 27 10.38 28 13.29 29 14.49 30 14.58 31 16.68 32 17.77 33 37.77 34 296.15

Kalkulasi mean and median


Mean 296.15 / 34 = 8.71 Median (7.69 + 7.8)/2 = 7.75 (there are 2 middle values)
Note: data for Taiwan is included

Statistik Deskriptif Standar

Variance

Variability or Dispersion

rata-rata dari skor penyimpangan kuadrat atau ukuran keberagaman data,

Standard Deviation

Semakin besar angka varians maka semakin beragamlah data yang kita miliki

(square root of variance)

ukuran dispersi yang paling banyak dipakai

Formula for variance (populasi)


n i =1

( Xi

X)

i =1

X i - [( X ) 2 / N ] N

N
Definition Formula

Computation Formula

These may be obtained in ArcGIS by: - opening a table, right clicking on column heading, and selecting Statistics - going to ArcToolbox>Analysis>Statistics>Summary Statistics

Illiteracy-Prcnt ADMIN_NAME Anhui Beijing Fujian Gansu Guangdong Guangxi Guizhou Hainan Hebei Heilongjiang Henan Hubei Hunan 14.49 3.11 10.38 17.77 4.02 5.61 14.58 8.65 4.83 4.16 7.36 7.69 5.87 (X - Xmean) 5.780 -5.600 1.670 9.060 -4.690 -3.100 5.870 -0.060 -3.880 -4.550 -1.350 -1.020 -2.840

(X-Xmean) squared 33.40500009 31.3632942 2.787917734 82.07827067 21.99885891 9.611823616 34.45344715 0.003635381 15.05668244 20.70517656 1.823294204 1.041000087 8.067270675

Kalkulasi Variance dan Standard Deviation


Variance from Definition Formula

Nei Mongol
Jiangsu Jiangxi Jilin Liaoning Ningxia Qinghai Shaanxi Shandong Shanghai Shanxi Sichuan Taiwan Tianjin Xizang Xinjiang Yunnan

8.14
8.05 6.49 4.44 3.48 10.09 16.68 8.19 7.96 3.97 4.42 10.24 3.9 3.52 37.77 4.64 13.29

-0.570
-0.660 -2.220 -4.270 -5.230 1.380 7.970 -0.520 -0.750 -4.740 -4.290 1.530 -4.810 -5.190 29.060 -4.070 4.580

0.325235381
0.435988322 4.929705969 18.23541185 27.35597656 1.903588322 63.51621185 0.270705969 0.562941263 22.47038832 18.40662362 2.340000087 23.1389295 26.93915303 844.466506 16.5672942 20.97370597

1361.370/34 = 40.04
Variance from Computation Formula [3940.924 (296.15 * 296.15)/34]/34 =40.04 Standard Deviation = 40.04

Zhejiang
Chongqing Hong Kong Macao Sum Mean

9.36
7.8 6.5 8.7 296.15 8.710294118

0.650
-0.910 -2.210 -0.010 0.000 Variance
StanDev

0.422117734
0.828635381 4.885400087 0.000105969 1361.370297 40.04030285
6.3277

=6.33
Note: data for Taiwan is included

Classic Descriptive Statistics: Univariate Frequency distributions


US population, by age group: 50 million people age 45-59 (data for 2000)
Source: http://www.census.gov/compendia/statab/ US Bureau of the Census: Statistical Abstract of the US

Often represented by the area under a frequency curve


This area represents 100% of the data

100%
In ArcGIS, you may obtain frequency counts on a categorical variable via: --ArcToolbox>Analysis>Statistics>Frequency

Cautionthese values are incorrect!

Why? Incorrect to calculate mean for percentages


Should calculate weighted mean

Each percentage has a different base population


wi =population of each province Very common error in GIS because we use aggregated data frequently n

X=

i =1 n

wixi wi

i =1

Correct Values!
Unweighted mean = 8.7 Weighted mean = 7.75 Weighted mean is smaller. The largest provinces
have lower illiteracy

Why? Highest rates in small provinces

ADMIN_NAME Guangdong Henan Shandong

Illiteracy-Prcnt Pop2008 4.02 7.36 7.96 95,440,000 94,290,000 94,172,300

ADMIN_NAME Ningxia Qinghai Xizang (Tibet)

Illiteracy-Prcnt Pop2008 10.09 16.68 37.77 6,176,900 5,543,000 2,870,000

ADMIN_NAME Anhui Beijing Fujian Gansu Guangdong Guangxi Guizhou Hainan Hebei Heilongjiang Henan Hubei Hunan Nei Mongol Jiangsu Jiangxi Jilin Liaoning Ningxia Qinghai Shaanxi Shandong Shanghai Shanxi Sichuan Taiwan Tianjin Xizang Xinjiang Yunnan Zhejiang Chongqing Hong Kong Macao Sum

Illiteracy-Prcnt 14.49 3.11 10.38 17.77 4.02 5.61 14.58 8.65 4.83 4.16 7.36 7.69 5.87 8.14 8.05 6.49 4.44 3.48 10.09 16.68 8.19 7.96 3.97 4.42 10.24 3.9 3.52 37.77 4.64 13.29 9.36 7.8 6.5 8.7 296.15

Pop2008 61,350,000 22,000,000 36,040,000 26,281,200 95,440,000 48,160,000 37,927,300 8,540,000 69,888,200 38,253,900 94,290,000 57,110,000 63,800,000 24,137,300 76,773,000 44,000,000 27,340,000 43,147,000 6,176,900 5,543,000 37,620,000 94,172,300 19,210,000 34,106,100 81,380,000 23,140,000 11,760,000 2,870,000 21,308,000 45,430,000 51,200,000 31,442,300 7,003,700 542,400 1347382600

x*w 888961500 68420000 374095200 467016924 383668800 270177600 552980034 73871000 337560006 159136224 693974400 439175900 374506000 196477622 618022650 285560000 121389600 150151560 62324921 92457240 308107800 749611508 76263700 150748962 833331200 90246000 41395200 108399900 98869120 603764700 479232000 245249940 45524050 4718880 10445390141

Calculation of weighted mean

Unweighted mean 296.15 / 34 = 8.71 Weighted mean


10,445,390,141 / 1,347,382,600

= 7.75
Note: we should also calculate a weighted standard deviation

Statistik Sentrografik
Statistik Deskriptif untuk Distribusi spasial
Mean Center Centroid Standard Distance Deviation Standard Distance Ellipse Density Kernel Estimation

Statistik Sentrografik
Measures of Centrality Measures of Dispersion Mean Center -- Standard Distance Centroid -- Standard Deviational Ellipse Weighted mean center Center of Minimum Distance

Two dimensional (spatial) equivalents of standard descriptive statistics for a single-variable (univariate). Used for point data
May be used for polygons by first obtaining the centroid of
each polygon

Best used to compare two distributions with each


other

1990 with 2000 males with females

Mean Center

Simply the mean of the X and the mean of

Sum of differences between the mean X


Minimizes sum of squared distances
between itself and all points
and all other Xs is zero (same for Y)

the Y coordinates for a set of points

min diC

Distant points have large effect: Values for Xinjiang will have larger effect Provides a single point summary measure for the location of a set of points

The equivalent for polygons of the mean center for a point distribution The center of gravity or balancing point of a polygon if polygon is composed of straight line segments between
(there is an example later)

Centroid

nodes, centroid given by average X, average Y of nodes

Calculation sometimes approximated as center of bounding


box

Not good

By calculating the centroids for a set of polygons can apply


Centrographic Statistics to polygons

Centroids for Provinces of China

Centroids for Provinces of China

Warning: Centroid may not be inside its polygon

For Gansu Province, China,

centroid is within neighboring province of Qinghai

Problem arises with crescentshaped polygons

Weighted Mean Center

Produced by weighting each X and Y coordinate


by another variable (Wi)

Centroids derived from polygons can be


weighted by any characteristic of the polygon

For example, the population of a province

X=

n i =1 n i =1

w ixi wi

Y=

n i =1 n i =1

w iyi wi

4,7 7,7

Calculating the centroid of a polygon or the mean center of a set of points.


ID 1 2 3 4 5 sum Centroid/MC X 2 4 7 7 6 26 5.2 Y 3 7 7 3 2 22 4.4

10

(same example data as for area of polygon)

2,3 6,2

7,3

X=

Xi
i =1

,Y =

Y
i =1

0 0 10

10

4,7 7,7

Calculating the weighted mean center. Note how it is pulled toward the high weight point.
i 1 2 3 4 5 sum w MC X 2 4 7 7 6 26 Y 3 7 7 3 2 22 weight 3,000 500 400 100 300 4,300 wX 6,000 2,000 2,800 700 1,800 13,300 3.09 wY 9,000 3,500 2,800 300 600 16,200 3.77

2,3 6,2

7,3

X=

wiXi
i =1

wY ,Y = w
i =1 i

i i

0 0

10

Center of Minimum Distance or Median Center

Also called point of minimum aggregate travel That point (MD) which minimizes sum of distances between itself min diMD and all other points (i) No direct solution. Can only be derived by approximation Not a determinate solution. Multiple points may meet this criteria see next bullet. Same as Median center:

Intersection of two orthogonal lines (at right angles to each other), such that each line has half of the points to its left and half to its right Because the orientation of the axis for the lines is arbitrary, multiple points may
meet this criteria.

Source: Neft, 1966

Median and Mean Centers for US Population


Median Center: Intersection of a north/south and an east/west line drawn so half of population lives above and half below the e/w line, and half lives to the left and half to the right of the n/s line Mean Center: Balancing point of a weightless map, if equal weights placed on it at the residence of every person on census day.
Source: US Statistical Abstract 2003

Standard Distance Deviation



Represents the standard deviation of the distance of each point from the mean center Is the two dimensional equivalent of standard deviation for a single variable Given by:
Formulae for standard deviation of single variable

( Xi - X ) 2 i =1
n

( Xi - Xc ) i =1 (Yi - Yc ) 2 i=1
n 2 n

N
which by Pythagoras reduces to:

diC 2 i=1
n

Or, with weights

i=1 wi( Xi - Xc) 2 i=1 wi(Yi - Yc) 2


n n

i =1

wi

---essentially the average distance of points from the center Provides a single unit measure of the spread or dispersion of a distribution. We can also calculate a weighted standard distance analogous to the weighted mean center.

Standard Distance Deviation Example


10
Circle with radii=SDD=2.9
4,7 7,7

i 1 2 3 4 5 sum Centroid

X 2 4 7 7 6 26 5.2

Y 3 7 7 3 2 22 4.4

(X - Xc)2 10.2 1.4 3.2 3.2 0.6 18.8 sum divide N sq rt

(Y - Yc)2 2.0 6.8 6.8 2.0 5.8 23.2 42.00 8.40 2.90

2,3 6,2

7,3

0 0
i 1 2 3 4 5 sum Centroid X 2 4 7 7 6 26 5.2

5
Y 3 7 7 3 2 22 4.4 (X - Xc)2 10.2 1.4 3.2 3.2 0.6 18.8 sum of sums divide N sq rt (Y - Yc)2 2.0 6.8 6.8 2.0 5.8 23.2

10

42 8.4 2.90

sdd =

i =1

( Xi - Xc ) 2 i =1 (Yi - Yc ) 2
n

Standard Deviational Ellipse: concept

Standard distance deviation / Jarak deviasi standar : ukuran


tunggal yang baik dari penyebaran titik-titik di sekitar pusat berarti, tetapi tidak menangkap adanya bias arah

tidak menangkap bentuk distribusi

The standard deviation ellipse gives dispersion in two dimensions Defined by 3 parameters

Angle of rotation Dispersion (spread) along major axis Dispersion (spread) along minor axis

The major axis defines the direction of maximum spread of the distribution The minor axis is perpendicular to it and defines the minimum spread

Standard Deviational Ellipse: calculation

Basic concept is to:

Temukan sumbu melalui dispersi maksimum (dengan Hitung standar deviasi dari titik-titik di sepanjang sumbu Hitung standar deviasi titik di sepanjang sumbu tegak
lurus terhadap sumbu utama (dengan demikian menurunkan panjang (radius) dari sumbu minor) demikian berasal sudut rotasi)

(dengan demikian menurunkan panjang (radius) dari sumbu utama)

Mean Center & Standard Deviational Ellipse: example


Tampaknya tidak ada perbedaan besar antara lokasi perangkat lunak dan industri telekomunikasi di North Texas

Implementation in ArcGIS
In ArcToolbox

To calculate centroid for a set of polygons, with To calculate using GeoDA:

Tools>Shape>Polygons to Centroids

ArcGIS:

ArcToolbox>Data Management Tools>Features>Feature to Point (requires ArcInfo)

Density Kernel Estimation


biasanya digunakan untuk "meningkatkan visual" pola titik Is an example of exploratory spatial data analysis
(ESDA)

Kernel=10,000

Kernel=5,000

low

low

high

high

SIMPLE Kernel option (see example above) Ketetanggan" atau kernel didefinisikan sekitar setiap sel grid yang terdiri dari semua sel grid dengan pusat dalam kernel tertentu (pencarian) radius Jumlah titik yang berada dalam ketetanggaan adalah total titik Total poin dibagi dengan luas ketetanggan untuk memberikan nilai sel grid

Density KERNEL option


permukaan lancar melengkung yang dipasang di setiap titik Nilai permukaan tertinggi pada lokasi titik, dan berkurang dengan peningkatan jarak dari titik, mencapai nol pada jarak kernel dari titik. Volume bawah permukaan sama dengan 1 (atau nilai populasi jika variabel populasi digunakan) Menggunakan fungsi kernel kuadrat Kepadatan di setiap sel grid output dihitung dengan menambahkan nilai-nilai dari semua permukaan kernel mana mereka overlay pusat sel grid

Implementation in ArcGIS
If specify a population field software calculates as if there are that number of points at that location. The search radius: the size of the neighborhood or kernel which is successively defined around every cell (simple kernel) or each point (density kernel) Output cell size: Size of each raster cell Search radius and output cell size are based on measurement units of the data (here it is feet) It is good to round them (e.g. to 10,000 and 1,000)

Terima Kasih