Anda di halaman 1dari 31

Statistics

Lecture and exercises (WS 2012/2013)

Dr. Olaf Lenz Institut fr Angewandte Geowissenschaften Angewandte Sedimentgeologie Technische Universitt Darmstadt

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 1

Structure
Basics (3 lectures with exercises)
17.10.2012 24.10.2012 31.10.2012 07.11.2012 14.11.2012 21.11.2012 28.11.2012 05.12.2012 12.12.2012 16.01.2013 23.01.2013 30.01.2013 Introduction on Statistics Data Presentation Requirements of Data for Statistical Analysis

Elementary Statistics (6 lectures with exercises)


t-tests and F-tests Analysis of Variance Correlation and Regression Chi-square Tests Non-parametric Tests Multivariate ANOVA/Repeated Measures

Analysis of Multivariate Data (3 lectures with exercises)


Cluster-Analysis Principal Component Analysis (Detrended) Correspondence Analysis

Time Series Analysis (1 lecture with exercises)


06.02.2013 13.02.2013 Analysis of stationary data: Spectral Analysis Analysis of non-stationary data: Wavelet Analysis

Final exam

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 2

Summary Statistics
Mean Median Mode Quartiles

Measures of location:

location of the center of the distribution location of the other parts of the distribution

Measures of spread:

Variance Standard deviation Interquartile range

variability of the data values

Measures of shape:

Coefficent of skewness Coefficient of variation Kurtosis length of the tail

symmetry

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 3

Exercise 1

Exercise 1: Find the median and the mean of these three data sets. Use the PAST software. a) 11 8 6 11 10 8 7 5 mean: (72/9) = 8 median: 8 5 6 6 7 8 8 10 11 11

b) 7 15 mean: (62/8 = 7.75) 2 3 6

12 6 3 median: (7+8)/2 = 7.5 9 12 15

c) 7 100 mean: (147/8 = 18.375) 2 3 6

12 6 3 median: (7+8)/2 = 7.5 9 12 100

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 4

Exercise 2
Exercise 2: Given are the incomes of five people in a small village. Calculate the mean and the median. Which measure is better? Why?

Person Sam Harvey Fred Jill Adrienne Mean Median

Income ($) 4 785 320 32 190 31 870 26 500 24 200 980 016 31 870

The median is better, because the mean is quite sensitive to erratic high values or outliers
22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 5

Exercise 3
Exercise 3: Given are the results of 31 vocabulary tests.
20 31 33 23 30 30 34 32 28 33 30 23 30 30 36 33 20 37 32 23 36 25 23 35 23 22 24 23 26 27 31

mean: (883/31 = 28.48) mode mean median median: 30 mode: 23

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 6

Exercise 4
Exercise 4: Suppose that the following scores were obtained on administering an English language test to ten non-native speakers who had undergone before a language course for brushing-up their knowledge, and ten otherwise similar people who had not undergone a language course:

without language course 15 22 62 17 31 58 45 9 76 43


mean: 37.8 standard dev.: 22.63

with language course 36 34 47 41 28 54 63 38 54 32


mean: 42.7 standard dev.: 11.36

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 7

Data Presentation

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 8

Introduction on data presentation

Graphs help to verify that it is valid to use a particular test Graphs may reveal unexpected patterns in the data Graphs quickly reveal any mistakes in our data
antipollution treatment insects

0 mg 1 mg 2 mg 3 mg

100 120 140 2500

exploratory data analysis

Does adding the antipollution treatment lead to an increase in invertebrates?


22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 9

Column graphs (bar graphs)

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 10

Townend (2002)

Column graphs (bar graphs)


Mean nitrogen contents of plants in different experimental treatments: (a) grouped by weedkiller treatment, (b) grouped by soil type (c) grouped by presence or absence of clover during period between crops.

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 11

Townend (2002)

Histogram
In a column graph, values are represented by the height of the columns; In a histogram, values are represented by the area of the columns

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 12

Davis (2002)

PAST column graph


1 2

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 13

PAST histogram
1 2

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 14

Line graph

Changes in mean stomatal conductance of two tree species over the course of a day
22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 15

Townend (2002)

Line graph

Mean concentrations of insecticide at different distances from fish cages at a fish farm
22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 16

Townend (2002)

PAST Line graph


1 2

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 17

Scatter graph

Relationship between the mean mass per seed and the number of seeds produced, for a range of plant specimens collected on Mossely Heath. The circled point appears to be an unusual observation, which could warrant further investigation
22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 18

Townend (2002)

Scatter graph: Independent/dependent variable


Example: How fast individual lions can run might be controlled by how long their legs are, but the length of their legs is not controlled by how fast they run.

Y-axis
Response (dependent) variable: running speed

X-axis
Controlling (independent) variable: leg length

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 19

Scatter graph

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 20

Townend (2002)

PAST Scatter graph


1 2

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 21

Ternary plot

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 22

source: wikipedia

PAST Ternary plot


1

1) 2) 3)

A: 10; B: 80; C: 10 A: 30; B: 60; C: 10 A: 40; B: 40; C: 20 10% C

3 2 1
80% B
22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 23

10% A

Tables

There are sometimes situations where a table might be better than a graph When it is important that the reader can obtain the values in your results accurately or if readers are likely to want to use or compare your actual figures elsewhere, rather than just to compare with other populations in your experiment or survey.

When you need to present a lot of data and this appears messy when you try to draw a graph of them.

When data can be presented a lot more compactly in a table.

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 24

Standard error and error bars in graphs

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 25

Tree height (m)

Townend (2002)

Standard error and error bars in tables

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 26

Townend (2002)

PAST error bars

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 27

PAST error bars

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 28

Box-and-whisker plot
Histogram

Box-and-whisker plots

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 29

Davis (2002)

PAST Box-and-whisker plot

Outlier

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 30

Next week
Basics (3 lectures with exercises)
17.10.2012 24.10.2012 31.10.2012 07.11.2012 14.11.2012 21.11.2012 28.11.2012 05.12.2012 12.12.2012 16.01.2013 23.01.2013 30.01.2013 Introduction on Statistics Data Presentation Requirements of Data for Statistical Analysis

Elementary Statistics (6 lectures with exercises)


t-tests and F-tests Analysis of Variance Correlation and Regression Chi-square Tests Non-parametric Tests Multivariate ANOVA/Repeated Measures

Analysis of Multivariate Data (3 lectures with exercises)


Cluster-Analysis Principal Component Analysis (Detrended) Correspondence Analysis

Time Series Analysis (1 lecture with exercises)


06.02.2013 13.02.2013 Analysis of stationary data: Spectral Analysis Analysis of non-stationary data: Wavelet Analysis

Final exam

22. Oktober 2012 | Fachbereich 11 | Angewandte Geowissenschaften | Dr. Olaf Lenz | 31

Anda mungkin juga menyukai