Lecture 8
Introduction to Inferential Statistics
Topics
The z Test: What It Is and What It
Does
Confidence Intervals Based on the
z Distribution
The t Test: What It Is and What It
Does
Confidence Intervals Based on the
t Distribution
Topics
The Chi-Square (2) Goodness-ofFit Test: What It Is and What It
Does
Correlation Coefficients and
Statistical Significance
Warning
Pay attention! the material covered
in this lecture is not easy and you
will need to go over it several times
to get the full meaning
The z Test:
What It Is and What It Does
The z test compares the mean of a sample
to the mean of the population
z test: a parametric inferential statistical
test of the null hypothesis for a single
sample where the population variance is
known
Sampling distribution: a distribution of
sample means based on random
samples of a fixed size from a population
Standard error of the mean: the standard
deviation of the sampling distribution
The z Test
The key idea here is that we are
comparing the distribution of
individual scores which make up
the population with the statistics of
a sample of N
Because we have a sample of N we
expect the variation to be smoothed
out compared to the population
the Central Limit Theorem:
The z Test
Central limit theorem
States that for any population with a
mean and a standard deviation ,
the distribution of sample means for
sample size N:
Will have a mean of
Will have a standard deviation of
X =
N
The z Test
Formula for z:
Example data - 1
The population statistic for IQ is a
mean of 100 with a SD of 15
Suppose that we test a class of 75
students and find that their mean IQ
is 103.5 are they a special class
with an above average IQ?
This is a one tailed z test
Region of rejection
The area of a sampling distribution that lies
beyond the test statistics critical value
When a score falls within this region, H0, is
rejected
Example data - 2
Students within a certain age range
have a mean weight of 90 pounds
(USA) with a SD of 17 pounds
We select a group of 50 who are
taking part in an exercise
programme and their mean weight
is 86 pounds. Is the programme
having any effect on their weight
This is a two tailed z test
Statistical Power
One-tailed test: statistically a more
powerful test than a two-tailed test
Statistical power: the probability of
correctly rejecting a false H0
With a one-tailed test, we are more
likely to reject H0
zobt does not have to be as large to be
considered significantly different from
the population mean
The z Test
As the sample size increases:
The standard error of the mean
decreases
This increases the statistical power
Notation
One tail test: Ha: 0<1 or 0>1
Two tail test: Ha: 01
Null hypothesis: H0: 0=1
z (N=50) = -1.67, p<0.05 (one
tailed)
Confidence Intervals
We can estimate the population mean from a
sample within a certain degree of
confidence
Confidence interval: an interval of a certain
width that we feel confident will contain
Statisticians recommend a 95% or a 99%
confidence interval
Formula for the confidence interval:
Example
Using our previous example with the weights
of students we found a mean of 86 pounds
with a SD of 17 pounds from a sample of
50
We found:
X = 2.40
The t Test:
What It Is and What It Does
t test
A parametric inferential statistical test of the
null hypothesis for a single sample where the
population variance is not known
The t Test
Since we do not know the population
statistics we need to estimate them
from our data. The steps are:
Estimated standard error
on the mean
s
sX =
N
s=
(X X)
N 1
X
t=
sX
Estimated standard
deviation of the
population
Example
s = 131.80
sX = 41.71
1176 1090
t=
= +2.06
41.71
From tables for 95% confidence with df = 9
we need a t of 1.833 (one tail)
The t Test
Assumptions of the t test
Data are interval or ratio
Population distribution of scores is
symmetrical
Confidence Intervals
Based on the t Distribution
For a one-sample t test, the
confidence interval is determined
by:
where
O is the observed frequency
E is the expected frequency
Example
In the USA 17% of teenagers at
High School get pregnant
In a certain school 7 girls were
pregnant out of 80
Does this school have a
significantly lower incidence of
teenage pregnancy?
Example
Frequency
Pregnant
Not pregnant
Observed
73
Expected
14
66
In this example:
2 = 4.24
From tables the critical value at 95% is
3.84 and so the null hypothesis is rejected
With 2 categories (pregnant and not
pregnant) we have df=1
Summary
Parametric tests: the z test and the t test
The distributions should be bell-shaped
Certain parameters should be known
Data should be interval or ratio
Memory test
Population average is 7
http://faculty.washington.edu/
chudler/stm0.html