Research Methods: Introduction To Inferential Statistics

Research Methods
Lecture 8
Introduction to Inferential Statistics
Topics
The z Test: What It Is and What It
Does
Confidence Intervals Based on the
z Distribution
The t Test: What It Is and What It
Does
Confidence Intervals Based on the
t Distribution
Topics
The Chi-Square (2) Goodness-ofFit Test: What It Is and What It
Does
Correlation Coefficients and
Statistical Significance
Warning
Pay attention! the material covered
in this lecture is not easy and you
will need to go over it several times
to get the full meaning
The z Test:
What It Is and What It Does
The z test compares the mean of a sample
to the mean of the population
z test: a parametric inferential statistical
test of the null hypothesis for a single
sample where the population variance is
known
Sampling distribution: a distribution of
sample means based on random
samples of a fixed size from a population
Standard error of the mean: the standard
deviation of the sampling distribution
The z Test
The key idea here is that we are
comparing the distribution of
individual scores which make up
the population with the statistics of
a sample of N
Because we have a sample of N we
expect the variation to be smoothed
out compared to the population
the Central Limit Theorem:
The z Test
Central limit theorem
States that for any population with a
mean and a standard deviation ,
the distribution of sample means for
sample size N:
Will have a mean of
Will have a standard deviation of
X =
N
Will approach a normal distribution as N

approaches infinity
The z Test
Formula for z:
Example data - 1
The population statistic for IQ is a
mean of 100 with a SD of 15
Suppose that we test a class of 75
students and find that their mean IQ
is 103.5 are they a special class
with an above average IQ?
This is a one tailed z test
Calculations for the

One-Tailed z Test
We can calculate
We now use 1.73 in the z-test

formula:
Interpreting the One-Tailed z Test

How do we now interpret z = 2.02
Critical value
The value of a test statistic that marks the
edge of the region of rejection in a sampling
distribution
Values equal to it or beyond it fall in the
region of rejection
Region of rejection
The area of a sampling distribution that lies
beyond the test statistics critical value
When a score falls within this region, H0, is
rejected
Interpreting the One-Tailed z Test
From standard tables the critical value of z for

this situation is 1.645 we therefore reject
the null hypothesis with 95% confidence the
class is clever!
Example data - 2
Students within a certain age range
have a mean weight of 90 pounds
(USA) with a SD of 17 pounds
We select a group of 50 who are
taking part in an exercise
programme and their mean weight
is 86 pounds. Is the programme
having any effect on their weight
This is a two tailed z test
Calculations for the

Two-Tailed z Test
The calculations follow exactly the
same steps as the one tail z test
Is the z of -1.67 significant we

need to know the critical z value
Interpreting the Two-Tailed z Test
For N = 50 the confidence value of 5% is

effectively split into 2 parts for a two tail test
Our result of 1.67 is less than 1.96 and so
we cannot reject the null hypothesis
Statistical Power
One-tailed test: statistically a more
powerful test than a two-tailed test
Statistical power: the probability of
correctly rejecting a false H0
With a one-tailed test, we are more
likely to reject H0
zobt does not have to be as large to be
considered significantly different from
the population mean
The z Test
As the sample size increases:
The standard error of the mean
decreases
This increases the statistical power
The z test is appropriate to use:

If the parameters, such as and , are
known
With interval or ratio data
Otherwise the t test is appropriate to use:

In cases where the sample size is small or
Where is not known
Notation
One tail test: Ha: 0<1 or 0>1
Two tail test: Ha: 01
Null hypothesis: H0: 0=1
z (N=50) = -1.67, p<0.05 (one
tailed)
Confidence Intervals
We can estimate the population mean from a
sample within a certain degree of
confidence
Confidence interval: an interval of a certain
width that we feel confident will contain
Statisticians recommend a 95% or a 99%
confidence interval
Formula for the confidence interval:
Example
Using our previous example with the weights
of students we found a mean of 86 pounds
with a SD of 17 pounds from a sample of
50
We found:
X = 2.40
In a z distribution (2 tail) 95% of the results

are found within z = 1.96 of the mean
We therefore estimate the population mean

as 86.0 4.7 (2.4x1.96) with 95%
confidence
The t Test:
What It Is and What It Does
t test
A parametric inferential statistical test of the
null hypothesis for a single sample where the
population variance is not known
Students t distribution: a set of

distributions that, although symmetrical
and bell-shaped, are not normally
distributed
Degrees of freedom (df): the number of
scores in a sample that are free to vary
generally df = N-1
The t Test
Since we do not know the population
statistics we need to estimate them
from our data. The steps are:
Estimated standard error
on the mean
s
sX =
N
s=
(X X)
N 1
X
t=
sX
Estimated standard
deviation of the
population
Calculations for the t Test

Calculations for the t test follow
exactly the same steps as for the z
test including the choice of a one or
two tailed test
Obviously different tables are used
to find the critical values to see
whether the null hypothesis can be
rejected
Example one tail t test

SAT Score
1010
1200
1310
1075
1149
1078
1129
1069
1350
1390
The mean SAT score of

students entering a US
university is 1090
Biology majors (10) have a
mean SAT score of 1176
are they cleverer than
average?
Example
s = 131.80
sX = 41.71
1176 1090
t=
= +2.06
41.71
From tables for 95% confidence with df = 9
we need a t of 1.833 (one tail)
2.06>1.833 and so we reject the null

hypothesis
t(9) = 2.06, p<0.05 (one tailed)
The t Test
Assumptions of the t test
Data are interval or ratio
Population distribution of scores is
symmetrical
The t test is used in situations in which:

Population mean is known
But the population standard deviation () is
not known
If these criteria are not met:

A non-parametric test is more appropriate
Confidence Intervals
Based on the t Distribution
For a one-sample t test, the
confidence interval is determined
by:
Typically, statisticians recommend

using either the 95% or 99%
confidence interval
The Chi-Square (2)

Goodness-of-Fit Test
Chi-square (2) goodness-of-fit test
A nonparametric inferential procedure that
determines how well an observed frequency
distribution fits an expected distribution
Observed frequency: the frequency with

which participants fall into a category
Expected frequency: the frequency
expected in a category if the sample data
represent the population
The Chi-Square (2)

Formula for chi-square:
where
O is the observed frequency
E is the expected frequency
In approved style, the result is

reported as:
The Chi-Square (2)

Assumptions and appropriate use
Appropriate for nominal (categorical)
data
The frequencies in each expected
frequency cell should not be too small
(not less than 5)
The sample should be randomly
selected and the observations must be
independent
Example
In the USA 17% of teenagers at
High School get pregnant
In a certain school 7 girls were
pregnant out of 80
Does this school have a
significantly lower incidence of
teenage pregnancy?
Example
Frequency
Pregnant
Not pregnant
Observed
73
Expected
14
66
In this example:
2 = 4.24
From tables the critical value at 95% is
3.84 and so the null hypothesis is rejected
With 2 categories (pregnant and not
pregnant) we have df=1
Correlation Coefficients and

Statistical Significance
We can have confidence levels for the correlation
coefficients that we met earlier
A one-tailed test of a correlation coefficient
Means that we have predicted the expected
direction of the correlation coefficient
A two-tailed test
Means that we have not predicted the
direction of the correlation coefficient
Degrees of freedom for the Pearson product:
N 2, where N represents the total number of
pairs of observations
Summary
Parametric tests: the z test and the t test
The distributions should be bell-shaped
Certain parameters should be known
Data should be interval or ratio
Nonparametric test: chi-square test

Population parameters are not needed
The underlying distribution of scores is not
assumed to be normal
Data are most commonly nominal or ordinal
Memory test
Population average is 7
http://faculty.washington.edu/
chudler/stm0.html

Research Methods: Introduction To Inferential Statistics

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Research Methods: Introduction To Inferential Statistics

Diunggah oleh

Hak Cipta:

Format Tersedia

Research Methods

Will approach a normal distribution as N

Calculations for the

We now use 1.73 in the z-test

Interpreting the One-Tailed z Test

Interpreting the One-Tailed z Test

From standard tables the critical value of z for

Calculations for the

Is the z of -1.67 significant we

Interpreting the Two-Tailed z Test

For N = 50 the confidence value of 5% is

The z test is appropriate to use:

Otherwise the t test is appropriate to use:

In a z distribution (2 tail) 95% of the results

We therefore estimate the population mean

Students t distribution: a set of

Calculations for the t Test

Example one tail t test

The mean SAT score of

2.06>1.833 and so we reject the null

The t test is used in situations in which:

If these criteria are not met:

Typically, statisticians recommend

The Chi-Square (2)

Observed frequency: the frequency with

The Chi-Square (2)

In approved style, the result is

The Chi-Square (2)

Correlation Coefficients and

Nonparametric test: chi-square test

Anda mungkin juga menyukai