Anda di halaman 1dari 23

Biostatistics for Public Health

Chapter 16 -Inference About a Proportion

Kevin Brooks MSc., PhD.


Objective

Interpret the components of an Analysis of Variance


and perform the F test to compare means.
✓ Proportions
✓ The Sampling Distribution of a Proportion
✓ Hypothesis Test, Normal Approximation
✓ Hypothesis Test, Exact Binomial Method

Kevin Brooks MSc., PhD. 2


Our data analysis journey continues …

Our data analysis journey continues …

3
Kevin Brooks MSc., PhD.
Structure of the Book

• Chapters 1 – 10 focused on statistical


concepts and practices
• After completing selected chapters on
statistical concepts, you may cover Chapters
11 – 19 in any order.
• Chapters 11 – 15 focused on the analysis of
quantitative response variables
• Chapters 16 – 19 focuses on the analysis
categorical response variables

4
Kevin Brooks MSc., PhD.
Summarizing data

This figure illustrates a basic difference of


quantitative and categorical data analysis.

5
Kevin Brooks MSc., PhD.
Binary Response Variable
Examples of binary response variables
• Classification of a respondent as a current smoker:
“yes” or “no”
• Gender: “male” or “female”
• Whether a patient survives five or more years:
“survived” or “did not survive”
• Whether the subject developed a blood clot: “case”
or “non-case”

6
Kevin Brooks MSc., PhD.
Binary Response Variable, cont

• One outcome is arbitrarily labeled a “success” and


the other a “failure”
• If the process of selection is random, the number
of successes in a sample will follow a binomial
distribution with parameters n and p
• Notation: Let X ~ b(n, p) represent a binomial
random variables with parameters n and p (Chapter 6)

7
Kevin Brooks MSc., PhD.
Proportions
• Conditions:
– Single SRS
– Response variable is binary.
• Describe the proportion of successes in the
sample, denoted “p-hat”:



 x

pˆ =
n
• where x = no. of successes and n = sample size
8
Kevin Brooks MSc., PhD.
Proportion, cont

• Two of ten individuals in the sample have a risk


factor for disease X.
• Therefore, the prevalence of this risk factor in the
sample is:

x 2
pˆ = = 0.1 (or
= 0.2 (or10%)
20%)
n 10
9
Kevin Brooks MSc., PhD.
A Proportion is an Average of 0s and 1s

Observation X
Here are the data in tabular form with
1 1
the variable coded 1= risk factor present 2 0
and 0 = risk factor absent. Note that 3 0
with n =10 and ∑x = 2 and sample mean 4 0
5 0

x=
∑ x
=
2 6
7
0
0
n 10 8 0
The sample mean and sample proportion 9 1
are equivalent when we think of binary 10 0
responses in this way. ∑x = 2
10
Kevin Brooks MSc., PhD.
Incidence and Prevalence

• Prevalences and some types of incidences are


proportions
• Incidence proportion (average risk) is proportion that
develop a specified condition over a set period of time
• Prevalence is the proportion with the characteristic at a
particular point in time

11
Kevin Brooks MSc., PhD.
Illustrative Example: Prevalence of Smoking

• A SRS of 57 adults reveals 17 current smokers.


• Thus, the prevalence of smokers in the sample is:

17
pˆ = = .2982
57
• Calculations should carry at least 4 significant digits.
For reporting purposes, the APA publication guide
(2001) recommends that proportions be converted to
percentages and reported with one-decimal accuracy
(e.g., 29.8%).
12
Kevin Brooks MSc., PhD.
Inference about a Proportion

• How good is sample proportion p̂ at estimating


population proportion p?
• To answer this question, consider what would
happen if we took repeated samples, each of size
n, from the population?
• How would sample proportions be distributed?

13
Kevin Brooks MSc., PhD.
Simulation of a sampling distribution of proportions showing
superimposed Normal approximation.

14
Kevin Brooks MSc., PhD.
Sampling Distribution of a Proportion

• In SRSs, the random number of success X in


such samples follow binomial distributions with
parameters n and p (Chapter 6)
• Sample proportion p̂ is a mathematical
transformation of the count of successes (divide
the count by n)
• When n is moderate to large, a Normal
approximation to the binomial can be used
(§8.3) to describe the sampling distribution of p̂

15
Kevin Brooks MSc., PhD.
Sampling distribution of a proportion, Normal approximation

⎛ pq ⎞
pˆ ~ N ⎜ p, ⎟
⎜ n ⎟
⎝ ⎠
where q = 1 − p

16
Kevin Brooks MSc., PhD.
Hypothesis Test, Normal Approximation Method

A. H0: p = p0 vs. Ha: p ≠ p0 where p0 represents the


proportion specified by the null hypothesis
B. Test statistic
pˆ − p0
z stat =
p0 q0 n
C. P-value. Convert zstat to P-value [e.g., using Table F].
Interpret results.
D. Significance level (optional).

17
Kevin Brooks MSc., PhD.
Illustration: Hypothesis Test

An SRS of n = 57 finds 17 smokers (p-hat = 17 / 57


= 0.2982). The national average for smoking
prevalence is 0.25. Is the proportion in the sample
significantly different than the national average?
A. H0:p = 0.25 vs. Ha: p ≠ 0.25

B. z = pˆ − p0 .2982 − .25
stat = = 0.84
p0 q0 n .25 ⋅ .75 57

C. P = 0.4010 [via Table F]. Weak evidence against H0.


The sample proportion is not significantly different
than the national average. 18
Kevin Brooks MSc., PhD.
Hypothesis Test, Exact Binomial Method

• When n is small (e.g., less than 5 successes


expected), binomial distributions do not resemble
Normal distributions and z procedures can not be
used.
• Instead an exact binomial procedure (e.g, “Fisher’s
method”) should be used

19
Kevin Brooks MSc., PhD.
Exact Binomial Method

A. Hypotheses. H0: p = p0 vs. Ha: p ≠ p0 where p0


represents the proportion under the null hypothesis
B. Test statistic. Observed number of successes, x.
C. P-value. Use a software program to calculate the P-
value. Interpret the results.
The theory of the test assumes X ~ b(n, p0). For
right-sided tests, the P-value = Pr(X ≥ x) from the
binomial distribution. (See text for additional
details.)
D. Significance level (optional).
20
Kevin Brooks MSc., PhD.
Conditions for Inference

• Sampling independence (SRS or facsimile)


• Valid information
• The z test of H0: p = p0 requires np0q0 ≥ 5

21
Kevin Brooks MSc., PhD.
Thank You

For Viewing

Kevin Brooks MSc., PhD.


MPH Program,
Division of Public Health
College of Human Medicine
Michigan State University
brooks52@msu.edu

Kevin Brooks MSc., PhD.


Biostatistics for Public Health

The End

Chapter 16 - Inference About a


Proportion

Kevin Brooks MSc., PhD. 23

Anda mungkin juga menyukai