Anda di halaman 1dari 31

Sampling Distributions and

Confidence Intervals for Proportions


Statistics and parameters
Sampling distribution of a proportion
Confidence intervals for proportions
Margin of error and critical values
Assumptions and conditions
Choosing the sample size


Statistical Experiment
Data are fixed
Experiment is random
Many possible samples
Survey
Every group of 1000 people is equally likely
Experiment
Each assignment to control or treatment could
happen
Statistics and Parameters
Statistics calculated

s
Sample proportion
Parameters - unknown


Population proportion p

x

p
Inference
Use statistic to describe the parameter
Parameter is the goal
Statistic is the tool
Sampling variability
Each time we take a random sample from a population, we are
likely to get a different set of individuals and calculate a
different statistic. This is called sampling variability. If we take a
lot of random samples of the same size from a given
population, the variation from sample to samplethe
sampling distributionwill follow a predictable pattern.
Sampling Distribution
Distribution of the statistic obtained from
repeated samples (or repeated trials of an
experiment) using the same number of
observations.
Changes with parameter
Random Variable
Imagine the data are random variables
Statistic = function of data
Statistic is also random

Survey
Simple Random Sample
78% agree with statement
Repeat survey
77% agree
73% agree

Sample is random
Sample proportions
The proportion of successes can be more informative than
the count. In statistical sampling the sample proportion of
successes, , is used to estimate the proportion p of
successes in a population.

For any SRS of size n, the sample proportion of successes is:
n
X
n
p = =
sample in the successes of count

In an SRS of 50 students in an undergrad class, 10 are Hispanic:


= (10)/(50) = 0.2 (proportion of Hispanics in sample)

The 30 subjects in an SRS are asked to taste an unmarked brand of coffee and
rate it would buy or would not buy. Eighteen subjects rated the coffee
would buy. = (18)/(30) = 0.6 (proportion of would buy)
p
p
p
If the sample size is much smaller than the size of a population with
proportion p of successes, then the mean and standard deviation of
are:
n
p p
p
p p
) 1 (


= = o
Because the mean is p, we say that the sample proportion in an SRS is an
unbiased estimator of the population proportion p.

The variability decreases as the sample size increases. So larger samples
usually give closer estimates of the population proportion p.
p
Sampling distribution of the sample proportion

The sampling distribution of is never exactly normal. But as the sample
size increases, the sampling distribution of becomes approximately
normal.
The normal approximation is most accurate for a large fixed n when p is close
to 0.5, and least accurate when p is near 0 or near 1.
p

Rules for Sample Proportion


1. A population with a fixed proportion p
2. Random Sample
Independent
Equal chance
3. Sample size is large
np > 9
n(1-p) > 9
Describes a binomial
experiment with normal
approximation
then
is approximately Normal with mean and
standard deviation
How does this help?
What is p?


p(1-p) so plug in p=1/2 ( worst case)
Or
Example
Suppose that a soda bottler claims
that only 5% of the soda cans are underfilled.
A quality control technician randomly samples 200
cans of soda. What is the probability that more than
10% of the cans are underfilled?
) 10 .

( > p P
) 24 . 3 ( )
200
) 95 (. 05 .
05 . 10 .
( > =

> = z P z P

=1.9994 = .0006
n = 200
S: underfilled can
p = P(S) = .05
q = .95
np = 10 nq = 190
OK to use the normal
approximation
Confidence intervals for proportions
According to the empirical rule
95% between - 2 and +2
500 people in Washington surveyed
120 agree with recent changes to bankruptcy laws

Sample proportion = 120/500 = 0.24

Estimate: SE ( )=


p
Can we say?
24% of people in Washington agree with the
recent changes to bankruptcy laws. (No)
It is probably true that 24% of people in
Washington agree with the recent changes to
bankruptcy laws. (No)
We are 95% confident that between 24%-2*1.9%
and 24%+2*1.9% of people in Washington agree
with the recent changes to bankruptcy laws. (Yes)


95% Confidence
What does the term 95% confidence really
mean?
95% of samples of this size will produce
confidence intervals that capture the
true proportion


95% Confidence (cont.)
What does the term 95% confidence really mean?
We expect 5% of our samples to produce intervals that
fail to capture the true proportion


Margin of Error
A 95% CI for a population proportion p can be
written as:



Half of the width of the CI is called the margin
of error, so the CI can be written as:
estimate ME.



p 2 SE(

p ).
Critical Values

Confidence level 95%: weve been using 2 for the
empirical rule, but if we use software/table, we get z* =
1.96, so this is the critical value and

95% CI is

Confidence level 90%: z* = 1.645, so

90% CI is
). ( 96 . 1 p SE p
). ( 645 . 1 p SE p
90% Confidence Interval
Critical Values
Confidence Level Critical Value = z*
90 1.645
95 1.96
98 2.326
99 2.576
99.9 3.29
Confidence Intervals
Point estimate Critical value x Standard error
Generally

95% Confidence Interval





Example: Give a 95% Confidence Interval for
p.
Supposed that 43% of 1006 people surveyed in
Washington are in favor of a financial reform
Standard Error




Margin of error

Upper limit

Lower limit

BLS Household survey
60,000 people surveyed. 6.8% unemployed.
90% confidence interval

Assumptions and Conditions
(1) Independence Assumption: Are sample observations
independent of each other?
(1) Randomization Condition: Was the sample randomly
generated?
(2) 10% Condition: If sampling is done without replacement,
then the sample size, n, must be no larger than 10% of
the population.

(2) Success/Failure Condition: The sample size must be large
enough so that both np and n(1-p) are at least 10.


Questions
Lets think about the 95% confidence interval we just computed
for the proportion of people in Washington in favor of a
financial reform

1 If we wanted to be 99% confident, would our confidence
interval need to be wider or narrower?
2 Our margin of error was about 3%. If we wanted to reduce it
to 2%, would our level of confidence be higher or lower?
3 If we had polled more people, would the intervals margin of
error have been larger or smaller?
Sample size for a desired margin of error
You may need to choose a sample size large enough to achieve a
specified margin of error. However, because the sampling distribution
of is a function of the population proportion p, this process
requires that you guess a likely value for p: p*.
The margin of error will be less than or equal to m if p* is chosen to be 0.5.

Remember, though, that sample size is not always stretchable at will. There are
typically costs and constraints associated with large samples.
( ) *) 1 ( *
*
) 1 ( , ~
2
p p
m
z
n n p p p N p
|
.
|

\
|
=
p
What sample size...?
How large a sample would be necessary to
estimate the true proportion of defective
products in a large population of products
within 3%, with 95% confidence?
(Assume a pilot sample yields an estimate of
p of 0.12)
What sample size...?
Solution:
For 95% confidence, use z* = 1.96
m = 0.03
p* = 0.12, so use this to estimate p
So use n = 451

n =
z
2
p*(1 p*)
m
2
=
(1.96)
2
(0.12)(1 0.12)
(0.03)
2
= 450.74
(continued)

Anda mungkin juga menyukai