SAMPLING DISTRIBUTIONS
POPULATION AND
SAMPLING DISTRIBUTIONS
Population Distribution
Sampling Distribution
2
Population Distribution
Definition
The population distribution is the
probability distribution of the
population data.
3
Population Distribution
cont.
Suppose there are only five students
in an advanced statistics class and
the midterm scores of these five
students are
70 78 80 80 95
Let x denote the score of a student
4
Table 7.1 Population Frequency and Relative
Frequency Distributions
x f Relative Frequency
70 1 1/5 = .20
78 1 1/5 = .20
80 2 2/5 = .40
95 1 1/5 = .20
N=5 Sum = 1.00
5
Table 7.2 Population Probability Distribution
x P (x)
70 .20
78 .20
80 .40
95 .20
ΣP (x) = 1.00
6
Sampling Distribution
Definition
The probability distributionxof is
called its sampling distribution. It lists
the variousx values that can assume
x
and the probability of each value of .
In general, the probability distribution
of a sample statistic is called its
sampling distribution.
7
Sampling Distribution
cont.
Reconsider the population of midterm
scores of five students given in Table 7.1
Consider all possible samples of three
scores each that can be selected, without
replacement, from that population.
The total number of possible samples is
5! 5 ⋅ 4 ⋅ 3 ⋅ 2 ⋅1
5 C3 = = = 10
3!(5 − 3)! 3 ⋅ 2 ⋅ 1 ⋅ 2 ⋅ 1
8
Sampling Distribution
cont.
Suppose we assign the letters A, B, C, D,
and E to the scores of the five students so
that
A = 70, B = 78, C = 80, D = 80, E = 95
Then, the 10 possible samples of three scores
each are
ABC, ABD, ABE, ACD, ACE,
ADE, BCD, BCE, BDE, CDE
9
Table 7.3 All Possible Samples and Their Means
When the Sample Size Is 3
x P( x )
76.00 .20
76.67 .10
79.33 .10
81.00 .10
81.67 .20
84.33 .20
85.00 .10
ΣP ( x ) = 1.00
12
SAMPLING AND
NONSAMPLING ERRORS
Definition
Sampling error is the difference between the
value of a sample statistic and the value of the
corresponding population parameter. In the
case of the mean,
Sampling error =
µ
assuming that the samplexis−random and no
nonsampling error has been made.
13
SAMPLING AND
NONSAMPLING ERRORS
cont.
Definition
The errors that occur in the collection,
recording, and tabulation of data are
called nonsampling errors.
14
Example 7-1
Reconsider the population of five
scores given in Table 7.1. Suppose one
sample of three scores is selected from
this population, and this sample
includes the scores 70, 80, and 95.
Find the sampling error.
15
Solution 7-1
70 + 78 + 80 + 80 + 95
µ= = 80.60
5
70 + 80 + 95
x= = 81.67
3
Sampling error = x − µ = 81.67 − 80.60 = 1.07
17
SAMPLING AND
NONSAMPLING ERRORS
cont.
The difference between this sample
mean and the population mean is
x − µ = 82.33 − 80.60 = 1.73
This difference does not represent
the sampling error.
Only 1.07 of this difference is due to the
sampling error
18
SAMPLING AND
NONSAMPLING ERRORS
cont.
The remaining portion represents the
nonsampling error
It is equal to 1.70 – 1.07 = .66
It occurred due to the error we made in
recording the second score in the
sample
Also,
Nonsampling error = Incorrect x − Correct x
= 82.33 − 81.67 = .66
19
Figure 7.1 Sampling and nonsampling errors.
20
MEAN AND STANDARD
DEVIATION OF x
Definition
The mean and standard deviation of
x of are
the sampling distribution
called the mean and standard x
deviation of µ x andσare x denoted by
and , respectively.
21
MEAN AND STANDARD
DEVIATION OF x cont.
x of
Mean of the Sampling Distribution
The mean of the sampling
x
distribution of is always equal to
the mean of the population. Thus,
µx = µ
22
MEAN AND STANDARD
DEVIATION OF x cont.
Standard Deviation of the Sampling Distribution of x
The standard deviation of the sampling
distribution of is
x
σ
σx =
n
where σ is the standard deviation of the population
and n is the sample size. This formula is used when
n /N ≤ .05, where N is the population size.
23
Standard Deviation of the
Sampling Distribution of
cont. x
If the condition n /N ≤ .05 is not
satisfied, we use the following
σ
formula to calculate :
x
σ N −n
σx =
n N −1
N −n
N −1
where the factor is called the
finite population correction factor
24
Two Important
Observations
1. The spread of the sampling
x
distribution of is smaller than the
spread of the corresponding
σ x < σ distribution,
population x
i.e.
2. The standard deviation of the
x
sampling distribution of
decreases as the sample size
increases
25
Example 7-2
The mean wage for all 5000 employees
who work at a large company is $17.50
and the standard deviation is $2.90. Letx
be the mean wage per hour for a
random sample of certain employees
selected from this company. Find the
mean and standard deviation of for a
sample size of x
a) 30
b) c) 200
75
26
Solution 7-2
a)
µ x = µ = $17.50
σ 2.90
σx = = = $.529
n 30
27
Solution 7-2
b)
µ x = µ = $17.50
σ 2.90
σx = = = $.335
n 75
28
Solution 7-2
c)
µ x = µ = $17.50
σ 2.90
σx = = = $.205
n 200
29
SHAPE OF THE SAMPLING
DISTRIBUTION OF x
Sampling from a Normally
Distributed Population
Sampling from a Population That Is
Not Normally Distributed
30
Sampling From a Normally
Distributed Population
If the population from which the samples
are drawn is normally distributed with
mean μ and standard deviation σ , then the
sampling distribution of the sample mean,
,will also be normally distributed with
x
the following mean and standard
deviation, irrespective of the sample size:
σ
µ x = µ and σ x =
n
31
Figure 7.2 Population distribution and
sampling x
distributions of .
Normal distribution
32
Figure 7.2 Population distribution and
sampling x
distributions of .
Normal distribution
33
Figure 7.2 Population distribution and
sampling x
distributions of .
Normal distribution
34
Figure 7.2 Population distribution and
sampling x
distributions of .
Normal distribution
x 35
Figure 7.2 Population distribution and
sampling x
distributions of .
Normal distribution
x 36
Example 7-3
In a recent SAT, the mean score for all
examinees was 1020. Assume that the
distribution of SAT scores of all examinees is
normal with the mean of 1020 and a
standard deviation ofx 153. Let be the mean
SAT score of a random sample of certain
examinees. Calculate the mean and standard
deviation of and describe the shape of its
x
sampling distribution when the sample size is
a) 16
b) 50 c) 1000
37
Solution 7-3
a)
µ x = µ = 1020
σ 153
σx = = = 38.250
n 16
38
Figure 7.3
Sampling
σ x = 38.250 distribution ofx
for n = 16
Population
distribution
σ = 153
39
Solution 7-3
b)
µ x = µ = 1020
σ 153
σx = = = 21.637
n 50
40
Figure 7.4
Sampling
σ x = 21.637 distribution ofx
for n = 50
Population
distribution
σ = 153
41
Solution 7-3
c)
µ x = µ = 1020
σ 153
σx = = = 4.838
n 1000
42
Figure 7.5
Sampling
σ x = 4.838 distribution ofx
for n = 1000
Population
σ = 153 distribution
43
µ x = μ = 1020 SAT scores
Sampling From a Population
That Is Not Normally
Distributed
Central Limit Theorem
According to the central limit theorem, for a
large sample size, the sampling distribution xof
is approximately normal, irrespective of the
shape of the population distribution. The mean
and standard deviation of the samplingx
distribution of are σ
µ x = µ and σ x =
n
The sample size is usually considered to be
large if n ≥ 30.
44
Figure 7.6 Population distribution and
sampling xdistributions of .
45
Figure 7.6 Population distribution and
sampling xdistributions of .
46
Figure 7.6 Population distribution and
sampling xdistributions of .
47
Figure 7.6 Population distribution and
sampling xdistributions of .
48
Figure 7.6 Population distribution and
sampling xdistributions of .
x
49
Example 7-4
The mean rent paid by all tenants in a large
city is $1550 with a standard deviation of
$225. However, the population distribution of
rents for all tenants in this city is skewed to
the right. Calculate the mean and standard
deviation of and describe the shape of its
x
sampling distribution when the sample size is
a) 30
b) 100
50
Solution 7-4
a) Let x be the mean rent paid by a sample
of 30 tenants
µ x = µ = $1550
σ 225
σx = = = $41.079
n 30
51
Figure 7.7
σ = $225
μ = $1550 x
52
Figure 7.7
σ x = $41.079
µx = $1550 x
53
Solution 7-4
b) Let x be the mean rent paid by a sample
of 100 tenants
µ x = µ = $1550
σ 225
σx = = = $22.500
n 100
54
Figure 7.8
σ = $225
μ = $1550 x
55
Figure 7.8
(b) Sampling distribution x
of for n =
100.
σ x = $22.500
µx = $1550 x 56
APPLICATIONS OF THE
SAMPLING DISTRIBUTION OF
x
1. If we make all possible samples of
the same (large) size from a
population and calculate the mean
for each of these samples, then
about 68.26% of the sample means
will be within one standard
deviation of the population mean.
57
Figure 7.9µ P ( − 1σ x ≤ x ≤ µ + 1σ x )
Shaded
area is .
6826
.3413 .3413
µ − 1σ x µ µ + 1σ x x
58
APPLICATIONS OF THE
SAMPLING DISTRIBUTION OF x
cont.
59
Figure 7.10 µ P( − 2σ x ≤ x ≤ µ + 2σ x )
Shaded
area is .
9544
.4772 .4772
µ − 2σ x µ µ + 2σ x x
60
APPLICATIONS OF THE
SAMPLING DISTRIBUTION OF x
cont.
61
Figure 7.11 µ P( − 3σ x ≤ x ≤ µ + 3σ x )
Shaded
area is .
9974
.4987 .4987
µ − 3σ x µ µ + 3σ x x
62
Example 7-5
Assume that the weights of all packages
of a certain brand of cookies are normally
distributed with a mean of 32 ounces and
a standard deviation of .3 ounce. Find the
probability that the mean weight,x ,of a
random sample of 20 packages of this
brand of cookies will be between 31.8 and
31.9 ounces.
63
Solution 7-5
µ x = µ = 32 ounces
σ .3
σx = = = .06708204 ounce
n 20
64
z Value for a Value of x
x
The z value for a value of is
calculated as
x −µ
z=
σx
65
Solution 7-5
31.8 − 32
For x = 31.8: z = = −2.98
.06708204
31.9 − 32
Forx = 31.9: z = = −1.49
.06708204
Shaded are
is .0667
31.8 31.9 µ x = 32 x
-2.98 -1.49 0
z
67
Example 7-6
According to the College Board’s report, the
average tuition and fees at four-year private
colleges and universities in the United States was
$18,273 for the academic year 2002-2003 (The
Hartford Courant, October 22, 2002). Suppose
that the probability distribution of the 2002-2003
tuition and fees at all four-year private colleges
in the United States was unknown, but its mean
was $18,273 and the standard deviation was
$2100. Let be the mean tuition and fees for
x
2002-2003 for a random sample of 49 four-year
private U.S. colleges. Assume that n /N ≤ .05.
68
Example 7-6
a) What is the probability that the 2002-
2003 mean tuition and fees for this
sample was within $550 of the
population mean?
b) What is the probability that the 2002-
2003 mean tuition and fees for this
sample was lower than the
population mean by $400 or more?
69
Solution 7-6
µ x = µ = $18,273
σ 2100
σx = = = $300
n 49
70
Solution 7-6
a)
For x = $17,723:z = x − µ = 17,723 − 18,273 = −1.83
σx 300
For x = $18,823: z = x − µ = 18,823 − 18,273 = 1.83
σx 300
P (17,723 ≤x ≤ 18,823)
= P (-1.83 ≤ z ≤ 1.83)
= .4664 + .4664 =.
9328
71
Solution 7-6
a)
Therefore, the probability that the
2002-2003 mean tuition and fees for
this sample of 49 four-year private
U.S. colleges was within $550 of the
population mean is .9328.
72
Figure 7.13 P($17,723 ≤ x ≤ $18,823)
Shaded
area is .
9328
.4664 .4664
P ( x ≤ $17,873) = P (z ≤ -1.33)
= .5 - .4082
= .0918
74
Solution 7-6
b)
Therefore, the probability that the
2002-2003 mean tuition and fees for
this sample of 49 four-year private
U.S. colleges was lower than the
population mean by $400 or more is
.0918.
75
Figure 7.14 P ( x ≤ $17,873)
The
required
probability
is .0918
.4082
$17,87 µx= x
3 $18,273
-1.33 0 z
76
POPULATION AND SAMPLE
PROPORTIONS
The population and sample
proportions, denoted by p and p̂ ,
respectively, are calculated as
X x
p= and pˆ =
N n
77
POPULATION AND SAMPLE
PROPORTIONS cont.
where
N = total number of elements in the population
n = total number of elements in the sample
X = number of elements in the population that
possess a specific characteristic
x = number of elements in the sample that possess
a specific characteristic
78
Example 7-7
Suppose a total of 789,654 families
live in a city and 563,282 of them own
homes. A sample of 240 families is
selected from this city, and 158 of
them own homes. Find the proportion
of families who own homes in the
population and in the sample. Find the
sampling error.
79
Solution 7-7
X 563,282
p= = = .71
N 789,654
x 158
pˆ = = = .66
n 240
Sampling error = pˆ − p = .66 − .71 = −.05
80
MEAN, STANDARD DEVIATION, AND
SHAPE OF THE SAMPLING
p̂
DISTRIBUTION OF
p̂
Sampling Distribution of
Mean and Standard Deviationp̂ of
Shape of the Sampling Distributionp̂ of
81
Sampling Distribution p̂of
Definition
The probability distribution of the
p̂
sample proportion, , is called its
sampling distribution. It gives
p̂
various values that can assume
and their probabilities.
82
Example 7-8
Boe Consultant Associates has five
employees. Table 7.6 gives the names
of these five employees and
information concerning their
knowledge of statistics.
83
Table 7.6 Information on the Five Employees of
Boe Consultant Associates
Name Knows
Statistics
Ally yes
John no
Susan no
Lee yes
Tom yes
84
Example 7-8
If we define the population
proportion, p, as the proportion of
employees who know statistics, then
p = 3 / 5 = .60
85
Example 7-8
Now, suppose we draw all possible
samples of three employees each and
compute the proportion of
employees, for each sample, who
know statistics.
5! 5 ⋅ 4 ⋅ 3 ⋅ 2 ⋅1
Total number of samples = 5 C 3 = = = 10
3!(5 − 3)! 3 ⋅ 2 ⋅ 1 ⋅ 2 ⋅ 1
86
Table 7.7 All Possible Samples of Size 3 and the
Value of p̂ for Each Sample
f Relative
p̂ Frequency
.33 3 3/10 = .30
.67 6 6/10 = .60
1.00 1 1/10 = .10
Σf = 10 Sum = 1.00
88
Table 7.9 p̂ of
Sampling Distribution When the
Sample Size is 3
p̂ P ( p̂ )
.33 .30
.67 .60
1.00 .10
ΣP ( p̂ ) = 1.00
89
Mean and Standard
Deviation
p̂ of
Mean of the Sample Proportion
The mean of the sample p̂
proportion,µ p̂ , is denoted by and is
equal to the population proportion, p.
Thus,
µ pˆ = p
90
Mean and Standard
Deviation
p̂ of cont.
Standard Deviation of the Sample Proportion
The standard deviation of the sample
proportion, p̂ , is denoted byσ p̂ and is given
by the formula
pq
σ pˆ =
n
Where p is the population proportion, q = 1 –p ,
and n is the sample size. This formula is used
when n /N ≤ .05, where N is the population size.
91
Mean and Standard
Deviation
p̂ of cont.
If the n /N ≤ .05 condition is not
satisfied, we use the following
formula
σ p̂ to calculate :
pq N −n
σ pˆ =
n N −1
N −n
where the factorN − 1is called the
finite population correction factor
92
Shape of the Sampling
Distribution ofp̂
Central Limit Theorem for Sample Proportion
According to the central limit theorem, the
sampling distribution ofp̂ is approximately
normal for sufficiently large sample size. In
the case of proportion, the sample size is
considered to be sufficiently large if np and nq
are both greater than 5 – that is if
np > 5 and nq >5
93
Example 7-9
The National Survey of Student Engagement
shows about 87% of freshmen and seniors rate
their college experience as “good” or “excellent”
(USA TODAY, November 12, 2002). Assume this
result is true for the current population of all
freshmen and seniors. Let be the proportion of
p̂
freshmen and seniors in a random sample of 900
who hold this view. Find the mean and standard
deviation of and describe the shape of its
sampling distribution. p̂
94
Solution 7-9
95
Solution 7-9
np and nq are both greater then 5
Therefore, the sampling distribution
p̂
of is approximately normal with a
mean of .87 and a standard deviation
of .001, as shown in Figure 7.15
96
Figure 7.15
Approximate
ly normal σ p̂ =.011
µ p̂ = .87 p̂
97