Anda di halaman 1dari 7

LESSON 5 SAMPLING DISTRIBUTION OF SAMPLE MEANS

Weve been examining Z-scores & the probability of obtaining individual scores within a normal
distribution but inferential statistics involve samples of more than 1. To transition into inferential
statistics, it is important that we understand how probability relates to sample means, not just
individual scores
Inferential statistics: sample infer something about population
Often not possible to measure everyone in a population Samples are convenient
representations of them. If you take multiple samples of the same size from a population, they are
likely to give different resultsSamples vary!
Quite likely that a particular sample wont reflect the population exactly
Discrepancy b/n sample & population = sampling error
The term sampling error does not mean a sampling mistake rather it indicates that means drawn
from multiple samples taken from a population will vary from each other due to random chance and
therefore may deviate from the population mean
What is a distribution of Sample Means (Sampling Distribution of the Mean)?
A distribution of sample means (

X ); a distribution of a statistic [in this case a sample mean] over

repeated sampling from a specified population


Based on all possible random samples of size n, from a population
Can inform us of the degree of sample-to-sample variability we should expect due to chance
EXAMPLE: Suppose we have a population:
6
7
8
9
= 7.5
Lets take all possible samples of size n = 2 from this population:
1)

6.0
2)

3)

4)

5)

6)

X=

7)

X=

X=

X = 7.0
X = 7.5
X = 8.0
X = 8.5

7.5

X = 6.5

8)

X = 7.0
X = 7.5
X = 6.5
X = 7.0

9)

10)

11)

12)

What do you notice?

8.0

X = 7.5

14)

X = 8.0

15)

16)

X = 8.5
X = 9.0

13)

X is rarely exactly , But, most X are close to (or cluster around)


Extreme values of
X are rare
You can determine the exact probability of obtaining a particular
X
p(
X < 7)? = 3 / 16
Important properties of the sampling distribution of means:
Mean
Standard Deviation
Shape

1. The Mean
The mean of the distribution of sample means is the mean of the population

Page1

1.
2.
3.

The mean of the distribution of sample means is called expected value of

X is an unbiased estimate of : on average, the sample mean produces a value that exactly

matches the population mean


2. The Standard Deviation of the Distribution of Sample Means

X = Standard Error of the Mean


Variability of

X around

Special type of standard deviation, type of error


Average amount by which
Less error =

X deviates from

better, more reliable estimate of population parameter

X influenced by two things:


(1) Sample size (n)
Larger n = smaller standard errors
Note: when n = 1 X =
as starting point for X ,
X gets smaller as n increases
(2) Variability in population ()
Larger = larger standard errors
3. The Shape
Central Limit Theorem = Distribution of sample means will approach a normal distribution as n
approaches infinity. Very important! True even when raw scores NOT normal!
What about sample size?
(1) If raw scores ARE normal, any n will do
(2) If raw scores are not normal but are symmetrically distributed, a small n will usually suffice
(3) If the raw scores are severely skewed, n must be sufficiently large
For most distributions n 30
Why are Sampling Distributions Important?
Tell us the probability of getting a particular

X , given &

Critical for inferential statistics!


Allow us to estimate population parameters
Allow us to determine if a sample mean differs from a known population mean just because of
chance
Allow us to compare differences between sample means due to chance or to experimental
treatment?
Sampling distribution is the most fundamental concept underlying all statistical tests
Working with the Distribution of Sample Means

We can use the Normal Curve & Table

Page1

If we assume DSM is normal (again, we can do this if raw scores are normally distributed or n is at least
30) AND If we know &

x
x

where: X =

Example: Suppose you take a sample of 25 high-school students, and measure their IQ. Assuming that
IQ is normally distributed with = 100 and = 15, what is the probability that your samples IQ will be
105 or greater?
z

Step 1: Convert to Z-score:

15 15
3
25 5

X =

x
x

105 100
1.67
3

The probability of the sample having a mean of 105 or greater is: 0.0475
Example: Repeat the same problem in the previous example, but assume your sample size is 64
z

Step 1: Convert to Z-score:

15 15
1.875
64 8

X =

x
x

105 100
2.67
1.875

Step 2: See Table Normal Curve & Table


The probability of the sample having a mean of 105 or greater is: 0.0038
Example: What
time, if n = 36?

Find X =

X =

15 15
2.5
36 6

Step 2: See Table Normal Curve & Table :


Step 3: Solve for

Z=?

X : X = + Z X

X = 100 + (?)(2.5) = 102.6


Page1

Step 1:

X marks the point above which sample means are likely to occur only 15% of the

Summary of some important properties of the sampling distribution of the mean from the
selection with replacement of random samples of size n from a population having a mean and a
standard deviation :
1. The mean of the sampling distribution of

2. The standard error of

X is equal to the mean of the population. That is:

is equal to the population standard deviation divided by the square

root of the sample size. That is:

X =
n
3. The sampling distribution of

X is approximately normally distributed.

EXERCISE:
1. There are five workers in an RTW factory. Their wages per hour are as follows:
WORKER
HOURLY WAGE (Php)
A
10
B
11
C
12
D
13
E
14
a) Compute the population mean and the population standard deviation.
b) Compute the means of all possible samples of two workers each, when sampling is without
replacement.
c) Compute the mean of the sample means and the standard error.
d) Draw histogram for both the original data and the data for sample mean.
e) Compare the population mean and the sample mean.
f) Compare the standard error and the population standard deviation.
Solution:
a) Construct the following table for easy computation:

Workers

X1
(wages/hr)

A
B
C
D
E
TOTAL
Table 1.a

10
11
12
13
14
60

Population mean

= xifi
N

fi

xifi

Xi -

(Xi - )2 fi

(Xi - )fi

Population standard deviation:

(Xi - )2 fi =
N

Possible Samples
AB

X)

Sample Means (
(10+11)/2 =10.5

Page1

b) Possible combinations of 2 workers each taken from 5 workers when sampling is without
replacement and their corresponding means:

Table 1.b
c) Mean of the sample means and the standard error:

fi

(Sample
means)
10.5

X X - (X f
X X)
i

i i

10.5

-1.5

XX) f
(

2.25

P
i

2.25

X
= f
N

1/10

Table 1. c
The mean of the sample means is:

Xf

i i

The standard deviation of the sample means is:

X -X )
i

fi

n
d) Draw the two (2) histograms using data from Table 1.a nd Table 1.c

e) Comparison of the population mean and the sample mean.


f) Comparison of the standard error and the population standard deviation.
EXERCISE 2. USING THE SAME DATA FROM EXERCISE 1
a) Compute the means of all possible samples of three (3) workers each, when sampling is
without replacement.
b) Compute the mean of the sample means and the standard error.
c) Draw histogram for both the original data and the data for sample mean.
d) Compare the population mean and the sample mean.
e) Compare the standard error and the population standard deviation.
ADDITIONAL LESSON: CENTRAL LIMIT THEOREM
Page1

Standard Error of the Mean:

From the previous lesson The mean of the sampling distribution X equals to the mean of the
population sampled and the standard deviation of the sampling distribution is smaller than the standard
deviation of the population.
Remember:
1. The concept of the standard error of the mean is significant because it measures the degree of
accuracy of the sample mean X as an estimate of the population mean ().
2. We have a good estimate if the standard error x is small or close to 0, and a poor estimate if x
is large.
3. The value of x is dependent on the size (n) of the sample. As (n) increases x decreases. Thus,
in order to obtain a relatively good estimate of , n must be sufficiently large.
4. When random samples of size n are drawn without replacement, from a population with a mean
and a standard deviation , the sampling distribution of X has the mean x = and the
standard deviation:

For finite population of size N


for infinite population
finite correction factor
Used whenever we sample from a population which is finite.
Sampling is done without replacement.
When N is large relative to n, this correction factor will be very
close to one ( 1 ) and maybe omitted.
Rule of Thumb: Ignore the correction factor when the sample size is less than 5% of the population size;
when n 0.05N.
Central Limit Theorem
o States that: If random samples of size n are drawn from a population (finite or infinite), then as n
become larger, the sampling distribution of the mean approaches the normal distribution, regardless
of the form of the population distribution.
o It is an important justification for the use of large samples. The central limit theorem assures us that
no matter what the shape of the population distribution is, the sampling distribution of the mean is
closely normally distributed whenever n is large.

o It justifies the use of the formula:


Z=X-

EXAMPLE: A company manufactures electrical components that have a length of life that is
approximately normally distributed with a mean equal to 800 hours and a standard deviation of 60
hours. Find the probability that a random sample of 36 units will yield an average life that is more than
820 hours.
SOLUTION:
Given:
= 800 hours
n = 36
= 60 hours
x = 820 hours
Required: P (X > 820)
Z=X-

Page1

OR
The formula is used when computing for the probability that X will take on a value within a
given range in the sampling distribution of X.
o A sample is sufficiently large and approximately normal whenever n 30.
o If the population distribution is a normal distribution, the sampling distribution will always be
approximately normally distributed, no matter how small n is.

Formula:
Z = 820 800 = 20 = 2. = .4772
60
10
36
P (X > 820) = P ( z > 2) = .5 - .4772 = .0228
Thus: There is 2.28% probability that n = 36, will yield am average life that is more than 820 hrs.

Page1

EXERCISE:
1. Find the finite population factor in each of the following:
a. N = 50, n = 10
b. N = 200, n = 80
c. N = 1000, n = 150
2. A random variable x is normally distributed with the mean = 50 and standard deviation = 20.
Assuming that the size of the universe is extremely large, what is the probability that the sample
mean takes on the following values if the sample size n = 100?
a. Less than 45
b. Greater than 54
c. Between 48 and 53
d. Between -10 and 55
3. The daily sales volume of brand X soy sauce at a grocery store is normally distributed with the mean
= Php50.00 and the standard deviation = Php10.00. What is the probability that the sample mean
takes on the following values if the sales record of 64 out of 100 days is taken as a random sample?
a. Greater than Php51.50
b. Less than Php40.25
4. Find the standard error of X when a random sample size 20 is chosen from a population size 100
having a mean of 80 and a standard deviation of 4.
5. A random sample size 64 is drawn from a normal population having a mean of 120 and a standard
deviation of 18. Find the probability that the mean of this sample will be:
a) More than 130
b) Between 120 and 125
c) Between 118 and 128
6. The average of 1000 employees of a manufacturing company is known to be 32.5 years and the
standard deviation is 5 years. Find the probability that a random sample of 36 employees will yield
an average age of less than 30 years.
7. An instructor gives his present class of 36 an examination which, as he knows from years of
experience, yields a mean of 65 and a standard deviation of 8. What is the probability that his class
will obtain a mean of more than 70?
8. The daily sales volume of a fruit gum sold by a sidewalk vendor is normally distributed with the
mean = 30 and the standard deviation = 10. What is the probability that the sample mean takes on the
following values if a sample size n = 64 is taken from the fairly large universe?
a) Less than 28 pesos
b) Greater than 31 pesos
c) Between 35 pesos and 100 pesos
9. The weight of machine parts on inventory are normally distributed with the mean = 75 grams and the
standard deviation = 10 grams. What is the probability that the sample means takes on the following
values if a sample size n = 400 is taken out of the universe N = 1000?
a) Less than 74 grams
b) Between 74.8 grams and 80 grams
c) Greater than 75.5 grams
d) Between 74.2 grams and 75.3 grams

Anda mungkin juga menyukai