Sampling theory:
We look at sample characteristics to gain
understanding of the population at hand.
Recognize random variables and random
sample.
Important Statistics
n
Mean
Location
of a sample.
Variance
Spread
of a sample.
Sample mean
X1 + X 2 +L + X n
Xi
X=
=
n
i =1 n
n
Sample variance
2
(
X
X
)
S2 = i
n -1
i =1
n
n X - X i
i =1
S 2 = i =1
n( n - 1)
n
------(*)
2
2
i
or
-------(**)
s2
X ~ N (m ,
)
n
~ N(0,1)
Definition 5.1.1
2
If X i ~ N ( m , s ) , then
T=
X -m
s/ n
n =n-1 degrees of
Cases
CLT
Normal
Standardization Z = X - m
s/ n
Normal
Z=
X -m
s/ n
t-dist
T=
X -m
s/ n
Example 5.1.1:
A factory produces bulbs with lifetime
approximation to a normal distribution with
mean 600 hours and standard deviation
18 hours. Find the probability for average
lifetime less than 585 hours (if sample size
n = 9).
Example 5.1.2:
Suppose a manufacturer is interested in the
average production of a machine in a day, more
specifically, he is interested in the probability of
the machine producing on average more than
100 items per day. It is known that the machine
has a normal distribution with mean m and
2
variance s . The manufacturer measured the
production of 11 machines yielding the following
data:
115 82 98 126 109 143 136 92 103
127 150
Statistical Inference
n
Divided into
Estimation
Point estimator
n Interval estimator
n
Tests
of hypothesis
Point Estimator
n
Interval Estimates:
Confidence Interval
An interval in which the true value of the
parameter falls with some level of
confidence.
n Example:
A 95% confidence interval for X means
that we are 95% confident that the value of
m lies in that interval.
n
s
n
< m < x + za
2
s
n
the right.
a
2
to
s known s unknown,
normal
population,
n 30
s unknown,
n < 30
Confidence
intervals
x za
2
s
n
x za
2
s
n
x ta
2
s
n
Example 5.2.1.2
Measurements of the weights of a random
sample of 200 containers made by a
certain machine showed a mean of 0.21
kilograms and it is known that s = 0.002
kilograms. Find the 95% confidence
interval for the mean weight of all the
containers.
Example
The average calcium contained in 36
samples taken from different locations is
found to be 1.6 grams per millilitre. Find
the 90% confidence interval for the mean
calcium contained in the river. Assume that
the samples standard deviation is 0.2.
Example 5.2.1.5
A machine produces containers that are
cylindrical in shape. Nine containers are
randomly chosen and the diameters are
10.01, 9.97, 10.03, 10.04, 9.99, 9.98, 9.99,
10.01, and 10.03 centimetres. Find a 99%
confidence interval for the mean diameter
of containers from this machine, assuming
an approximate normal distribution.
pq
< p < p + z a
n
2
pq
n
Example 5.2.1.6
In a random sample of n = 600 families
owning television sets in a city, it is found
that x = 240 subscribed to ASTRO. Find a
95% confidence interval for the actual
proportion of families in this city who
subscribe to ASTRO.
(n - 1)s 2
c 2a
2
2
(
n
1
)
s
< s2 <
c2 a
1-
Example 5.2.1.7
The following are the weights, in grams, of
10 packages of sugar packed by a worker:
454, 451, 458, 450, 451, 459, 458, 459,
452 and 450.
Find a 95% confidence interval for the
variance of all such packages of sugar
packed by this worker, assuming a normal
population.
Hypothesis Testing
n
Example 5.2.2.1
Suppose a manufacturer observes that the
existing procedure gives about 4%
defective products. The engineer would
like to implement a new procedure to
reduce the number of defective products.
It was agreed that n=100 products would
be produced using the new procedure. Let
X equal the number of these 100 products
that are defective. State H0 and H1.
P(type I error)
= P(reject H0 when H0 is true) = a.
P(type II error)
= P(accept H0 when H0 is false) = b.
n
H0 false
Accept H0
Correct
decision
type II error
Reject H0
type I error
Correct
decision
ONE SIDED
(LEFT SIDE)
ONE SIDED
(RIGHT
SIDE)
Symbol in Ho
Symbol in H1
<
>
Rejection
region
Both tails
Left tail
Right tail
m=m0
m>m0
X - m0
z=
s
n
z>za
m=m0
m<m0
m=m0
mm0
X - m0
X - m0
z=
z=
s
s
n
n
z<-za
|z|>za/2
m=m0
m>m0
X - m0
z=
s
n
z>za
m=m0
m<m0
m=m0
mm0
X - m0
X - m0
z=
z=
s
s
n
n
z<-za
|z|>za/2
m=m0
m>m0
X - m0
t=
s
n
t>ta
m=m0
m<m0
m=m0
mm0
X - m0
X - m0
t=
t=
s
s
n
n
t<-ta
|t|>ta/2
Decision rule:
m=m0
m>m0
m=m0
m<m0
X - m0
X -m
P Z >
P Z < - s 0
s
m=m0
mm0
X - m0
2 P Z >
s
Decision rule:
m=m0
m>m0
X - m0
P Z >
s
m=m0
m<m0
X - m0
P Z < s
m=m0
mm0
X - m0
2 P Z >
s
n
Decision rule:
m=m0
m>m0
X - m0
P T >
s
m=m0
m<m0
X - m0
P T < s
m=m0
mm0
X - m0
P
T
>
2
s
n
Example 5.2.2.6
A random sample of 100 electronic chips
showed an average lifetime of 2.8 years.
Assuming a population standard deviation
of 0.5 years, does this seem to indicate
that the mean lifetime is greater than 2.7
years? Use a 0.05 level of significance.
Run hypothesis testing using both the test
statistic approach and p-value approach.
Example
Suppose that 150 MMU students were
tested for their Intelligent Quotation (IQ).
From the data, the average IQ was 120
with a standard deviation of 11.3. An MMU
professor claims that he knows the overall
students IQ is different from 118. Using a
0.05 level of significance, determine if the
professor is correct by using both the test
statistics as well as the p-value approach.
Example 5.2.2.9
Test the hypothesis that the average
diameter of a certain type of battery
produced by a factory is 10 millimetres if
the diameters of a random sample of 10
batteries are 10.1, 9.8, 10.1, 10.5, 10.1,
9.7, 9.9, 10.4, 10.3 and 9.8 millimetres.
Use a 0.01 level of significance and
assume that the distribution of diameters
is normal. Run hypothesis testing using
both the test statistics and p-value
approach.
p - p 0
z=
p0q0
n
z>za
z<-za
|z|>za/2
Decision rule:
p=p0
p>p0
p - p0
P z >
p 0 q0
p=p0
p<p0
P z < - p - p0
p0 q 0
n
p=p0
pp0
2 P z > p - p0
p 0 q0
Example 5.2.2.10:
A common medicine for relieving serious
pain is believed to be only 80% effective. A
new medicine is used to a random sample
of 100 adults who were suffering from
serious pain and it shows that 85 received
relief. Is this sufficient evidence to
conclude that the new medicine is superior
to the one commonly prescribed? Use a
0.05 level of significance.
Decision rule:
s2=s02
s2>s02
s2=s02
s2<s02
s2=s02
s2s02
2
(
n
1
)
s
2
, s2=sample variance,
c =
s02=null variance value
s 02
c2>c2a
c2<c21-a c2<c21-a/2
or
c2>c2a/2
Reject H0 if test statistic falls in
critical region.
Example 5.2.2.12
In paper manufacturing, a process is
considered out of control if the standard
deviation of the weight of a piece of paper
exceeds 1.25 grams. A random sample of
20 pieces of papers produced during a
routine check yield a standard deviation of
1.9 grams. At the 0.05 level of
significance, is the paper production
process out of control?
Decision rule:
Reject H0 if c2>c2a(k-1)
Example 5.3.1.
Tossing a fair dice 180 times yields 26
ones, 32 twos, 25 threes, 24 fours, 35
fives and 38 sixes. Test at the 0.01 level of
significance whether the data obtained
from the experiment has a discrete
uniform distribution.
Example 5.3.2.
Suppose the lifetime (in hours), X for 40 bulbs is
recorded as follows:
Class boundaries (hours)
Observed frequencies
1.5 2.0
2.0 2.5
2.5 3.0
11
3.0 3.5
15
3.5 4.0
4.0 4.5