Anda di halaman 1dari 38

1

IEE 570 Advanced Quality Control


Instructor: Jing Li
Lecture notes #3
2
The need of Statistical Inference
In statistical quality control, the probability distribution is
used to model some quality characteristic.
The parameters of a probability distribution are unknown.
Point estimation
Confidence interval estimation
The parameters of a process can be time varying, how do
we identify a process change?
Hypothesis testing
Chapter 3 Inference About Process Quality
3
Point Estimation
Distribution Parameters Estimator


x =
2
o
2 2
S = o
Normal
o

4
/ c S = o (best) or
2
/ d R = o (easy to
compute); c
4
and d
2
are given in
Appendix Table VI
Binomial
p

x x
n
p
n
i
i
= =

=1
1

, {x
i
} are either 1 or 0,
corresponding to success and failure
of the i
th
Bernoulli trial, respectively.
Poisson


x x
n
n
i
i
= =

=1
1



4
Interval Estimation
Estimate the interval between two statistics that include
the true value of the parameter with some probability
Example: Pr{ Ls s U}=1-o
The interval Ls s U is called a 100(1- o)% confidence
interval (C.I.) for the unknown mean
Two-sided C.I. (L is lower confidence limit, U is upper
confidence limit)
One-sided C.I.:
lower 100(1- o)% C. I.: Ls , Pr{ Ls }=1-o
upper 100(1- o)% C. I.: s U, Pr{ s U}=1-o
5
If x is a random variable with unknown mean and known
variance o
2
, what is the confidence interval for mean ?
Point estimator
The approximate distribution of is regardless of
the distribution of x due to the central limit theorem.
Given confidence level o, then
100(1-o)% two-sided confidence interval on is:

100(1-o)% upper confidence interval on is:
100(1-o)% lower confidence interval on is:

=
=
n
i
i
n x x
1
/ ) (
) / , (
2
n N o
n
Z x
n
Z x
o
+ s s
o

o o 2 / 2 /
2 / } Pr{
2 /
o = >
o
Z z
where
n
Z x
o
+ s
o
s
o

o
n
Z x
x
Z
o
is the percentage point of N(0,1) distribution such that Pr(z> Z
o
)= o
C. I. of Population MeanVariance Known
6
The response time of a distributed computer system is an
important quality characteristic. The system manager wants to
estimate the mean response time to a specific type of command.
From past experience, he knows that the standard deviation of
response time is 8 millisec. If the command is executed 25 times
and the response time for each trial is recorded. The sample
average response time is 79.25 millisec. Compute a 95% two-
sided confidence interval for the mean response. Also compute
a 95% lower confidence interval for the mean response.


Example
7
A chemical process converts lead to gold. However,
the production varies due to the powers of the
alchemist. It is known that the process is normally
distributed, with a standard deviation of 2.5 g. How
many samples must be taken to be 90% certain that an
estimate of the mean process is within 1.5 g of the true
but unknown mean yield?
Example
8
Hypothesis Testing
Statistical hypothesis:
A statement about the values of the parameters of a probability
distribution



H0: Null hypothesis
H1: Alternative hypothesis (Two-sided/one sided)
Hypothesis testing:
Making a hypothesis concerning what we believe to be true and
then use sampled data to test it.
Conclusion:
Compare test statistic with a threshold value, then reject or fail
to reject H0

5 . 1 :
5 . 1 :
1
0
=
=

H
H
5 . 1 :
5 . 1 :
1
0
>
=

H
H
5 . 1 :
5 . 1 :
1
0
<
=

H
H
9
Test Mean of A PopulationVariance Known
o Significance level/type I error

) 1 , 0 ( ~ N
n
x
o

0 0
: = H
If
If
If
0 1
: = H
n
x
Z
o

0
0

=
2 / 0 o
Z Z >
o
Z Z >
0
o
Z Z <
0
0 1
: > H
0 1
: < H
10
Example
The response time of a distributed computer system is an
important quality characteristic. The system manager wants to
know whether the mean response time to a specific type of
command exceeds 75 millisec. From past experience, he knows
that the standard deviation of response time is 8 millisec. If the
command is executed 25 times and the response time for each
trial is recorded. The sample average response time is 79.25
millisec. Formulate an appropriate hypothesis and test the
hypothesis.
11
1. Traditional hypothesis testing:
Given o to determine whether the null hypothesis was rejected
Disadvantage:
No information on how close to/far away from the rejection
region
predefined o may not reflect different decision makers risk
assessments
2. P-Value approach
P-Value: the smallest level of significance that would lead to rejection of
the null hypothesis
if the predefined o>P= o
min
, reject the null hypothesis

f(x)
x

=0 Z
0
>0
Z
0
<0
1-
u(Z
0
)
u(Z
0
)
The Use of P-Values in Hypothesis Testing
12
Use of P-Value for the Normal Distribution
H
0
: =
0 ,
standard normal statistic Z
0
~N(0,1)
P=2[1-u(|Z
0
|)] with two-sided H1, i.e., H
1
: =
0

P=1-u(Z
0
) for one-sided H1, H
1
: >
0

P=u(Z
0
) for one-sided H1, H
1
: <
0


f(x)
x

=0 Z
0
>0
Z
0
<0
1-
u(Z
0
)
u(Z
0
)
13
Example (Revisit)
The response time of a distributed computer system is an
important quality characteristic. The system manager wants to know
whether the mean response time to a specific type of command
exceeds 75 millisec. From past experience, he knows that the
standard deviation of response time is 8 millisec. If the command is
executed 25 times and the response time for each trial is recorded.
The sample average response time is 79.25 millisec. Formulate an
appropriate hypothesis and test the hypothesis. Compute the P-
value.
14
Some Useful Formulas for Normal Distribution
} { } {
} { } {
} { 1 } {
1 ) ( Therefore,
} P{ ), 1 , 0 ( ~ for that as defined is
/
0
0
a z P a z P
a z P a z P
a z P a z P
Z
Z z N z Z
n
x
Z
s = >
> = s
s = >
= u
= >

=
o
o
o

o
o o
What is the P-value of a two-sided test on population mean
with Z
0
=Z
o
?
15
Inference on the Mean of a Normal Distribution
Unknown Variance
n
s
t x
n
s
t x
n n 1 , 2 / 1 , 2 /
+ s s
o o

) 1 ( ~

n t
n s
x
0 0
: = H
If
If
If
0 1
: = H
n s
x
t
0
0

=
1 , 2 / 0
>
n
t t
o
1 , 0
>
n
t t
o
1 , 0
<
n
t t
o
0 1
: > H
0 1
: < H
Two-sided C.I.
16
The mean time it takes a crew to restart an aluminum
rolling mill after a failure is of interest. The crew was
observed over 25 occasions, and the results were
= 26.42 minutes and variance S
2
=12.28 minutes. If
repair time is normally distributed, find a 95%
confidence interval on the true but unknown mean
repair time. Test the hypothesis that the mean time
equals 25 minutes. Use a two-sided alternative and
o=0.05.
Example
17
If the value of the parameter specified by the null
hypothesis is contained in the 100(1- o)% interval,
then the null hypothesis cannot be rejected at the o
level.

If the value specified by the null hypothesis is not in
the interval, then the null hypothesis can be rejected
at the o level
Confidence Interval and Hypothesis Testing
18
Understanding the result of Hypothesis Test
When we reject the null hypothesis, it is a strong
conclusion: there is a strong evidence that the null
hypothesis is false.
When we fail to reject the null hypothesis, it is a
weak conclusion: It does not mean that the null
hypothesis is correct. It only means we do not have
strong evidence to reject it.
19
Court System and Hypothesis Testing
Hypothesis testing in science is a lot like the criminal
court system in the United States. How do we decide
guilt?

Assume innocence until ``proven'' guilty.
Evidence is presented at a trial.
Proof has to be ``beyond a reasonable doubt.''

A jury's possible decision:
guilty
not guilty

Note that a jury cannot declare somebody ``innocent,'' just ``not
guilty.'' This is an important point.
20
n n
x x
n n
x x
2
2
2
1
2
1
2 /
2
_
1
_
2 1
2
2
2
1
2
1
2 /
2
_
1
_
- -
o
+
o
+ s s
o
+
o

o o
Z Z
0 2 1 0
: A = H
Inference for a Difference in Means
If
If
If
0 2 1 1
: A = H
Assume Known Population Variances
) 1 , 0 ( ~
) (
2
2
2 1
2
1
2 1 2 1
N
n n
x x
o o

+

2
2
2 1
2
1
0 2 1
0
n n
x x
Z
o o +
A
=
2 / 0 o
Z Z >
0 2 1 1
: A > H
0 2 1 1
: A < H
o
Z Z >
0
o
Z Z <
0
Two-sided C.I.
21
A bakery has a line making Binkies, a big-selling junk food.
Another line has just been installed, and the plant manager wants
to know if the output of the new line is greater than that of the old
line, as promised by the bakery equipment firm. 12 days of data
are selected at random from line 1 and 10 days of data are selected
at random from line 2, with x


1
= 1124.3 cases and
x


2
= 1138.7. It is known that o
1
2
= 52 and o
2
2
= 60. Test the
appropriate hypotheses at o = 0.05, given that the outputs are
normally distributed. What is the P-value for this test?
Example
22
0 2 1 0
: A = H
Inference for a Difference in Means of Two
Normal Distributions
If
If
If
0 2 1 1
: A = H
Assume Unknown Population Variances
Assume
) 2 ( ~
1 1
) (
2 1
2 1
2 1 2 1
+
+

n n t
n n s
x x
p

2 1
0 2 1
0
1 1 n n s
x x
t
p
+
A
=
2 , 2 / 0
2 1
+
>
n n
t t
o
0 2 1 1
: A > H
0 2 1 1
: A < H
Two-sided C.I.
2
2
2
1
2
o o o = =
2 , 0
2 1
+
>
n n
t t
o
2 , 0
2 1
+
<
n n
t t
o
n n
x x
n n
x x
2 1
2 , 2 /
2
_
1
_
2 1
2 1
2 , 2 /
2
_
1
_
1 1
-
1 1
-
2 1 2 1
+ + s s +
+ o + o p n n p n n
S t S t
2
) 1
2
(
2
2
) 1
1
(
2
1
2 1
2
+

+

=
n n
s n s n
S
p
23
Textbook problem: Two quality-control technicians measured the surface
finish of a metal part, obtaining the data shown below. Assume that the
measurements are normally distributed.
Technician 1 Technician 2
1.45 1.54
1.37 1.41
1.21 1.56
1.54 1.37
1.48 1.20
1.29 1.31
1.34 1.27
1.35
Assuming that the variances are equal, construct a 95% confidence interval on
the mean difference in surface-finish measurements. Test the hypothesis that
the mean surface finish measurements made by the two technicians are equal.
Use o=0.05.
Example
24
v o,
t
t-table
25
Inference on the Variance of a Normal Distribution
) 1 ( ~
) 1 (
2
2
2

n
s n
_
o
2
0
2
0
: o o = H
If or
If
If
2
0
2
2
0
) 1 (
o
_
s n
=
2
1 , 2 /
2
0
>
n o
_ _
2
0
2
1
: o o = H
2
1 , 2 / 1
2
0
<
n o
_ _
2
0
2
1
: o o > H
2
1 ,
2
0
>
n o
_ _
2
0
2
1
: o o < H
2
1 , 1
2
0
<
n o
_ _
2 / } Pr{ ,
) 1 ( ) 1 (
2
1 , 2 /
2
1
2
1 , 2 / 1
2
2
2
1 , 2 /
2
o = _ > _
_

s o s
_

o
o o
n n
n n
S n S n
Two-sided C.I.
26
25.5 26.1

26.8 23.2

24.2 28.4

25.0 27.8

27.3 25.7

Example
Consider the data in Exercise 3 - 3. Construct a 90% two - sided
confidence interval on the variance of battery life. Convert this into
a corresponding confidence interval on the standard deviation of
battery life.

27
v o v o o o
= s
o
o
s
, , 2 / , , 2 / 1 1 , 1 , 2 /
2
2
2
1
2
2
2
1
1 , 1 , 2 / 1
2
2
2
1
/ 1 ,
1 2 1 2
F F F
S
S
F
S
S
n n n n
Inference on Variances of Two Normal Distributions
1 , 1
2
2
2
2
2
1
2
1
2 1
~
/
/

o
o
n n
F
S
S
2
2
2
1
0
s
s
F =
2
2
2
1 0
: o o = H
If or
If
If
1 , 1 , 2 / 0
2 1

>
n n
F F
o
2
2
2
1 1
: o o = H
1 , 1 , 2 / 1 0
2 1

<
n n
F F
o
1 , 1 , 0
1 2

> '
n n
F F
o
2
2
2
1 1
: o o > H
2
2
2
1 1
: o o < H
1 , 1 , 0
2 1

>
n n
F F
o
2
2
2
1
0
s
s
F =
2
1
2
2
0
s
s
F = '
Two-sided C.I.
28
Example
Textbook problem: (revisit). Two quality-control technicians measured the
surface finish of a metal part, obtaining the data shown below. Assume that the
measurements are normally distributed.
Technician 1 Technician 2
1.45 1.54
1.37 1.41
1.21 1.56
1.54 1.37
1.48 1.20
1.29 1.31
1.34 1.27
1.35

1. Construct a 95% confidence interval estimate of the ratio of the variances of
technician measurement error.

2. Construct a 95% confidence interval on the variance of measurement error
for Technician 2.
29
Testing on Binomial Parameters
To test whether the parameter p of a binomial distribution equals a
standard value p
0

The test is based on the normal approximation to the binomial
distribution












The null hypothesis is rejected if |z
0
|>Z
o/2

0 1
0 0
:
:
p p H
p p H
=
=

>


<

+
=
0
0 0
0
0
0 0
0
0
) 1 (
) 5 . 0 (
) 1 (
) 5 . 0 (
np x if
p np
np x
np x if
p np
np x
Z
2 1 1
2 1 0
:
:
p p H
p p H
=
=
2 1
2 2 1 1
2 1
2 1
0

;
)
1 1
)( 1 (

n n
p n p n
p
n n
p p
p p
Z
+
+
=
+

=
2 1
p p if =
2 / 0
Z | Z |
o
>
H
0
is rejected if
n
p p
Z p p
n
p p
Z p
) 1 (

) 1 (

2 / 2 /

+ s s

o o
30
Test on Poisson Distribution
A random sample of n observation is taken, say x
1
, x
2
, ..,x
n
. Each
{x
i
} is Poisson distributed with parameter . Then the sum x= x
1
+
x
2
+...+x
n
is Poisson distributed with parameter n.

If n is large, =x/n is approximately normal with mean and
variance /n
Test hypothesis
H
0
: =
0
H
1
: =
0
The null hypothesis would be rejected if |Z
0
|>Z
o/2.



n /
x
Z
0
0
0


=
x
31
Two Types of Hypothesis Test Errors
Type I error ( producers risk, o error):
o = P{type I error} = P{reject H
0
|H
0
is true}
=P{product is rejected| but product is good}

Type II error (consumers risk, | error):
| = P{type II error} = P{fail to reject H
0
|H
0
is false}
=P{product is not rejected|although product is bad}

Power of the test:
Power = 1- | = P{reject H
0
|H
0
is false}
32
Probability of Type II Error
| = P{type II error} = P{fail to reject H
0
|H
0
is false}
=Pr{within the acceptance region|has a mean shift}
H
0
:

=

0
H
1
:

=

1
=

0
with known o
2



0 if ,
0 1
> o o + =
)
n
Z ( )
n
Z (
} H | n / Z x n / Z Pr{
} H | H Pr{
2 / 2 /
1 2 / 0 2 / 0
1 0
o
o
u
o
o
u =
o + s s o =
= |
o o
o o
33
Example
The mean contents of coffee cans filled on a particular
production line are being studied. Standards specify that the
mean contents must be 16.0 oz, and from past experience it is
known that the standard deviation of the can contents is 0.1
oz. The hypotheses are
H
0
: =16.0
H
1
: =16.0
A random sample of nine cans is to be used, and the type I
error probability is specified as o=0.05. What is the type II
error if the true mean contents are
1
=16.1 oz?
34
Properties of Type I & Type II Errors
Both types of errors can be reduced by
increasing the sample size at the price of
increased inspection costs.
For a given sample size, one risk can only
be reduced at the expense of increasing
the other risk.

35
OC Curves
OC curve
The larger the mean shift, the smaller the type II error
The larger the sample size, the smaller the type II error
o=0.05
1 = n
2 = n
3 = n
4 = n
36
OC Curves
OC curve
The larger the mean shift, the smaller the type II error
The larger the sample size, the smaller the type II error
o=0.05
1 = n
2 = n
3 = n
4 = n
37
Example
Suppose we wish to test the hypotheses
H
0
: =15
H
1
: =15
where we know that o
2
=9.0. If the true mean is
really 20, what sample size must be used to ensure
that the probability of type II error is no greater
than 0.10? Assume that o=0.05.
38
Use OC Curve
n=4

Anda mungkin juga menyukai