CHE 425 622 Ch4

Strategies for Process
Improvement (and
product development)
CHAPTER 4
DESIGN/ANALYSIS OF
SINGLE FACTOR
EXPERIMENTS
Course Notes for ChE 425/622
Prof. Alexander Penlidis
Department of Chemical Engineering
University of Waterloo, Waterloo, Ontario, N2L 3G1
Tel: (519) 888-4567 x36634
E-mail: penlidis@uwaterloo.ca
A. Penlidis, 2015.
This copy is for individual use only in connection with this course.
It may not be resold or used to make additional copies.
OBJECTIVES
Introduce and discuss:
Randomization.
Replication.
Blocking.
Experimental design for comparative
studies.
Analysis of comparative studies.
DEPENDENT AND
INDEPENDENT VARIABLES
Independent variables or factors (settings, inputs,
regressors) are controlled by the experimenter.
Dependent variables or responses (outputs) are the
measured outcomes of an experiment and are
dependent upon the settings of the independent
variables.
Examples:
Independent Variables
Composition
Temperature
Water flowrate
% additive
Reaction time
Wood species
Plategap
Supplier
Dependent Variables
Freeness
Tensile strength
Burst Index
Opacity
Paper breaks
Caliper
Tear Index
Cost
WHAT IS A SINGLE
FACTOR EXPERIMENT?
This is an experiment which studies the effect
of a single independent variable or factor.
The levels at which the factor is studied are
often called treatments. In an experiment
there will be two or more treatments.
The factor can be quantitative, like temperature
at 50, 60, and 70 oC, or qualitative, like
different wood species A, B, C, and D.
This experiment determines if there are
significant differences in the results of the
measured response variable, depending on
the level of the independent variable.
EXAMPLES OF SINGLE
FACTOR STUDIES
Analytical labs
Method comparisons
Product development
Comparing new products

to the competition
Process optimization
Comparing different
operating conditions
Quality control
Compare different
suppliers raw materials
CONSIDERATIONS IN
DESIGNING A SINGLE
FACTOR STUDY
Guarantee the validity of the experiment with
randomization.
Obtain a measure of reproducibility.

Control sources of variability by blocking
(Block what you can and randomize what
you cannot!).
RANDOMIZATION
It is important that the sequence of trials in all experimentation
be assigned by some process of randomization.
1. To prevent personal bias on the part of the experimenter
or others from entering.
2. To eliminate biases in estimated effects caused by trends
in the errors or other independent variables not included
in the study. Under randomization, they are absorbed into
the error rather than the estimated effects.
3. To prevent time/order effects from masking the results.
4. To ensure that the observed effects were caused by the
changes to the factors made by the experimenter.
5. To make certain that the significance tests are based on
valid random variables.
RANDOMIZE, RANDOMIZE, RANDOMIZE!

If randomization is not possible (?!), at least be aware of the
possible problems.
REPRODUCIBILITY AND
EXPERIMENTAL ERROR
The analysis of these experiments essentially

consists of comparing the variability between
treatment levels with the background variability
(within treatment). The latter is the experimental
error.
Obtaining a reliable estimate of the experimental

error is key to the analysis.
To obtain an estimate of experimental error, you

must replicate your experiment.
What sources contribute to experimental error in an

experiment?
measurement error
uncontrollable or unknown errors
assignable sources of error.
REPLICATION
Question:
Suppose I want to compare four

manufacturing processes for the same
product. How many replicates should I
take for my study? (Replicate = repeat of
entire experiment from start to finish).
Answer:
It depends!
You need to know:

What difference you want to detect.
Standard deviation of process (response).
What degree of confidence is desired.
SAMPLE SIZE
TABLE
This table shows the sample sizes needed for a
comparative experiment with four treatments
using = 0.05.
True standard
deviation as a
percent of
mean
2
3
4
5
6
7
8
9
10
12
14
16
18
20
True difference as percent of

the mean
5
4
7
11
17
24
32
41
10
2
3
4
5
7
9
11
14
17
24
32
41
15
2
2
3
3
4
5
6
7
8
11
15
19
24
29
20
2
2
2
3
3
3
4
5
5
7
9
11
14
17
25
2
2
2
2
2
3
3
3
4
5
6
7
9
11
30
2
2
2
2
2
2
3
3
3
4
5
6
7
8
Source: Cochran, W.G. and G.M. Cox, Experimental Designs, 2nd

Edition, Wiley, 1957.
10
COMPLETELY RANDOMIZED
DESIGNS
Essentially this design is the extension of t-tests for

comparing two treatments to the comparison of k
treatments which have been randomly allocated.
Example:
Coagulation times (sec) for blood drawn

from 24 animals randomly allocated to
four different diets (Box, Hunter and Hunter):
diets (treatment)
A
62
60
63
59
Treatment
Totals
244
B
63
67
71
64
65
66
396
C
68
66
71
67
68
68
408
D
56
62
60
61
63
64
63
59
488
-diets allocated
randomly
-blood samples taken
and tested randomly
k or t = 4 treatments
n1 = 4 n2 = 6
n 3 = 6 n4 = 8
Grand total = 1536
11
PLOT FOR DIET EXAMPLE

4
7
0
7
TIME
6
6
2
6
8
5
4
5
A
T
IE
D
12
COMPLETELY RANDOMIZED
DESIGNS
It is assumed that the effects model for this design is:
yti t ti
i 1,2,...nt
t 1,2,3,4
overall mean
t t deviation of treatment mean from
overall mean (treatment effect)
ti " error" for the i th observation of the t th treatment

Question: Are there any real differences between
diets?
H0: t 0
for all k treatments (diets)
What we are trying to determine: Is the variation

between treatment means significantly larger than the
variation that occurs within treatments?
13
CALCULATION OF ANOVA TABLE

Total sum of squares:
nt
S Total yti y
t 1 i 1
1 k nt
2
yti k yti
t 1 i 1
nt t 1 i 1
k
nt
t 1
nt
Observations Correction for Mean

2
t 1 i 1
62 2 60 2 ...59 2 -
1
1536 2
24
98644 98304 340

k
df
Total nt 1 23
t 1
14
CALCULATION OF ANOVA TABLE

Between treatments sum of squares:
2
nt
yti
2
k
k
i 1
1
SB
nt yt y
k
nt
t 1
t 1
nt
k nt
yti
t 1i 1
t 1
treatment totals2
nt
t=1
= 98532 98304 228
Correction for mean
df B k 1 3
Within treatment sum of squares (residual):

SW
k nt
yti yt 2 STotal S B
t 1i 1
340 228 112

k
df
nt 1 df Total df B 20
W
t 1
15
ANOVA TABLE
SOURCE
df
SS
MS
DIETS
(between
treatments)
228
76
WITHIN
DIETS
(error)
20
112
5.6
TOTAL
23
340
df = degrees of freedom
SS = sum of squares
MS = mean square = SS/df
16
DIAGNOSTIC CHECKING
So far the variance decomposition shown in the

ANOVA table is purely numerical. The relationships
among the sums of squares apply to any data set.
The ANOVA is meaningful and can be used to draw

conclusions, if the assumed model is correct.
yti t ti
Under the assumption that the errors ti are normally

distributed, the terms in the model can be estimated
from:
y
t yt y
ti yti yt
y ti yt
We can then examine residual plots to check the

adequacy of the assumed model.
17
DIAGNOSTICS
Overall dot diagram, all residuals
-5
-4
-3
-2
-1
Plot of residuals against predicted values

6
Residuals
4
2
0
-2
-4
-6
60
62
64
66
68
70
Predicted values
18
F-TEST
If there are no differences between diets, it can be

shown that the mean square for between diets is
estimating the error variance, as is the mean square
within diets.
This suggests that a test to determine if significant
differences exist between diets can be formulated
as the following F-test:
H 0 : A B C D 0
H1: i 0 for at least one of i A, B, C , D
SB
Fobserved
SW
df B
dfW
MS B 76.0
13.6
MSW
5.6
The value of Fobserved is now compared with the

tabulated value of F (dfB,dfw) (i.e., F critical value) at
a preselected significance level
19
F-TEST
If we select = 0.05 (i.e., 95% confidence level):
from the tables F3,20, 0.05 310

.
Fobserved Ftabulated
We can therefore reject the null hypothesis and

conclude that there are significant differences
between diets.
How are they different?
Examine methods for comparing multiple means.
20
MULTIPLE COMPARISONS
BONFERRONI t-TEST
Consider a set of k means. There are k(k-1)/2

possible pairs of means, and tests of the type:
H0: i j
H1: i j
These can be tested by calculating:
yi y j
Tobserved
s
1 1
ni n j
k
and comparing with t N k , where N nt

2
t 1
21
BONFERRONI t-TEST
Performing k(k-1)/2 tests is laborious (maybe not so

much nowadays!), BUT in addition it has another
even more serious drawback.
Perform all tests at a level of significance . Then
, the overall probability of making at least one
incorrect rejection, is much larger than and is
unknown.
An upper bound for a set of c tests at is given by
the Bonferroni inequality:
1 1
For k = 5, for example, there are (5x4)/2 = 10

possible comparisons. Even if each test is carried
out at = 0.05:
1 1 0.05 0.40
10
22
BONFERRONI t-TEST
To compensate:
1.
Only carry out those tests that are really of

interest to the investigation, say c.
2.
Choose a reasonably small target upper

bound, b, for .
3.
Conduct each test at = b/c.
For example, for k = 5, if we want to chose b = 0.10

and want to do all possible comparisons, = 0.1/10 =
0.01.
23
LEAST SIGNIFICANT
DIFFERENCE (LSD)
The standard error of the difference between two

means is 2 s 2
If ni = nj approximate it by n; for the diets example

n=6
2(5.6)
1.366
6
t20,0.025 2.086
s.e.
Hence a difference between a specific pair of

means is significant at the 5% level if it exceeds
2.086 x 1.366 = 2.85 = LSD (Fishers LSD).
24
MULTIPLE COMPARISONS:
LSD
To apply the LSD simultaneously to all means use

the previous results involving the Bonferroni
inequality:
k 4
k (k 1) / 2 6
choose b 0.05
0.05
0.01
6
t0.005,20 2.845
LSD (2.845)(1.366) 3.89
means:
A
61
B
66
C
68
D
61
Therefore we can distinguish between diets A and

B, A and C, B and D, and C and D. However, we
cannot distinguish between diets A and D or B and
C.
25

CHE 425 622 Ch4

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

CHE 425 622 Ch4

Diunggah oleh

Hak Cipta:

Format Tersedia

Strategies for Process

Introduce and discuss:

Comparing new products

Obtain a measure of reproducibility.

RANDOMIZE, RANDOMIZE, RANDOMIZE!

The analysis of these experiments essentially

Obtaining a reliable estimate of the experimental

To obtain an estimate of experimental error, you

What sources contribute to experimental error in an

Suppose I want to compare four

You need to know:

True difference as percent of

Source: Cochran, W.G. and G.M. Cox, Experimental Designs, 2nd

Essentially this design is the extension of t-tests for

Coagulation times (sec) for blood drawn

Grand total = 1536

PLOT FOR DIET EXAMPLE

ti " error" for the i th observation of the t th treatment

for all k treatments (diets)

What we are trying to determine: Is the variation

CALCULATION OF ANOVA TABLE

Observations Correction for Mean

98644 98304 340

CALCULATION OF ANOVA TABLE

= 98532 98304 228

Correction for mean

Within treatment sum of squares (residual):

340 228 112

So far the variance decomposition shown in the

The ANOVA is meaningful and can be used to draw

Under the assumption that the errors ti are normally

We can then examine residual plots to check the

Plot of residuals against predicted values

If there are no differences between diets, it can be

The value of Fobserved is now compared with the

If we select = 0.05 (i.e., 95% confidence level):

from the tables F3,20, 0.05 310

We can therefore reject the null hypothesis and

How are they different?

Examine methods for comparing multiple means.

Consider a set of k means. There are k(k-1)/2

These can be tested by calculating:

and comparing with t N k , where N nt

Performing k(k-1)/2 tests is laborious (maybe not so

For k = 5, for example, there are (5x4)/2 = 10

Only carry out those tests that are really of

Choose a reasonably small target upper

Conduct each test at = b/c.

For example, for k = 5, if we want to chose b = 0.10

The standard error of the difference between two

If ni = nj approximate it by n; for the diets example

Hence a difference between a specific pair of

To apply the LSD simultaneously to all means use

LSD (2.845)(1.366) 3.89

Therefore we can distinguish between diets A and

Anda mungkin juga menyukai