Anda di halaman 1dari 25

Strategies for Process

Improvement (and
product development)
CHAPTER 4
DESIGN/ANALYSIS OF
SINGLE FACTOR
EXPERIMENTS
Course Notes for ChE 425/622
Prof. Alexander Penlidis
Department of Chemical Engineering
University of Waterloo, Waterloo, Ontario, N2L 3G1
Tel: (519) 888-4567 x36634
E-mail: penlidis@uwaterloo.ca
A. Penlidis, 2015.
This copy is for individual use only in connection with this course.
It may not be resold or used to make additional copies.

OBJECTIVES

Introduce and discuss:

Randomization.
Replication.
Blocking.
Experimental design for comparative
studies.
Analysis of comparative studies.

DEPENDENT AND
INDEPENDENT VARIABLES
Independent variables or factors (settings, inputs,
regressors) are controlled by the experimenter.
Dependent variables or responses (outputs) are the
measured outcomes of an experiment and are
dependent upon the settings of the independent
variables.

Examples:
Independent Variables
Composition
Temperature
Water flowrate
% additive
Reaction time
Wood species
Plategap
Supplier

Dependent Variables
Freeness
Tensile strength
Burst Index
Opacity
Paper breaks
Caliper
Tear Index
Cost

WHAT IS A SINGLE
FACTOR EXPERIMENT?
This is an experiment which studies the effect
of a single independent variable or factor.
The levels at which the factor is studied are
often called treatments. In an experiment
there will be two or more treatments.
The factor can be quantitative, like temperature
at 50, 60, and 70 oC, or qualitative, like
different wood species A, B, C, and D.
This experiment determines if there are
significant differences in the results of the
measured response variable, depending on
the level of the independent variable.

EXAMPLES OF SINGLE
FACTOR STUDIES
Analytical labs

Method comparisons

Product development

Comparing new products


to the competition

Process optimization

Comparing different
operating conditions

Quality control

Compare different
suppliers raw materials

CONSIDERATIONS IN
DESIGNING A SINGLE
FACTOR STUDY
Guarantee the validity of the experiment with
randomization.

Obtain a measure of reproducibility.


Control sources of variability by blocking
(Block what you can and randomize what
you cannot!).

RANDOMIZATION
It is important that the sequence of trials in all experimentation
be assigned by some process of randomization.
1. To prevent personal bias on the part of the experimenter
or others from entering.
2. To eliminate biases in estimated effects caused by trends
in the errors or other independent variables not included
in the study. Under randomization, they are absorbed into
the error rather than the estimated effects.
3. To prevent time/order effects from masking the results.
4. To ensure that the observed effects were caused by the
changes to the factors made by the experimenter.
5. To make certain that the significance tests are based on
valid random variables.

RANDOMIZE, RANDOMIZE, RANDOMIZE!


If randomization is not possible (?!), at least be aware of the
possible problems.

REPRODUCIBILITY AND
EXPERIMENTAL ERROR

The analysis of these experiments essentially


consists of comparing the variability between
treatment levels with the background variability
(within treatment). The latter is the experimental
error.

Obtaining a reliable estimate of the experimental


error is key to the analysis.

To obtain an estimate of experimental error, you


must replicate your experiment.

What sources contribute to experimental error in an


experiment?

measurement error
uncontrollable or unknown errors
assignable sources of error.

REPLICATION

Question:

Suppose I want to compare four


manufacturing processes for the same
product. How many replicates should I
take for my study? (Replicate = repeat of
entire experiment from start to finish).

Answer:

It depends!

You need to know:


What difference you want to detect.
Standard deviation of process (response).
What degree of confidence is desired.

SAMPLE SIZE
TABLE
This table shows the sample sizes needed for a
comparative experiment with four treatments
using = 0.05.
True standard
deviation as a
percent of
mean
2
3
4
5
6
7
8
9
10
12
14
16
18
20

True difference as percent of


the mean

5
4
7
11
17
24
32
41

10
2
3
4
5
7
9
11
14
17
24
32
41

15
2
2
3
3
4
5
6
7
8
11
15
19
24
29

20
2
2
2
3
3
3
4
5
5
7
9
11
14
17

25
2
2
2
2
2
3
3
3
4
5
6
7
9
11

30
2
2
2
2
2
2
3
3
3
4
5
6
7
8

Source: Cochran, W.G. and G.M. Cox, Experimental Designs, 2nd


Edition, Wiley, 1957.

10

COMPLETELY RANDOMIZED
DESIGNS

Essentially this design is the extension of t-tests for


comparing two treatments to the comparison of k
treatments which have been randomly allocated.

Example:

Coagulation times (sec) for blood drawn


from 24 animals randomly allocated to
four different diets (Box, Hunter and Hunter):
diets (treatment)
A
62
60
63
59

Treatment
Totals

244

B
63
67
71
64
65
66

396

C
68
66
71
67
68
68

408

D
56
62
60
61
63
64
63
59
488

-diets allocated
randomly
-blood samples taken
and tested randomly
k or t = 4 treatments
n1 = 4 n2 = 6
n 3 = 6 n4 = 8

Grand total = 1536

11

PLOT FOR DIET EXAMPLE


4
7

0
7

TIME

6
6

2
6

8
5

4
5
A

T
IE
D

12

COMPLETELY RANDOMIZED
DESIGNS
It is assumed that the effects model for this design is:

yti t ti

i 1,2,...nt

t 1,2,3,4

overall mean
t t deviation of treatment mean from
overall mean (treatment effect)

ti " error" for the i th observation of the t th treatment


Question: Are there any real differences between
diets?

H0: t 0

for all k treatments (diets)

What we are trying to determine: Is the variation


between treatment means significantly larger than the
variation that occurs within treatments?
13

CALCULATION OF ANOVA TABLE


Total sum of squares:

nt

S Total yti y
t 1 i 1

1 k nt
2
yti k yti

t 1 i 1
nt t 1 i 1
k

nt

t 1

nt

Observations Correction for Mean


2

t 1 i 1

62 2 60 2 ...59 2 -

1
1536 2
24

98644 98304 340


k
df

Total nt 1 23

t 1

14

CALCULATION OF ANOVA TABLE


Between treatments sum of squares:
2

nt

yti
2

k
k
i 1
1
SB
nt yt y

k
nt
t 1
t 1
nt

k nt

yti

t 1i 1

t 1

treatment totals2

nt

t=1

= 98532 98304 228

Correction for mean

df B k 1 3

Within treatment sum of squares (residual):


SW

k nt

yti yt 2 STotal S B

t 1i 1

340 228 112


k

df
nt 1 df Total df B 20
W

t 1

15

ANOVA TABLE

SOURCE

df

SS

MS

DIETS
(between
treatments)

228

76

WITHIN
DIETS
(error)

20

112

5.6

TOTAL

23

340

df = degrees of freedom
SS = sum of squares
MS = mean square = SS/df

16

DIAGNOSTIC CHECKING

So far the variance decomposition shown in the


ANOVA table is purely numerical. The relationships
among the sums of squares apply to any data set.

The ANOVA is meaningful and can be used to draw


conclusions, if the assumed model is correct.

yti t ti

Under the assumption that the errors ti are normally


distributed, the terms in the model can be estimated
from:

y
t yt y
ti yti yt

y ti yt

We can then examine residual plots to check the


adequacy of the assumed model.

17

DIAGNOSTICS
Overall dot diagram, all residuals

-5

-4

-3

-2

-1

Plot of residuals against predicted values


6

Residuals

4
2
0
-2
-4
-6
60

62

64

66

68

70

Predicted values

18

F-TEST

If there are no differences between diets, it can be


shown that the mean square for between diets is
estimating the error variance, as is the mean square
within diets.
This suggests that a test to determine if significant
differences exist between diets can be formulated
as the following F-test:

H 0 : A B C D 0
H1: i 0 for at least one of i A, B, C , D
SB
Fobserved

SW

df B
dfW

MS B 76.0

13.6
MSW
5.6

The value of Fobserved is now compared with the


tabulated value of F (dfB,dfw) (i.e., F critical value) at
a preselected significance level

19

F-TEST

If we select = 0.05 (i.e., 95% confidence level):

from the tables F3,20, 0.05 310


.
Fobserved Ftabulated

We can therefore reject the null hypothesis and


conclude that there are significant differences
between diets.

How are they different?

Examine methods for comparing multiple means.

20

MULTIPLE COMPARISONS
BONFERRONI t-TEST

Consider a set of k means. There are k(k-1)/2


possible pairs of means, and tests of the type:

H0: i j

H1: i j

These can be tested by calculating:

yi y j

Tobserved
s

1 1

ni n j
k

and comparing with t N k , where N nt


2

t 1

21

MULTIPLE COMPARISONS
BONFERRONI t-TEST

Performing k(k-1)/2 tests is laborious (maybe not so


much nowadays!), BUT in addition it has another
even more serious drawback.
Perform all tests at a level of significance . Then
, the overall probability of making at least one
incorrect rejection, is much larger than and is
unknown.
An upper bound for a set of c tests at is given by
the Bonferroni inequality:

1 1

For k = 5, for example, there are (5x4)/2 = 10


possible comparisons. Even if each test is carried
out at = 0.05:

1 1 0.05 0.40
10

22

MULTIPLE COMPARISONS
BONFERRONI t-TEST
To compensate:

1.

Only carry out those tests that are really of


interest to the investigation, say c.

2.

Choose a reasonably small target upper


bound, b, for .

3.

Conduct each test at = b/c.

For example, for k = 5, if we want to chose b = 0.10


and want to do all possible comparisons, = 0.1/10 =
0.01.

23

MULTIPLE COMPARISONS
LEAST SIGNIFICANT
DIFFERENCE (LSD)

The standard error of the difference between two


means is 2 s 2

If ni = nj approximate it by n; for the diets example


n=6

2(5.6)
1.366
6
t20,0.025 2.086
s.e.

Hence a difference between a specific pair of


means is significant at the 5% level if it exceeds
2.086 x 1.366 = 2.85 = LSD (Fishers LSD).

24

MULTIPLE COMPARISONS:
LSD

To apply the LSD simultaneously to all means use


the previous results involving the Bonferroni
inequality:

k 4

k (k 1) / 2 6

choose b 0.05

0.05
0.01
6
t0.005,20 2.845

LSD (2.845)(1.366) 3.89

means:

A
61

B
66

C
68

D
61

Therefore we can distinguish between diets A and


B, A and C, B and D, and C and D. However, we
cannot distinguish between diets A and D or B and
C.

25

Anda mungkin juga menyukai