7/16/2009
Summary
The DOE Wizard can construct designs for studying the effects of quantitative or categorical
(non-quantitative) factors. This document considers experiments involving two or more
categorical factors. For such cases, the wizard will create a multilevel factorial design with runs
at each combination of the levels of the factors. Once the experiment is performed, the data are
analyzed using the Multifactor ANOVA procedure.
Sample Data:
The example in this document comes from Kutner et al. (1996). They describe a situation in
which researchers wished to conduct a stress test on a treadmill to determine the effects of three
factors: smoking history, amount of body fat, and gender. They expected that the factors might
interact, so they wished to obtain more than one individual with each combination of the factors.
The levels of the factors they selected were:
They decided to select 3 individuals for each of the 12 combinations of the factor levels for a
total of 36 subjects.
Design Creation
To begin the design creation process, start with an empty StatFolio. Select DOE – Experimental
Design Wizard to load the DOE Wizard’s main window. Then push each button in sequence to
create the design.
The first step of the design creation process displays a dialog box used to specify the response
variables. For the current example, there is a single response variable:
Impact: The relative importance of each response (not relevant if only one response).
Sensitivity: The importance of being close to the best desired value (in this case, the
Minimum). Setting Sensitivity to Medium implies that the desirability attributed to the
response decreases linearly between the Minimum and Maximum values indicated.
Minimum and Maximum: Range of desirable values for the response (20-40).
2009 by StatPoint Technologies, Inc. DOE Wizard – Multi-Factor Categorical Designs - 2
STATGRAPHICS – Rev. 7/16/2009
Step #2 – Define Experimental Factors
The second step displays a dialog box on which to specify the factors that will be varied:
Type – Set the type of each factor to Categorical, since there is a discrete set of possible
values for each.
Levels – Ddentify the levels of the factor, separating each level by a comma.
Since all of the factors are controllable process factors, only one Options button is enabled.
Pressing that button displays a second dialog box:
1. Factorial - a design in which the factors are crossed, and data is collected at all
combinations of the factors.
2. Variance components (hierarchical) - a design in which the factors are nested, so that
each level of a factor is unique to a specific level of the factor above it.
3. User-specified – a design in which the user rather than the program will specify the runs
to be performed.
This document describes the use of Factorial designs, since the 3 factors are crossed. See the
document title DOE Wizard – Variance Component Designs for a discussion of experimentz
involving nested factors.
When OK is pressed, the tentatively selected design is displayed in the Select Design dialog box:
If the design is acceptable, press OK to save it to the STATGRAPHICS DataBook and return to
the DOE Wizard’s main window, which should now contain a summary of the design:
Before evaluating the properties of the design, a tentative model must be specified. Pressing the
fourth button on the DOE Wizard’s toolbar displays a dialog box to make that choice:
The default model contains main effects for each factor and interactions for all pairs of factors.
Since we intend to run all of the runs in the base design, this step can be omitted.
Design Properties
Several of the selections presented when pressing button #6 are helpful in evaluating the selected
design:
Design Worksheet
The design worksheet shows the 36 runs that have been created, in the order they are to be run:
ANOVA Table
Source D.F.
Model 9
Total Error 26
Lack-of-fit 2
Pure error 24
Total (corr.) 35
9 of the 35 total degrees of freedom are used to estimate the main effects and two-factor
interactions. 26 degrees of freedom are left to estimate the experimental error, including 24
degrees of freedom attributable to pure error (directly from replicates). The 2 degrees of freedom
for lack-of-fit are attributable to a third-order ABC interaction that is not in the model.
Model Coefficients
Model Coefficients
The coefficients for each effect correspond to indicator variables defined in the underlying
regression model. For a factor with k levels, k – 1 indicator variable are created:
Xk-1 = -1 for level 1, 1 for level k, and 0 for all other levels
This coding is convenient since the sum of each variable across the 36 runs equals 0, which sets
the constant term in the model to the grand mean.
The graph of the design points shows the experimental region. Runs are performed at all
combinations of the factor levels:
Stress test
none
smoking
light
male
heavy
high gender
low female
body fat
Once the experiment has been created and any additional runs entered, it must be saved on disk.
Press the button labeled Step 7 and select a name for the experiment file:
Design files are extended data files and have the extension .sgx. They include the data together
with other information that was entered on the input dialog boxes.
To reopen an experiment file, select Open Data File from the File menu. The data will be loaded
into the datasheet, and the Experimental Design Wizard window will be displayed.
Once the data have been entered, press the button labeled Step #8 on the Experimental Design
Wizard toolbar. This will display a dialog box listing each of the response variables:
If more than one response has been measured, you should repeat this step once for each response.
When OK is pressed, the program will invoke the Multifactor ANOVA procedure for the designs
containing one or more blocking variables. Full details of that procedure are available in the
corresponding analysis.
Of particular interest in the current example are several tables and graphs:
A small P-value for any main effect or interaction (less than 0.05 if operating at the 5%
significance level) indicates that the corresponding factor has a significant effect on the response.
If the current example, all 3 factors have significant main effects, and there is a significant
interaction between body fat and smoking.
Graphical ANOVA
A new method for illustrating the results of an analysis of variance, from Hunter (2005), is
shown below:
female male
gender P = 0.0000
high low
body fat P = 0.0000
Residuals
-25 -15 -5 5 15 25
The plot shows the scaled deviations of the block and treatment averages from the grand mean,
together with the model residuals. Scaling is such that, if a factor has no effect, the variation
observed for that factor should be comparable to that of the residuals. Note that the variation for
all factors is considerably greater than that of the residuals.
Interaction Plot
The Interaction Plot is particularly important when two factors show a significant interaction:
Interaction Plot
32 body fat
low
29 high
26
minutes
23
20
17
14
none light heavy
smoking
For example, the above plot shows the average response at each combination of smoking and
body fat. The larger difference between the lines for non-smokers than for heavy smokers
indicates that the effect of body fat is much greater for individuals who do not smoke.
Optimization
Once a statistical model has been developed for each response, the analyst may now determine
what combination of factors will yield the best results. Pressing the button labeled Step #9 on the
Experimental Design Wizard toolbar instructs the program to examine each treatment and find
the treatment that maximizes the joint desirability of the estimated responses. When the
optimization is complete, a message similar to that shown below will be displayed:
The dialog box indicates the “Desirability” of the final result, based on a metric designed to
balance competing requirements of multiple responses (see the document titled DOE Wizard for
If you press OK, additional information will be added to the main DOE Wizard window:
The table shows that the estimated minutes for the best combinations of factors (low body fat,
male, non-smoking subjects) equals 32.9, with a 95% confidence interval for the mean that
ranges between 30.2 and 35.6.
If you push the Tables and Graphs button on the analysis toolbar, you can display the estimated
desirability for each treatment by selecting the Desirability Plot:
Desirability Plot
1 body fat
low
0.8 high
Desirability
0.6
0.4
0.2
0
none light heavy
body fat
Since all combinations involving high body fat are estimated to be below the lower acceptable
limit of 20 minutes, those desirability values all equal 0.
The button labeled Step 10 allows you to save the results in a StatFolio:
Actually, the StatFolio can be saved at any point and reloaded at a later date.
IMPORTANT: When using the Experimental Design Wizard, two files are created:
1. An experiment file with the extension .sgd which stores information about the
experimental data.
2. A StatFolio with the extension .sgp that stores the results of the analysis.
If you move the experiment to another computer, be sure to transfer both files.