Curso Bosch Diseño de Experimentos

Edition 08.
1993
1993 Robert Bosch GmbH
Table of Contents:
1. 1.1 1.2 1.3 2. 2.1 2.2 2.2.1 2.2.2 2.2.3 2.3 3. 3.1 3.2 4. 4.1 4.2 5. 6. 6.1 6.2 7. 7.1 7.2 7.3 7.4 7.4.1 7.4.2 7.4.3 7.4.4 7.5 7.6 7.7 8. 8.1 8.2 8.3 8.4 8.4.1 8.4.2 System-Analytical Approach .................................................................................. 5 One-Factor-at-a-Time Method ................................................................................ 5 Two-Factor Method................................................................................................ 9 General Case (Numerous Influence Factors) ........................................................ 12 Industrial Experimentation Methodology and System Theory .............................. 16 Hints on System Analysis..................................................................................... 17 Short Description of the System Theoretical Procedure ....................................... 17 Global System Matrix (i.e. without quoting any levels)........................................ 19 Local System Consideration ................................................................................ 19 Local System Matrix ............................................................................................ 20 Summary .............................................................................................................. 20 Probability Plot .................................................................................................... 21 Probability Plot of Small-Size Samples ................................................................ 22 Probability Paper ................................................................................................. 23 Comparison of Samples Means ........................................................................... 24 t Test .................................................................................................................... 24 Minimum Sample Size ......................................................................................... 26 F Test ................................................................................................................... 30 Analysis of Variance (ANOVA)........................................................................... 32 Deriving the Test Statistic .................................................................................... 34 Equality Test of Several Variances (According to Levene) .................................. 36 Design of Experiments with Orthogonal Arrays and Evaluating such Experiments ......................................................................... 38 Representing the Results of Measurement ............................................................ 43 Calculating the Effects ......................................................................................... 52 Regression Analysis ............................................................................................. 55 Factorial Designs ................................................................................................. 56 Design Matrix ...................................................................................................... 56 Evaluation Matrix ................................................................................................ 58 Confounding ........................................................................................................ 59 Fractional Factorial Designs................................................................................. 63 Designs for Three-Level Factors .......................................................................... 65 Central Composite Designs .................................................................................. 67 Screening Designs According to Plackett and Burman ......................................... 69 Statistical Evaluation Procedures for Factorial Designs ....................................... 71 One-Way Analysis of Variance ............................................................................ 71 Factorial Analysis of Variance ............................................................................. 72 Factorial Analysis of Variance with Respect to Variation .................................... 72 Computer Support ............................................................................................... 73 Evaluation of an Experiment using the FKM Program.......................................... 75 Evaluation with the Help of SAV Program........................................................... 81
-3-
9. 9.1. 9.2. 9.3. 9.4. 10. 11. 12.
Hints on Practical Design of Experiments ............................................................ 86 Task and Target Formulation .............................................................................. 86 System Analysis .................................................................................................. 86 Stipulating an Experimental Strategy ................................................................... 87 Executing and Documenting an Experiment ......................................................... 88 Shainin Method .................................................................................................... 89 List of References ............................................................................................... 92 Tables .................................................................................................................. 93 Index ...................................................................................................................101
Within the framework of quality assurance and for effective new and further development of Bosch products, careful design of experiments is not only indispensable but is also required by our customers. In this connection, the commonly used term Statistical Experimental Design is not exactly defined and labels such as Design of Experiments (DOE), Industrial Experimentation Methodology, Taguchi Method and Shainin Method(s) are often used interchangeably. This pamphlet is based on a seminar manuscript on Industrielle Versuchsmethodik 1 (Industrial Experimentation Methodology) and should clarify vital terms and processes of the statistical experimental design, to an interested user.
-4-
1. System-Analytical Approach
Investigation of a system must often begin with the description of a particular systems state. A basic requisite that we impose on an experiment is reproducibility, i.e. under definite conditions the result of an experiment must always be the same. Since there cant be absolute equality (one cannot swim upstream twice), the reproducibility of an experiment is a relative term. One can use statistical terms to define the term reproducibility. The term self-control can be interpreted as generalization of the term reproducibility. It is also possible to limit oneself to the statement that a (quantitative) result of an experiment must always lie within a specific bandwidth. Variation in results of repeated experiments (e.g. process variation) can, under certain situations, be a vital parameter. Standard deviation, under certain circumstances, can serve as a measure of the variation. If one wishes to evaluate the variation quantitatively, one needs a sufficiently large sample size, i.e. sufficiently many repetitions of the experiment (see hereto, Chapter 4.2). Similar statements are valid for the mean position.
1.1 One-Factor-at-a-Time Method

If one wishes to investigate the influence of a factor within a system, one varies this factor but leaves other factors in the system unchanged. In general, this ensures that other factors, which are not the subject of the investigation, neither falsify the results nor restrict the corresponding deduced statements. (That is obviously easier said than done.) This approach is convenient, logical and should be reckoned as a fundamental experimental strategy. The only restriction: The nature of influence depends (possibly very strongly) upon the position of the other factors. Strangely, numerous textbook authors reckon the one-factor-at-a-time method to be inefficient. A practical person can, nonetheless, confidently ignore these objections. It is evident that one-factor-at-a-time experiments must be carefully designed, executed and evaluated.
Systematic Approach We differentiate between variable and discrete influence parameters. Before one determines experiments or experimental series, one must think about what type of influence the variable factor has. When preparing one-factor-at-a-time experiments we will get acquainted with terms which are later of prime importance when investigating the general ndimensional case.
a) The most simple type of influence is the linear influence. Increasing the influence by a fixed amount always brings about the same effect, independent of the chosen levels (see Chapter 7). Many known natural laws of physics or chemistry are linear (examples?).
-5-
Differential calculus linearizes (nearly) arbitrary functions. However, one should not assume that a fact to be investigated can be linearized just as a matter of simplicity.
The statement that every problem can be linearized when the difference between the steps of the influence parameters is small enough, may be correct, though this is of little practical value, since what is considered small enough must then be clarified. For instance, a temperature difference of 1C can be small in many problems, but in other problems, this increment may be large.
Extrapolation beyond the investigated region is only permissible if the function is known. The same restriction applies to interpolation. Because a system generally exhibits significant background noise, erroneous interpretations of experimental results easily occur even though linearity is ensured.
b) Further generalization of linear influence is the monotonic influence (synonym: tendency, directional factor).
A monotonic influence is apparent when the input quantity can influence the output quantity in only one direction. More precise: A monotonic influence is apparent when an increase of the input quantity invariably causes either an increase or a decrease of the output quantity.
Monotonic influence parameter: The choice of the steps influences the size of the effect.
-6-
c) Factors that are not monotonic are called non-monotonic. A non-monotonic influence is apparent, when the target quantity is influenced in both directions. More precise: A non-monotonic influence is present, when the increase of the influence parameter increases or decreases the target quantity, depending upon the steps selected.
Non-monotonic influence parameter: The choice of the steps influences the size and direction of the effect. Remark 1: The pair of terms monotonic/non-monotonic leads to a basic characterization of variable influence factors. Remark 2: By suitably restricting the interval, a factor that is non-monotonic over a large interval can become monotonic (localization). This suitable restriction naturally presumes system knowledge.
Scope of the investigations (number of experiments) A generally applicable formula for determining a sample size does not exist. The sample size 1 can already be too large (a finger on a hot oven plate), and no sample size, no matter how large, can by context confirm a false statement, (from the fact that all numbers from 1 up to 999 are smaller than one thousand, it cannot be deduced that all numbers are smaller than one thousand). Likewise, a failed experiment does not mean that the hypothesis to be verified is false. The sample size is to be selected in such a manner that the effects or functional associations are recognized against the background of system noise (see Chapter 4.2).
Evaluation Always by means of graphs (graphs are compulsory), e.g. dot frequency diagram.
-7-
Task 1 When doing a one-factor-at-a-time experiment one should make sure that factors not constituting the object under investigation neither falsify the results nor restrict the confirmation. One should discuss, using concrete cases, how this basic principle can be realized (e.g. by randomization).
Task 2 A glass of water is put inside a freezer and the time required for the water to freeze is recorded. The initial temperature of the water (between 10 C and 100 C) should be determined so that the time interval up to freezing is as long as possible (optimization problem). How do you investigate the process empirically?
Task 3 a) Assuming that a process is definitely linear. How many supporting points does one need to represent the natural law explicitly? How should the system noise be considered? How does one select the supporting points? b) How can one invalidate linearity empirically?
Task 4 Electron-impact experiment done by Franck and Hertz: The anode current flowing to the exciting anode as a function of the anode voltage. a) How can the process represented in the adjacent figure be investigated empirically? b) What could a physicist conclude, who only performed tests at 5V, 10V and 15V?
-8-
Summary: From the considerations discussed in this chapter, it is clear that when investigating the influence of a single factor, the given situation is very important - how many measurements must be made, how many repetitions must be undertaken and where they have to be. There is therefore no strict recipe for conducting empirical investigations. Thus, it is not appropriate to teach recipes. Scheme Target quantity(ies): Influence variable(s): Other factors influencing the target quantity are (which are nonetheless not the objective of the investigation): How are the quantities considered? Prior knowledge: Number of the steps: Reason: Number of repetitions: Reason: Additional points to be considered:
1.2 Two-Factor Method

If one wants to investigate the influence of two factors within a system, phenomena have to be observed that dont arise during one-factor-at-a-time investigations. Since these phenomena are symptomatic for the general n-dimensional case, a thorough investigation is beneficial. Before one determines an experimental arrangement (incl. experiments size), what is known and unknown regarding the two factors (and what is then to be investigated empirically) must be systematically established. At first, a cognitive investigation takes place in principle; i.e. a knowledge-based description of the two-factors system. What is helpful as in the one-factor-at-a-time method is differentiation between discrete and variable influence parameters. There are 3 cases to be differentiated.
-9-
Case 1: Both influence factors are discrete Influence factor
A with k levels: B with l levels:
A1 , A2 , . . ., A k B1 , B 2 , . . ., B l
There are k l system states.
Example: Target quantity: Yield Plants A1 , A2 Pesticides B1 , B 2 , B 3
Remark: It is clear that in general one cannot derive other system states Ai B j from the knowledge of an empirical result.
Case 2: One discrete influence factor A: A1 , . . ., Al One variable influence factor.
Example: System: Solution Target quantity: Solubility
A : Chemical substance B : Temperature

It is possible, in principle to describe the system in k characteristic lines (family of characteristics). In general, one deals with k different one-factor problems. Is it possible to make a general deduction from one characteristic line to the other characteristic line?
Solubility of several inorganic substances as a function of temperature
When characteristic curves are shifted upwards in parallel (depending on the discrete factor), one speaks of an interaction-free system.
- 10 -
Case 3: Both influence parameters are variable. The information can be represented in 3-dimensions (see Figure).
Ignition Angle
Load
Rotational Speed
Complex Motronic ignition map (ignition angle as a function of load and engine speed)
Hereby the values of both variable influence parameters (in this example, load and engine speed) constitute the coordinates of points in a plane. The function is then represented by a mountain above this plane (see Figure 7.1). The experimenter must now specify the region in which empirical investigations are to be performed. It fully depends upon the physical question - how many experimental points should be foreseen. The idea that the experimental scope can be reduced by means of a combinatorial magic is simply erroneous. The scope can only be reduced through precise task formulation and use of the knowledge already verified.
Task: System: Target quantity: Influence parameters: Cake Height of cake Yeast, water
How can one investigate the system empirically? What does an array of characteristic curves look like in principle?
- 11 -
Summary: To handle two influence factors just like in the one-dimensional case: the combinatorial arrangement of experimental points, the number of repetitions per point etc. fully depend upon how a question is formulated. Generally binding rules, in an algorithmic sense, cannot exist. The case differentiation discrete discrete variable is helpful. discrete variable variable
1.3 General Case (Numerous Influence Factors)

A complex system, with numerous influence factors, poses a challenge. It is clear that the time needed for an investigation increases with the number of factors to be considered. It would be good if one could reduce the time needed for an experiment through a combinatorial magic. It is unfortunately not so. The only way to reduce experimental expenses is by applying the existing knowledge in a systematic way. This systematic approach must help the practitioner but not force him to have to employ terminology he does not understand (nor is expected to know). The practitioner must be able to present his knowledge or his presumptions in a simple and rational manner.(Furthermore, the assumptions must be systembased and plausible!) We differentiate between variable and discrete influence parameters. The description of the type of influence of the individual quantities belongs to principlesystem description. In view of the fact that we may have to differentiate among numerous input quantities, a careful description of the influence of individual input-quantities is especially important. Naturally, the influence of an individual quantity depends upon the position of the other input-quantities, but because of this, it must be determined whether the physical-chemical character of the individual quantities permits making principle statements about the type of influence, independent of the other quantities. With the systematic approach, it is preferable to begin with considering the discrete influence parameters. If for instance, A is a discrete influence parameter with the levels A1, A2 (e.g. metal type) then the following should be asked: Is one of the two levels, in respect of the target quantity, better, in principle, than the others or not? If it is not the case, then this means that the answer to the question depends upon the position of the other factors.
- 12 -
Example:
A Definite
A Ambiguous
Remark: It is usually preferred to begin by investigating the discrete influence parameters, basically because the different steps often represent the system states to be differentiated. (Dont compare apples with oranges!).
Variable Influence Before determining the experiments or experimental series, the overall influence should be described (i.e. without determining the levels). Relevant terminology is known to us (see 1.1 and 1.2).
Black Box If, after a careful analysis, all of the influence parameters are ambiguous or if the character of the influence is unknown, then the matter can not be investigated empirically. If one nevertheless, wants to conduct the experiment, all the strategies then become nearly equivalent (all cats are grey at night).
Trial Task: System: A board fixed on one side. Target quantity: Lowering of the free end. Influence quantities: Types of wood H 1 , H 2 , H 3 , length, breadth, height, force F
- 13 -
I. a) Perform a global system analysis with the help of the system matrix! b) If appropriate, draw an array of characteristic lines!
Length Linear Monotonic Non-monotonic Unknown
Breadth
Height
Force
II.
Given are Length: Breadth: Height: Force: 1.5 m 20 cm 4 cm 20 N
All 4 quantities can be reduced by up to 10%. Target is a board which is lowered as little as possible. Perform a local analysis! Which experiments or experimental series would you perform? Factors Definite Ambiguous Unknown Trial Task: System: Target quantity: Influencequantities: Green house Length Breadth Height Force
Yield of useful plants
Types of plants P1 , P2 , P3 Types of soil B1 , B 2 Chemicals C1 , C 2 , C 3 Water quantity (irrigation) Light Temperature
- 14 -
1. Perform a detailed system analysis (with system matrix)! 2. Draw arrays of characteristic lines! What can be said about interactions? 3. Which experimental strategy is recommendable?
Trial Task: a) What does the optimization strategy of a monotonic system look like?
Global System Matrix: Factor Monotonic Non-monotonic Unknown
b) What does the optimization strategy of the following system look like? Factors Levels Definite Ambiguous Unknown
A A1 A2
X
B B1 B2
C C1 C2
D D1 D2
E E1 E2
F F1 F2
X
X X X X
- 15 -
2. Industrial Experimentation Methodology and System Theory

The terminology or key words summarized under D.O.E., Statistical Experimental Design, Taguchi, and Shainin methods, as mentioned earlier, are either required or initiated by customers and also used in specialized literature. With respect to the practical relevance of the methods mentioned above, reference is made to the following: Taguchi Method The Taguchi method is characterized by, among other things, the usage of the so-called orthogonal arrays to reduce the required extent of the experiment. The use of the method is dependent upon negligibility of interactions or - in exceptional cases - the predictability of interactions. These assumptions are controversial; nevertheless, successful examples are often quoted in the literature. These successes are not verifiable and usually not rationally comprehensible. What is confirmed is that substantial misstatements can be proved with the orthogonal arrays.
D.O.E. (= Design of Experiments) Anybody who has ever thought about performing an experiment, has practiced experimental design. Thus, one can never ask the question whether one is for or against experimental design. With regard to the contents of textbooks about the D.O.E.-subject, however, there are some reservations, for instance:
All algorithmic approaches are based on models, i.e. a mathematically quantitative model is suggested to represent the reality to be investigated. All subsequent procedures (experimental designs, evaluations etc.) are only reasonable if the model adequately describes the reality. The difficulty of selecting the right model is fundamentally natural. From the results structure, it is not possible to recognize whether the model is adequate (i.e. verification is neither a prior nor a posterior possibility).
A way out of this difficulty is only possible via a system-theoretical approach.
Shainin Method For Shainin method see Chapter 10 and [11].
- 16 -
2.1 Hints on System Analysis

The prerequisite for a reasonable experimental design is a system analysis. The purpose of a system analysis is, among other things, to present existing knowledge or lack of knowledge for the system to be investigated with the help of elementary terms. Theoretical DOE terms are to be avoided at this stage for various reasons. After executing the system analysis, a decision can be made, to some extent deduced, about the experimental strategy that is appropriate. Automation in the sense of a strict recipe is not appropriate and therefore not to be pursued. Formulation of General Systems Theory terminology is used. Generally it may be assumed that the system to be investigated does not represent a black-box. (It is self-evident that a real black-box cannot be investigated with formal procedures). Hence the specialist will be able to make principle statements about the input-output-situation of the system. An explanation in principle, i.e. qualitatively correct explanations, are preferred to precise quantitative statements that are for various reasons often false (better be approximately right than exactly wrong).
2.2 Short Description of the System-Theoretical Procedure

System analysis begins with system definition. This includes listing all relevant target quantities (output) as well as relevant influence parameters (input). Here for instance, flow charts and cause-and-effect diagrams can be helpful. When dealing with input-quantities, e.g., care should be taken about independence, susceptibility and possibility of definite establishment. Subsequent to completion of the system definition, the system characteristics are to be described. System analysis is a recursive process. In the ideal case, all relevant system characteristics are known and investigating the system via experiments becomes unnecessary. A statement about system noise belongs to system characteristics description, i.e. the description of output-quantities behaviour when given input-quantities are kept constant. The knowledge of system noise has vital consequences to the type and scope of impending investigations. Describing the functional input-output situation is important within the scope of information about system characteristics. In view of the fact that normally several input-quantities exist, describing the influence of the individual input-quantity is especially important. Naturally, the influence of an individual quantity depends upon the position of the other input-quantities, and for this reason it is especially important that the physical-chemical character of the individual quantity permits making principle statements about its type of influence, independent of the other quantities. Here the following formulation of terms can help further: Global description
- 17 -
Linear influence (as a special case of the monotonic influence): A linear influence exists, if the functions f ( A, . . . ) are always linear (linear influence factors are certainly exceptional cases).
Characteristics of a linear influence factor
Monotonic influence: A monotonic influence exists if the input-quantity can only influence the output-quantity in one direction.
Characteristics of a monotonic influence factor Non-monotonic influence factor: A non-monotonic (dichotomous) influence exists if the input-quantity is influenced in both directions (i.e. both upwards and downwards). Here also the characteristic of the influence factor depends upon the position of the other influence factors. It is generally assumed, however, that the type of the dichotomy is an invariant of the influence factor, i.e. the dichotomy is independent of the position of the other factors. Characteristics of a dichotomous influence factor
- 18 -
2.2.1 Global System Matrix (i.e. without quoting any levels) Considering the special role of discrete input-quantities, every single quantity is then specified according to how someone, conversant with the system, determines the influence character (without quantification). Hereby reference is made to the above type classification. The results are summarized in the global system matrix: Factors Linear Monotonic Dichotomous Unknown A completed global system matrix can alreadydepict a sensible experimental strategy. Example: If all influence factors are monotonic, then it is simple to optimize the system and the only question needed to be asked is what influence factors are decisive for the optimum. Here reference can be made to the Shainin method.
2.2.2 Local System Consideration Often, an experimental strategy directly follows from the global system consideration. Because the global characteristics array, especially that of the dichotomous influence factors, is often very complex, system consideration must be localized; i.e., the levels of the influence factors must be prescribed and the properties of the system relative to the prescribed levels considered. For the special case between the two steps, the following case-differentiation is to be made: 1. Univalent Influence Factor (univalent = definite) If the target quantity is only moved in one direction with a change from A1 to A2 , i.e. or always
f ( A1 ) f ( A2 ) > 0 f ( A1 ) f ( A2 ) < 0 ,
then a univalent factor exists. Hint: Because of localization, a dichotomous factor can be univalent. To some extent, however, there exists some correspondence between univalent and monotonic factors.
- 19 -
2. Bivalent Influence Factors (bivalent = ambiguous) Bivalent factors, according to definition, are factors which are not univalent. That means that the factor, depending on the position of the other factors, influences the target quantity both upwards and downwards when the level of the influence factor is changed as prescribed. The behaviour of a bivalent factor is, as such, synergetic or antagonistic. It is of special importance to find out which ones of the other factors cause the changes.
2.2.3 Local System Matrix (depending upon the selected levels, i.e. there exists not only one local system matrix). The results of the local system consideration are summarized in the local system matrix. Example: Factors Levels Univalent Bivalent Unknown A completed local system matrix gives an indication of the complexity of localized problems. The simplest case exists when all influence factors are univalent. Then the experimental strategy is obvious. The most difficult case exists when all influence factors are bivalent or when the character of the influence is unknown. In this case, a simple experimental strategy is (without further information) impossible. Especially, reasonable optimization with a small experimental series is not attainable.
A A1 A2 B1
B B2 C1
C C2 Z1
Z Z2
2.3 Summary
The statement made in the QS-Info 1/90 there is no alternative to statistical design of experiments is only correct if, under statistical design of experiments, one understands the systematic, i.e., the system-theoretical design of experiments by considering the statistical points of view. If under statistical design of experiments, however, one understands the contents of the textbooks about statistical design of experiments (from Fisher via Box to Taguchi), then it is assumed that these contents are not or are only seldom transferable to real-life. Similar reservations are made with respect to commercial software-packages. Especially, every polemic against the so-called conventional methods is uncalled-for. A consequent application of the system-theoretical attitude will often lead to the need to account for conventional investigation types in other cases, however, this can lead to the formal approaches being seen as promising. Holding to stubborn schools of thought is certainly detrimental at long-term.
- 20 -
3. Probability Plot
When one speaks about a normal distribution, one mostly associates this concept with a Gaussian bell-shaped curve. The Gaussian bell-shaped curve is a representation of the probability density function f ( x ) of the normal distribution:
1 2
1 x 2
2
f ( x) =
This function and its graphic representation are printed on the 10 DM bank note, besides the portrait, in honour of the mathematician called Carl Friedrich Gau. The normal distribution assigns to every value x the probability that a random variable X takes a value between and x . One acquires the distribution function F ( x ) of the normal distribution, in that he integrates over the above given density function.
1 2
f ( x) =
1 v 2
dv
F ( x ) corresponds to the area up to the value x , under the Gaussian bell-shaped curve.
The graphical representation of this function has an s-shaped form. Thus, strictly speaking, one must always think about this curve whenever a normal distribution is concerned. If the y-axis, in this representation, is now distorted such that a straight line evolves out of the s-shaped curve, a new coordinate system - the probability paper - emerges. The x-axis remains unchanged. Because of this association, a normal distribution in this new coordinate system is always portrayed as a straight line on the probability paper. One uses this fact in order to check graphically for the normal distribution of a given data set. As long as the number of measured values given is large enough, one creates a histogram of these values, thus determining the relative frequencies of values within the classes of a grouping. If the cumulative relative frequencies found are now plotted over the right class limits on the probability paper and a series of points approximately lying on a straight line is obtained, then it can be inferred that the values of the data set are approximately normally distributed. Remark: The recording of measurement values or groups of measurement values ordered according to the factor levels on probability paper is a component of the SAV-program (see Chapter 8.4 Computer aid and [9]).
Hint: In German, two different denotions are used in this context. Wahrscheinlichkeitsnetz stands for the coordinate system in which the data are plotted and Wahrscheinlichkeitspapier denotes the form (sheet) with the pre-printed coordinate system (see chapter 3.2), whereas in English textbooks the denotion probability paper is used for both.
- 21 -
3.1 Probability Plot of Small-Size Samples

The size of a sample for creating a histogram or calculating relative frequencies is often not sufficient, so that representation on the probability paper according to the abovedescribed method is not possible. There is a way out of this dilemma, which is explained below. The processes can be understood easily by means of computer simulation. One takes a sample of size n : x 1 , x 2 , . . . , x n from a standard normally distributed population ( = 0 , = 1 ) and arranges the values in order of magnitude: x ( 1) x ( 2 ) . . . x ( n ) . The number assigned to each of the sample values in this increasing sequence is called rank. The smallest value x ( 1 ) has therefore the rank 1, the greatest value x ( n ) the rank n . for every x ( i ) ( i = 1, 2, . . ., n) If this process is frequently repeated, then the cumulative frequencies H i ( n ) ensue for every rank i as a mean value of Fi (in actual sense, the median will be considered). To every sample size 6 n 50 these cumulative frequencies H i ( n ) are given for each rank i in Table 1 (Section 12). We now consider a sample of size 10 for example, which should be tested for normal distribution: 2.1 2.9 2.4 2.5 2.5 2.8 1.9 2.7 2.7 2.3. The values are sorted according to magnitude: 1.9 2.1 2.3 2.4 2.5 2.5 2.7 2.7 2.8 2.9. The value 1.9 has rank 1, the value 2.9 rank 10. In the table in the appendix (sample size n = 10 ) one finds the cumulative frequencies (in percentage) for every rank i : 6.2 15.9 25.5 35.2 45.2 54.8 64.8 74.5 84.1 93.8. Finally, one chooses a suitable division (scaling) for the x-axis of the probability paper corresponding to the values 1.9 up to 2.9 and enters the cumulative frequencies versus the well-sorted accompanying sample values on the probability paper. One therefore marks the following points in the example considered above: (1.9; 6.2), (2.1; 15.9), (2.3; 25.5), ... ..., (2.7; 74.5), (2.8; 84.1), (2.9; 93.8). Because these points are well approximated by an eye-fitted straight line, it can be assumed that the sample values are approximately normally distributed.
Then one determines the value Fi = F ( x (i ) ) from the table of standard normal distribution
- 22 -
3.2 Probability Paper

The plot of the above described points will be simplified if the so-called probability paper is used. This is a special form where horizontal lines are drawn at the positions of the cumulative relative frequencies which correspond to ranks i . The probability paper for the sample size n = 10 therefore exhibits horizontal lines for the values: 6.2% 15.9% 25.5% ... 74.5% 84.1% 93.8%.
Hint: The cumulative frequency H i ( n ) to the rank i can also be calculated with the following approximation formulas
H i (n) = i 0.5 n
and
H i (n) =
i 0.3 . n + 0.4
The deviation from the exact value in the table is thereby insignificant. Approximating values for n = 10 : 5% 15% 25% 35% 45% 55% 65% 75% 85% 95%
- 23 -
4. Comparison of Samples Means

4.1 t Test
The t test is a statistical method with which a decision can be made to determine whether the mean values of two samples are significantly different. In order to clarify the functional nature of t tests, we will perform the following mental experiment: We derive from a normally distributed population N( , ) two samples each of size n, calculate the mean values y 1 and y 2 as well as the standard deviations s1 and s 2 (or the
2 2 variances s1 and s 2 ) and finally deduce the value
t=
y1 y 2
2 s12 + s 2
t can take values between 0 and + . If we repeat this process very often, we expect that
mainly values near zero occur and very large values are rarely found. This mental experiment was performed by computer simulation. For n = 10 and 3,000 sample pairs ( t -values), the result was the histogram represented in Fig. 4.1.
Fig. 4.1
- 24 -
If one simultaneously lets the number of samples approach infinity and the class width approach zero, the histogram will more and more approach the straight line that represents the density function of the t distribution. The upper limit of the 99% random variation range (percentage point) is, in this example, t 18; 0.99 = 2.88 , i.e. only in 1% of all cases can values greater than 2.88 randomly occur. Percentage points of the t distribution are tabled for different error probabilities depending upon the number of degrees of freedom f = 2 ( n 1) (Table 2). The t test approach is based on the relationship represented above. A decision shall be made whether the arithmetic mean values of two existing series of measurements (each of size n ) can belong to one and the same population or not. As the so-called null hypothesis, it is therefore assumed that the mean values of the respectively affiliated population are equal. Hence, the test statistic becomes calculated from both the mean values y 1 and y 2 as well
2 2 as the variances s1 and s 2 :
t=
y1 y 2
2 s12 + s 2
for n 1 = n 2 = n .
If t > t 2 ( n 1); 0.99 is the result, i.e. t lies outside the 99% random variation range, the null hypothesis will be rejected. Hint: The expression for the test statistic t is then, in the simplest form only applicable when both the variances of the populations as well as the sample sizes are assumed to be 2 equal respectively ( 1 = 2 and n 1 = n 2 = n ). The prerequisite for equal variances can be 2 tested with the help of an F test (see 5). The t test, in the form represented here, tests the null hypothesis 1 = 2 against the alternative 1 2 . As such, a two-sided question exists. For this reason, the absolute value of the difference of the means is contained in the expression for t .
t can hence only assume values 0 , so that the distribution depicted in Figure 4.1 results.
Table 2 in Section 12 gives the 95%, 99%, and 99.9% percentage points of the t distribution in correspondence with the two-sided question. They correspond to the one-sided percentage points: 97.5%, 99.5% and 99.95%.
- 25 -
4.2 Minimum Sample Size

In the preceding Section 4.1 it was explained how one can decide, by means of a t test, whether or not the mean values of two samples are significantly different. This decision is frequently the goal of experiments, by which the change of a target characteristic in dependence upon two system states or two settings of an influence factor is to be determined. The subsequent intention with respect to pursued system optimization is to choose the better one between two selected settings. This especially applies to experiments witch use orthogonal arrays, by which several influence factors are concurrently varied on two levels (see Chap. 7). The executed factorial analysis of variance (see 8.2) in the scope of the evaluation of such experiments is, in principle, nothing other than a comparison of mean values of all experiment results attained for two settings (levels) of an influence factor, by considering experimental noise. In the preparatory phase of such experimental investigations, the experimenter often asks the question: which minimum mean value difference is of actually interest in view of his target (system optimization, production simplification, costs reduction), and which minimum sample size n must be chosen, so that the minimum mean value distance, if actually existent, is ascertained as a result of the experimental evaluation (significant). From the expression for the test statistic t (see Section 4.1)
t=
y1 y 2
2 s12 + s 2
it is apparent that for a significant test result, n must be the greater, the smaller the mean 2 value difference y 1 y 2 is and the greater the variances s12 and s 2 of both of the series to be compared are. Note that the table value t Table is smaller at increasing number of degrees of freedom f = 2 ( n 1) . Visually, a small difference of mean values by a simultaneously greater variance of distributions means that both groups of values are visually either indistinguishable or are hardly distinguishable in a graphical representation of both measurement series.
Based on the previous discussion, it is possible to estimate the minimum sample size n roughly, by assigning the mean value difference as a multiple of a mean variance
2 s 12 + s 2
and for different n the calculated test statistic t is compared with t Table (ob-
serve the degrees of freedom and significance level!).
- 26 -
Besides this trial method, however, there is an exact deduction method for the minimum sample size from the statistical point of view, which we only sketch roughly at this point (deduction in [1] and [7]). By comparing the mean values of two series of measurements and the corresponding testdecision, two types of errors are possible. In the first case, both series of measurements originate from the same population, i.e. there is no significant difference. If one decides here, due to a t test, that a difference of both mean values exists, then an error of the first kind ( ) is made. It corresponds to the significance level of the t test (for example = 1% ). If, in the second case, a difference of the mean values actually exists, i.e. the measured series originates from two different populations, then this will not be indicated with absolute certainty by the test. The test result can coincidentally indicate that this difference does no exist. One speaks in this case about an error of the second kind ( ). For the person performing the experiment, both of these error types are unpleasant, because for example due to the likely significant effect of an influence factor, further expensive investigations may be initiated or even changes in the production process (error of the first kind; type I error), or because the actually significant effect is not identified, the chance to make possible process improvements (error of the second kind; type II error) is missed. The minimum sample size n , which is required in order to identify a real mean value difference depends upon both the distance 2 1 = D =
units of standard deviation in correspondence with the above plausibility consideration and the error probabilities and .
of the mean values given in
(u n=
+ u D2
In the concrete case of comparing two series of measurements, the mean values 1 and
2
as well as the standard deviation of the population (subsequently also D ) are not
known. They become estimated through the empirical values y 1 , y 2 and s . For this reason, when calculating n according to the given formula, the t distribution must be taken as a basis. Accordingly, u and u are the abscissa values u , by which the t distribution assumes the values (two-sided) or (one-sided). Smaller error probabilities, i.e. smaller type I ( ) and type II errors ( ) mean that both distributions to be compared and thus also the distributions of the mean values may only marginally overlap. For this, with a given mean values distance D , the sample size n must be chosen adequately large.
- 27 -
The representation on the following page gives the minimum sample size n in case, a weaker, medium or stronger effect should be identified. The given figures are based on error risks of 2% ( ) and 5% ( ) (see Table E1 in [7]). If one intends to conduct an experiment using a complete orthogonal array for investigating influence parameters on 2 levels each, then the number of experiments per experimental row should be chosen large enough that the product
Number of rows Number of runs per row 2

substantially exceeds the given minimum sample size.
- 28 -
Stronger effect
Medium effect
Weaker effect
- 29 -
5. F Test
The F test is a statistical method, with which it can be decided, whether the variances of two samples are significantly different. The functionality of the test can be explained, just as in the case of the t test, using the result of a computer simulation. We take two samples of sizes n1 and n 2 from a normally distributed population N ( , )
2 and calculate the sample variances s12 and s 2 , and from this finally calculate the quantity
F=
s12
2 s2
F can take values between 0 and + . It is plausible that by frequent repetition of this procedure, small values near zero and very large values result very rarely.
The results of a computer simulation, by which the F -values for N = 3,000 sample pairs are determined with sample sizes n1 = n 2 = n = 9 , are represented as a histogram in the following figure.
Figure 5.1
- 30 -
If one lets the number of samples approach infinity and, at the same time, the class width approaches zero, the histogram will approximate the line in Fig. 5.1 (density function of the F distribution). The shape of the histogram depends upon the sample sizes n1 and n 2 of the investigated sample pairs; the curve shape of the density function of the F distribution correspondingly depends upon the degrees of freedom f 1 = n 1 1 and f 2 = n 2 1 .
The upper limit of the 99% random variation range (percentage point) in the calculated example is F8; 8; 0.99 = 6.03 , i.e. only in 1% of all cases (error probability) is random
2 2 s 1 6.03 s 2 .
The percentage points of the F distribution are tabled in the appendix for different error probabilities dependent upon the degrees of freedom f 1 and f 2 . The relationship represented above makes the approach by F test understandable. It should be decided whether or not two series of measurements, with sizes n1 and n 2 , originate from two normally distributed populations with the same variance (the mean values do not need to be known). As a null hypothesis, it is assumed that the variances of the respective populations are 2 equal: 1 = 2 . 2 Finally, the test statistic F =
s12
2 s2
2 will be calculated from the variances s12 and s 2 of both
measurement series and compared with the percentage point of the F distribution. If the result is F > Fn 1 1; n 2 1; 0.99 , i.e., F lies outside of the 99% random variation range, then the null hypothesis will be rejected.
Remark:
2 The alternative hypothesis is 1 > 2 ; a one-sided problem is in question. 2
2 In principle, when one writes the greater one of the two variances s12 and s 2 above the fraction line, then F can only assume values greater than 1; now there is a two-sided question. If an error probability of = 1% is chosen the percentage point must be determined with an accuracy of 99.5%.
- 31 -
6. Analysis of Variance (ANOVA)

With the help of the t test (Section 4.1) a determination is made whether the mean values of two series of measurements are significantly different. The series of measurements to be compared can be considered formally as experimental results for both respective levels 1(e.g. material A ) and 2 (material B ) of an individual influence factor (material). If one expands the one-factor-at-a-time experiment to more than two levels (general: k levels), then it is no longer possible to compare the mean values using the t test. In this case, an evaluation can occur by means of the analysis of variance. If the factor A has no influence upon the measurement results, then all individual results y i j can be seen as originating from the same population. The y i j and thus also the mean values y i are then only subjected to random deviations (experiment noise) of the common mean value . In the other case - the factor A has a significant influence upon the result of measurement the mean values 1 , . . . , k of the distributions belonging to the levels A1 , . . . , Ak of the factor A will be different. In the scope of the analysis of variance, one sets k independent, normally distributed populations with the same variance as prerequisite and formulates the null hypothesis: All
1 = 2 = . . . = k = (Remark: Since identical variances were a prerequisite, the null
measured
values
originate
from populations
with
the
same
mean
value
hypothesis means that all measured values originate from one and the same population). Therefore one calculates the mean variance within the experimental rows (levels of A )
2 2 s2 = s y =
1 k
(s )
k 2 y i =1
2 as well as the variance between the experimental rows (levels of A ) s12 = s y . 2 2 s y Is a measure for the experimental noise. s y Is the variance of the mean values y i .
If the null hypothesis is correct, both factors are estimates of the variance of the underlying population:
2 $2 1 = n sy
$2 2 = s2 . y
The factor n is to be considered because of the relationship y =
y
n
- 32 -
Finally, one conducts an F test with the test statistic
F=
2 n sy 2 sy
(comparison of both estimates), and rejects the above formulated null hypothesis, if
F > Fk 1; ( n 1) k ; 0.99 .
(percentage points for F in the appendix)
Rejection of the null hypothesis means: a significant difference exists with regard to the mean values y i of the results of measurement for the levels of factor A , or: factor A has a significant influence upon the result of measurement.
Figure 6.1
Figure 6.2
- 33 -
Figures 6.1 and 6.2 should illustrate the importance of this fact. Along the diagonals, the density functions of normal distributions with equal variance are represented respectively. In the corners of the figures, the density functions of the mixture of distributions (top left) and of the distribution of the mean values (bottom right) are represented. The distributions on Figure 6.1 are only subjected to small mean-value fluctuations, the mixture of distributions is nearly normally distributed. The variance of the distribution of mean values and original distributions are rarely different, so that an F test does not reject the null hypothesis (identical mean values). In comparison with this, the mean values of the seven distributions in Figure 6.2 show greater fluctuations, the variance of the mean-value distribution is substantially (significant) greater than that of single distributions. Accordingly, the null hypothesis, that is the assumption of identical mean values, will in this case be rejected within the scope of an analysis of variance.
6.1 Deriving the Test Statistic

The term analysis of variance is based on the decomposition of variation of all measured values in both parts - random variation (experimental noise) and systematic deviation of the mean values associated with the above represented formality. This decomposition is described as follows. When k represents the number of rows and n the number of measured values (experiments) per row, then the overall variance of all n k measured values is given by
s2 =
n 1 n k 1 i =1
(y
j =1
ij
The quantity Q = ( n k 1) s 2 is called the sum of squares (SS).

Q=
(y
n k i =1 j =1
ij
Q=
(y
n k i =1 j =1
ij
yj +yj y
(expansion with zero)
Q=
(y
i =1 j =1
ij
yj
+ ( yi j y j ) (y j y) + ( y j y) 2
If we first consider the middle term:
i =1 j =1
( yi j y j ) ( y j y ) = (yi j y j ) y j ( yi j y j ) y .
i=1 j =1 i =1 j =1
- 34 -
j =1
n n k n k yj ( y i j y j ) y yi j + y y j i =1 j =1 i =1 j =1 i =1
j =1
n yj ( y i j y j ) n k y 2 + n k y i =1
(y
j =1
(n y j n y j ) = 0
Therefore:
Q=
i =1 j =1
(y
n k
ij
yj
i =1 j =1
(y
n k
Q = ( n 1) s 2 + j
j =1
( k 1) s
i =1
2 y
2 2 ( n k 1) s 2 = k ( n 1) s y + n ( k 1) s y
Q = Q1 + Q 2
Overall variation = experimental noise + variation of mean values
Degrees of freedom of Q1 : Degrees of freedom of Q 2 : Degrees of freedom of Q :
f 1 = k ( n 1) f2 = k 1 f = n k 1
Equation of the number of degrees of freedom:
f = f 2 + f1
n k 1 = k 1 + k ( n 1) n k 1= nk 1
Q2
Test statistic:
F =
f2 Q1 f1
n ( k 1) 2 sy 2 n sy k 1 = = 2 k ( n 1) 2 sy sy k ( n 1)
- 35 -
6.2 Equality Test of Several Variances (According to Levene)

With the one-way analysis of variance, it is investigated whether a factor A has a significant influence upon the result of measurement. Thus a determination is made whether the mean values 1 , . . . , k of the measurement results which belong to the levels
A1 , . . . , Ak are significantly different.

Frequently the aim of the experiments in this case is to maximise or to minimise a target quantity. In connection with investigating disturbance-insensitive (robust) designs, it can be of interest to find out parameter settings, at which the experimental results possibly exhibit little variation (variance). For this reason, it is sensible to initially check whether the variances of the results in the individual experimental rows are significantly different. Experiment No. 1 2 Results Mean Variance
x 11, x 12 , . . . , x 1n x 21, x 22 , . . . , x 2 n

x1 x2

s12
2 s2
x k1, x k 2 , . . . , x k n
xk
s k2
Deviating from our notation to date, we designate the determined measured values with x and calculate the row mean values x i as well as the variances within the rows s i2 . To test the equality of these variances s i2 , Levene proposes the following method: 0. Formulate the null hypothesis: All results of measurement originate from populations with equal variance: 2 1 = 2 = ... = 2 . 2 k 1. Calculate the absolute deviations of the results of measurement x i j from the mean values x i . This corresponds to a transformation according to the equation:
yi j = xi j xi .
The transformed values y i j are entered in the evaluating scheme. Further calculation is done exclusively with the transformed values y i j .
- 36 -
Experiment No. 1 2
Results
Mean
Variance
y 11, y 12 , . . . , y 1n y 21, y 22 , . . . , y 2 n

y1 y2

s12
2 s2
y k 1, y k 2 , . . . , y k n
yk
s k2
2 2. Calculate the mean values y i and variances s y
2 3. Calculate the mean value of the variances s y 2 4. Calculate the variance s y of the mean values y i
5. F test with the test statistic
F=
2 n sy 2 sy
Degrees of freedom: f 1 = k 1,
f 2 = ( n 1) k
If F , for example, is greater than the percentage point Fk 1; ( n 1) k ; 0.99 , then the null hypothesis will be rejected with an error probability <1%, i.e. the variances s i2 of the original values x i j differ by a high degree.
Remark: After transforming the measured values, the method to be followed is identical to the oneway analysis of variance described in the preceding section.
- 37 -
7.
Design of Experiments with Orthogonal Arrays and Evaluating such Experiments
In this section, two simple examples will be used to represent how orthogonal arrays are applied:
Example 1: One-factor-at-a-time method

The change in length of an alloy should be determined through experiment. Two experiments will be performed. 1. Experiment: length at T1 = 25 C 2. Experiment: length at T2 = 100 C
Figure 7.0.1
L 1 (25 C ) = 100.04 cm L 2 (100 C ) = 10016 cm .

One starts with the fact that a linear relationship exists between expansion and temperature and therefore wants to calculate the equation of the straight line in order to determine arbitrary intermediate values. Equation of the straight line: L = A0 + A1 T Through a coordinate transformation, as it is schematically represented in Figure 7.0.1 through the second x-axis, the pair of values ( T1, T2 ) will be formally transformed in (-1, +1).
- 38 -
The transformation equation is x =
T2 + T1
.
2 T2 T1 2
Remark: This equation can be written in the form given in 7.1 through a simple transformation:
x =
2 (T T2 ) + 1. T 2 T1
x = T 62.5 . 37.5
Substituting the values T1 = 25 C and T2 = 100 C gives: For T = T 2 follows: For T = T1 follows:
x = +1. x = 1.
L = a 0 + a1 x .
In the transformed coordinate system, the straight line equation is: From there, follows for x = +1: for x = 1:
10016 = a 0 + a 1 , . 100.04 = a 0 a 1 .
At this point the reason for the coordinate transformation is clear; the coefficients a 0 and
a 1 are thus easy to calculate by addition or subtraction of both equations:

a0 = 10016 + 100.04 . = 1001 . 2 a1 = 10016 100.04 . = 0.06 . 2 a0 = L2 + L1 2
.
The coefficient a 0 is the mean value of both lengths:
The coefficient a 1 is the half effect (see Figure 7.0.1):
a1 =
L2 L1 2
Thus, in the transformed system the equation of the straight line is:
L=
L2 + L1 2
L2 L1 2
L = 1001 + 0.06 x . .
The equation of the straight line in the original system is found by reverse transformation
L = 1001 + 0.06 .
T 62.5 37.5
L = 100 + 0.0016 T .
- 39 -
Example 2: Two-Factor Design

This example should clarify the mathematical procedure followed when evaluating experiments using orthogonal arrays applying a known and analytically exact physical fact Ohms law. We put ourselves in the position of an experimenter, who does not know the relationship between voltage, current and resistance and wants to investigate it with the help of a simple experiment. We assume he has conducted four individual experiments according to Figure 7.0.2 and ignores experimental repetitions and measuring errors.
Figure7.0.2
R 1 = 20 I1 = 4 A
R 2 = 60 I 2 = 12 A
Searched: U = f ( R, I )
Transformation:
x1 =
R 2 + R1 =
2 R 2 R1 2
R 40 20
x2 =
I 2 + I1 =
2 I 2 I1 2
I 8 4
- 40 -
Multilinear formulation of solution:
U = a 0 + a1 x 1 + a 2 x 2 + a12 x 1 x 2
1. x 1 = 1 2. x 1 = + 1 3. x 1 = 1 4. x 1 = + 1
x 2 = 1 x 2 = 1 x 2 = +1 x 2 = +1
a 0 a1 a 2 + a 12 = 80 a 0 + a 1 a 2 a12 = 240 a 0 a1 + a 2 a12 = 240 a 0 + a 1 + a 2 + a12 = 720
On the right side there are the voltages U , determined by individual experiment combinations.
a0 = a1 = a2 = a 12 =
80 + 240 + 240 + 720 = 320 4 720 + 240 240 + 80 = 160 4 4 720 + 240 240 + 80 = 160 4 4 720 + 80 240 + 240 = 80 4 4
Substituted in the formulated solution, one gets: Reverse transformation:
U = 320 + 160 x 1 + 160 x 2 + 80 x 1 x 2 .
U = 320 + 160
R 40 I 8 R 40 I 8 + 160 + 80 20 4 20 4
U = R I
Remark: In this example, the right solution (Ohms law) is bound to come out because the multilinear form U = a 0 + a 1 x 1 + a 2 x 2 + a 12 x 1 x 2 was just the right formulation. A more complex functional relationship with quotients or exponentials of the influence factors would be described with this formulation only approximately or otherwise never described at all (see 7.3).
- 41 -
Generalization: For two factors and two levels, the equation of the multilinear form in the transformed system is:
y = a 0 + a 1 x 1 + a 2 x 2 + a12 x 1 x 2 .
The coefficients can easily be determined with the following matrix. One designates this matrix as an orthogonal arrangement or an orthogonal array (see 7.4). The term orthogonality in this connection, simply said, means that in each column both levels (-) and (+) appear equally frequently (see also general formulation scheme in 7.4.1). The orthogonality is explained in [1] through mathematical orthogonality conditions.
I
+ + + +
x1
+ +
x2
+ +
x1 x 2
+ +
y y1 y2 y3 y4
a0 = a1 = a2 = a 12 =
y1 + y 2 + y 3 + y 4 4 ( y 2 + y 4 ) ( y 3 + y1 ) 4 ( y 3 + y 4 ) ( y1 + y 2 ) 4 ( y1 + y 4 ) ( y 2 + y 3 ) 4
The coefficient a 0 is the mean value of all measurement results. The coefficient a 1 is the half mean effect through a change of x 1 from -1 to +1.
a1 =
Effect ( x 2 = + 1) 2
Effect ( x 2 = 1) 2
- 42 -
7.1 Representing the Results of Measurement

In this section, we want to limit ourselves to 2 m -designs, that is, designs, by which m influence quantities (factors) are varied at 2 levels. The simplest, trivial case m = 1 corresponds to a one-factor-at-a-time experiment, by which the dependence of a target characteristic y of only one influence factor A is investigated. The functional relationship y = f ( A) is easily represented through the graph of the function. (Figure 7.1.1).
Figure 7.1.1
The change of the target characteristic y when changing from A to A+ is called the effect of the factor A. It depends upon the selection of the settings A and A+ . Through connecting both points ( A , y 1 ) and ( A+ , y 2 ) one attains a piece of a straight line, which represents a simple approximation of the actual, unknown curve. The above basic considerations can be transferred to full and fractional factorial designs with two or more factors. The possibilities for two-dimensional representation of the results of experiments however, are limited. Target characteristics, which are dependent upon two or more factors, are represented in the technical literature mostly in form of contours or arrays of characteristic curves (performance curves, operating diagrams). This association should be illustrated using the following figures.
- 43 -
Figure 7.1.2
Figure 7.1.3
The representation in Figure 7.1.3 shows the contours of a hill (see Figure 7.1.2), as are found on topographic charts. In the example shown, a jump from a line to the neighbouring line corresponds to a height difference of 10 m. Closely neighbouring contours represent a steep ascent in a direction perpendicular to the contours. If one remains on a closed contour, then one moves - pictorially speaking - at a constant height around the hill. If we refrain from the picture of a hill and consider instead of the height generally a function y , which depends upon the parameters A and B : y = f ( A, B ) .
- 44 -
y is a target characteristic, whose value is determined by the setting of the factors A and B. Each setting (A, B) then corresponds to a point in the A-B-plane and this again to a value y = f ( A, B ) . One finds for instance the following results: A 6 12 6 12 B 12 12 20 20 y 43 62 78 113
The four points ( A, B) form a rectangle in Figure 7.1.3. They are drawn in Figure 7.1.2 over the A-B-plane with y as the third coordinate, which corresponds to the height above this plane. From this representation, it is just as apparent as in Figure 7.1.1, that when dealing with factorial designs at two levels, a linear model (straight line, plane) is taken as a basis, in order to approximate the unknown, in general, curved response surface. Figure 7.1.4 shows a further way to represent these results. The target characteristic y is entered as a function of A with B as fixed parameter.
Figure7.1.4
In Figure 7.1.3, an attempt is made to illustrate the three-dimensional surface y = f ( A, B ) it corresponds to the hill surface two-dimensionally depending upon both factors A and B. The dotted curves in Figure 7.1.4, on the contrary, represent the function y respectively when B is fixed : y = f ( A, B = const .) They are, as such, the intersection lines of a perpendicular cut through the hills surface whereby B is constant (see Figure 7.1.2). - 45 -
Analogous to that, the dotted lines in Figure 7.1.5 represent the function y when A is constant.
Figure 7.1.5
These facts are illustrated by the following figures, as further examples.
Figure 7.1.6
- 46 -
Fig. 7.1.7
Fig. 7.1.8
Fig. 7.1.9
- 47 -
In principle, one can also use these methods for representing results of experiments. The above scheme can be simplified, in which, one transforms the factorial levels A1 = 6 ,
A2 = 12 , B1 = 12 , B 2 = 20 respectively according to the following rule:

X*= 2 ( X X 2 ) + 1. X 2 X1 A1* =
* A2 =
Example:
2 ( A1 A2 ) + 1 = 1 A2 A1
2 ( A2 A 2 ) + 1 = + 1 A2 A1 2 ( B1 B 2 ) + 1 = 1 B 2 B1 2 ( B2 B2 ) + 1 = +1 B 2 B1
B1* =
* B2 =
If one considers only the attained signs, then after the coordinate transformation one attains the following design matrix for the two-factor design with two levels, instead of the above scheme. No. 1 2 3 4 A + + B + +
y1 y2 y3 y4
The second row corresponds accordingly to an experiment in which the factor A is set on the upper level (+), the factor B on the lower level (-). Instead of using the form A1, A2 for the settings of factor A one frequently uses A and A + . In the column y the results are y 1 , . . . , y 4 of the four experiment rows. They allow being represented in the following form.
- 48 -
Figure7.1.10
Figure7.1.11
This form of representation is also applicable, when one (or several) of the investigated factors is not a quantitative adjustable variable, but instead a qualitative variable with fixed levels (e.g. material 1 - material 2). Naturally, an interpolation of intermediate values is not reasonable in this case. The results of three influence factors can be graphically represented by expanding Figure 7.1.10, into the form of a cubical. Each corner point thus corresponds to a combination of levels of the factors A, B and C. When dealing with more than three factors, only two or three-dimensional projections of an n-dimensional experimental space can be represented.
- 49 -
The linear model which underlies the two level designs, is described mathematically through a multilinear form (two-dimensional example):
y = f ( x 1 , x 2 ) = a 0 + a 1 x 1 + a 2 x 2 + a 12 x 1 x 2 .
Thus, x 1 and x 2 are variables assigned to the factors A and B. The function y defines a straight line respectively for a fixed x 1 or a fixed x 2 ( x 1 = constant or x 2 = constant):
y = f ( x 1 , k ) = a 0 + a 1 x 1 + a 2 k + a 12 x 1 k = (a 0 + a 2 k ) + ( a 1 + a 12 k ) x 1 = b + m x1 .
The expression a 12 x 1 x 2 describes the interaction of the factors A and B. If a 12 is different from zero, then both factors do not behave purely additive, i.e. the change of the target characteristic y through transition from A to A + the effect (total effect) of A depends upon the setting of B . Thus, in the example for Figure 7.1.2 or Figure 7.1.3, y increases from 43 to 62 in case of a transition from A = 6 to A + = 12 (total effect of A = 19) when B = 12 or y increases y from 78 to 113 because of a change from A = 6 to A + = 12 (total effect of A = 35) when B = 20 . The influence of interaction can also lead to the fact that instead of a positive effect of A (B) with another setting of B (A) a negative effect occurs, as the following figures illustrate. It might be intuitively clear that neglecting the interaction can lead to fully false results or conclusions. Beyond that, the mean effect of a factor can be equal to zero as a result of the interaction (see Section 7.2). The examples in Figures 7.1.12 - 7.1.15 illustrate the importance of interaction. It shows how the response surface of y is deformed when the interaction term is altered step by step.
Fig. 7.1.12
- 50 -
Fig. 7.1.13
Fig. 7.1.14
Fig.7.1.15
- 51 -
These representations show clearly the principle appearance of a surface described by a multilinear form. The linearity with respect to both coordinates is obvious. In addition, it is seen that the minimum or maximum of every considered straight line respectively lies on the boundary of the experimental space.
7.2 Calculating the Effects

The effect of a factor gives the change of the target characteristic y, when a change takes place from - level to + level, as an average over the settings of all the other factors. Naturally, the effect depends upon the explicit choice of the levels. A graph of the effects, for the example of the two-factor design, is shown in Fig 7.2.1. As long as the factors behave in an additive manner, both lines are parallel (see Figure 7.1.11). If, on the contrary, the effect of a factor depends upon the setting (level) of another, then an interaction of these factors exists, since they do not behave in an additive manner. The evaluation matrix of the two-factor design contains a column AB for the interaction of these factors in addition to the columns for the factors A and B. No. 1 2 3 4 A + + B + + AB + +
y1 y2 y3 y4
Fig. 7.2.1
- 52 -
Fig. 7.2.2
Fig. 7.2.3
The effect of factor X is calculated as a difference from the mean value of all y, resulting when X has the + level and the mean value of all y, resulting when X has the level -. This calculation rule is analogous for interactions and may be used generally for orthogonal designs with m factors. For this example the following is valid:
Effect ( A) =
y( A
2 m1
y( A
2 m1
y2 + y4 2 y3 + y4 2
y1 + y 3 2 y1 + y 2 2
Effect ( B ) =
y(B
2
m1
y(B
2
m1
- 53 -
Effect ( AB ) =
y ( AB
2
m1
y ( AB
2
m1
y1 + y 4 2
y2 + y3 2
Here, the designation of the factor levels with + and - as opposed to the notation 1 and 2, that is frequently used, proves advantageous, since the signs of y i on the right side of these equations can directly be read for A, B and AB from the evaluation matrix. Furthermore, the column AB of the evaluation matrix can be determined character-wise as the product of the columns A and B ( ( 1) ( 1) = + 1). When dealing with fractional factorial designs, confounding of factors with interactions can occur. The effects of confounded quantities can then no longer be calculated separately.
Hint: Calculation of mean effects is given here only as a matter of completeness. By using the Figures 7.1.6 - 7.1.9, one can easily see that if a stronger interaction AB exists, the mean effect of both factors A and B can become zero, although each factor exhibits great total effects.
- 54 -
7.3 Regression Analysis

From the factors effects, the coefficients of the multilinear form (regression polynomial) may be calculated by using the coordinate transformation which transforms the setting values of factors into the coded form, + level, - level. The searched coefficients correspond to half of the effects. Consider, as an example, the function: y = 3 + 4 x 1 2 x 2 + 5 x 1 x 2 . The four experiments with the settings
A = 5 B = 6
A + = 10 B + = 12
would accordingly deliver the following results if experimental noise remained unconsidered:
y 1 = 3 + 4 5 2 6 + 5 5 6 = 161 y 2 = 3 + 4 10 2 6 + 5 10 6 = 331 y 3 = 3 + 4 5 2 12 + 5 5 12 = 299 y 4 = 3 + 4 10 2 12 + 5 10 12 = 619 .

We now proceed as though the above initial polynomial was unknown and try to derive the coefficients from the experimental data (see 7.2).
Effect ( A) =
y2 + y4 2 y3 + y4 2
y1 + y 3 2 y1 + y 2 2
= 245
Effect ( B ) =
= 213
Effect ( AB ) =
y1 + y 4 2
y2 + y3 2
= 75
Constant term =
y1 + y 2 + y 3 + y 4 4
= 352.5
If one now substitutes half of the effects as coefficients into the polynomial (model)
y = a 0 + a 1 x 1 + a 2 x 2 + a 12 x 1 x 2
and considers the coordinate transformation (see Section 7.1)
X*=
2 ( X X 2 ) + 1, X 2 X1
- 55 -
then this results
y = 352.5 +
245 2 ( A 10) + 1 2 10 5 213 2 ( B 12) + 1 2 12 6 2 75 2 ( B 12) + 1 ( A 10) + 1 2 10 5 12 6
+ +
and after solving this expression:
y = 3 + 4 A 2 B + 5 AB .
It is therefore possible to calculate the coefficients of the regression polynomial from the results of the experiment which was chosen as a formulation model for the experimental design. Therefore, it is possible to determine interpolated values within the experimental space. If one or several additional experiments are conducted in the center of the experimental space (e.g. rectangle in Figure 7.1.3) (design with center point), it is possible to get information about the adequacy of the model used as a basis, by comparing the results for this point with the corresponding interpolated values, i.e. about the quality of the fit. If greater deviations occur between the results of additional experiments and the values interpolated with the help of the regression polynomial, then this shows that the chosen model describes the reality insufficiently, if not fully wrong. Here the whole crux of DOE with orthogonal arrays shows itself: right results can only be attained with the right model.
7.4 Factorial Designs

7.4.1 Design Matrix In Section 7.1, the creation of a simple scheme for a 2 2 -design is shown by considering a coordinate transformation: No. 1 2 3 4 A + + B + +
Strictly speaking, one can interpret the first two rows of the design as a one-factor-at-atime experiment, where the factor A is set to the lower (-) or upper (+) level, while the factor B is on the - level.
- 56 -
In the rows 3 and 4, A is set on the two - and + levels, though B is held fixed on + level. This scheme is the basis for a general rule of factorial designs that is made clear by means of the following representation.
2 5 -Design 2 4 -Design 2 3-Design 2 2 -Design

Experiment
A
+ + + + + + + + + + + + + + + +
B
+ + + + + + + + + + + + + + + +
C
+ + + + + + + + + + + + + + + +
D
+ + + + + + + + + + + + + + + +
E
+ + + + + + + + + + + + + + + +
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Scheme for illustrating the general rule for factorial designs (see [1] p. 53).
- 57 -
Should three factors A, B, C be varied on two levels each, the four experiments of 2 2 design must be conducted respectively with factor C on - or + level, resulting in the 2 3 design with 8 rows. Analogously, one obtains full factorial designs with 4, 5, and generally m factors, with 2 4 , 2 5 or 2 m rows. It becomes clear that with the increasing number of factors to be investigated, the number of rows grows exponentially, whereby repetition of experiments is still not considered (see Section 8).
7.4.2 Evaluation Matrix Until now we have considered the design matrix of an experiment exclusively in this respective section. In Sections 7.1 up to 7.3, however, the importance of interaction was explained and it was shown, by means of an example of the 2 2 -design, how this is handled when working within the scope of an evaluation matrix. An evaluation matrix contains columns for interactions in addition to the columns of factors, which are identical with the columns of the design matrix. While in the case of an evaluation matrix of the 2 2 -design only the AB interaction is considered, three two-factor interactions AB, BC and AC as well as one three-factor interaction ABC must already be considered when dealing with a 2 3 -design for 3 factors A, B and C:
A
1 2 3 4 5 6 7 8 + + + +
B AB
+ + + + + + + +
C AC
+ + + + + + + +
BC
+ + + +
ABC
+ + + +
Evaluation matrix of a three-factors design
It should be remembered that these interactions correspond to the coefficients of a multilinear form. Considering all interactions in the case of the 2 3 -design, a model of the form
y = a 0 + a 1 x 1 + a 2 x 2 + a 12 x 1 x 2 + a 3 x 3 + a 13 x1 x 3 + a 23 x 2 x 3 + a 123 x 1 x 2 x 3
will be chosen (see 7.1).
- 58 -
Remark: In this model, the designations x 1, x 2 and x 3 will be used instead of the names A, B and C for the three factors. Correspondingly, e.g. a 12 is the coefficient of the interaction AB.
The columns of the evaluation matrix assigned to the interactions can be calculated, character-wise as products of the columns of related factors ( ( 1) ( 1) = + 1). For example, the column for the interaction AC results when one multiplies the columns of the factors A and C with each other.
7.4.3 Confounding If all 8 experiments of a 2 3 - design were conducted, the effects and thus the coefficients of the model for all factors and interactions can be calculated separately. Mathematically considered, the calculation of the coefficients means solving a system of 8 equations with 8 unknowns (see model, design and evaluation matrix).
y 1 = a 0 a 1 a 2 + a 12 a 3 + a 13 + a 23 a 123 y 2 = a 0 + a 1 a 2 a 12 a 3 a 13 + a 23 + a 123 y 3 = a 0 a 1 + a 2 a 12 a 3 + a 13 a 23 + a 123 y 4 = a 0 + a 1 + a 2 + a 12 a 3 a 13 a 23 a 123 y 5 = a 0 a 1 a 2 + a 12 + a 3 a 13 a 23 + a 123 y 6 = a 0 + a 1 a 2 a 12 + a 3 + a 13 a 23 a 123 y 7 = a 0 a 1 + a 2 a 12 + a 3 a 13 + a 23 a 123 y 8 = a 0 + a 1 + a 2 + a 12 + a 3 + a 13 + a 23 + a 123
The coefficients of this system of equations are easy to calculate due to the simple structure. For example, the constant a 0 can be determined by adding all rows and dividing the sum by 8 (mean of all results y i , see 7.3, Regression Analysis). Owing to the balanced nature of the system of equations in front of every coefficient a plus sign appears as frequently as a minus sign by addition, all members on the righthand side, except for a 0 cancel each other out. In order to calculate a 1 the rows 1, 3, 5 and 7 are multiplied by -1 respectively and then all 8 rows are added together. Again, apart from a 1 all elements on the right-hand side cancel each other out. The calculation for all the remaining coefficients is analogous. If one compares this procedure with the equations in Section 7.2, it will be evident that the calculation of the coefficients of the system of equations and the calculation of half effects of the factors are identical processes. Because a plus sign appears in front of a 0 in every row of the equation system, the evaluation matrix is often given a precedent column with exclusively plus signs, which is designated with I (for identity) or 0.
- 59 -
If less than 8 experiments are conducted, then it is clear that it is no longer possible to determine the coefficients separately . The so-called confounding occurs. This is explained by means of an example of the 2 3 1 fractional factorial design. Where, three factors shall be investigated, only 4 experiments are conducted. Design matrix of the 2 3 1 design (see [9]):
A
1 2 3 4 + +
B
+ +
C
+ +
We now consider how the interaction columns AB, AC and BC of the related evaluation matrix look. They can be calculated as a product of the corresponding columns of the design matrix.
AB
+ +
AC
+ +
BC
+ +
If one compares these columns with the columns of the design matrix, then it is evident that AB with C, AC with B and BC with A are equivalent. Thus, the columns A and BC, B and AC, C and AB in the evaluation matrix are not distinguishable at all. One reckons that the factor A with the interaction BC, the factor B with the interaction AC and the factor C with the interaction AB are confounded.
A BC
1 2 3 4 + +
B AC
+ +
C AB
+ +
Evaluation matrix of the 2 3 1 fractional factorial design
- 60 -
The occurrence of confounded factors will still be somewhat clearer if one directly considers the incomplete system of equations corresponding to the 2 3 1 design:
y 1 = a 0 a 1 a 2 + a 12 + a 3 a 13 a 23 + a 123 y 2 = a 0 + a 1 a 2 a 12 a 3 a 13 + a 23 + a 123 y 3 = a 0 a 1 + a 2 a 12 a 3 + a 13 a 23 + a 123 y 4 = a 0 + a 1 + a 2 + a 12 + a 3 + a 13 + a 23 + a 123 .
If, in this case, the first and third equation are multiplied by -1 and subsequently all four equations are added together, then all elements on the right-hand side apart from a 1 and a 23 will cancel out. They are the coefficients assigned to the factor A or to the interaction BC. Therefore A and BC are confounded. The remaining confounded factors are analogously.
Remark: Strictly considered, one should list an extra column in the evaluation matrix, for entering the identity (column for the constant term a 0 ) and the three-factor interaction ABC. It is neglected as a matter of simplicity.
It is therefore not possible in the preceding example, for instance, to calculate the effect of factor A separate from the effect of interaction BC. Here, a rather strange logic can be used now, which is found in most of the literature on the subject of DOE. The effect of factor A can be determined if one assumes that the interaction BC doesnt exist. This means that one must be sure that the factors B and C behave purely additive. If this is clear, then it is sufficient to investigate B and C with the onefactor-at-a time experiment. In textbooks on DOE, it is often assumed that three-factor and higher interactions are not probable and as such this fact becomes exploited in order to formulate fractional factorial designs of the type 2 m 1 .
- 61 -
We investigate the evaluation matrix of the 2 4 1 design as an example.
B AB CD
C AC BD
+ + + + + + + +
D BC AD
+ + + +
ABC
+ + + +
1 2 3 4 5 6 7 8
+ + + +
+ + + +
+ + + +
Instead of the 2 4 = 16 experiments which would be necessary for investigating four factors on two levels each, in correspondence with the full factorial design, here only 8 experiments will be conducted. If one determines the column of the interaction ABC, then one sees that this corresponds with the column of factor D. Therefore, factor D is confounded with a three-factor interaction. When applying this design it is assumed that the three-factor interaction ABC does not exist. When this assumption is false then a false effect results for D. In addition, two-factor interaction effects cannot be calculated separately. If, for instance, a higher significance of the third column occurs during the column-wise evaluation (factorial analysis of variance ), then it is not determinable whether this is due to the interaction AB or CD. Otherwise AB and CD can compensate themselves (equivalent, counteracting effects). This is not recognisable by the evaluation. The reduction in the extent of experimentation is therefore a trade-off with the risk of a faulty result as well as loss of information. This statement is especially valid for a fractional factorial design with a reduction of the experimental extent by more than factor 0.5 (Taguchi method, see [10]). The rows 1-8 of the 2 4 1 -design correspond to the rows 1, 10, 11, 4, 13, 6, 7, 16 of the complete 2 4 -design (see [9], Appendix). An experiment on the basis of the 2 4 1 -design still allows being rescued, if necessary, by addition of the missing (complementary) eight rows. The confounding of the two-factor interactions amongst themselves will thus be cancelled. When executing the supplementary experiment there is however, no guarantee for the success of the overall experiment. The above considerations regarding the supplementary experiment are naturally transferable in principle to other type 2 m 1 -designs.
- 62 -
7.4.4 Fractional Factorial Designs We want to explain, using the 2 5 2 -design as example, the thought upon which fractional factorial designs are based. The evaluation matrix of this design shows that due to stronger reduction (in comparison to the 2 4 1 -design) the main effects (factors) are already confounded with interactions (1., 2., 4. and 7. column).
A BE
1 2 3 4 5 6 7 8 + + + +
B AE
+ + + +
E AB CD
+ + + +
C DE
+ + + +
AC BD
+ + + +
BC AD
+ + + +
D CE
+ + + +
Evaluation matrix of a 2 5 2 fractional factorial design The design matrix of the 2 5 2 -design is combinatory identical with that of 2 4 1 ( 2 6 3 , 2 7 4 ) designs (see [9]). Only the confounding structure in the related evaluation matrix is different. In textbooks about design of experiments with orthogonal arrays, it is mostly recommended, in this relationship, to assign the factors to the characters (columns) A-E skillfully. What is meant by that? Let us assume that an experiment for a coating process is planned, by which a metal layer becomes applied on a substrate by vapour deposition. Under the application of the 2 5 2 design, the influence of the factors pre-treatment, pressure, temperature, deposition rate type of metal on the target characteristic (e.g. strength of bonding of the coat) should be investigated.
- 63 -
System analysis is said to have proved that no interaction of (surface) temperature to other factors is to be expected. According to textbook opinion, for this case it would be best to designate the temperature as factor E because then the interactions AE, BE, CE and DE will play no role and their confounding with the factors A-D has no negative effects. If no actual temperature interaction with the remaining factors exist, this means, mathematically, that E has a purely additive behaviour, and changing the setting of E always causes the same effect which is independent of the setting of factors A-D. In this case it would be sufficient to investigate E in a one-factor-at-a-time experiment. If, on the other hand, the assumption of freedom of interaction of E was false, the experimental evaluation will apparently result in false deductions. When doing an experimental evaluation, try to determine the significant factors with help of factorial analysis of variance and ascertain a better level setting using the calculated means for the factor levels (see 8.4.2). Through this approach, one obtains a factor level combination, from which one can expect an optimum result of the experiment. In the above example, this could be, for example, the factor level combination A+, B+, C-, D-, E+. It is not contained in the 2 5 2 -design (the design contains only 8 of the possible 2 5 = 32 combinations). Through a verification run using these settings, one must therefore verify whether the combination is actually optimum. If this is not so, then either the assumption of freedom of interaction of E or some other prerequisite upon which the experiment is based, was false.
- 64 -
7.5 Designs for Three-Level Factors

In the preceding sections it was shown that factorial designs of the type 2 m always assume a linear model. Extreme values (maximum or minimum) thus lie respectively on the boundary of the experimental space. The correctness of the model can only be verified through additional experiments. If for example, a local maximum lies within the experimental space (square) according to figure 7.1.10, then its existence can only be ascertained by conducting at least one experiment within this area. In general, deviations from linearity can only be recognized when one selects at least three levels for every factor. Owing to the exponentially increasing number of experiments, three levels represent the upper limit for practical applications, so that for the user the 3 2 and 3 3 -designs may be of vital interest. It is advisable to designate the three factor levels in the transformed coordinate system with -1, 0, +1 (in the literature, designations as 1, 2, 3 also are used). The 3 3 -design has then the following form:
Experiment 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
A
-1 +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 +1 0 0 0 0 -1 +1 0 0 0 0 0
B
-1 -1 +1 +1 -1 -1 +1 +1 -1 -1 +1 +1 0 0 0 0 -1 +1 -1 +1 0 0 -1 +1 0 0 0
C
-1 -1 -1 -1 +1 +1 +1 +1 0 0 0 0 -1 -1 +1 +1 -1 -1 +1 +1 0 0 0 0 -1 +1 0
3 3 -design for three factors with three levels each
- 65 -
The complete model in this case is:
y = a 0 + a 1 x 1 + a 2 x 2 + a 12 x 1 x 2 + a 3 x 3 + a 13 x1 x 3 + a 23 x 2 x 3 + a 123 x 1 x 2 x 3
2 2 + a 11 x 12 + a 22 x 2 + a 33 x 3 .
This is a second-degree regression polynomial. Contrary to the linear model (see 7.4.2) it contains quadratic terms, so that curved surfaces can also be represented. In the evaluation matrix of the 3 3 -design, the -2/3 and +1/3 fractions also appear in addition to the integers -1, 0 and +1. The coefficients of the regression polynomial are thus not all calculated in the same manner (see [1] P.209). For the evaluation of such a design, a computer program should be used in every case. A design for three-level factors will mainly only be considered, when an investigation can be limited to a few factors and an individual experiment can be conducted with comparably less trouble. In practice, it is possible for example, that in the environment of the center point (see Figure 7.5.1) which corresponds to a production state, one should search for better settings for the factors A, B and C. Often one or several investigated factors are discrete influence quantities, e.g. materials or machines. The consideration of a regression polynomial and the calculation of theoretical intermediate values is fully senseless in this case. It is sufficient, therefore, to constrain oneself to a variance analytical evaluation (Section 8) and to determine the factor settings with the best result, based on the the principle of pick the winner. The experimental points of the 3 3 -design can be represented schematically by a cube in the three-dimensional experimental space, where besides the cube corners of the 2 3 design, the midpoints of the faces and edges as well as the center point of the cube are to be investigated (Figure 7.5.1).
Figure 7.5.1: Schematic representation of 3 3 -design
- 66 -
7.6 Central Composite Designs

The SAV software (see 8.4.2) enables calculation and evaluation of central composite designs for three factors. The structure of such a design is apparent from the following representation. In addition to the experiments in the corner points (factorial points), as they are prescribed by a complete 2 3 -design, experiments are conducted in the so-called star points as well as in the center point of the star. Such a design is termed central composite design, when the center point of the star corresponds with the center point of the cube, as shown in the Figure 7.6.1.
Figure 7.6.1: Central composite design for three factors
Apparently, each of the three factors is varied on five levels. It would be considerable to select the levels -2, -1, 0, +1, +2 for every factor. However, the orthogonality of the design would be lost, i.e. the coefficients of the regression polynomial could no longer be determined non-correlated and with the same variance (see [1] P. 228). For these properties to be retained, the factor levels -, -1, 0, +1, + must be selected, whereby depends upon the number of the experimental runs at the factorial points, at the star points and at the center point. The SAV program supports the design structure in that it requests these numbers in physical units as well as the levels belonging to the factorial points and then calculates the star points (levels in physical units) (see [9]). The design matrix of a central composite design for three factors is depicted on the following page. It is apparent that the design is classified according to factorial points, star points and the center point and exhibits a simple scheme within these groups, which enables expanding it for investigating four factors (see [1] P. 226). Obviously, the clarity is lost by more than three factors. A four-dimensional hyper-cube is beyond our imagination.
- 67 -
Experiment Factorial points 1 2 3 4 5 6 7 8 Star points 9 10 11 12 13 14 Center point 15
A
-1 1 -1 1 -1 1 -1 1 - + 0 0 0 0 0
B
-1 -1 1 1 -1 -1 1 1 0 0 - + 0 0 0
C
-1 -1 -1 -1 1 1 1 1 0 0 0 0 - + 0
Central composite design for three factors
In practice the choice of a central composite design is convenient when an initially conducted design of the type 2 3 should be complemented with additional experiments. In this case only seven additional experimental points (six star points and one center point) need to be investigated. Understandably, in case of the central composite designs the number of experimental points increases very quickly with the number of factors (not considering repetitions): Number of factors Number of experiments 2 9 3 15 4 25 5 43 6 77
It, nevertheless, remains substantially below the number of experiments of 5 m required for a complete five-level design for m factors.
- 68 -
7.7 Screening Designs According to Plackett and Burman

In the professional literature it is frequently recommended that the so-called screening designs per Plackett and Burman be employed, to screen out the essential factors from a large number of possible influence factors. The number of experiments (number of rows) is different from factorial designs ( N = 2 k ), an integer multiple of four, thus N = 4 , 8, 12 , 16, 20, 24 , . . .. Furthermore, these screening designs are formulated according to different rules from factorial designs (see Scheme in 7.4.1). Plackett-Burman designs involve highly confounded arrays which bring with them the problems already discussed in 7.4.3. We want to explain this by means of a design with twelve rows (Figure 7.6.1). For that, we consider once more the second example from Section 7 (Ohms law) and select the resistance R and the current I as the factors A and B, and conduct experiments with the following settings (see Figure 7.0.2):
R 1 = 20 I1 = 4 A
Experiment 1 2 3 4 5 6 7 8 9 10 11 12
R 2 = 60 I 2 = 12 A .
A
+ + + + + +
B
+ + + + + +
C
+ + + + + + -
D
+ + + + + + -
E
+ + + + + +
F
+ + + + + + -
G
+ + + + + +
H
+ + + + + + -
I
+ + + + + +
J
+ + + + + +
K
+ + + + + + -
y
80 80 80 240 240 240 240 240 240 720 720 720
+ Effect
2,880 2,880 1,760 1,760 1,760 1,760 1,760 1,760 2,080 2,080 2,080 960 960 2,080 2,080 2,080 2,080 2,080 2,080 1,760 1,760 1,760
320
320
-53
-53
-53
-53
-53
-53
53
53
53
Figure 7.7.1: Plackett-Burman design with 12 rows
- 69 -
By neglecting experimental noise and measurement errors, the values of the target characteristic y (voltage in Volt) as entered in the evaluation matrix (Figure 7.7.1) are obtained by exact validity of Ohms law. This design formally allows investigating a maximum of eleven factors. It is also possible that the experimenter incorrectly identifies the factors C up to K, by using possible influence quantities such as colour of insulator, diameter of connecting plug, number of conductor windings, ... (imaginary variables). In the last row of the scheme, the rounded effect respectively is given for every factor (column) (see 7.2).
Effect =
+
6 6
From these results the experimenter must assume that besides A and B all factors C up to K have a significant influence on the target characteristic y. Beyond this, an additive behaviour of A and B, which does not actually exist, is simulated. Generally, nothing basically changes in this false result, even with repeated experimentation and varianceanalytical evaluation. This example shows that the interaction effect AB distributes itself uniformly on all columns except the first two. This is a basic characteristic of the screening designs: 2-Factor interaction effects distribute themselves among all main effects (columns) with exception of the ones from which they result (see [1] S.120). Through the confounding structure, illusory effects can be simulated as such or the actual main effects that prevail can be compensated. In [1] (P. 119) it is expressly emphasized that owing to this confounding structure, Plackett-Burman designs are only successfully applicable when interactions dont exist. Otherwise fully false conclusions can be made.
- 70 -
8. Statistical Evaluation Procedures for Factorial Designs

Generally, due to disturbance influences, the measurement results of an individual experiment vary more or less strongly when the experiment is repeatedly carried out. It is thus recommended that every experiment, represented by an individual line, be executed several times. During evaluation, the mean y i and the variance s i2 of individual results are calculated by row. The result of the experiment can then be judged with respect to experimental variation within the scope of an analysis of variance. The evaluation matrix of the design (in the example of a 2 2 -design) will be complemented with the corresponding columns on the right, as a matter of convenience: Experiment No. 1 2 3 4 + + + + + +
AB
Results
Mean
Variance
2 sy
yi j
yi
y 11, y 12 , . . . , y 1n y 21, y 22 , . . . , y 2 n
y 31 , y 32 , . . . , y 3n y 41 , y 42 , . . . , y 4 n
y1 y2
y3 y4
s12
2 s2 2 s3 2 s4
The mean s 2 of individual variances represents a measure for the experimental variation. y
8.1 One-Way Analysis of Variance

Initially, one checks whether the variance of results of the rows is significantly different from experimental variation. This occurs in three levels.
2 1. Calculation of the mean value s y =
1 k 2 si . k i =1
This quantity is a measure for the experimental variation.

2 2. Calculation of the variance s y of the mean values y i .
2 sy = k 1 yi y k 1 i =1
with
2 n sy 2 sy
y=
1 k yi k i =1
(Total mean)
3. F test with the test statistic F =
If F is greater than the percentage point F ( 95%) or F ( 99%) of the F distribution with f 1 = k 1 and f 2 = ( n 1) k degrees of freedom, then a significant difference exists (with an error probability < 5% or < 1% ) in the results y i (n = number of the repetitions per row, k = number of rows).
- 71 -
8.2 Factorial Analysis of Variance

With the factorial analysis of variance it can be decided for every factor X, whether it has a significant influence on the result of experiment. Procedure: 1. Calculation of the mean s 2 y This calculation is identical with step 1 in 8.1 (one-way analysis of variance).
2 2. Calculation of the variance s x of the means of measured values per level of factor X
3. Significance test
F=
2 s x Number of measured values per level 2 sy
If the test statistics F lies above the percentage point F ( 95%) or F ( 99%) of the F distribution with f 1 = Number of levels 1 and f 2 = ( n 1) k degrees of freedom, factor X (with an error probability < 5% or < 1% ) has a significant influence on the result of experiment. (n = Number of the repetitions per row; k = Number of rows)
Hint: The SAV program (see 8.4) calculates the significance level for one-way and factorial analysis of variance corresponding to the value of the test statistics.
8.3 Factorial Analysis of Variance with Respect to Variation

In practice, situations can arise in which maximization or minimization of the target characteristic y as an experimental target is of less concern than minimization of the variation of this target characteristic. Thus, factor settings, in which the individual values exhibit the least possible variation, are searched for. With a factorial analysis of variance, what is calculated is the respective mean of all variances where the considered factor was set either at the lower or upper level. For both of these mean variances, an F test is conducted by considering f 1 = f 2 = of freedom (two-sided question).
k (n 1) degrees 2
- 72 -
8.4 Computer Support

Quality Vanguard in cooperation with ZQF created a software for evaluating experiments (SAV). This program supports evaluation of full factorial and fractional factorial designs as well as the evaluation of central composite designs (see [1]). Interested users of design of experiments can obtain the program, including a user manual, free of charge from ZQF, on a 3 1/2-diskette (Both only available in German!). In addition, a program by the Forschungskuratorium Maschinenbau e.V. (FKM) can be obtained from ZQF. From the following overview it is evident that an essential advantage of the FKM-program is the print-out of form sheets for experimentation (factor settings) and for hand-written entry of the measurement results; whereas the SAV program can claim the benefit of somewhat extended possibilities with respect to the selection of experimental designs and rather less demands on computer equipment. Detailed explanations about installation and application of both the programs can be found in the related manuals ([8] and [9]). In Sections 8.4.1 and 8.4.2 it will be shown, by means of an experimental example, that in principle both programs have the same evaluation algorithm (the analysis of variance) and therefore come to identical results. Unfortunately the designations used in both programs are not identical. The FKMprogram, for instance, uses the abbreviations which are used, in the English language, with the analysis of variance. With the explanation in 8.4.1 and 8.4.2 and by comparison with 8.1 and 8.2, the evaluations of the example are still comprehensible to the reader. The print-outs of the FKM-program are in landscape format. For this reason, the tables and graphics had to be slightly changed for this document. With the help of the SAV-program designs with more than two levels per factor can also be evaluated. The program uses natural numbers (1,2,3, ...) to designate the factor levels.
The process investigated, in this example, concerns inserting a connector-plug tongue into a polyamide plate. The connector-plug tongue melts inside plastic material under the effect of ultrasonic. The aim of the investigation was to optimise the strength of this bonding, i.e. to obtain the highest possible press-out forces. Thus, within the scope of a two-level design, the factors cylinder pressure, starting speed and vibrating amplitude of the sonotrode were varied. The importance of the vibrating amplitude (factor C) was proved through the experiment. A substantial improvement of the pull-off strength was achieved by switching the amplitude over to a higher value (upper level).
- 73 -
Comparing the FKM and SAV programs

Program options Computer and software requirements Form-sheets print-out for experimental support Numerical and graphical print-out of mean effects One-way ANOVA with respect to the rows Row-wise print-out of mean and variance Factorial ANOVA Print-out of the significance level Scree plot Possible number of experiments per row Factorial ANOVA with respect to variation For every factor and every interaction representation of grouped results on probability paper Evaluation of 3-level designs Evaluation of central composite designs for 3 factors Evaluation of screening designs Evaluation of highly confounded designs according to Taguchi X = Option available X X X X X maximum 5 arbitrary X X 2-levelled >95% or >99% X X variable 90% up to 100% X X X X FKM Windows 3.0 EXCEL 4 X SAV MS-DOS
- 74 -
8.4.1 Evaluation of an Experiment using the FKM-Program
Print-out of input data Page 1 - Description of experiment - Target characteristic - Factors - Levels
Investigation: Article: Ultrasonic Done by: Plugged Date: relay B Target characteristic Designation Press-out force unit N Influence quantities (factors) KB A B C Factor Cylinder pressure Speed Amplitude 1.5 25 22 + 2.0 50 32 unit bar mm/s mm/1000 EA Al Nov 78
Page 2 - Experimental prescription (experimental design) - Scheme for hand-written entry of the results of measurement (individual values) - Number of experimental runs (experiments per row)
Experimental prescription 1 2 4
Cylinder pressure Speed Amplitude
Individual values press-out force
Row No. 1 2 3 4 5 6 7 8
[ bar ] A 1.5 2 1.5 2 1.5 2 1.5 2
[ mm/s ] B 25 25 50 50 25 25 50 50
[ mm/1000 ] C 22 22 22 22 32 32 32 32
y1 135 160 145 180 260 215 225 295
y2 140 170 150 185 265 235 270 260
y3 130 165 160 180 230 275 230 285
Number of runs 3
- 75 -
This form-sheet can be printed out prior to conducting the experiment and used for handwritten entry of the measurement results during experimentation.
Print-out of evaluation results Statistical evaluation of the experiment results essentially provides four important pieces of information: 1. Representing the mean y and the variance s 2 of the individual results y i for every row Often a simple comparison of means y already enables one to make an important statement about the result of the experiment. The quantities considered here are given in the evaluation scheme of Page 3 of the above print-out. One glance of the results for this example reveals that the result, on average, by the 8 th row is substantially higher (280) than with the 1st row (135) and that in the 4th row of the design a remarkably small variance (8.33) occurs. In view of a possibly large value of the target characteristic, the press-out force is thus the parameter combination corresponding to row 8 (all factors on the upper (+) level) at best (pick the winner).
Column 1
Row No.
Summarized results
5 AC + + + + 808.33 6 BC + + + + 821.67 826.67 -5.00 -1.25 9.38 7 ABC + + + + 848.33 800.00 48.33 12.08 876.04 _ y 135.00 165.00 151.67 181.67 251.67 241.67 241.67 280.00 s^2 25.00 25.00 58.33 8.33 358.33 933.33 608.33 325.00
2 B + + + + 855.00
3 AB + + + + 848.33
4 C + + + + 1,015.0
A + + + + 868.33
1 2 3 4 5 6 7 8 Total + Total C e(C) SS(C)
780.00 793.33 800.00 633.33 840.00 88.33 61.67 48.33 381.67 -31.67 22.08 15.42 12.08 95.42 -7.92 2,926.04 1,426.04 876.04 54,626.0 376.04
In the original print-out, the so-called Scree Plot is located directly below this table (see page 80). By means of the SS(C) quantity (Sum of Squares), this graphic represents what extent the individual factors or interactions contribute to the overall variation (explanation of the terms SS(C) and S.d.q.A. on the following pages).
- 76 -
2. Mean effects of every factor and of the interactions considered in the design The (mean) effect of a factor gives information about the extent to which the target characteristic y has changed on average when the setting of this factor is changed from level () to level (+). The effect e( C) of factor A, for example, is the difference between the mean of all results where A has the + level and the mean of all results, where A has the - level (see above):
e ( C) = e(C) =
Total + Total C = 4 4 4 868.33 780.0 88.33 = = 22.08 . 4 4 4
The calculation of the effects for the remaining factors B and C and the interactions takes place analogously.
Remark: The character C in the expression e( C) designates the contrast C = ( Total + ) (Total ) and not the factor C.
Press-out force
220 215 210 205 200 195 190 185 180 + 300 250 200 150 100 50 0 +
Cylinder pressure
Amplitude
The corresponding representation for factor C has been left out due to lack of space. The figure on the following page shows an entry of the linear effects e( C) for individual factors and interactions.
- 77 -
linear Effect e(C) 100,00 80,00 60,00 40,00 20,00 0,00 -20,00 A B A B C A C B C A C B
y-axis: Press-out force
On Page 6 of the print-out, the importance of an interaction is graphically illustrated.
300,00 250,00 200,00 150,00 100,00 50,00 0,00 CBB+
C+
y-axis: Press-out force If the setting of factor C is varied from - to +, then the target characteristic, press-out force, increases by a specific amount which depends upon the level at which factor B is set (no interaction, parallel lines).
- 78 -
240,00 230,00 220,00 210,00 200,00 190,00 180,00 170,00 BAA+
B+
The increment of the press-out force by varying factor B from - to + depends upon the setting (level) of factor A. An interaction AB exists, the lines do not run parallel.
3. Assessing the statistical significance of a factor or of an interaction based upon experimental noise The assessment of statistical significance of a factor (or an interaction) is the result of an analysis of variance of the measured data. Analysis of variance means that one decomposes the sum of squares (SS) Q, that is a measure for the deviation of all individual values (i-th row, j-th column) from the overall mean, into a sum of individual contributions which are based on the influence of the factors and interactions:
Q = Q A + Q B + Q C + Q AB + Q AC + Q BC + Q ABC + Q Residual
All components of Q, which are not explainable by the influence of factors or interactions is accounted for by the expression Q Residual (residual sum of squares = experimental noise). These SS have the designation SS (Sum of Squares) in the evaluation scheme on pages 3 and 4. Q Residual has the designation SSW (Sum of Squares Within Rows). A number of degrees of freedom f belongs to every SS. If one divides the SS by the corresponding number of degrees of freedom f , then one gets the respective variances:
MS =
SS f
(Mean Squares)
MSW =
SSW fW
(Mean Squares Within).
- 79 -
In order to judge the significance of a factor, an F test is conducted. The F test respectively compares the variance MS caused by the considered factor with the experimental variation MSW:
F=
MS . MSW
If this test statistic F is greater than one of the table values F(95%) or F(99%) of the F distribution, then the considered factor is significant with a probability of 95% or 99%.
Column
1 2 3 4 5 6 7
SS(C) f(C) MS(C) F(C) F(95%) F(99%) Significant Factor 2,926.04 1 2,926.04 10.00 ** A 1,426.04 1 1,426.04 4.87 * B 876.04 1 876.04 2.99 AB 54,626.04 1 54,626.04 186.62 4.49 8.53 ** C 376.04 1 376.04 1.28 AC 9.38 1 9.38 0.03 BC 876.04 1 876.04 2.99 ABC SSW fw MSW 4,683.33 16 292.71
4. Pareto Analysis of factors and interactions with regard to their contribution to the overall variation (Scree Plot) Pareto Analysis means that one classifies the factors and their interactions in correspondence with the size of their contribution to the overall variation. In the FKM-program the sums of squares SS become classified according to their size and then graphically represented (Scree plot, Page 3). One acquires the same information when he classifies the F test-values F (Scheme Page 4) according to their size.
Scree Plot SS(C)

60000,00 50000,00 40000,00 30000,00 20000,00 10000,00 0,00 C B BC A AB AC ABC
- 80 -
8.4.2 Evaluation with Help of SAV Program Since the use of SAV program is represented in detail in [9], we will confine ourselves by the following explanations to interpret the data represented on the screen. Before beginning the actual evaluation, it should be checked under Planverwaltung/Auswhlen/Anzeigen, whether the design used for evaluating corresponds to the design used during experimentation. The example deals with a classical 2 3 design with 8 rows:
By selecting the option Datenverwaltung/Anzeige one gets a listing of the entered results:
- 81 -
Auswerten/Versuchszeilen gives the following representation:
These are the intermediate values and results of an analysis of variance based on the rows of the design (see Section 4.1 and 6.4.1). The overall sum of squares (SS) Q will be split into the portions SS between the rows ( Q 2 ) and SS within the rows ( Q1= Q Residual ). Both contributions to variation are then compared, taking into account the degrees of freedom, with the help of an F test.
65,798.9 = Q = Q1 + Q 2 = 4,683.3 + 61115.6 ,
Q2 F = f2 Q1 f1
61115.625 , 81 k 1 = = = 29.8 4,683.3 Q1 8 ( 3 1) k (n 1)
Q2
Instead of comparing this test statistic with the table value F7; 16; 0.99 = 4.03 , the program directly calculates the significance belonging to the value 29.8. Since this lies near 100% for this example, the rounded value 100% will be given. The result the mean values of the rows are significantly different is as discussed in Section 8.4.1 not a further surprise, after pressing the arrow-button the means and standard deviations of the individual results of every row given in the lower part of the representation (see below) can be seen in detail. Of primary importance to the program user, is the awareness that the selected factors for the experimental design have actual influence on the target characteristic. Otherwise any further statistical evaluation would be senseless and the initial problem analysis prior to the experiment would have to be repeated.
- 82 -
Lower part of screen representation (Auswerten/Versuchszeilen):
Selecting the option Auswerten/Spaltenweise/Mittelwertanalyse, leads to the following representation:
A factorial analysis of variance (see Section 8.2) will be conducted, i.e. every column of the design will be examined to determine whether it has a significant influence on the result of experiment. Unfortunately it is not apparent from the numbering of the columns, which factors and interactions respectively correspond to the individual columns. This classification, however, is easy to find out from the corresponding representations in [9] (Appendix) (see also Section 7.4.2).
- 83 -
For the example shown here, the following is valid: Column no. 1 2 3 4 5 6 7 corresponds to factor/ interaction A B AB C AC BC ABC
The factors A (Column 1, Significance 99.39%) and C (Column 4, 100%) do apparently have a substantial influence on the results of the experiment. It can occur, in practice, that several factors (and/or interactions) possess significances near 100%. In this case one can assign a rank to these factors by means of F test values. The Scree-Plot in the FKM-program is nothing but a graphical representation of this rank. With the help of and buttons it is possible to invoke a representation of grouped values for every column on the probability paper (Section 3). For column 4 (factor C) for example one gets the following figure:
- 84 -
Here, all values where factor C was set on the lower level (level 2) are marked with + and all values where C was set on the upper level (level 1) are marked with *. The representation illustrates the result of the factorial analysis of variance:

The groups of points may be approximated by best-fit lines (approximation of normally distributed values). The mean values, whose intersections correspond to the best-fit lines with the 0 s -line (50% cumulative relative frequency), are substantially distinguishable. The slopes of the best fitting straight lines are nearly equal, i.e. the groups of values have approximately equal standard deviations.
Auswertung/Spaltenweise/Streuungsanalyse: By selecting this program option, an analysis will be done for every column with respect to the variation. Thus, the mean of all variances, where the considered factor was set on the lower level and the mean of all variances, where the considered factor was set on the upper level will be calculated. Both of the mean variances will be compared within the scope of an F test while considering the number of degrees of freedom
f1 = f2 =
k (n 1) . On the screen, the mean variances for every level will appear, as 2
well as the corresponding significance. In the example, the variance within the rows is quite large (small) when the factor C (Column 4) is set on the upper (lower) level.
- 85 -
9. Hints on Practical Design of Experiments

The cases in the previous chapters describe almost without exception, the mathematicalstatistical processes which are extremely important when evaluating experiments under the consideration of statistical viewpoints. The impression could be made however, that the problems occurring within the scope of technical investigations were of pure statistical nature and thus only solvable through the application of mathematical algorithms. In the relevant literature concerning the subject of experimental design, attempts were not made to refute this impression. With the following general comments, this misconception will be addressed.
9.1 Task and Target Formulation

Obviously, experimental design does not take place in a vacuum. In general, the development department is given a task, which initially is only vaguely formulated, e.g. improvement of the pollution emission of a Diesel car through modifications on the injection nozzle design or reduction of the noise of a generator type XY. It is essential, in this case, to formulate the task more precisely, to give target specification and evaluation criteria, which will later allow a determination on whether the target was fully or partly met. Normally, one will have to determine a directly or indirectly measurable physical quantity as the target, by means of which the degree of improvement can be judged.
9.2 System Analysis

For a system analysis, the responsible project group should try to take stock of the available knowledge and lack of knowledge about problem-relevant system components (see [12]). In this relationship it is recommended to consistently apply Elementary Quality Assurance Tools (see [14]). From the start it is not clear what is meant by the term system. The definition of the system to be investigated, will be quickly arrived at by listing all quantities which, in a certain form, can have influence on the target characteristic. In general, one gets a number of possible quantitative or qualitative influence quantities, which regarding

independence possibility of setting realization costs expected measure of effects on the target characteristic
must be evaluated and structured. This phase of combination, evaluation and structuring of the influence quantities is actually the most critical phase of experimental design. If the practical experimental investigation does not become dispensable as a result of the system analysis, one can, in the interest of simplifying the complexity of the experiment for the mentioned evaluation criteria, try to select only as small a number of influence quantities (factors) as possible for an experiment. The last-mentioned criterion expected measure of effects on the target characteristic will however, be decisive for the selection.
- 86 -
Ultimately in this phase, the prior knowledge and intuition of the test engineers will determine the success or failure of an experiment. As important as consideration of the presumed relevant factors, are thoughts concerning the treatment of the remaining influence quantities which are basically considered to be possible noise factors. In order to prevent these noise factors from significantly influencing the target characteristic, they should be held constant. If this is not possible, then an undesired influence on the overall result can be prevented (as much as possible) by the so-called randomization, i.e. executing individual experiments in a random sequence. This naturally also affects influence quantities which could have been missed within the scope of the system analysis. In every case, uncontrolled noise factors can increase the experimental noise, i.e. variation of the results when experiments are repeated with unmodified factor setting. In this relationship the quality level of the measuring process (capability of the measurement system) must be considered, which can additionally have a negative effect on experimental variation.
9.3 Stipulating an Experimental Strategy

The DOE user, who, at this point, expects a patent remedy, on how to proceed in a concrete situation, must rather be disappointed. Selecting an experimental strategy naturally depends upon the result of system analysis. If, for instance, it happens that in the case of factor A no interaction with other factors is expected, then nothing prevents conducting a one-factor-at-a-time experiment, by which A is varied while all the remaining factors are held constant. Though one can never be absolutely sure that interaction of A with one or several of the remaining factors could still exist. In this case the statement resulting from the experiment about the influences of A on the target characteristic y is naturally only correct for the settings chosen in the experiment. Therefore, nothing will circumvent this fact by combining all the intended settings of A with all the settings of the remaining factors and conducting the corresponding experiments. The number of experiments to be conducted increases exponentially with the number p of the factors to be investigated, of the number k of factor levels and the number n of the experimentation per individual experiment (replications): k p n . The promise to reduce the scope of experiment by applying orthogonal arrays can strictly only be maintained when interactions can be ruled out. In practice, it is only possible to reduce the extent of the experiment by half via the application of uncritically reduced designs of type 2 p1 with p > 4 . It should also be noted that the variables search according to Shainin (Section 8, [5] and [11]) and also the full factorial designs, actually represent a respective sequence of onefactor-at-a time experiments. By considering the rapidly increasing complexity of experiments according to the formula k p n , one must try to reduce the number p of the factors to be considered and, in the simplest case, to limit oneself to two factor levels.
- 87 -
According to the cases in 7.3, limiting oneself to two factor levels implies the choice of a linear model. Interpolations and extrapolations through calculations with the help of a theoretical model equation are strictly not possible, since their accuracy depends upon the adequacy of the model, which cannot be verified without additional experiments. In case of qualitative factors (e.g. material, suppliers, charge, machine) mainly only a few discrete level settings are possible, and theoretical intermediate results are not sensible. If it is possible to estimate experimental noise and provide information about the desired improvement (set target), then the number of replications should be chosen corresponding to the minimum sample size (Section 4.2). However, the distance of the factor-levels (in physical units) must be sufficiently large.
9.4 Executing and Documenting an Experiment

Before beginning with the experiment a hand-written experimental instruction (e.g. 8.4.1) should be prepared, which ensures that the experiments are conducted in the planned manner (sequence, factor combinations, number of replications, constraints). If a sample for an experiment must be manufactured and assembled with special parameter combinations, providing the experimental instructions early can help prevent possible errors during this preliminary stage. Furthermore, it can serve as a basis for time and costs estimate for the overall experiment. From experience, inquiries are sometimes made even months after the conclusion of an experiment, about how an experiment was conducted. These can usually be handled with only the help of a detailed documentation. This is, for example, nearly always the case when designing a supplementary or subsequent experiment. It is thus absolutely necessary to plan out and complete extensive documentation and backup of the experiment. As a matter of record, it should be mentioned that when executing an experiment any peculiarities that occur should be documented. Of course, the experimenter may not prejudge any results (e.g. outlier) or select successful individual trials.
- 88 -
10. Shainin Method

The essential characteristics of the Shainin method were summarized in [11]. We thus, reproduce the unchanged QS-info text as follows. A detailed representation is found in [5].
Design of experiments according to Shainin Currently, in the field of the statistical experimental design, original equipment customers, especially VW, do prefer the so-called Shainin Method. In the following section several hints will be given about the Shainin Method; detailed information can be acquired through ZQF. The method is taught in the RB seminar VP1. The American management consultant Dorian Shainin propagates a procedure that strongly distinguishes itself from the Taguchi method. The process recommended by Shainin nearly fully originates from the classical experimental design. The starting point of the Shainin method is the assumption that for many formulations of questions, which concern industrial products or processes, the so-called Pareto principle is valid. The Pareto principle states that a phenomenon, which theoretically can have very many causes, has in reality only very few causes. The Pareto principle has surely no universal validity, however, it is generally a reasonable working hypothesis. The Pareto principle plays no role in the Taguchi method; instead the basis of the Taguchi method is the assumption that in an investigation, one can define the target characteristics such that the potential influence factors behave free of interaction or that the behaviour with respect to interaction is in principle predictable. The statistical experimental design has numerous methods, which screens the most important influence factors from a large number of potential influence factors with possibly a few experiments. Shainin recommends four screening processes which characterise themselves through substantial simplicity, but are only applicable for special questions.
The processes are:

Multi-Vari Charts Components Search Paired Comparison Variables Search
Multi-Vari Charts The Multi-Vari Chart method was developed in 1950 by L. Seder. Fluctuations of a process are represented graphically. Through original values or vertical lines, the fluctuations which are place-dependent, charge-dependent, time-dependent etc. become recorded independently. By this it will be clear under certain circumstances where the main causes for the fluctuations lie.
- 89 -
Components Search By systematically interchanging single parts between a good unit and a bad unit, one can, under certain circumstances, locate the individual part which in case of the bad unit, led to quality insufficiency. When repairing defective appliances the component search principle is frequently used.
Paired Comparison The method of paired comparison is very similar to the components search. It is used when units cannot be dismantled into individual parts. Good and bad units are taken from the corresponding populations and the measurable quality characteristics become compared with one another. The quality characteristics, which differ from one another in a remarkable manner are possibly responsible for the quality difference in the units. These will then be checked through further comparisons. Variables Search Variables search is a process which screens the most important factors from a medium number (5 - 15) of influence factors in a possibly effective manner. To a certain degreevariables search is a one-factor-at-a-time experiment. Variables search is only applicable if, for every influence factor, the better and the worse of two determined levels is known in principle. (This indispensable prerequisite is naturally very radical.) That means, when searching variables, one does not deal with the search for an optimum, but rather with the question about which factors contribute to the optimum in a more decisive manner. The procedure in detail: (i) Executing comparative experiments all factors on the worse level V s with all factors on the better level V g . If no greater difference is recognized, then further investigation is not beneficial. (ii) Now the influence of the individual factors is investigated separately. If A is presumed to be the strongest factor, then an experiment is carried out where A is put on the good and all the other factors are put on the bad level. If the result roughly corresponds to the result of the experiment V g of (i), then it can be assumed that A has a dominant influence on the target characteristic. This will be confirmed through a counter-sample, where A is on the bad and all the others are on the good level. In case A is dominant, the same result as in case V s should occur. When A is dominant then the investigation must not be continued. When A is not dominant, then B is investigated in the same manner etc.
- 90 -
(iii)
If none of the factors is dominant, then the factors, which signify an influence will be combined and put once on the good or on the bad level and the other factors on the bad or good level. If through this, the results of (i) are repeated, then it is a proof that it can be sufficient to put the factors, by which the one-factor-at-a-time comparisons signify an influence on their good level and the rest of the factors on their bad level.
If now the number of the influence factors has been reduced to a maximum of 2 - 4, then Shainin recommends the well-known full factorial 2 2 -, 2 3 - or 2 4 -designs, preferentially with a maximum of 2 repetitions per parameter combination. These full factorial designs are evaluated in the normal manner, e.g. with the analysis of variance. In this manner it is possible to determine which influence factors or interactions are especially important. Because Shainin recommends relatively small sample sizes, it is clear that only strong effects are verifiable. It is plausible that one can reliably give an optimum when dealing with full factorial designs, since all parameter combinations are explicitly investigated. One-factor-at-a-time comparisons (B versus C) belong to the processes recommended by Shainin. In conclusion, it is still worth mentioning that Shainin rejects the application of orthogonal arrays, since by confounded designs misinterpretations due to interaction problems are nearly inevitable.
- 91 -
11. List of References

[1] E. Scheffler: Einfhrung in die Praxis der statistischen Versuchsplanung, VEB Deutscher Verlag fr Grundstoffindustrie, Leipzig, 1986 G. Box, W. Hunter, J. Hunter: Statistics for Experimenters, Wiley & Sons, New York, 1978 Retzlaff, Rust, Waibel: Statistische Versuchsplanung, Verlag Chemie, Weinheim, 1978 D. Wheeler: Understanding Industrial experimentation, Statistical Process Controls, Inc., Knoxville, 1988 Keki R. Bhote: Qualitt - The Weg zur Weltspitze, Institut fr Qualittsmanagement, Grobottwar, 1990, deutsche bersetzung von: Keki R. Bhote: World Class Quality, American Management Association, New York, 1988 J. M. Juran, F. M. Gryna: Juran's Quality Control Handbook, McGraw-Hill Book Company, New York, 1988 O.L. Davies (Hrsg.): The Design and Analysis of Industrial Experiments, London, 1956, 2. Aufl. 1963 Forschungskuratorium Maschinenbau e.V. (FKM): Fehlerverhtung, Vorhaben No. 137, Fehlerverhtung vor Produktionsbeginn durch Verfahren der statistischen Versuchsmethodik, Abschlubericht, FKM, Frankfurt, 1992 (Bericht und Programm bei ZQF erhltlich) Quality Vanguard: Software zur Auswertung von Versuchen (SAV), Benutzerhandbuch (Handbuch and Program by ZQF erhltlich)
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10] QS-Info No. 1/1990 [11] QS-Info No. 13/1990 [12] QS-Info No. 15/1991 [13] VDA (Hrsg.): Schriftenreihe Qualittskontrolle in the Automobilindustrie, Band 4: Sicherung the Qualitt vor Serieneinsatz (Textentwurf zur Neuauflage), Frankfurt, Verband the Automobilindustrie [14] Elementary Quality Assurance Tools (ZQF)
- 92 -
12. Tables Table 1

Cumulative relative frequencies H i ( n ) (in percent) for entering the points ( x i , H i ) from sequentially arranged sample values on probability paper
i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
n=6
10.2 26.1 42.1 57.9 73.9 89.8
n=7
8.9 22.4 36.3 50.0 63.7 77.6 91.2
n=8
7.8 19.8 31.9 44.0 56.0 68.1 80.2 92.2
n=9
6.8 17.6 28.4 39.4 50.0 60.6 71.6 82.4 93.2
n=10
6.2 15.9 25.5 35.2 45.2 54.8 64.8 74.5 84.1 93.8
n=11
5.6 14.5 23.3 32.3 41.3 50.0 58.7 67.7 76.7 85.5 94.4
n=12
5.2 13.1 21.5 29.5 37.8 46.0 54.0 62.2 70.5 78.5 86.9 94.9
n=13
4.8 12.3 19.8 27.4 34.8 42.5 50.0 57.5 65.2 72.6 80.2 87.7 95.3
n=14
4.5 11.3 18.4 25.5 32.3 39.4 46.4 53.6 60.6 67.7 74.5 81.6 88.7 95.5
n=15
4.1 10.6 17.1 23.9 30.2 36.7 43.3 50.0 56.7 63.3 69.8 76.1 82.9 89.4 95.9
The cumulative relative frequency H i ( n ) to rank i can also be calculated with one of the approximation formulas
H i ( n) =
i 0.5 n
and
H i ( n) =
i 0.3 . n + 0.4
The deviation from the exact table value is thereby insignificant. Example:
n = 15
i = 12
or
Table value: 76.1
H 12 (15) =
12 0.5 = 76.7 15
H 12 (15) =
12 0.3 = 76.0 15 + 0.4
- 93 -
Table 1 (Continued)
i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
n=16
3.9 10.0 16.1 22.4 28.4 34.8 40.9 46.8 53.2 59.1 65.2 71.6 77.6 83.9 90.0 96.1
n=17
3.7 9.3 15.2 20.9 26.8 32.6 38.2 44.0 50.0 56.0 61.8 67.4 73.2 79.1 84.8 90.7 96.3
n=18
3.4 8.9 14.2 19.8 25.1 30.9 36.3 41.7 47.2 52.8 58.3 63.7 69.1 74.9 80.2 85.8 91.2 96.6
n=19
3.3 8.4 13.6 18.7 23.9 29.1 34.5 39.7 44.8 50.0 55.2 60.3 65.5 70.9 76.1 81.3 86.4 91.6 96.7
n=20
3.1 7.9 12.9 17.9 22.7 27.8 32.6 37.8 42.5 47.6 52.4 57.5 62.2 67.4 72.2 77.3 82.1 87.1 92.1 96.9
n=21
2.9 7.6 12.3 17.1 21.8 26.4 31.2 35.9 40.5 45.2 50.0 54.8 59.5 64.1 68.8 73.6 78.2 82.9 87.7 92.4 97.1
n=22
2.8 7.2 11.7 16.4 20.6 25.1 29.8 34.1 38.6 43.3 47.6 52.4 56.7 61.4 65.9 70.2 74.9 79.4 83.6 88.3 92.8 97.2
n=23
2.7 6.9 11.3 15.6 19.8 24.2 28.4 32.6 37.1 41.3 45.6 50.0 54.4 58.7 62.9 67.4 71.6 75.8 80.2 84.4 88.7 93.1 97.3
n=24
2.6 6.7 10.7 14.9 18.9 23.3 27.4 31.6 35.6 39.7 43.6 48.0 52.0 56.4 60.3 64.4 68.4 72.6 76.7 81.1 85.1 89.3 93.3 97.4
n=25
2.4 6.4 10.4 14.2 18.1 22.4 26.1 30.2 34.1 38.2 42.1 46.0 50.0 54.0 57.9 61.8 65.9 69.8 73.9 77.6 81.9 85.8 89.6 93.6 97.6
- 94 -
Table 1 (Continued)
i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
n=26
2.4 6.2 9.9 13.8 17.6 21.5 25.1 29.1 33.0 36.7 40.5 44.4 48.0 52.0 55.6 59.5 63.3 67.0 70.9 74.9 78.5 82.4 86.2 90.2 93.8 97.6
n=27
2.3 5.9 9.5 13.4 16.9 20.6 24.2 28.1 31.6 35.2 39.0 42.5 46.4 50.0 53.6 57.5 61.0 64.8 68.4 71.9 75.8 79.4 83.1 86.6 90.5 94.1 97.7
n=28
2.2 5.7 9.2 12.7 16.4 19.8 23.3 27.1 30.5 34.1 37.4 41.3 44.8 48.4 51.6 55.2 58.7 62.6 65.9 69.5 72.9 76.7 80.2 83.6 87.3 90.8 94.3 97.8
n=29
2.1 5.5 8.9 12.3 15.9 19.2 22.7 26.1 29.5 33.0 36.3 39.7 43.3 46.4 50.0 53.6 56.7 60.3 63.7 67.0 70.5 73.9 77.3 80.8 84.1 87.7 91.2 94.5 97.9
n=30
2.1 5.3 8.7 11.9 15.2 18.7 21.8 25.1 28.4 31.9 35.2 38.6 41.7 45.2 48.4 51.6 54.8 58.3 61.4 64.8 68.1 71.6 74.9 78.2 81.3 84.8 88.1 91.3 94.7 97.9
n=31
2.0 5.1 8.3 11.6 14.8 18.0 21.2 24.4 27.6 30.8 34.0 37.2 40.4 43.6 46.8 50.0 53.2 56.4 59.6 62.8 66.0 69.2 72.4 75.6 78.8 82.0 85.3 88.5 91.7 94.9 98.0
n=32
1.9 5.0 8.1 11.2 14.3 17.4 20.5 23.6 26.7 29.8 32.9 36.0 39.1 42.2 45.3 48.4 51.6 54.7 57.8 60.9 64.0 67.1 70.2 73.3 76.4 79.5 82.6 85.7 88.8 91.9 95.0 98.1
n=33
1.9 4.8 7.8 10.8 13.9 16.9 19.9 22.9 25.9 28.9 31.9 34.9 37.9 41.0 44.0 47.0 50.0 53.0 56.0 59.0 62.1 65.1 68.1 71.1 74.1 77.1 80.1 83.1 86.2 89.2 92.2 95.2 98.1
n=34
1.8 4.7 7.6 10.5 13.5 16.4 19.3 22.2 25.1 28.1 31.0 33.9 36.8 39.8 42.7 45.6 48.5 51.5 54.4 57.3 60.2 63.2 66.1 69.0 71.9 74.9 77.8 80.7 83.6 86.5 89.5 92.4 95.3 98.2
n=35
1.8 4.6 7.4 10.2 13.1 15.9 18.8 21.6 24.5 27.3 30.1 33.0 35.8 38.6 41.5 44.3 47.2 50.0 52.8 55.7 58.5 61.4 64.2 67.0 69.9 72.7 75.6 78.4 81.3 84.1 86.9 89.8 92.6 95.5 98.2
- 95 -
Table 2
Percentage points of the t distribution (two-sided) Significance level
f
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 100 200 300 400 500
95% 12.7 4.3 3.18 2.78 2.57 2.45 2.37 2.31 2.26 2.23 2.20 2.18 2.16 2.15 2.13 2.12 2.11 2.10 2.09 2.09 2.06 2.04 2.03 2.02 2.01 2.01 1.98 1.97 1.97 1.97 1.97 1.96
99% 63.7 9.93 5.84 4.60 4.03 3.71 3.50 3.36 3.25 3.17 3.11 3.06 3.01 2.98 2.95 2.92 2.90 2.88 2.86 2.85 2.79 2.75 2.72 2.70 2.69 2.68 2.63 2.60 2.59 2.59 2.59 2.58
99.9% 636.6 31.6 12.9 8.61 6.87 5.96 5.41 5.04 4.78 4.59 4.44 4.32 4.22 4.14 4.07 4.02 3.97 3.92 3.88 3.85 3.73 3.65 3.59 3.55 3.52 3.50 3.39 3.34 3.32 3.32 3.31 3.30
- 96 -
Table 3
f2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 34 36 38 40 50 60 70 80 90 100 150 200 1,000
Percentage points of the F distribution (PA=95%, one-sided)
f 1 =1
161 18.5 10.1 7.71 6.61 5.99 5.59 5.32 5.12 4.96 4.84 4.75 4.67 4.60 4.54 4.49 4.45 4.41 4.38 4.35 4.30 4.26 4.23 4.20 4.17 4.15 4.13 4.11 4.10 4.08 4.03 4.00 3.98 3.96 3.95 3.94 3.90 3.89 3.85
f 1=2
200 19.0 9.55 6.94 5.79 5.14 4.74 4.46 4.26 4.10 3.98 3.89 3.81 3.74 3.68 3.63 3.59 3.55 3.52 3.49 3.44 3.40 3.37 3.34 3.32 3.30 3.28 3.26 3.24 3.23 3.18 3.15 3.13 3.11 3.10 3.09 3.06 3.04 3.00
f1= 3
216 19.2 9.28 6.59 5.41 4.76 4.35 4.07 3.86 3.71 3.59 3.49 3.41 3.34 3.29 3.24 3.20 3.16 3.13 3.10 3.05 3.01 2.98 2.95 2.92 2.90 2.88 2.87 2.85 2.84 2.79 2.76 2.74 2.72 2.71 2.70 2.66 2.65 2.61
f1= 4
225 19.2 9.12 6.39 5.19 4.53 4.12 3.84 3.63 3.48 3.36 3.26 3.18 3.11 3.06 3.01 2.96 2.93 2.90 2.87 2.82 2.78 2.74 2.71 2.69 2.67 2.65 2.63 2.62 2.61 2.56 2.53 2.50 2.49 2.47 2.46 2.43 2.42 2.38
f1= 5
230 19.3 9.01 6.26 5.05 4.39 3.97 3.69 3.48 3.33 3.20 3.11 3.03 2.96 2.90 2.85 2.81 2.77 2.74 2.71 2.66 2.62 2.59 2.56 2.53 2.51 2.49 2.48 2.46 2.45 2.40 2.37 2.35 2.33 2.32 2.31 2.27 2.26 2.22
f1= 6
234 19.3 8.94 6.16 4.95 4.28 3.87 3.58 3.37 3.22 3.09 3.00 2.92 2.85 2.79 2.74 2.70 2.66 2.63 2.60 2.55 2.51 2.47 2.45 2.42 2.40 2.38 2.36 2.35 2.34 2.29 2.25 2.23 2.21 2.20 2.19 2.16 2.14 2.11
f1= 7
237 19.4 8.89 6.09 4.88 4.21 3.79 3.50 3.29 3.14 3.01 2.91 2.83 2.76 2.71 2.66 2.61 2.58 2.54 2.51 2.46 2.42 2.39 2.36 2.33 2.31 2.29 2.28 2.26 2.25 2.20 2.17 2.14 2.13 2.11 2.10 2.07 2.06 2.02
f1=8
239 19.4 8.85 6.04 4.82 4.15 3.73 3.44 3.23 3.07 2.95 2.85 2.77 2.70 2.64 2.59 2.55 2.51 2.48 2.45 2.40 2.36 2.32 2.29 2.27 2.24 2.23 2.21 2.19 2.18 2.13 2.10 2.07 2.06 2.04 2.03 2.00 1.98 1.95
f1= 9
241 19.4 8.81 6.00 4.77 4.10 3.68 3.39 3.18 3.02 2.90 2.80 2.71 2.65 2.59 2.54 2.49 2.46 2.42 2.39 2.34 2.30 2.27 2.24 2.21 2.19 2.17 2.15 2.14 2.12 2.07 2.04 2.02 2.00 1.99 1.97 1.94 1.93 1.89
- 97 -
Table 3 (Continued)
Percentage points of the F distribution (P A=95%, one-sided)
f2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 34 36 38 40 50 60 70 80 90 100 150 200 1,000
f 1 = 10 f 1 = 15
242 19.4 8.79 5.96 4.74 4.06 3.64 3.35 3.14 2.98 2.85 2.75 2.67 2.60 2.54 2.49 2.45 2.41 2.38 2.35 2.30 2.25 2.22 2.19 2.16 2.14 2.12 2.11 2.09 2.08 2.03 1.99 1.97 1.95 1.94 1.93 1.89 1.88 1.84 246 19.4 8.70 5.86 4.62 3.94 3.51 3.22 3.01 2.85 2.72 2.62 2.53 2.46 2.40 2.35 2.31 2.27 2.23 2.20 2.15 2.11 2.07 2.04 2.01 1.99 1.97 1.95 1.94 1.92 1.87 1.84 1.81 1.79 1.78 1.77 1.73 1.72 1.68
f 1 = 20
248 19.4 8.66 5.80 4.56 3.87 3.44 3.15 2.94 2.77 2.65 2.54 2.46 2.39 2.33 2.28 2.23 2.19 2.16 2.12 2.07 2.03 1.99 1.96 1.93 1.91 1.89 1.87 1.85 1.84 1.78 1.75 1.72 1.70 1.69 1.68 1.64 1.62 1.58
f 1 = 30
250 19.5 8.62 5.75 4.50 3.81 3.38 3.08 2.86 2.70 2.57 2.47 2.38 2.31 2.25 2.19 2.15 2.11 2.07 2.04 1.98 1.94 1.90 1.87 1.84 1.82 1.80 1.78 1.76 1.74 1.69 1.65 1.62 1.60 1.59 1.57 1.53 1.52 1.47
f 1 = 40
251 19.5 8.59 5.72 4.46 3.77 3.34 3.04 2.83 2.66 2.53 2.43 2.34 2.27 2.20 2.15 2.10 2.06 2.03 1.99 1.94 1.89 1.85 1.82 1.79 1.77 1.75 1.73 1.71 1.69 1.63 1.59 1.57 1.54 1.53 1.52 1.48 1.46 1.41
f 1 = 50
252 19.5 8.58 5.70 4.44 3.75 3.32 3.02 2.80 2.64 2.51 2.40 2.31 2.24 2.18 2.12 2.08 2.04 2.00 1.97 1.91 1.86 1.82 1.79 1.76 1.74 1.71 1.69 1.68 1.66 1.60 1.56 1.53 1.51 1.49 1.48 1.44 1.41 1.36
f 1 = 100 f 1
253 19.5 8.55 5.66 4.41 3.71 3.27 2.97 2.76 2.59 2.46 2.35 2.26 2.19 2.12 2.07 2.02 1.98 1.94 1.91 1.85 1.80 1.76 1.73 1.70 1.67 1.65 1.62 1.61 1.59 1.52 1.48 1.45 1.43 1.41 1.39 1.34 1.32 1.26 254 19.5 8.53 5.63 4.37 3.67 3.23 2.93 2.71 2.54 2.40 2.30 2.21 2.13 2.07 2.01 1.96 1.92 1.88 1.84 1.78 1.73 1.69 1.65 1.62 1.59 1.57 1.55 1.53 1.51 1.44 1.39 1.35 1.32 1.30 1.28 1.22 1.19 1.08
- 98 -
Table 3 (Continued) Percentage points of the F distribution (P A=99%, one-sided)
f2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 34 36 38 40 50 60 70 80 90 100 150 200 1,000
f 1 =1
4052 98.5 34.1 21.2 16.3 13.7 12.2 11.3 10.6 10.0 9.65 9.33 9.07 8.86 8.68 8.53 8.40 8.29 8.18 8.10 7.95 7.82 7.72 7.64 7.56 7.50 7.44 7.40 7.35 7.31 7.17 7.08 7.01 6.96 6.93 6.90 6.81 6.76 6.66
f 1=2
4999 99.0 30.8 18.0 13.3 10.9 9.55 8.65 8.02 7.56 7.21 6.93 6.70 6.51 6.36 6.23 6.11 6.01 5.93 5.85 5.72 5.61 5.53 5.45 5.39 5.34 5.29 5.25 5.21 5.18 5.06 4.98 4.92 4.88 4.85 4.82 4.75 4.71 4.63
f1= 3
5403 99.2 29.5 16.7 12.1 9.78 8.45 7.59 6.99 6.55 6.22 5.95 5.74 5.56 5.42 5.29 5.18 5.09 5.01 4.94 4.82 4.72 4.64 4.57 4.51 4.46 4.42 4.38 4.34 4.31 4.20 4.13 4.08 4.04 4.01 3.98 3.92 3.88 3.80
f1= 4
5625 99.3 28.7 16.0 11.4 9.15 7.85 7.01 6.42 5.99 5.67 5.41 5.21 5.04 4.89 4.77 4.67 4.58 4.50 4.43 4.31 4.22 4.14 4.07 4.02 3.97 3.93 3.89 3.86 3.83 3.72 3.65 3.60 3.56 3.54 3.51 3.45 3.41 3.34
f1= 5
5764 99.3 28.2 15.5 11.0 8.75 7.46 6.63 6.06 5.64 5.32 5.06 4.86 4.70 4.56 4.44 4.34 4.25 4.17 4.10 3.99 3.90 3.82 3.75 3.70 3.65 3.61 3.57 3.54 3.51 3.41 3.34 3.29 3.26 3.23 3.21 3.14 3.11 3.04
f1= 6
5859 99.3 27.9 15.2 10.7 8.47 7.19 6.37 5.80 5.39 5.07 4.82 4.62 4.46 4.32 4.20 4.10 4.01 3.94 3.87 3.76 3.67 3.59 3.53 3.47 3.43 3.39 3.35 3.32 3.29 3.19 3.12 3.07 3.04 3.01 2.99 2.92 2.89 2.82
f1= 7
5928 99.4 27.7 15.0 10.5 8.26 6.99 6.18 5.61 5.20 4.89 4.64 4.44 4.28 4.14 4.03 3.93 3.84 3.77 3.70 3.59 3.50 3.42 3.36 3.30 3.26 3.22 3.18 3.15 3.12 3.02 2.95 2.91 2.87 2.84 2.82 2.76 2.73 2.66
f1=8
5982 99.4 27.5 14.8 10.3 8.10 6.84 6.03 5.47 5.06 4.74 4.50 4.30 4.14 4.00 3.89 3.79 3.71 3.63 3.56 3.45 3.36 3.29 3.23 3.17 3.13 3.09 3.05 3.02 2.99 2.89 2.82 2.78 2.74 2.72 2.69 2.63 2.60 2.53
f1= 9
6022 99.4 27.3 14.7 10.2 7.98 6.72 5.91 5.35 4.94 4.63 4.39 4.19 4.03 3.89 3.78 3.68 3.60 3.52 3.46 3.35 3.26 3.18 3.12 3.07 3.02 2.98 2.95 2.92 2.89 2.79 2.72 2.67 2.64 2.61 2.59 2.53 2.50 2.43
- 99 -
Table 3 (Continued) Percentage points of the F distribution (P A=99%, one-sided)
f2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26 28 30 32 34 36 38 40 50 60 70 80 90 100 150 200 1,000
f 1 = 10 f 1 = 15
6,056 99.4 27.2 14.5 10.1 7.87 6.62 5.81 5.26 4.85 4.54 4.30 4.10 3.94 3.80 3.69 3.59 3.51 3.43 3.37 3.26 3.17 3.09 3.03 2.98 2.93 2.89 2.86 2.83 2.80 2.70 2.63 2.59 2.55 2.52 2.50 2.44 2.41 2.34 6,157 99.4 26.9 14.2 9.72 7.56 6.31 5.52 4.96 4.56 4.25 4.01 3.82 3.66 3.52 3.41 3.31 3.23 3.15 3.09 2.98 2.89 2.82 2.75 2.70 2.66 2.62 2.58 2.55 2.52 2.42 2.35 2.31 2.27 2.24 2.22 2.16 2.13 2.06
f 1 = 20
6,209 99.4 26.7 14.0 9.55 7.40 6.16 5.36 4.81 4.41 4.10 3.86 3.66 3.51 3.37 3.26 3.16 3.08 3.00 2.94 2.83 2.74 2.66 2.60 2.55 2.50 2.46 2.43 2.40 2.37 2.27 2.20 2.15 2.12 2.09 2.07 2.00 1.97 1.90
f 1 = 30
6,261 99.5 26.5 13.8 9.38 7.23 5.99 5.20 4.65 4.25 3.94 3.70 3.51 3.35 3.21 3.10 3.00 2.92 2.84 2.78 2.67 2.58 2.50 2.44 2.39 2.34 2.30 2.26 2.23 2.20 2.10 2.03 1.98 1.94 1.92 1.89 1.83 1.79 1.72
f 1 = 40
6,287 99.5 26.4 13.7 9.29 7.14 5.91 5.12 4.57 4.17 3.86 3.62 3.43 3.27 3.13 3.02 2.92 2.84 2.76 2.69 2.58 2.49 2.42 2.35 2.30 2.25 2.21 2.17 2.14 2.11 2.01 1.94 1.89 1.85 1.82 1.80 1.73 1.69 1.61
f 1 = 50
6,300 99.5 26.4 13.7 9.24 7.09 5.86 5.07 4.52 4.12 3.81 3.57 3.38 3.22 3.08 2.97 2.87 2.78 2.71 2.64 2.53 2.44 2.36 2.30 2.25 2.20 2.16 2.12 2.09 2.06 1.95 1.88 1.83 1.79 1.76 1.73 1.66 1.63 1.54
f 1 = 100 f 1
6,330 99.5 26.2 13.6 9.13 6.99 5.75 4.96 4.42 4.01 3.71 3.47 3.27 3.11 2.98 2.86 2.76 2.68 2.60 2.54 2.42 2.33 2.25 2.19 2.13 2.08 2.04 2.00 1.97 1.94 1.82 1.75 1.70 1.66 1.62 1.60 1.52 1.48 1.38 6,366 99.5 26.1 13.5 9.02 6.88 5.65 4.86 4.31 3.91 3.60 3.36 3.17 3.00 2.87 2.75 2.65 2.57 2.49 2.42 2.31 2.21 2.13 2.06 2.01 1.96 1.91 1.87 1.84 1.80 1.68 1.60 1.54 1.49 1.46 1.43 1.33 1.28 1.11
- 100 -
Index
Additivity 64; 70 Adequacy 88 Analysis of variance 32; 79 factorial 72; 83 factorial, with respect to variation 72 one-way 71; 82 Best fitting straight line 22; 85 Black Box 13; 17 Center point 56; 67 Characteristic lines 10 Coefficient 39; 42; 55; 59 Components search 90 Computer support 73 Confounding 54; 59 Constant term 61 Contours 43; 44 Contrast 77 Coordinate transformation 38; 39; 55 Degrees of freedom 31; 35 Design central composite 67; 73 experimental 75 factorial 57 fractional factorial 60; 62; 63 matrix 56; 60 of Experiments (DoE) 16 Plackett-Burman 69 sceening 69 Dichotomy 18 Effect 39; 42; 43; 50; 52; 55; 77 illusory effect 70 total 50 Error probability 27; 31; 37 Evaluation matrix 52; 58; 71 Expenses 87 Experimental space 49; 52; 56 strategy 87 Extrapolation 6; 88 F Test 30 Factor 86 level 54; 65 Factorial point 67 Formulation rule 57 scheme 67 Histogram 21; 31 Hypothesis alternative 25; 31 null 25; 31; 32; 37 Identity 59 Influence linear 5 monotonic 6; 18 non-monotonic 7 Input 17 Interaction 50; 52; 78; 79 three-factor 58 two-factor 58 Interpolation 6; 56; 88 Levene 36 Minimum sample size 26 Model 65; 88 Multilinear form 42; 50; 52; 58 Multi-Vari Chart 89 Noise experimental 32; 34; 79 factor 87 Normal distribution 21 One-factor-at-a-time experiment 43; 56 method 5; 38 Orthogonal array 91 Orthogonality 67 Output 17 Paired comparison 90 Pareto analysis 80 principle 89 Percentage point 25; 31 Polynomial 55 Probability paper 21; 84 Randomization 87 Regression analysis 55 polynomial 66 Reproducibility 5 Response surface 45; 50 Scree plot 80; 84 Shainin 16; 87; 89 Significance 72; 80; 82; 85 Significance level 26; 27 Star point 67 Straight line 39 Sum of squares 34; 79; 82 residual 79 within rows 79 Supplementary experiment 62 System 86 analysis 17 noise 17
- 101 -
of equations 59; 61 theory 16 System matrix 14 global 19 local 20 t Test 24 Taguchi 16; 62; 89 Target characteristic 45; 78 Transformation 40
equation 39 reverse 39; 41 Two-factor design 40 method 9 Variables search 87; 90 Variance 26; 30 Variation, experimental 71 Verification run 64
- 102 -
Robert Bosch GmbH Zentralstelle Qualittsfrderung (ZQF) Revision: Rach Telephone (07 11) 8 11-4 47 88 Telefax (07 11) 8 11-4 51 55 Edition 08.1993
Robert Bosch GmbH Zentralstelle Qualittsfrderung (ZQF) Postfach 30 02 20 D-70442 Stuttgart Telefon (07 11) 8 11-4 47 88 Telefax (07 11) 8 11-4 51 55 Stand 08.93

Curso Bosch Diseño de Experimentos

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Curso Bosch Diseño de Experimentos

Diunggah oleh

Hak Cipta:

Format Tersedia

Edition 08.

1993 Robert Bosch GmbH

9. 9.1. 9.2. 9.3. 9.4. 10. 11. 12.

1.1 One-Factor-at-a-Time Method

1.2 Two-Factor Method

Case 1: Both influence factors are discrete Influence factor

A with k levels: B with l levels:

There are k l system states.

Example: Target quantity: Yield Plants A1 , A2 Pesticides B1 , B 2 , B 3

Case 2: One discrete influence factor A: A1 , . . ., Al One variable influence factor.

Example: System: Solution Target quantity: Solubility

A : Chemical substance B : Temperature

Solubility of several inorganic substances as a function of temperature

1.3 General Case (Numerous Influence Factors)

Length Linear Monotonic Non-monotonic Unknown

Given are Length: Breadth: Height: Force: 1.5 m 20 cm 4 cm 20 N

Yield of useful plants

Global System Matrix: Factor Monotonic Non-monotonic Unknown

2. Industrial Experimentation Methodology and System Theory

A way out of this difficulty is only possible via a system-theoretical approach.

Shainin Method For Shainin method see Chapter 10 and [11].

2.1 Hints on System Analysis

2.2 Short Description of the System-Theoretical Procedure

Characteristics of a linear influence factor

3.1 Probability Plot of Small-Size Samples

3.2 Probability Paper

4. Comparison of Samples Means

4.2 Minimum Sample Size

serve the degrees of freedom and significance level!).

of the mean values given in

Number of rows Number of runs per row 2

2 will be calculated from the variances s12 and s 2 of both

6. Analysis of Variance (ANOVA)

1 = 2 = . . . = k = (Remark: Since identical variances were a prerequisite, the null

The factor n is to be considered because of the relationship y =

Finally, one conducts an F test with the test statistic

(percentage points for F in the appendix)

6.1 Deriving the Test Statistic

The quantity Q = ( n k 1) s 2 is called the sum of squares (SS).

(expansion with zero)

If we first consider the middle term:

Degrees of freedom of Q1 : Degrees of freedom of Q 2 : Degrees of freedom of Q :

Equation of the number of degrees of freedom:

6.2 Equality Test of Several Variances (According to Levene)

A1 , . . . , Ak are significantly different.

2 2. Calculate the mean values y i and variances s y

5. F test with the test statistic

Design of Experiments with Orthogonal Arrays and Evaluating such Experiments

Example 1: One-factor-at-a-time method

L 1 (25 C ) = 100.04 cm L 2 (100 C ) = 10016 cm .

The transformation equation is x =

a 1 are thus easy to calculate by addition or subtraction of both equations:

The coefficient a 0 is the mean value of both lengths:

The coefficient a 1 is the half effect (see Figure 7.0.1):

Example 2: Two-Factor Design

Multilinear formulation of solution:

a 0 a1 a 2 + a 12 = 80 a 0 + a 1 a 2 a12 = 240 a 0 a1 + a 2 a12 = 240 a 0 + a 1 + a 2 + a12 = 720

Substituted in the formulated solution, one gets: Reverse transformation:

U = 320 + 160 x 1 + 160 x 2 + 80 x 1 x 2 .

7.1 Representing the Results of Measurement

These facts are illustrated by the following figures, as further examples.

A2 = 12 , B1 = 12 , B 2 = 20 respectively according to the following rule:

7.2 Calculating the Effects