This file is part of a program based on the Bio 4835 Biostatistics class taught at Kean University in Union, New Jersey.
The course uses the following text:
Daniel, W. W. 1999. Biostatistics: a foundation for analysis in the health sciences. New York: John Wiley and Sons.
The file follows this text very closely and readers are encouraged to consult the text for further information.
ANALYSIS OF VARIANCEANOVA
Introduction
ANOVA means Analysis of Variance. It is used to separate the total variation in a set of data
into two or more components. The source of variation is identified so that one can see its influence on
the total variation. It is also used to compare means where there are three or more.
ANOVA is used to analyze the data from experiments. The purposes are for estimating and
testing hypotheses about population variances and population means. There are several types of
experiments and techniques which utilize ANOVA. These include one-way ANOVA, two-way
ANOVA and multiple ANOVA, which come from experiments employing the completely randomized
design, randomized complete block design, repeated measures design or factorial experiment design.
One-way ANOVA is used to determine if there is any significant difference between the means
of groups of data. These groups may vary under the effect of one factor. The data are organized into
groups and presented in a data table.
Data
For ANOVA work, the data are presented in a data table. There must be at least three groups of
data although more are possible.
The sample table above shows four groups. Additional columns are added as necessary to
accommodate each group. The groups do not need to be the same size. For each group of data we
need to find Sx, Sx2 and n.
Assumptions
Hypotheses
Statistical test
The ANOVA test statistic is the variance ratio, V.R., which is distributed as F with the
appropriate number of numerator degrees of freedom and denominator degrees of freedom at the
chosen a level.
A big value of F means to reject the null hypothesis. A small value means not to reject.
Calculations
Basic statistical calculations are made to determine Sx, Sx2 and n for each group. Also
required are N, the total number of measurements and k, the total number of groups. Then, an
ANOVA table is made as shown below.
The ANOVA table has columns for degrees of freedom (df), sums of squares (SS), mean
squares (MS) and the variance ratio (F). These values are found using a series of calculations.
TOTAL df = N - 1
GROUP df = k - 1
ERROR df = N - k
The error term reflects how much each individual measurement differs from the population
mean of its group.
Steps for ANOVA calculations
SS Total = Sx2 - CF
All of the above equations are used in the ANOVA calculations. All except equation [A]
appear in the ANOVA calculation table.
a. Given
For this problem, data were obtained from goldfish breathing experiments conducted in biology
laboratory. The opercular breathing rates in counts per minute were collected in groups of 8
measurements at different temperatures ranging from 12C to 27C. The data are given in the table
below.
N = 48 (number of measurements)
k = 6 (number of groups)
b. Assumptions
It is assumed that there is normal distribution of the data, that the data represent independent
random samples and that there is a constant variance.
c. Hypotheses
d. Statistical test
Test statistic
Distribution
The test statistic is distributed as F with 5 numerator degrees of freedom (k-1) and 42
denominator degrees of freedom (N-k).
Decision criteria
The critical value of F with 5 numerator degrees of freedom and 42 denominator degrees of
freedom is about 2.45 at the 95% confidence level. We reject H0 if V.R. > 2.45.
e. Calculations
Recall that the calculator uses exact values. Each calculation formula has its own letter
corresponding to a cell in the ANOVA calculation table. It is suggested that the values resulting from
each calculation be stored in their corresponding storage location on the calculator. Do not round off
results of calculations due to the strong risk of erroneous results being obtained.
The term (Sx)2 is obtained by adding all of the values in the Sx row of the data table, then squaring it.
Store the result in location [A].
SS Total = Sx2 - CF
To find SS Total, the values in the Sx2 row of the data table are all added together, then the value of
CF is subtracted. The result is stored in location [B]. It is assumed that the value of CF is stored in
location [A].
For the sample calculations, it is assumed that SS Total is located in [B] and SS Group is located in
[C]. The result is stored in [D].
This value is also known as the Mean Square Factor (MS Factor). The result is stored in location [E].
f. Discussion
The 95% confidence level for F with 5 numerator degrees of freedom and 42 denominator
degrees of freedom is about 2.45 as read from the F tables. The actual value is 12.01 with a probability
(calculator value) of 2.98 x 10-7. This means that H0 is rejected.
g. Conclusions
We conclude that not all the means of the groups are equal.
Modern calculation methods
With the advent of statistical calculators, such as the TI-83, and spreadsheet programs with
built-in statistical calculation capabilities, there is no longer any reason that a researcher needs to do
any of these calculations manually anymore.
In old times, researchers worked with paper and a pencil. It is amazing what they
accomplished that way. But it is no longer necessary to become overly concerned with performing
manual calculations, even using a calculator, when there is the automatic way which can calculate a
complete ANOVA in seconds.
It is important, however, to understand the basis of why a technique such as ANOVA is useful
and the mathematical and statistical basis underlying it. Once that basis is understood, it is relatively
easy to do the calculations and go back to the really important considerations, such as the effects of the
factors or treatments on the groups in the study under consideration.