ANALYSIS OF VARIANCE
t− tests
Test Test statistic Standard Error Confidence limits
1
Unit IV − ANALYSIS OF VARIANCE 2
F − test
Test Test statistic
Problems
Problem 1 A machinist is making engine parts with axle diameter of 0.7 inch. A random sample of 10
parts shows mean diameter 0.742 inch with a S.D. of 0.04 inch. On the basis of this sample, would you
say that the work is inferior?
Solution:
Given size n = 10(< 30) ∴ small sample.
Sample mean x = 0.742
Population mean µ = 0.7
Population S.D. s = 0.04
We want to test the difference between sample mean and population mean. ∴ we apply t−test for single
mean.
H0 : µ = 0.7
H1 : µ 6= 0.7 ∴ two tailed test
x−µ
t = σ
√
n−1
0.742 − 0.7
=
0.04
√
10 − 1
√
0.042 × 10 − 1
= = 3.16
0.04
|t| = 3.16
Inference:
Since Calculated |t| > 2.26, H1 is accepted at 5% level.
Problem 2 Eight individuals are chosen at random from a population and their heights are found to be
in cms 163, 163, 164, 165, 166, 169, 170,171. In the light of these data discuss the suggestion that the
mean height in the universe is 165cm.
Solution:
x d=x−A d2
A=163
163 0 0
163 0 0
164 1 1
165 2 4
166 3 9
169 6 36
170 7 49
171 8 64
27 163
Sample mean:
P
d 27
x=A+ = 163 + = 166.375
n 8
Sample s.d.:
P 2 2
d2
P
d 163 27
s2 = − = − = 8.984
n n 8 8
√
∴ s= 8.984 = 2.997
H0 : µ = 165
H1 : µ > 165 ∴ Right tailed test.
x−µ
t = σ
√
n−1
166.375 − 165
=
2.997
√
8−1
√
1.375 × 8 − 1
= = 1.214
2.997
|t| = 1.214
Degrees Of Freedom:ν = n − 1 = 8 − 1 = 7
LOS:For ν = 7, α = 5% ∴ tabulated value t0.05 = 2.365 ∴ |t0.05 | = 2.365
Inference:
Since Calculated |t| < 2.365, H0 is accepted at 5% level.
Problem 3 Two horses A and B were tested according to the time (in seconds) to run a particular race
with the following results.
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 29
Test whether horse A is running faster that B at 5% level.
Sample I Sample II
x1 d1 = x1 − A1 d21 x2 d2 = x2 − A2 d22
A1 = 33 A2 = 30
28 −5 25 29 −1 1
30 −3 9 30 0 0
32 −1 1 30 0 0
33 0 0 24 −6 36
33 0 0 27 −3 9
29 −4 16 29 −1 1
34 1 1
−12 52 −11 47
Sample means: P
d1 −12
x 1 = A1 + = 33 + = 31.29
n
P1 7
d2 −11
x 2 = A2 + = 33 + = 28.17
n2 6
2
d21 −12 2
P P
d1
52
s21 = − = − = 4.5
n n 7 7
P 1 2 P 1 2 2
d2 d2 47 −11
s22 = − = − = 4.48
n2 n2 6 6
H0 : µ1 = µ2
H1 : µ1 > µ2 ∴ Right tailed test
x1 − x2
t =s
n1 s21
+ n2 s22 1
1
+
n1 + n2 − 2 n1 n2
31.29 − 28.17
=s
7 × 4.5 + 6 × 4.48 1 1
+
7+6−2 7 6
3.12
= = 2.49
1.2816
t = 2.49
Problem 4 IQ tests were administered to 5 persons before and after they were trained. The results are
given below.
Candidates 1 2 3 4 5
IQ before training 110 120 123 132 125
IQ after training 120 118 125 136 121
Solution: Let µ be the difference between mean IQ of all the persons before and after training.
H0 : µ = 0
H1 : µ > 0, ∴ Right tailed test
We find d and s :
x y d=y−x d2
110 120 10 100
120 118 −2 4
123 125 2 4
132 136 4 16
125 121 −4 16
10 140
P
d 10
∴d= = =2
n 5
and P 2
d2 √
P
2 d 140 10
s = − = − = 24 s = 24 = 4.899
n n 5 5
Under H0 , the test statistic is
d
t = s
√
n−1
2
=
4.899
√
4
= 0.816
Problem 5 Two independent samples of sizes 7 and 6 have the following values
Sample A 28 30 32 33 31 29 34
Sample B 29 30 30 24 27 28
Examine whether the samples have been drawn from normal populations having same variance using
0.05 level of significance.
Solution: Let σ12 and σ22 denote the variance of the two populations. Given sizes n1 = 7(< 30) and
n2 = 6(< 30) ∴ small sample. We use F −test for equality of variances.
H0 : σ12 = σ22
H1 : σ12 6= σ22 ∴ Two tailed test
Sample I Sample II
x1 d1 = x1 − A1 d21 x2 d2 = x2 − A2 d22
A1 = 33 A2 = 30
28 −5 25 29 −1 1
30 −3 9 30 0 0
32 −1 1 30 0 0
33 0 0 24 −6 36
31 0 0 27 −3 9
29 1 1 28 −2 4
34 1 1
7 35 −12 50
Sample s.ds.: 2
2
d21
P P
d1
35 7
s21 = − = − =4
n1 n1 7 7
P 2 P 2
−12 2
2 d2 d2 50
s2 = − = − = 4.33
n2 n2 6 6
n1 s21 7×4
∴ S12 = = = 4.67
n1 − 1 6
and
n2 s22 6 × 4.33
S22 = = = 5.196
n2 − 1 5
S22
F =
S12
5.196
=
4.67
= 1.1126
Degrees of Freedom:
N r = ν2 = n2 − 1 = 6
Dr = ν1 = n1 − 1 = 7
The table value of F is F0.05 (6, 7) = 4.39 at 5% level.
Inference:
Since Calculated |F | < 4.39, H0 is accepted at 5% level. ∴ Samples are taken from populations having
same variance.
Problem 6 The nicotine contents in two random samples of tobacco are given below.
Sample I 21 24 25 26 27
Sample II 22 27 28 30 31 36
Can you say that the two samples came from the same population?
Solution: Now, we want to test the two samples could have been drawn from the sample normal
population. ∴ we use both the test. i.e., t−test for difference of means and F −test for equality of
variances. Let σ12 and σ22 denote the variance of the two populations. First we find two sample means
and variances.
x1 d1 = x1 − A1 d21 x2 d2 = x2 − A2 d22
A1 = 25 A1 = 30
21 −4 16 22 −8 64
24 −1 1 27 −3 9
25 0 0 28 −2 4
26 1 1 30 0 0
27 2 4 31 1 1
36 6 36
−2 22 −6 114
Sample means: P
d1 −2
x 1 = A1 + = 25 +
= 24.6
Pn1 5
d2 −6
x 2 = A2 + = 30 + = 29
n2 6
Sample standard deviations.:
2
2
d21
P P
d1
22 −2
s21 = − = − = 4.24
n n 5 5
P 1 2 P 1 2 2
d2 d2 114 −6
s22 = − = − = 18
n2 n2 6 6
Population standard deviations.:
n1 2 5
S12 = s = × 4.24 = 5.3
n1 − 1 1 4
n2 2 6
s22 = s = × 18 = 21.6
n2 − 1 2 5
∴ the two samples could have been drawn from the sample normal population.
Exercise
1) A random sample of 10 boys has the following IQ’s 70,120,110,101,88,83,95,98,107,100. Do these
data support the assumption of a population mean IQ of 100? [ Answer:t = 0.62; H0 is accepted. ]
Sum of squares of
Sample Size Sample mean deviations from the
mean
I 10 15 90
II 12 14 108
Examine whether the samples come from the same normal population. [ Answer:t = 0.74; H0 is
accepted. ]
4) A group of 10 rats fed on a diet A and another group of 8 rats fed on a different diet B, recorded the
following increase in weight. Does it show the superiority of diet A over diet B.
Diet A 5 6 8 1 12 4 3 9 6 10
Diet B 2 3 6 8 1 10 2 8
5) The table below represent the values of protein content from cow’s milk and buffalo’s milk at a certain
level. Examine if these differences are significant.
6) A company arranged an intensive training course for its team of salesman. A random sample of 10
salesmen was selected and the values (in 1000) of their sales made in the weeks immediately before
and after the course are shown in the following table.
Salesmen 1 2 3 4 5 6 7 8 9 10
Sales before 12 23 5 18 10 21 19 15 8 14
Sales after 18 22 15 21 13 22 17 19 12 16
Test whether there is evidence of an increase in mean sales. [ Answer:t = 2.76; H1 is accepted. ]
7) Two random samples from two normal population are given below.
Sample I 24 27 26 21 25
Sample II 27 30 32 36 28 25
Test if the two population have the same variance. [ Answer:F = 2.918; H0 is accepted. ]
I 8 9.6 1.2
II 11 16.5 2.5
Can we conclude that two samples have been drawn from the same normal population? [ Answer:F =
2.007; H0 is accepted& t = −10.05; H1 is accepted. ∴ two samples could not have been drawn from
the same population. ]
H0 : µ1 = µ2 = µ3 = · · · = µc
H1 :Not all equal.
Step1:
N = Total No. of observations = r × c. “r and s are no. of rows and columns in the given data”
Step2:
P P P
T = x1 + x2 + · · · + xc
Step3:
T2
C.F. =
N
Step4:
x21 + x22 + · · · + x2c − C.F.
P P P
T SS =
Step5:
x1 )2 ( x2 )2 ( xc )2
P P P
(
SSC = + + ··· + − C.F.
n1 n2 nc
where n1 , n2 , · · · , nc are no. of entries in each columns.
Step6:
SSE = T SS − SSC
Step7:
ANOVA table
Source
Sum of F − table
of d.f. Mean square F − ratio
squares value
variation
Between SSC
SSC c−1 M SC =
columns c−1
M SC
if M SC > M SE
F = M SE
M SE F0.05 (N r, Dr)
if M SE > M SC
M SC
Error SSE SSE
N −c M SE =
N −c
Step1:
N = Total No. of observations = r × c. “r and s are no. of rows and columns in the given data”
Step2:
P P P
T = x1 + x2 + · · · + xc
Step3:
T2
C.F. =
N
Step4:
x21 + x22 + · · · + x2c − C.F.
P P P
T SS =
Step5:
( x1 )2 ( x2 )2 ( xc )2
P P P
SSC = + + ··· + − C.F.
n1 n2 nc
where n1 , n2 , · · · , nc are no. of entries in each columns.
Step6:
( y1 )2 ( y2 )2 ( yr )2
P P P
SSR = + + ··· + − C.F.
m1 m2 mr
where m1 , m2 , · · · , mr are no. of entries in each rows.
Step6:
SSE = T SS − SSC − SSR
Step7:
ANOVA table
Source
Sum of F − table
of d.f. Mean square F − ratio
squares value
variation
Between SSC
SSC c−1 M SC =
columns c−1
M SC
if M SC > M SE
FC = M SE
M SE
if M SE > M SC
F0.05 (N r, Dr)
M SC
Between SSR
SSR r−1 M SR =
rows r−1
M SR
if M SR > M SE
FR = M SE
M SE F0.05 (N r, Dr)
if M SE > M SR
M SR
Error SSE SSE
(r − 1)(c − 1) M SE =
(r − 1)(c − 1)
Total TSS (rc − 1)
Here N r =corresponding degrees of freedom of the Numerator in F − ratio
and Dr =corresponding degrees of freedom of the Denominator in F − ratio.
Step8:
Inference:
For Between columns:
If Cal.FC <tab. F, we accept the H0 .
If Cal.FC >tab. F, we reject the H0 .
For Between rows:
If Cal.FR <tab. F, we accept the H0 .
If Cal.FR >tab. F, we reject the H0 .
Problems
Problem 1 Four machines A, B, C, D are used to produce a certain kind of cotton fabric. Samples of
size 4 with each unit as 100 square meters are selected from the outputs of the machines at random and
the number of flaws in each 100 square meters are counted, with the following result.
A B C D
8 6 14 20
9 8 12 22
11 10 18 25
12 4 9 23
Solution: Here only one factor is involved, namely performance. We want to test with 4 samples for
each.
So, we use one-way classification.
Step1:
N = Total No. of observations = r × c = 4 × 4 = 16.
Step2:
T = 40 + 28 + 53 + 90 = 211
Step3:
T2 2112
C.F. = = = 2782.56
N 16
Step4:
x21 + x22 + · · · + x2c − C.F. = 410 + 216 + 745 + 2038 − 2782.56 = 626.44
P P P
T SS =
Step5:
x1 )2 ( x2 )2 ( x3 )2 ( x4 )2
P P P P
(
SSC = + + + − C.F.
n1 n2 n3 n4
(40)2 (28)2 (53)2 (90)2
= + + + − 2782.56
4 4 4 4
Step6:
SSE = T SS − SSC = 626.44 − 540.69 = 85.75
Step7:
ANOVA table
Source
Sum of
of d.f. Mean square F − ratio F − table value
squares
variation
Between 540.69
540.69 c−1=3 M SC = = 180.23
columns 3
180.23
F =
7.15 F0.05 (3, 12) = 3.49
Error 85.75 85.75
16 − 4 = 12 M SE = = 7.15
12
Step8:
Inference:
Since Cal.F >tab. F, we reject the H0 . ∴ the 4 machines differ in their performance significantly.
Problem 2 The sales of 4 salesmen in 3 seasons are tabulated here. Carry out an analysis of variance.
Salesmen
Seasons A B C D
Summer 45 40 38 37
.
Winter 43 41 45 38
Monsoon 39 39 41 41
Solution:
In this problem the data is given according to two factors season and salesmen. So, we do a two-way
analysis of variance.
In order to simplify computations we shall code the data by subtracting 40 from each value.
Step2:
T =7+0+4−4=7
Step3:
T2 72
C.F. = = = 4.083
N 12
Step4:
x21 + x22 + · · · + x2c − C.F. = 35 + 2 + 30 + 14 − 4.083 = 76.917
P P P
T SS =
Step5:
( x1 )2 ( x2 )2 ( x3 )2 ( x4 )2
P P P P
SSC = + + + − C.F.
n1 n2 n3 n4
(49)2 (4)2 (−4)2
= +0+ + − 4.083
3 3 3
81
= − 4.083 = 22.917
3
Step6:
( y1 )2 ( y2 )2 ( y3 )2
P P P
SSR = + + − C.F.
n1 n2 n3
(7)2
=0+ + 0 − 4.083
4
Step6:
SSE = T SS − SSC − SSR = 76.917 − 22.917 − 8.167 = 45.833
Step7:
ANOVA table
Source
F − table
of Sum of squares d.f. Mean square F − ratio
value
variation
Between 22.917
SSC=22.917 c−1=4−1=3 M SC = = 7.639
columns 3
7.639 F0.05 (3, 6) =
FC = =1
7.6388 4.76
Between 8.167
SSR=8.167 r−1=3−1=2 M SR = = 4.0835
rows 2
7.6388 F0.05 (6, 2) =
FR = = 1.87
4.0835 19.33
Error 45.833
SSE=45.833 (r − 1)(c − 1) = 6 M SE = = 7.6388
6
Total TSS=76.917 (rc − 1) = 11
Step8:
Inference:
For Between columns:
If Cal.FC <tab. F, we accept the H0 .
For Between rows:
If Cal.FR <tab. F, we accept the H0 .
∴ There is no significant difference between the salesmen and between the seasons so far as sales is
concerned.
Exercise
1) A completely randomized design experiment with 10 plots and 3 treatments gave the following
results:
Plot No. 1 2 3 4 5 6 7 8 9 10
Treatment A B C A C C A B A B
Yield 5 4 3 7 5 1 3 4 1 7
2) The following are the number of mistakes made in 5 successive days by 4 technicians working for a
photographic laboratory test at a level of significance α = 0.01. Test whether the difference among
the four sample means can be attributed to chance.
Technician
I II III IV
6 14 10 9
14 9 12 12 .
10 12 7 8
8 10 15 10
3) The following data represent the number of units of production per day turned out by different workers
using 4 different types of machines
Machine type
A B C D
1 44 38 47 36
2 46 40 52 43
Workers .
3 34 36 44 32
4 43 38 46 33
5 38 42 49 39
(a) Test whether the five men differ with respect to mean productivity and
(b) Test whether the mean productivity is the same for the four different machine types.
4) The sales of 4 salesmen in 3 seasons are tabulated here. Carry out an analysis of variance.
Salesmen
Seasons A B C D
Summer 36 36 21 35
.
Winter 28 29 31 32
Monsoon 26 28 29 29