Anda di halaman 1dari 13

Testing hypothesis and analysis of variance

Introduction
One of the most important techniques of making statistical inference about the population
parameter(s) or about the form of the population distribution is of testing statistical
hypothesis.

Statistical Hypothesis:
Any statement or assumption regarding either population parameter(s) or form of the
population distribution is called hypothesis and is denoted by 𝐻.

A sample investigation produces results; and with these results, decisions are made on the
population. But such decisions involve an element of uncertainty causing wrong decisions.
Hypothesis is an assumption which may or may not be true about a population parameter.
For example, tossing a coin 300 times, one may get 190 heads and 110 tails. At this instance,
we are interested in testing whether the coin is unbiased or not. Therefore, we may conduct a
test to judge significance whether the difference is due to sampling.

Null Hypothesis
The simple hypothesis reveals that the value of sample and the value of the population under
study do not show any difference. The hypothesis we have assumed is said to be null
hypothesis; it means that the true difference between the mean of sample and the mean of
population is nil. The rejection of null hypothesis reveals that the decision is correct.
For example:
(i) The average daily sales of a firm is $ 4000.
(ii) The average income of a man of a particular locality is $ 2000

All these statements will have to be verified on the basis of sample tests. Generally a
hypothesis states that there is no difference between the mean of sample and the population.
A null hypothesis is denoted by 𝐻0 .

Alternative Hypothesis
Rejection of 𝐻0 leads to the acceptance of alternative hypothesis, which is denoted by 𝐻1 .
For example,
𝐻0 = 𝜇 = $4000 (Null hypothesis)
𝐻1 = 𝜇 ≠ $4000 𝑖. 𝑒. , 𝜇 > $4000 𝑜𝑟 𝜇 < 4000 (Alternative Hypothesis)

Type I and Type II errors


When there are two hypotheses set up, the acceptance or rejection of a null hypothesis is
based on a sample study. Thus it leads to two wrong conclusions. (i) Rejecting 𝐻0 , when 𝐻0
is true (ii) Accepting 𝐻0 , when 𝐻0 is false. This can be expressed in the following table:
Decision from sample
Accept 𝐻0 Reject 𝐻0
𝐻0 true Correct decision Wrong decision (Type I error = 𝛼)
𝐻0 false Wrong decision (Type II error= 𝛽 Correct decision

Level of significance
The maximum probability of committing type I error, which we specified in a test, is known
as the level of significance. Generally, 5% level of significance is fixed in statistical tests.

1
This implies that we can have 95% confidence in accepting a hypothesis or we could be
wrong 5%.

Power of a test
In hypothesis testing, the power of a test statistic is the probability of rejecting the null
hypothesis 𝐻0 when 𝐻0 is false. It is denoted by 1 − 𝛽.

Acceptance and rejection regions


The range of variation has two regions – acceptance region and rejection region. If the sample
statistics falls in critical region we have to reject the hypothesis, as it leads to false decision.
We go for 𝐻1 , if the computed sample statistic falls in rejecting region.

Steps in performing a test of hypothesis


 State the null hypothesis
 Select an appropriate test statistic
 Choose a significance level α of the test, usually α=5%
 Calculate the test statistic (𝑧, 𝑡, 𝐹, 𝜒 2 𝑒𝑡𝑐. ) and determine 𝑝 value.
 Draw conclusion on the basis of 𝑝, i.e., decide whether the difference observed is due
to chance or play of some external factors on the sample under study.

State hypothesis

Select test
statistic

State decision
rule (α level)

Calculate
test statisitc

Do not Reject
reject Make statistical 𝐻0
decision
𝐻0

Conclude 𝐻0 may Conclude 𝐻0 is


be accepted rejected

Different Tests for Testing Null Hypothesis


 Normal Test
 t-test
 Chi-square test
 F-test

2
 Normal Test:
It is called large sample test (n>30) and usually two-tailed. Normal test is most powerful and
widely used in testing hypothesis regarding the population means, population proportions and
population correlation coefficient.
𝑋−𝜇
The test statistic is: 𝑍 = ~𝑁(0,1)
𝜎

Examples & Exercises of Normal Test:


 Normal Distribution Chapter
 Sampling Distribution Chapter

 T-test:
In some cases, when the sample size is small (n<30) and variance is unknown, we may have
to face some problems of using Z test. In this case it is recommended for t-test.

𝑥̅ −𝜇
The test statistic is: 𝑡 = 2
, which is the form of t-distribution with (n-1) degrees of
√𝑠
𝑛
freedom. If n is very large then t-test becomes normal test. Therefore t-test is called a small
sample test and can be considered as a special case of normal test. Like normal test, t-test is a
two-tailed test. The t-test is widely used in testing hypothesis regarding the population means,
population regression coefficients, and population correlation coefficients.

Example: (population mean is equal to a specified value of mean)


A large number of customers claim to the sales manager of coca-cola company that in each 2
liter bottle the weight is less than from the labeled. That is why a quality control inspector of
coca-cola company is interested in testing whether the mean number of coca-cola bottle
differs from the labeled amount of 2 liter or not. The inspector is drawn a random sample of
15 bottles from different markets and found the average weight of their contents is 1.85 liter
and the standard deviation is of 0.5 liter. Does the sample evidence indicate that the
customer's claim is true at 5% level of significance?

Solution:
The null hypothesis to be tested is given by:
𝐻0 : 𝜇 = 2
against the alternative hypothesis is

𝐻1 : 𝜇 < 2

The test statistic is

𝑥̅ − 𝜇 1.85 − 2
𝑡= = = −1.16190
2
√𝑠 √0.25
𝑛 15

Let the level of significance is α=0.05


At 5% level of significance with 14 degrees of freedom the critical value of the test statistic is
-1.761.

3
Decision/comments:
Since the calculated value of the test statistic is greater than the critical value, we reject the
null hypothesis.

T-test for testing equality of two population means (population variances are unknown but
not equal).

Example:
Suppose the manager of a textile industry suspects that the mean time lost due to the sickness
of the night shift workers exceeds the mean time for the day shift workers. To check it, the
manager randomly selected 12 workers in each shift category and record the number of days
lost due to sickness within the past year.
Night Shift 12 10 20 15 18 9 12 10 21 25 13 8
Day Shift 8 10 15 9 12 16 15 20 5 18 12 7
If the number of days per year lost due to the sickness for the night shift and day shift
workers are normally distributed with mean 𝜇1 and 𝜇2 and variance 𝜎12 and 𝜎12 respectively,
test the significance of the difference of population means if the population variances are not
equal.

Solution:
Here the null hypothesis to be tested is:
𝐻0 : 𝜇1 = 𝜇2
against the alternative hypothesis
𝐻1 : 𝜇1 > 𝜇2

Let 𝑥̅1 is the sample mean for night shift and 𝑥̅2 is the sample mean for day shift workers
which are given by
1 173 1 147
𝑥̅1 = 12 ∑12 12
𝑖=1 𝑥1𝑖 = 12 = 14.2, and 𝑥̅2 = 12 ∑𝑖=1 𝑥2𝑖 = 12 = 12.25

The sample variances are given by


𝑛1
1
𝑠12 = ∑(𝑥1𝑖 − 𝑥̅1 )2 = 29.36
𝑛1 − 1
𝑖=1
and
𝑛2
1
𝑠22 = ∑(𝑥2𝑖 − 𝑥̅2 )2 = 21.48
𝑛2 − 1
𝑖=1

Under the null hypothesis the test statistic is given by

𝑠 2𝑠 2 2
( 1+ 2)
(𝑥̅ 1 −𝑥̅2 ) 𝑛1 𝑛2
𝑡= ~𝑡𝑚 𝑑.𝑓 where 𝑚 = 2 2 2
2 2 𝑠 𝑠2
𝑠 𝑠
√ 1+ 2 ( 1) ( 2)
𝑛1 𝑛2
𝑛1 𝑛2 +
𝑛1 −1 𝑛2 −1

4
Therefore,

(14.42 − 12.25)
𝑡= = 1.0543
√29.36 + 21.48
12 12

The degrees of freedom is

29.36 21.48 2
( 12 + 12 )
𝑚= = 21.48 ≅ 21
29.36 2 21.48 2
( 12 ) ( 12 )
+
11 11

Let the level of significance is 5%

At 5% level of significance with 21 degrees of freedom the critical value of the test statistic is
1.721.

Decision:
Since the calculated value of the test statistic is less than the critical value, the null hypothesis
will be accepted. This implies that manager suspicion is not correct.

5
 Chi-square (𝛘𝟐 ) test

Chi-square (𝛘𝟐 ) test is one of the most popular, simple and widely applicable nonparametric
tests in the field of business, economics, banking, finance, management, medical sciences etc.
Chi-square helps us analyze data that come in the form of counts. This test can be applied to
nominal or categorical data.

The most important applications of 𝛘𝟐 test are as:


 for testing hypothesis that population variance 𝜎 2 is equal to a specified value 𝜎02
 for testing homogeneity of a set of several population variances
 for testing homogeneity of several population correlation coefficients
 for testing homogeneity of several population proportions
 for testing the independence of several attributes or variates
 for testing the goodness of fit
 for testing that two or more than two populations are identical or not

𝛘𝟐 a goodness of fit test:

Example:
A bank has an ATM installed inside the bank, and it is available to its customers only from 7
AM to 6 PM Monday through Friday. The manager of the bank wanted to investigate if the
number of transactions made on this ATM are the same for each of the 5 days (Monday
through Friday) of the week. She randomly selected one week and counted the number of
transactions made on this ATM on each of the 5 days during this week. The information she
obtained is given in the following table, where the number of users represents the number of
transactions on this ATM on these days. For convenience, we will refer to these transactions
as "people" or "users".
Day Monday Tuesday Wednesday Thursday Friday
Number of users 253 197 204 279 267
At a 1% level of significance, can we reject the null hypothesis that the number of people
who use this ATM each of the 5 days of the week is same? Assume that this week is typical
of all weeks in regard to the use of this ATM.

Solution:
Step 1: State the null and alternative hypothesis
𝐻0 : The number of people using the ATM is the same for all 5 days of the week
𝐻1 : The number of people using the ATM is not the same for all 5 days of the week

If the number of people using the ATM is the same for all 5 days of the week, then 0.20 of
the users will use this ATM on any of the 5 days of the week.

Let 𝑝1 , 𝑝2 , 𝑝3 , 𝑝4 , 𝑎𝑛𝑑 𝑝5 be the proportions of people who use this ATM on Monday,
Tuesday, Wednesday, Thursday, and Friday respectively.
Then null and alternative hypotheses can also be written as

𝐻0 : 𝑝1 = 𝑝2 = 𝑝3 = 𝑝4 = 𝑝5 = 0.20
𝐻1 : At least two of the five proportions are not equal to 0.20

Step 2: Select a distribution to use

6
Since there are 5 categories (i.e., 5 days on which the ATM is used), this is a multinomial
experiment. Consequently, we use the Chi-square distribution to make this test.

Step 3: Determine the rejection region/critical value


The significance level is given to be 1%=0.01, and the goodness of fit test is always right-
tailed.
Degrees of freedom (d.f) is
𝑑. 𝑓 = 𝑘 − 1 = 5 − 1 = 4, where k=5 (categories or days)

At 1% level and 4 (d.f), the critical value of Chi-square is 13.277.

Step 4: Calculate the value of the test statistic

The test statistic is


(𝐎 − 𝐄)𝟐
𝛘𝟐 = ∑
𝐄

Table: Calculating the value of test statistic


Categories Observed P Expected (O-E) (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
(Day) frequency frequency 𝐸
O E
Monday 253 0.20 1200×(0.20)=240 13 169 0.704
Tuesday 197 0.20 1200×(0.20)=240 -43 1849 7.704
Wednesday 204 0.20 1200×(0.20)=240 -36 1296 5.400
Thursday 279 0.20 1200×(0.20)=240 39 1521 6.338
Friday 267 0.20 1200×(0.20)=240 27 729 3.038
𝑛 = 1200 Sum=23.184

Now,
(𝐎 − 𝐄)𝟐
𝛘𝟐 = ∑ = 𝟐𝟑. 𝟏𝟖𝟒
𝐄

Step 5: Make a decision

The value of 𝛘𝟐 (𝟐𝟑. 𝟏𝟖𝟒) is greater than the critical value of 𝛘𝟐 = 𝟏𝟑. 𝟐𝟕𝟕 (𝟏% 𝐚𝐧𝐝 𝟒 𝐝. 𝐟).
So we reject the null hypothesis and state that the number of persons who use this ATM is not
the same for the 5 days of the week. In other words, we conclude that a higher number of
users of this ATM use this machine on one or more of these days.

Exercise:
In a 2011 Time/Money Magazine survey, Americans age 18 years and older were asked if
"we are less sure that our children will achieve the American Dream." Of the respondents,
65% said yes, 29% said no, and 6% said that they did not know (Time, October 10, 2011).
Assume that these percentages hold true for the 2011 population of Americans age 18 years
and older. Recently 1000 randomly selected Americans age 18 years and older were asked
the same question. The following table lists the number of Americans in this sample who
made the respective response.
Response Yes No Do not know

7
Frequency 624 306 70
Test at a 2.5% level of significance whether the current distribution of opinions is different
from that for 2011.

𝛘𝟐 - test for comparison of several population proportions

Example:
Six samples of T-shirts are selected at random from six different lots of the products of six
garments factories. The sample size and the number of defective items for each factory are
shown as:
Factories Factory Factory Factory Factory Factory Factory
1 2 3 4 5 6 Total
Attributes
Defective 10 20 15 20 20 15 100
T-shirts
Non-defective 80 100 120 100 80 120 600
T-shirts
Total 90 120 135 120 100 135 700
Test the homogeneity of proportion of defective T-shirts of six factories.

Solution:
Here six independent random samples are drawn from six different binomial populations.
Here 𝑥𝑖 ~𝐵(𝑛𝑖 , 𝑃𝑖 ), 𝑖 = 1, 2, 3, 4, 5, 6, where 𝑃𝑖 is the proportion of defective T-shirts of the ith
factory (𝑖 = 1, 2, 3, 4, 5, 6).

Here the null hypothesis to be tested is:


𝐻0 : 𝑃1 = 𝑃2 = ⋯ 𝑃6
against the alternative hypothesis is
𝐻1 : At least two of them are not equal

The pooled estimate of sample proportion is given by

100
𝑃= = 0.1429
700

Under the null hypothesis the test statistic is given by


6
2
1 xi2 X 2
χ = [∑ − ] ~χ2 (5)d.f
p(1 − p) ni N
i=1

1 102 202 152 202 202 152 1002


χ2 = [ + + + + + − ]
0.1429(1 − 0.1429) 90 120 135 120 100 135 700

= 6.7353

Let the level of significance is 5%

8
At 5% level of significance with 5 degrees of freedom the critical value of the χ2 test statistic
is 11.0705.

Decision:
Since the calculated value of the test statistic is less than the critical value, we accept the null
hypothesis. Thus we can say that there exists homogeneity of the proportion of defective T-
shirts of these factories.

Exercises:

9
F-test for testing equality of k (k>2) population means
Under the null hypothesis the test statistic is given by
𝐵𝑆𝑆
𝐹 = − 1 ~𝐹(𝑘 − 1, 𝑛 − 𝑘) 𝑑. 𝑓
𝑘
𝑊𝑆𝑆
𝑛−𝑘

Where, BSS=Between sum of squares, WSS=Within sum of squares, n=Total number and
k=number of groups.

Example:
Random samples of sales of a brand beverage coca-cola from four departmental stores are
selected at random. The sales in each day of these stores are recorded below:
Store 1 (in $) Store 2 (in $) Store 3 (in $) Store 4 (in $)
750 620 980 850
620 750 870 980
710 850 1050 970
900 1200 1250 1050
750 1000 1200 1100
560 650 810 990
890 650 650 760
980 560 710 860
1000 640 680 780
990 650 770 960
1110 850 950 1090
1060 750 780 990
1050 890 780 1060
1060 680 850 890
1150 950 940 1030
1250 960 810 990
Test the hypothesis that the average sales will be equal of four departmental stores.

Solution:
Here the null hypothesis to be tested:
H0 : μ1 = μ2 = μ3 = μ4
against the alternative hypothesis is H1 : at least two of them are not equal

Here given that, n1 = 16, n2 = 16, n3 = 16, n4 = 16; n = 64; and k = 4


𝑇 14830 𝑇 12650 𝑇 14080
Now, 𝑥̅1 = 𝑛1 = = 926.875, 𝑥̅2 = 𝑛2 = = 790.625, 𝑥̅3 = 𝑛3 = = 880, and
1 16 2 16 3 16
𝑇 15350
𝑥̅4 = 𝑛4 = = 959.375
4 16

𝑛
∑𝑘 𝑖
𝑖=1 ∑𝑗=1 𝑥𝑖𝑗 56910
and the grand mean 𝑥̅ = 𝑛=𝑛 = = 889.22
1 +𝑛2 +𝑛3 +𝑛4 64

The total sum of squares is given by:

10
𝑘 𝑛𝑖
2
TSS = ∑ ∑ 𝑥𝑖𝑗 − 𝑛𝑥̅ 2 = 52530500 − 50605439.06 = 1925060.94
𝑖=1 𝑗=1
The between sum of squares is given by:
k

BSS = ∑ ni x̅i2 − nx̅ 2


i=1
= 16 × (926.86)2 + 16 × (790.63)2 + 16 × (880)2 + 16 × (959.38)2
− 64 × (889.22)2 = 258329.69

The within sum squares is given by

WSS = TSS − BSS = 1925060.94 − 258329.69 = 1666731.25

Under the null hypothesis the test statistic is given by

𝐵𝑆𝑆 258329.69
𝐹= 𝑘 − 1 = 3 ~𝐹(3, 60) 𝑑. 𝑓 = 3.0998
𝑊𝑆𝑆 1666731.25
𝑛−𝑘 60

Let the level of significance is 5%

Critical value: At 5% level of significance with 3 and 60 degrees of freedom, the critical
value of F-test statistic is 2.76

Comments:
Since the calculated value of the test statistic is greater than the critical value, we reject the
null hypothesis. This means that the average sales of four departmental stores are not equal.

Exercises:
1. Three different treatments are given to 3 groups of patients with anemia. Increase in Hb%
level was noted after one month and is given below. Test whether the difference in
improvement in 3 groups is significant or not?
Group A: 3 1 2 0 1 2 2
Group B: 3 2 2 3 1 3 2
Group C: 3 4 5 4 2 2 4

2. Fifteen fourth-grade students were randomly assigned to three groups to experiment with
three different methods of teaching arithmetic. At the end of the semester, the same test was
given to all students. The following table gives the scores of students in the three groups.
Method I Method II Method III
48 55 84
73 85 68
51 70 95
65 69 74
87 90 67
Calculate the value of the test statistic F and prepare the ANOVA table.

11
Analysis of Variance (ANOVA)

Experiment: An experiment is any process or study which results in the collection of data,
the outcome of which is unknown. In statistics, the term is usually restricted to situations in
which the researcher has control over some of the conditions under which the experiment
takes place.

Example
Before introducing a new drug treatment to reduce high blood pressure, the manufacturer
carries out an experiment to compare the effectiveness of the new drug with that of one
currently prescribed. Newly diagnosed subjects are recruited from a group of local general
practices. Half of them are chosen at random to receive the new drug, the remainder receiving
the present one. So, the researcher has control over the type of subject recruited and the way
in which they are allocated to treatment.

Experimental Unit
A unit is a person, animal, plant or thing which is actually studied by a researcher; the basic
objects upon which the study or experiment is carried out.

For example, a person, a monkey, a sample of soil, a pot of seedlings, a postcode area, a
doctor's practice.

The basic principles of experimental designs are:


(1) Randomization (2) Replication and (3) Local control. These principles make a valid test
of significance possible.

(1) Randomization.
The first principle of an experimental design is randomization, which is a random process of
assigning treatments to the experimental units. The random process implies that every
possible allotment of treatments has the same probability. An experimental unit is the
smallest division of the experimental material and a treatment means an experimental
condition whose effect is to be measured and compared.
Purpose of randomization
-The purpose of randomization is to remove bias and other sources of extraneous variation,
which are not controllable.
-Another advantage of randomization (accompanied by replication) is that it forms the basis
of any valid statistical test.
Hence the treatments must be assigned at random to the experimental units. Randomization is
usually done by drawing numbered cards from a well-shuffled pack of cards, or by drawing
numbered balls from a well-shaken container or by using tables of random numbers.

(2) Replication
Replication is a repetition of the basic experiment. In other words, it is a complete run for all
the treatments to be tested in the experiment. In all experiments, some variation is introduced
because of the fact that the experimental units such as individuals or plots of land in
agricultural experiments cannot be physically identical. This type of variation can be
removed by using a number of experimental units. We therefore perform the experiment
more than once, i.e., we repeat the basic experiment. An individual repetition is called a
replicate. The number, the shape and the size of replicates depend upon the nature of the
experimental material.

12
A replication is used:
-To secure more accurate estimate of the experimental error, a term this represents the
differences that would be observed if the same treatments were applied several times to the
same experimental units.

-To decrease the experimental error and thereby to increase precision this is a measure of the
variability of the experimental error.

- (iii) to obtain more precise estimate of the mean effect of a treatment, since ,
where denotes the number of replications.

(3) Local Control


It has been observed that all extraneous sources of variation are not removed by
randomization and replication. This necessitates a refinement in the experimental technique.
In other words, we need to choose a design in such a manner that all extraneous sources of
variation are brought under control.

For this purpose, we make use of local control, a term referring to the amount of balancing,
blocking and grouping of the experimental units.
-Balancing means that the treatments should he assigned to the experimental units in such a
way that the result is a balanced arrangement of the treatments.
-Blocking means that like experimental units should be collected together to form a relatively
homogeneous group. A block is also a replicate.
The main purpose of the principle of local control is to increase the efficiency of an
experimental design by decreasing the experimental error. The point to remember here is that
the term local control should not be confused with the word control. The word control in
experimental design is used for a treatment. Which does not receive any treatment but we
need to find out the effectiveness of other treatments through comparison.

13