HANOI UNIVERSITY

## Tutor: Ms. Lê Thị Ngọc Tú

Tutorial: 4 – AC 09
Tutorial time: Tuesday – 12.30 – 14.00
Group members:
Nguyễn Huyền Trang 0904010116 Nguyễn Thị Huệ 0904010043
Trần Thu Hằng 0904010030 Nguyễn Thị Hồng Nga 0904010077
Nguyễn Thị Tươi 0904010112 Trần Thi Thanh Vân 0604040183
Lê Thị Minh Thành 0904010098 Nguyễn Thị Mai 0904010065
Case study - ANOVA

Case study - ANOVA

Scenario
In recent years, along with an increasing demand in human resources, a growing
number of universities have plan to open new faculties as well as increase the number of
student admissions for these hot sectors. However, it is undeniable that the mismatch
between the number of students’ enrollment and teachers/lecturers’ quantity has large
effect on the quality of education and training. To be aware of this important issue, our
group decided to find out whether there are any differences in the number of students per
teacher from 2005 to 2009 (particularly 2005, 2007 and 2009) by using statistical
technique (2-way ANOVA). The available data is blocked into six main regions in
Vietnam. After conducting the test, the result show that during this 6-year period, despite
the changes in both number of students and teachers, the number of students per teacher
is nearly the same, which lead to our conclusion that there is no difference among three
years.

Case study - ANOVA

Methodology

Data collection
As the problem objective is to test whether there are changes in the amount of
students per teacher in recent years in Viet Nam, to be more detail we conduct the test
over three years including 2005, 2007, and 2009. Moreover, the data type is quantitative;
we decided to use the analysis of variance. The data was collected from the Vietnam
General Statistics Office website (shown in Appendix E).
However, we pointed out that many other factors may affect to the result of our
test. As a result, the variability within the samples might be large. In order to reduce the
variation in each year, we made the survey according to blocks and then did the test.
Therefore, we took a random sample of six regions containing Red River delta, Northern
midlands & mountainous, Northern Central and Central Coastal, Highlands, South East,
and Mekong River delta to test the changes in the rate of student over one teacher in
those areas over three years. Nevertheless, because it was so difficult to conduct the
experiment on those areas, we continued using excel to select randomly one province in
each area to be on behalf of that region. And thereafter, we got the result of six provinces:
Hai Phong, Son La, Da Nang, Kon Tum, Dong Nai, and the last one is Kien Giang. Thus,
there are six blocks containing six regions and three treatments are three years in this test.
The experimental design used here is a randomized block design, which treatments are
the three years 2005, 2007, 2009.

## 2005 2007 2009

Red River delta 23.04452467 28.10416667 28.43558606
Northern midlands & mountainous 21.81818182 31.32592593 10.34782609
North Central and Central Coast 45.16666667 33.19047619 27.26348748
Highlands 24.68253968 12.05464481 38.86703383
South East 19.89583333 25.53491436 37.99269006
Mekong River delta 14.95890411 8.356495468 11.10789474

Case study - ANOVA

Approach
In order to indicate whether differences exist among the number of students over
the quantity of teachers over three years, it is necessary to check the required conditions
for using F-test of two-way ANOVA, which are the random variable is normally
distributes and the population variances are equal. We will check each condition one by
one.

Check the required condition

Normality

Normality
As you can see from the histogram in Appendix D, the three populations are non
normal, in order to use 2 way ANOVA, we assume that all of them are normally
distributed.

Variances equality
Since the best estimator of population variance is the sample variance, we applied
the F - test to compare the variability of two populations (biggest versus smallest ones,
shown in Appendix B). With α = 5%, the F-values of the three tests are higher than 0.05.
Therefore, it can be inferred that the variances are equal.
For its applicability, two-way ANOVA is a procedure that testes to determine
whether differences exist among two or more population means. It enables to measure
how much variation is attributable to difference among populations and how much
variation is attributable to differences within populations. By designing a randomized
block design experiment, it reduces the within treatment variation so as to more easily
detect difference among the treatment means. However, the technique only allows
testing for a difference rather than indicating which population means exceed others.

Case study - ANOVA

After calculating the variance (shown in appendix A), the largest variance is that
one in 2009 while the smallest one is in 2007, so we use F-test to make inference about
those two population variances

1. Testing hypothesis:

σ 12
HO : =1
σ 22

σ 12
HA : 2 ≠1
σ2

2. Test statistic:

s12
F= 2
s2 v
is F-distributed with 1
= n1 −1 and
v2 = n2 −1

## 3. Significance level: α = 0.05

4. Decision rule:

Reject Ho if F > Fα/2, v1, v2 = F.025, 2, 2 = 39 or F < F1-α/2, v1, v2 = 1/F.025, 2, 2 = 0.0256

6. Conclusion:

## Since 0.0256 < F = 0.69 < 39, not reject Ho.

Therefore, there is not enough evidence to conclude that the population variances differ.

Case study - ANOVA

Hypothesis testing

## 2.1. Testing block means

1. Testing hypothesis:

## Ho: Block means are all equal

Ha: At least tow block means differ

2. Test statistic:

MSB
F=
MSE is F-distributed with ν1 = b – 1 and ν2 = n – k – b + 1

## 3. Significance level: α = 0.05

4. Decision rule:

6. Conclusion:

## Since F = 1.98995 < 3.33, not reject Ho.

Therefore, there is not enough evidence to conclude that block means differ,
which indicate that we can use blocks to remove the variability and two-way ANOVA
can be conducted.

Case study - ANOVA

## 2.2. Testing treatment means

1. Testing hypothesis:

H O : µ1 = µ2 = µ3
H A : At least 2 treatment means differ

2. Test statistic:

MST
F=
MSE is F-distributed with ν1 = k – 1 and ν2 = n – k – b +1

3. Significance level:

α = 0.05

4. Decision rule:

f(F)
Rejection
Region

0 0.11241 4.10

6. Conclusion:

## Since F = 0.11241 < 4.10, we do not reject Ho.

Hence, there is not sufficient evidence to conclude that differences exist among
the three years.

Case study - ANOVA

Discussion of finding
It is obvious from the hypothesis tests that there is not enough evidence to reject
the null hypothesis, which assumes that there is no difference between the ratios of
students/teacher in Vietnam over five year period. From the result extracted from the data
analysis section, there is also no difference among the block means representing the
population of six main regions in Vietnam. Therefore, it is quite easy to recognize the
balance state through these six areas.

If the test were not conducted, people may think that the ratio of students per
teacher increases over the years because of the student growth in Vietnam. The fact
shows that due to high demand in high quality human resource to meet challenges of
economic growth, many universities/colleges have increased the number of admission
year by year. To be aware of that fact, education units have had plan to recruit more
teachers to keep up with the increase in number of students and remain/improve teaching
quality. This fact somehow explains the reason for unchanged number of students per
teacher over the years. However, compared with the world’s standard (15-20
students/teacher) and the goal of Ministry of Education and Training (20
students/teacher), the current ratio in Vietnam is still much higher with 28
students/teacher. Therefore, we need to increase the number of teachers to improve the
quality of our country‘s education. Besides, teachers’ quality (degree, teaching skills, etc)
which directly affects education quality should be concerned about. From the result
extracted from the data analysis section, there is also no difference among the block
means representing the population of six main regions in Vietnam. So, it is quite easy to
recognize the balance state through these six areas.

Case study - ANOVA

Limitation
Although we tried to do test with our best effort, some limitations still happened.
These following limitations can reduce our test’s accuracy:

## • Lack of information: it is difficult to find information through out longer

periods (5-year periods in stead of 1-year periods as we showed
previously).The 1-year periods can be too short time so that this limitation can
reflect inaccuracy in changing the number of professors. As a result, our
conclusions may be not much exactly.

## • Rejection regions: we chose α = 0.05, which might lead to type II error.

However, we believed that it is not affecting our result so much.

## • Time consuming: because checking consumptions are necessary for testing so

we spend lots of time to check the normality of populations and the equality of
its variance. Fortunately, histograms drawn resulting normally distributed
populations as we expected. Moreover, we also check SSB to ensure that there
is no difference between blocks.

• Normality: In order to follow the 2 way ANOVA test above, we have assumed
that the three populations are normally distributed.

Recommendation and conclusion

Recommendation

Recommendation
In the recent years, the number of student increases continuously in universities.
As we expected, the ratio of the professors and their students does not change from year

Case study - ANOVA

to next year, which means that it does not have strong influences on the quality of
teaching and studying. However, we still have some recommendation in order to improve
those qualities.
+ Reinforcing high qualified professors: since the number of students increases in
universities, it creates a lot of pressure on education. The lack of high qualified teachers
is inevasible. Therefore, reinforcing high qualified professors are the first principles.
+ Motivating teachers: the teachers should be facilitated studies with suitable
compensations. Beside, creating good relationships between teachers and their students
are respected also. Thus, that reduce a large number of teachers quit their jobs.
+ Changing from traditional classes to new model ones: let Hanoi University be
an example, the students and teachers attend at five lectures and five tutorials each week.
Consequently, the professors and their students have extra time for self-study.
+ Flexible time: both teachers and student as well can involve in the social
activities, voluntary event, and part-time jobs in order to gain practical experiences, soft
skills like communication skills. In addition, universities can provide enough facilities
and equipments for teaching.

Conclusion
In conclusion, the report carried out on the purpose of dealing with a statistics
question: whether there exist any differences in the number of students per teacher
through 5-years period of time from 2005 to 2009 in six main regions in Vietnam
including Red River delta, Northern midlands & mountainous, Northern Central and
Central Coastal, Highlands, South East, and Mekong River delta The findings drawn
from this study shows that there are not differences from the number of students per over

Case study - ANOVA

year in regions which we indicate above. It also means that Vietnamese university
education can provide enough teachers to meet the need of social in general and the
increase in enrolment target through years. However, we still need some recommendation
in order to improve the education system as shown in our report.
During the time we were conducting the research, some limitation occurred
which lead to inaccuracy result. In addition, because of the characteristic of ANOVA test
and time consuming, we can not show the whole picture of the issue for example, the
trend of enrolment target, change in method and model class, etc. If by any chance our
report has aroused interest in other researchers about the same topic, we hope that future
studies would be conducted on a larger time scale, with more detailed data, and with
further knowledge of statistic.

Case study - ANOVA

Reference
• General Statistic Office, Number of teachers, students in universities and colleges by

province, http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207

• http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-

2005/30050060/290/

• http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/

Case study - ANOVA

Appendixes

## SUMMARY Count Sum Average Variance

Red river delta 3 79.5843 26.5281 9.12889

## Northern midlands and mountains areas 3 63.4919 21.164 110.341

Northern Central area and Central coastal area 3 105.621 35.2069 83.1804

## 2005 6 149.567 24.9278 109.518

2007 6 138.567 23.0944 107.965
2009 6 154.015 25.6691 156.604

Case study - ANOVA
Check the variances equality
F-Test Two-Sample for Variances
Variable 1 Variable 2
Mean 23.09443724 25.66908638
Variance 107.9649341 156.6043958
Observations 6 6
df 5 5
F 0.6894119
P(F<=f) one-tail 0.346572174
F Critical one-tail 0.1980069

## Since 0.198 < F = 0.689 < 5, we do not reject Ho.

There is enough evidence to conclude that the two population variances are the same.

Case study - ANOVA

## Error 937.5724 10 93.7572

Total 1891.514 17

Case study - ANOVA

D. Histograms
• Population 1: 2005

## Population 1 - 2005 Bin Frequency

Frequency 4 10 0
3
20 2
2
1 30 3
0 40 0
10 20 30 40 50 More 50 1
More 0

• Population 2: 2007

Population 2 - 2007
Bin Frequency
10 1
Frequency

3
2 20 1
1
30 2
0
10 20 30 40 50 More
40 2
50 0
More 0

Case study - ANOVA
• Population 3: 2009
Frequenc
Population 3 - 2009
Bin y
2.5
10 0
2

Frequency
1.5 20 2
1
30 2
0.5
0 40 2
10 20 30 40 50 More 50 0
More 0

Case study - ANOVA
E. Data from GSO
2007 2008 2009
Teacher Student S/T Teacher Student S/T Teacher Student S/T
192843 167570 179617
61321 60651 65115
Whole country 6 0 4
Num Red river delta 25384 791671 25310 695089 26409 725976
31.01148
16476 606207 36.793336 17065 529211 18083 541671 29.954709
1 Hà Nội 5
2 Hà Tây 1404 29435 20.9651
32.36619
536 17704 33.029851 568 18384 646 19576 30.303406
3 Vĩnh Phúc 7
18.47468
522 7624 14.605364 632 11676 543 14530 26.758748
4 Bắc Ninh 4
11.43279
896 8100 9.0401786 811 9272 870 10277 11.812644
5 Quảng Ninh 9
15.84551
761 9677 12.716163 848 13437 876 13312 15.196347
6 Hải Dương 9
27.42749
1776 49913 28.104167 1862 51070 1894 53857 28.435586
7 Hải Phòng 7
24.47078
624 22875 36.658654 907 22195 963 24067 24.991693
8 Hưng Yên 3
11.80065
621 8409 13.541063 612 7222 613 8450 13.784666
9 Thái Bình 4
13.68656
118 3922 33.237288 268 3668 315 4070 12.920635
10 Hà Nam 7
18.34441
1517 27081 17.851681 1504 27590 1372 34802 25.365889
11 Nam Định 5
5.854077
133 724 5.443609 233 1364 234 1364 5.8290598
12 Ninh Bình 3
Northern midlands and moutain areas 4863 112385 5702 105105 5978 120033
1 Hà Giang 71 2134 30.056338 65 1001 15.4 71 1441 20.295775

Case study - ANOVA
15.76363
107 1410 13.17757 110 1734 97 1571 16.195876
2 Cao Bằng 6
21.48888
212 2080 9.8113208 45 967 45 688 15.288889
3 Bắc Kạn 9
12.67123
80 530 6.625 73 925 73 905 12.39726
4 Tuyên Quang 3
19.16049
97 1917 19.762887 81 1552 81 714 8.8148148
5 Lào Cai 4
8.577981
70 829 11.842857 109 935 111 1264 11.387387
6 Yên Bái 7
7 Thái Nguyên 2437 70666 28.997128 2929 69822 23.83817 3019 75433 24.986088
5.319277
148 1252 8.4594595 166 883 166 3188 19.204819
8 Lạng Sơn 1
10.46188
228 3592 15.754386 223 2333 244 3001 12.29918
9 Bắc Giang 3
8.955935
725 10519 14.508966 1112 9959 1031 13820 13.404462
10 Phú Thọ 3
15.17647
124 2547 20.540323 187 2838 214 2869 13.406542
11 Lai Châu 1
24.52278
405 12687 31.325926 417 10226 23 238 10.347826
12 Sơn La 2
10.43243
159 2222 13.974843 185 1930 471 11706 24.853503
13 Hòa Bình 2
Northern Central area and Central
9601 316394 9640 268741 332 3195
coastal area
18.90594
700 16646 23.78 808 15276 10866 292413 26.910823
1 Thanh Hóa 1
35.53174
1282 41358 32.26053 1134 40293 830 16022 19.303614
2 Nghệ An 6
16.27388
162 1172 7.2345679 157 2555 1325 39175 29.566038
3 Hà Tĩnh 5
33.45945
138 4889 35.427536 148 4952 167 2854 17.08982
4 Quảng Bình 9
5 Quảng Trị 78 1272 16.307692 79 1171 14.82278 148 5039 34.047297

Case study - ANOVA
5
25.95370
1952 97154 49.771516 2009 52141 80 1246 15.575
6 Thừa Thiên-Huế 8
29.52567
2394 79458 33.190476 2785 82229 2076 56599 27.263487
7 Đà Nẵng 3
13.00558
650 3771 5.8015385 537 6984 3135 90889 28.991707
8 Quảng Nam 7
20.60357
403 5553 13.779156 280 5769 634 10616 16.744479
9 Quảng Ngãi 1
31.56847
609 27751 45.568144 628 19825 375 6270 16.72
10 Bình Định 1
19.47302
329 4192 12.741641 241 4693 696 22994 33.037356
11 Phú Yên 9
44.23195
724 30423 42.020718 651 28795 370 6287 16.991892
12 Khánh Hòa 1
10.52830
54 847 15.685185 53 558 852 30733 36.071596
13 Ninh Thuận 2
26.92307
126 1908 15.142857 130 3500 53 446 8.4150943
14 Bình Thuận 7
Central highlands 1853 54774 1178 45317 125 3243
1 Kon Tum 183 2206 12.054645 90 1539 17.1 1271 49400 38.867034
2 Gia Lai 111 1163 10.477477 100 1415 14.15 190 2984 15.705263
29.05470
450 14021 31.157778 457 13278 103 1570 15.242718
3 Đắk Lắk 5
4 Đắk Nông 565 8976 15.886726 491 15761 32.099796
54.77401
544 28408 52.220588 531 29085 487 29085 59.722793
5 Lâm Đồng 1
South East 15381 549900 13720 447998 15318 485285
1 Bình Phước 97 766 7.8969072 109 952 8.733945 105 879 8.3714286
8.597402
84 805 9.5833333 77 662 77 904 11.74026
2 Tây Ninh 6
25.44402
761 20824 27.363995 527 13409 883 15529 17.586636
3 Bình Dương 3
4 Đồng Nai 759 19381 25.534914 607 19558 32.22075 684 25987 37.99269
Case study - ANOVA
8
23.30746
251 5171 20.601594 335 7808 304 7684 25.276316
5 Bà Rịa - Vũng Tàu 3
33.61864
13429 502953 37.452752 12065 405609 13265 434302 32.740445
6 TP. Hồ Chí Minh 9
Mekong river delta 4239 103312 5101 113450 5273 123067
1 Long An 84 1295 15.416667 77 1309 17 161 3762 23.36646
2 Tiền Giang 215 3622 16.846512 315 4940 15.68254 325 5879 18.089231
9.170588
178 1506 8.4606742 170 1559 166 1803 10.861446
3 Bến Tre 2
12.53995
216 5072 23.481481 413 5179 472 5535 11.726695
4 Trà Vinh 2
15.04572
572 12563 21.963287 853 12834 469 14212 30.302772
5 Vĩnh Long 1
31.35174
438 15400 35.159817 344 10785 412 12321 29.90534
6 Đồng Tháp 4
17.34439
384 8327 21.684896 482 8360 514 10767 20.947471
7 An Giang 8
8.401041
331 2766 8.3564955 384 3226 380 4221 11.107895
8 Kiên Giang 7
34.54332
1523 47008 30.865397 1662 57411 1816 53766 29.606828
9 Cần Thơ 1
10 Hậu Giang 43 797 18.534884 48 1326 27.625 126 3625 28.769841
17.84615
105 2097 19.971429 156 2784 171 2989 17.479532
11 Sóc Trăng 4
25.31683
101 2083 20.623762 101 2557 170 2546 14.976471
12 Bạc Liêu 2
12.29166
49 776 15.836735 96 1180 91 1641 18.032967
13 Cà Mau 7