Anda di halaman 1dari 21

G

O
I
N
G
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Introduction

There has been an outbreak of an unknown disease in all Local Government


Areas (LGAs) in South Australia. The disease has affected a proportion of people
in these communities. Medical professionals have not yet reached a conclusion
about what is to be done about the outbreak. It is essential that this analysis is
completed as soon as possible to draw conclusions that enables health
authorities to take immediate and appropriate actions to handle the epidemic.

There has been a request to analyse the data. Using the existing data, the
analysis has been focused on the incidents of disease, the trends from the five-
week period (20/02 20/03/2017), distribution across all LGAs, and the impact of
people taking and not taking preventative measures. This analysis will be
concluded with recommendations and challenges faced in the analysis.

- Hypothesis testing

1
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Method

The raw data given was dense. It consisted of all the LGAs (City of Adelaide,
Adelaide Hills Council, City of Burnside, City of Campbelltown, City of Charles
Sturt, Town of Gawler, City of Holdfast Bay, City of Marion, City of Mitcham, City
of Norwood Payneham & St Peters, City of Onkaparinga, City of Playford, City of
Port Adelaide Enfield, City of Prospect, City of Salisbury, City of Tea Tree Gully,
City of Unley, Town of Walkerville, and City of West Torrens). It has information on
the number of sick people with and without preventative measures. It also has
information about the number of people disabled and dead. The temperatures
and ages of the confirmed sick people was also provided, which is excluded in
the analysis.

Data aggregation has been done. The distribution of sick people was clustered
into LGAs, preventative measures, and no preventative measures across the five-
week period. Dispersion of data was measured by calculating the central
tendencies (mean, median and mode), measures of spread (range, interquartile
range, and standard deviation). Calculations of the normal distribution of sick
people was calculated and presented. A comparison between sick people who
have and have not taken preventative measures was calculated and presented in
scatter plots, line graphs and bar graphs. Some general statistical information
(i.e. total healthy population, percentage of total average, percentage of total
sick people for five weeks) was gathered (see Error: Reference source not found).

Excel was used to create all my tables and representations. When aggregating
the data, all the disease data was put into one spreadsheet, from week one
(20/02) to week 5 (20/03). To filter the data further, I added a separate sheet to
put in the data required for the investigation the amount of people who got sick
with preventative and without preventative measures. Other sub- sheets that
could be made using the main sheet to show different representations. In the
end, a summary was made.

For the normal distributions (see Figure 8 - Frequency Distribution of Confirmed


Sick People Without Preventative Actions in 19 LGAs in Adelaide Within 5 Weeks
(20/02 - 20/03/2017) or Figure 12 - Frequency Distribution of Sick People With
Preventative Actions in 19 LGAs in Adelaide Within 5 Weeks (20/02 - 20/03/2017),
the mean and standard deviation of each data set was calculated with the
following formulas:

=AVERAGE This function takes all the data from the first to last piece that is
selected and returns the average of its arguments. Using this function makes
finding the mean more efficient because when calculated manually with lots of
data, mistakes can be made.

=STDEV This function estimates standard deviation based on a sample. This is


more efficient than doing it by hand because the formula for standard deviation
has many components that need to be calculated correctly.

2
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

After calculating the mean standard deviation, a normal distribution graph can
be made. The data of the frequency of incidents is put as the x axis on a scatter
plot and the data for normal distribution on the y axis of the scatter plot.

=NORM.DIST(x, mean, standard_dev, cumulative) This function returns the


normal distribution for the specified mean and standard deviation. This function
has a very wide range of applications in statistics, including hypothesis testing.
The function calculates the normal distribution for each y-value based on the x-
value given. The x is the value for which the distribution is chosen, mean is the
arithmetic mean of the distribution, standard deviation is the standard deviation
for the distribution and cumulative is a logical value that determines the form of
the function. If cumulative is TRUE, NORM.DIST returns the cumulative
distribution function; if FALSE, it returns the probability mass function. In this
case, FALSE is used.

For a normal distribution curve (see Figure 1 - Normal Distribution of Body


Temperature Among Sick People), the empirical rule can be applied (see Figure 2
- Empirical Rule Applied to a Normal Curve):
approximately 60% of the values lie within the first standard deviation of
the mean ( 1),
approximately 95% of the values lie within two standard deviations of the
mean ( 2), and
approximately 99.7% of the values lie within three standard deviations of
the mean ( 3). The mean, median and mode (central tendencies) lie
exactly in the middle.

The Empirical Rule tells you about what percentage of values are within a certain
range of the mean. These results are approximations only, and they only apply if
the data follow a normal distribution. In the case of the data given, the curves for
all graphs are positively skewed, meaning that the Empirical Rule cannot be
applied. However, the Empirical Rule is an important result in statistics because
the concept of going out about two standard deviations to get about 95% of the
values is one that is mentioned often with confidence intervals and hypothesis
tests.

Since the curves of all my graphs are positively skewed, it means that the mean
is greater than the median and the median is closer to the first quartile than to
the third quartile. For skewed distributions, the standard deviation gives no
information on the asymmetry. It is better to use the first and third quartiles (in
this case first quartile), since these will give some sense of the asymmetry of the
distribution. To calculate the quartiles, the QUARTILE function can be used:

=QUARTILE (array, quart) This function returns the quartile of a data set. The
array is used to take the array or cell range of numeric values for which you want
the quartile value; the quart indicates which value to return. In this case, the
quart is equal to three calculating the third quartile (75 th percentile).

Based on the aggregated data, analysis on the trend of disease and distribution
of disease in all LGAs was presented in graphs. When the data was examined in
detail, there were several inadequacies. Parameters that were not clearly defined

3
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

were the definition of being healthy, what preventative measures. Vaccination


was excluded from preventative measures. The antigen and disease in the
vaccination was unclear. Signs and symptoms were not explicit and only showed
general signs such as increases in body temperature. To assist the analysis, I
have looked up the meanings of preventative measures, health, target groups for
vaccinations and preventable diseases through vaccination. The results and
analysis are presented in the discussion.

4
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Discussion

In this section, the analysis of the epidemic is presented in four ways: (a) a
statistical summary table (b) total comparison of effect of the disease between
people who took preventative measures and those who have not taken
preventative measures, (c) distribution of case across the five-week period for
people who took preventative measures and those who have never taken
preventative measures, and (d) a snapshot of the incidence of the disease for
week four and five.

The statistical summary table indicated that total people who got sick during the
period of five week (20 February 20 of March 2017) were not up to one percent,
0,45. From the total people affected, 83 percent identified in week four and five.
Average of sick people who have taken presentative measures was 190 people,
and the most affected category is those without preventative measures, 959.

Categories Total %
Total healthy people 1,270,008
Total sick people 5,745 0.45%
Total Sick people weeks 4 & 5 4,780 83%
Total Mean Sick People Took Preventative Measures 190 3%
Total Mean of Sick people without Preventative Measures 959 17%

The average is clearly detailed in graph 1, the comparison of incidence of


diseases during the period of five weeks. The more it moves from week one to
the following weeks, number of cases among the two categories increase and
the gaps between them becomes wider. It can be confirmed that LGAs who take
preventative measures have a decreased amount of sick people compared to
LGAs who dont take preventative measures. As seen in figure 1, there is a
definite increase in sick people who dont take preventative actions.
Figure 21 - Comparison of Disease Incidents in 19 LGAs With and Without
Preventative Actions

Figure 1.
Comparison of Disease Incidents in 19 LGAs With and Without Preventative Actions
2500
2188
2000
1672
1500
Frequency of Cases
1000
754 721
500
164 199
0 16
0 1 30
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

Week

Preventative Without Preventative

For example, from the total sick people in week five, seventy five percent (75%)
were those without preventative measures and 100% in week one. Though there
are a signifiant amount of people sick regardless, we can conclude that
preventative measures are working.

5
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

The significant differences in the impact of the disease among the two categories
is further explained in the attempt to see the normal distribution of cases in
figures 2 and 3. There a clear distiction among the two.

In figure 2, the is closely normal distribution compared to the one in figure 3,


with a long tail on the right side. This means that approximately 68% of the
cases lie in between one standard deviation from the mean with the exception
on the right side where two points lie in between two standard deviations away
from the mean.

Figure 2
Distribution of Confirmed Sick People Without Preventative Actions in 19 LGAs in Adelaide Within 5 Weeks (20/02 - 20
0
0
0
0
0
Normal Distribtuion Scale 0
0
0
0
0
0 500 1000 1500 2000 2500

Frequency of Incidents

If Figure 8 and Figure 12 are compared, we can see that figure 8 vaguely looks
like it is normally distributed. Figure 12 however is positively skewed like the rest
of the distribution curves.

When normal distribution curves were made for LGAs that took preventative
measures or no preventative measures for 5 weeks, we can see that the
distribution is not normal it is positively skewed. This shows that the mean is
greater than the median and the median is closer to the first quartile. In this
case, the standard deviation gives no information about the assymetry of the
curve. In this case, the first quartile is used to give some sense of the assymetry
of the distribution.

6
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 3
istrubution of Sick People WIth Preventative Actions in 19 LGAs in Adelaide Within 5 Weeks (20/02 - 20/03/2017)

0
0
0
0
Normal Distribution Scale 0
0
0
0
0 100 200 300 400 500 600 700 800

Frequency of Incidents

An emphasis was put on weeks four and five because during those weeks,
vaccines were introduced.

Incubation period

Hypothesis testing

7
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Conclusion
What is your overall key message, linked to the work throughout your report?
Give a brief summation of key information from each section, including final
recommendations.

8
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

References List

Total %
Total healthy 1270008
Total sick 5745 0.45%
Total Sick Weeks 4 & 5 4780 0.38%
Total Mean Preventative Measures 190.20 3%
Total Mean No Preventative
Measures 958.80 17%

parameters

9
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 1 - Normal Distribution of Body Temperature Among Sick People

1.4

1.2

0.8
Normal Distribution
0.6

0.4

0.2

0
40.4 40.6 40.8 41 41.2 41.4 41.6 41.8 42

Temperatures

Figure 2 - Empirical Rule Applied to a Normal Curve

1
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 3 - Distribution of Number of Sick People Without Preventative Actions Week 1 (20/02/2017)

Distribution of Number of Sick People Without Preventative Actions Week 1 (20/02/2017)


0.35
0.3
0.25
0.2
Normal Distribution Scale 0.15

0.1
0.05
0
0 1 2 3 4 5 6

Frequency of Incidents

Figure 4 - Distribution of Number of Sick People Without Preventative Actions Week 2 (27/02/2017)

Distribution of Number of Sick People Without Preventative Actions Week 2 (27/02/2017)


0.03

0.03

0.02

Normal Distribution Scale 0.02


0.01

0.01

0
0 10 20 30 40 50 60

Frequency of Incidents

2
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 5 - Distribution of Number of Sick People Without Preventative Actions Week 3 (06/03/2017)

Distribution of Number of Sick People Without Preventative Actions Week 3 (06/03/2017)


0.01
0.01
0.01
0.01

Normal Distribution Scale 0


0
0
0
0
0 50 100 150 200 250

Frequency of Incidents

Figure 6 - Distribution of Number of Sick People Without Preventative Actions Week 4 (13/03/2017)

Distribution of Number of Sick People Without Preventative Actions Week 4 (13/03/2017)


0.01
0.01
0.01
0.01

Normal Distribution Scale 0


0
0
0
0
0 50 100 150 200

Frequncy of Incidents

3
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 7 - Distribution of Number of Sick People Without Preventative Actions Week 5 (20/03/2017)

Distribution of Number of Sick People Without Preventative Actions Week 5 (20/03/2017)


0.01

0.01

0.01

0
Normal Distribution Scale
0

0
0 50 100 150 200 250

Frequency of Incidents

Figure 8 - Frequency Distribution of Confirmed Sick People Without Preventative Actions in 19 LGAs
in Adelaide Within 5 Weeks (20/02 - 20/03/2017)

ution of Confirmed Sick People Without Preventative Actions in 19 LGAs in Adelaide Within 5 Weeks (20/
0
0
0
0
0
Normal Distribtuion Scale 0
0
0
0
0
0 500 1000 1500 2000 2500

Frequency of Incidents

4
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 9 - Distribution of Number of Sick People With Preventative Actions Week 3 (06/03/2017)

Distribution of Number of Sick People With Preventative Actions Week 3 (06/03/2017)

0.14
0.12
0.1
0.08
Normal Distribution Scale 0.06

0.04
0.02
0
0 2 4 6 8 10 12 14

Frequency of Incidents

Figure 10 - Distribution of Number of Sick People With Preventative Actions Week 4 (13/03/2017)

Distribution of Number of Sick People With Preventative Actions Week 4 (13/03/2017)


0.04

0.03

0.03

0.02
Normal Distribution Scale 0.02

0.01

0.01

0
0 5 10 15 20 25 30 35 40 45 50

Frequency of Incidents

5
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 11 - Distribution of Number of Sick People With Preventative Actions Week 5 (20/03/2017)

Distribution of Number of Sick People With Preventative Actions Week 5 (20/03/2017)


0.01

0.01

0.01

0.01
Normal Distribution Scale 0.01

0
0 20 40 60 80 100 120 140

Frequency of Incidents

Figure 12 - Frequency Distribution of Sick People With Preventative Actions in 19 LGAs in Adelaide
Within 5 Weeks (20/02 - 20/03/2017)

Distrubution of Sick People WIth Preventative Actions in 19 LGAs in Adelaide Within 5 Weeks (20/02 - 20

0
0
0
0
Normal Distribution Scale 0

0
0
0
0 100 200 300 400 500 600 700 800

Frequency of Incidents

6
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 13 - Average Amount of People Sick After 5 Weeks Without Preventative Actions

Average Amount of People Sick After 5 Weeks Without Preventative Actions


100 92.2
90 69.2
80
70 52.6
52.6
51.4
60 47.4
50 37.4
40 26.2
22.4
30 13.8
13
200.81 33.66.4
6.2 8.4
7.4
10
0
Number of Confirmed Sick People

ct

n
w
pe

lto
os

el
Pr

pb
of

m
Ca
ty
Ci

of
ty
Ci
LGA

Figure 14 - Number of Confirmed Sick People Without Preventative Actions Week 1

Number of Confirmed Sick People Without Preventative Actions Week 1


6 5
5
4 3
3 222
2 11
1
0
Number of Confirmed Sick People 000000000000
ly
il
nc

ul
G
u
Co

ee
Tr
s
ill

a
H

Te
e
id
la
e
Ad

LGA

7
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 15 - Number of Confirmed Sick People Without Preventative Actions Week 5

Number of Confirmed Sick People Without Preventative Actions Week 5


250 236
234
209
208
200 171
169
167
141
150 114
104
100
91
100 67
424549
Number of Confirmed Sick People 50 21
713
0

ry

ct
rd

n
E

w
PA
bu

pe
fo

lto
ay

lis

os
el
Pl

Sa

Pr
pb
m
Ca
LGA

Figure 16 - Average Number of Sick People After 5 Weeks With Preventative Actions

Average Number of Sick People After 5 Weeks With Preventative Actions


90 83.4
80
70
60 47.8
50 43.6
40
30 23.9
19
20 15
12
11.4
8.2
7.6
6.4
Number of Confirmed Sick People 10 1.845.8
0.2
0
0000
e

e
id

id
s

la
rn

e
Bu

Ad
of
of

ty
ty

Ci
Ci

LGA

8
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 17 - Number of Confirmed Sick People With Preventative Actions Week 1

Number of Confirmed Sick People With Preventative Actions Week 1


1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
Number of Confirmed Sick People 0.2
0.1
0

ct
e
id

pe
la

os
e
Ad

Pr
of

of
ty

ty
Ci

Ci
LGA

Figure 18 - Number of Confirmed Sick People With Preventative Actions Week 5

Number of Confirmed Sick People With Preventative Actions Week 5


140 118
120
100 86
86
80
51 57
52
60 4546
36
31 3 7
40 25
22
Number of Confirmed Sick People 20 1 2 4 6 7 9
0
ct
rd

pe
fo
ay

os
Pl

Pr
of

of
ty

ty
Ci

Ci

LGA

9
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 19 - Comparison of Sick People per LGA With Preventative Actions Within 5 Weeks

Comparison of Sick People per LGA With Preventative Actions Within 5 Weeks
140
120
100
80
60
40
Frequency of Sick People 20
0

ey
m
t
e

E
ur

PA
id

ha

nl
St
la

U
itc
e

s
Ad

M
a rle
Ch
LGA

wk1 wk2 wk3 wk4 wk5

Figure 20 - Comparison of Sick People per LGA Without Preventative Actions Within 5 Weeks

Comparison of Sick People per LGA Without Preventative Actions Within 5 Weeks
350
300
250
200
150
100
Frequency of Sick People 50
0
ey
m
t
e

E
ur

PA
id

ha

nl
St
la

U
itc
e

s
Ad

M
a rle
Ch

LGA

wk1 wk2 wk3 wk4 wk5

10
Eugenia Camnahas Going Viral: Understanding Disease Maths Folio SACE ID:
570070T

Figure 21 - Comparison of Disease Incidents in 19 LGAs With and Without Preventative Actions

Comparison of Disease Incidents in 19 LGAs With and Without Preventative Actions


2500
2188
2000
1672
1500
Frequency of Cases
1000
754 721
500
164 199
0 16
0 1 30
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

Week Number

Preventative Without Preventative

Figure 22 - Total Number of People Who Have and Haven't Taken Preventative Measures During
Weeks 4 & 5

Total Number of People Who Have and Haven't Taken Preventative Measures During Weeks 4 & 5
600
500
400
300
200
Number of People 100
0
ey
m
t
e

E
ur

PA
id

ha

nl
St
la

U
itc
e

s
Ad

M
a rle
Ch

LGA

wk4&5 Preventative wk4&5 #Prev

11