Basic Biostatistics 2022

BASIC BIOSTATISTICS
SOURCES :
http://statistics.laerd.com>spss-tutorials
APA YANG AKAN ANDA PELAJARI?
• Jenis-jenis data (mengikut level of measurement)
• Measures of Central tendency and Dispersion
• Enter data in SPSS
• Normality test
• Analysis data
– Chi-square
– Independent T-test
– One-way Anova
– Pearson’s correlation
scale
Research objective Research question Analysis
Untuk mengukur tahap Apakah tahap Frequency dan
pengetahuan mengenai pengetahuan ibu percentage (%)
gestational diabetes antenatal mengenai
mellitus di kalangan ibu gestational diabetes di Mean & Standard
antenatal di HTAA HTAA? Deviation
Untuk menentukan Adakah terdapat Independent Sample
hubungkait di antara hubungkait antara data T-test dan One-Way
data sosiodemographik sosiodemographik Anova
dengan tahap dengan tahap
pengetahuan mengenai pengetahuan mengenai
gestational diabetes gestational diabetes
mellitus di kalangan ibu mellitus di kalangan ibu
antenatal di HTAA antenatal di HTAA
DUMMY TABLE
DUMMY TABLE
DUMMY TABLE
SKALA PENGUKURAN
• Pemilihan skala dalam mengukur sesuatu variable

ditentukan oleh variable itu sendiri.
• Apabilavariableitu diukur dalam kajian ia akan
memberikan data.
• Pemahaman asas mengenai data ini akan membantu
penyelidik untuk memilih kaedah analisis dan ujian
statistic yang sesuai untuk digunakan.
DATA NOMINAL
• Data yang diklasifikasi ke dalam kategori, nama

atau label.
• Kategori tidak dapat disusun mengikut urutan
• Dapat memberikan nilai numerik pada kategori
tersebut tetapi tidak dapat menjalankan operasi
matematik ke atas nilai tersebut.
• Contoh:
–Lelaki -1
–Perempuan -2
DATA ORDINAL
• Contoh :
• Sama seperti data
– Tahap pendidikan
nominal cuma data Tidak bersekolah -1
boleh disusun Sekolah rendah -2
mengikut urutan. Sekolah menengah - 3
• Tetapi perbezaan Diploma -4
antara dua data masih

– Gred pekerjaan
tidak boleh diukur Gred U29 - 1
secara matematik. Gred U32 - 2
Gred U36 - 3
DATA NUMERIK
• Data discrete
- Values that can be assumed only by whole numbers
(gaps between values)
- Eg: No. of students, no. of teeth extracted, no. of
accident, parity
• Data continuous
- Can assume any numerical values over a certain
interval (with decimal values)
- Eg. Height, Weight, Exam marks
EXAMPLE
Question Variables Measurement
scale
1. What is blood pressure Blood Ordinal

level for this patient? pressure
-Low
-Normal
-High
2. What is the height of this Height Continuous

patient?
________ cm
How to check the type of data?
• Look at the questionnaires/checklist
• Determine the type of data :
– Age 
– Gender 
– Exercise 
– Smoking status 
– Weight 
– Height 
– Systolic/Dystolic 
– HbA1c 
– Homocystine 
Descriptive statistic
Example :
Sociodemographic data : gender, smoking status
Identify Categorical / Numerical
Categorical
Frequency (%)
Descriptive statistic
Example :
To identify knowledge of GDM among antenatal
mother at HTAA.
Identify Categorical / numerical
Numerical
Normality test
Normal Not Normal
Mean (sd) Median (IQR)

Analytical statistic / inferential statistic
Example :
To determine association between
sociodemographic data and knowledge of GDM
among antenatal mother at HTAA.
Null hypothesis :
There is no association between sociodemographic
data and knowledge of GDM among antenatal
mother at HTAA.
How to know
whether there is a
association or not?
Do analysis  check the p-value
• P value < 0.05
• Cut point is used to reject or not to reject a
hypothesis
• If p value <0.05 : signifikan  terdapat

perbezaan / terdapat hubungkait
• If p value >0.05 : tidak signifikan  tiada

perbezaan/ tiada hubungkait
Inferential Statistic
POPULATION
Infer back
to the SAMPEL
population
STATISTICAL
CONCLUSION
MEASURE OF CENTRAL TENDENCY
Measures of Central Tendency
• Mean : the average score
• Median : the value that lies in the middle after

ranking all the scores
• Mode: the most frequently occurring score

The mean
• The mean is the most commonly used
average.
• To calculate the mean of a set of values we
add together the values and divide by the
total number of values.
Mean = Sum of values
Number of values
Measures of central tendency
• Mean : the average score is sensitive to
extreme scores
• Ex: 1 2 3 4 5 6 7 8 9 10
• Ex: 1 2 3 4 5 6 7 8 9 20
• Ex: 1 2 3 4 5 6 7 8 9 100
Median
• Is not sensitive to extreme scores
• Use it when you are unable to use the mean
because of extreme scores
• Def: It is the middle value of a set of numbers
arranged in order.
• Find the median of:
10 , 7 , 9, 12, 7, 8, 6
Write the values in order:

6, 7, 7, 8, 9, 10, 12
The median is the middle value = 8

• When there is an even number of values,
there will be two values in the middle.
• In this case, we have to find the mean of the
two middle values.
• Find the median : 56, 42, 47, 51, 65, 43
• The values in order: 42, 43, 47, 51, 56, 65
• There are two middle values , 47 and 51
• 47 +51= 98 /2 =49
Mode
• Does not involve any calculation or ordering of
data.
• The mode in a set of data is the data value
that appears the most often.
• Ex: 2, 1, 2, 0, 0, 2, 3, 1, 2, 1
• The mode is : 2
Exercise
• Find the mean, median and mode from the
following set of data values:
13, 2, 9, 16, 22, 4
Mean:
Median:
Mode:
Exercise
• This is the frequency distribution of the ages of
students in a class.
Age Frequency
11 7
12 19
13 10
14 4
Find out their mean age:

A. 12 B. 12.3 C. 12.5 D. 19
Exercise
• The table shows the number of mobile phones owned
by 100 families.
No. of mobile phone Frequency
2 2
3 6
4 38
5 16
6 20
7 18
Find the mode:

A. 4 B. 4.5 C. 5 D. 38
MEASURES OF DISPERSION
Measures of Dispersion/ Variablity
• Range
• Interquartile Range
• Variance
• Standard Deviation
Range
• The range of a set of data is a measure of how
the data is spread across the distribution.
• Range = highest value – lowest value
• Ex: 2, 5, 9, 17, 8, 27, 1, 34

• Range : 34 – 1 = 33
Quartile
• Split the ordered data into 4 quarters
• The range of data is divided into 4 equal
percentiles of quarters (25%)
Minimum Maximum
Q1 Median Q3
Inter Quartile Range
• IQR is differences between Q1 dan Q3
IQR
Minimum Maximum
Q1 (25%) Median Q3 (75%)
Range
Variance
• The variance is a measure of variability. It
is calculated by taking the average of
squared deviations from the mean.
• Variance tells you the degree of spread in
your data set. The more spread the data,
the larger the variance is in relation to
the mean.
Variance
• 6, 5, 8, 6, 4, 6, 7, 6, 8
• Mean = (6 +5 +8 +6 +4 +6 +7+6+8) /10
=6
Variance = [(6-6)2 +(6-5) 2 +(6-8) 2 +(6-6) 2 +(6-4) 2
+(6-6) 2 +(6-7) 2+(6-6) 2+(6-8) 2
n-1
Variance =18/9 = 2
Standard Deviation (SD)
• Square root of variance
• Most widely used and better measure of
variability
• Smaller SD indicates closer to the mean
• Like mean, SD is sensitive to extreme values
Example
• Let’s calculate the variance of the follow
data set: 2, 7, 3, 12, 9.
Example
• The first step is to calculate the mean. The
sum is 33 and there are 5 data points.
• Therefore, the mean is 33 ÷ 5 = 6.6.
• Then you take each value in data set,
subtract the mean and square the
difference. For instance, for the first value:
• (2 - 6.6)2 = 21.16
Example
• The squared differences for all values are
added:
• 21.16 + 0.16 + 12.96 + 29.16 + 5.76 = 69.20
• The sum is then divided by the number of
data points:
• 69.20 ÷5 = 13.84
• The variance is 13.84.
• To get the standard deviation, you calculate
the square root of the variance, which is SD
= 3.72.
Small & High SD
Statistical Test
NORMAL DISTRIBUTION
Characteristic
• Shape : Bell-shaped, smooth n symmetrical
• Mean=median=mode
• Skewness in normal range (-2 until 2)
• Kurtosis in normal range(-2 until 2)
Checking for normality
• Test statistic
- Kolmogrov-Smirnov (n> 50) or Shapiro wilk
(n<50)
• Graphical
- Normal curve : bell shaped, symmetrical
• Descriptive data
- Mean=mode=median
- Skewness
- Kurtosis
SKEWNESS
KURTOSIS
Test of Normality
• P value <0.05, indicate distribution is not normal.

• P value > 0.05 indicate the distribution of the sample is
not significantly difference from the normal distribution
HOW TO ENTER DATA IN SPSS?
• BMI
Wt_______
((ht/100) * (ht/100))
Normal : 18.5 – 22.9
Overweight : 23-27.5
Obese : > 27.5
Jadual 2: menunjukkan statistik umur
responden
Statistic Std. Error
Age (years) Mean 42.16 0.722
95% Confidence Interval
for Mean Lower Bound 40.74
Upper Bound 43.59
5% Trimmed Mean 42.08
Median 42
Variance 79.782
Std. Deviation 8.932
Minimum 21
Maximum 64
Range 43
Interquartile Range 12
Skewness 0.163 0.196
Kurtosis -0.184 0.39
Rajah 1: Menunjukkan Taburan Umur
Responden
soalan
Jadual 2 dan Rajah 1 menunjukkan statistic
deskriptif bagi umur responden dalam satu
kajian. Jawab soalan dibawah berdasarkan
jadual 2 dan Rajah 1 di atas.
i- Taksirkan data dalam jadual 2 ( 4 markah)

ii- Taksirkan data dalam rajah 1 ( 2 markah)
i- Taksirkan data dalam jadual 2
Std.
Sample = Age
Statistic Error
Umur minima = (years) Mean 42.16 0.722

95% Confidence Lower
Umur maksima = Interval for Mean Bound
Upper
40.74
Min umur = 5% Trimmed Mean

Bound 43.59
42.08
Sisihan piawai umur = Median 42
Variance 79.782
Skewness = Std. Deviation 8.932
Minimum 21
Kurtosis = Maximum 64
Range 43
Interquartile Range 12
Skewness 0.163 0.196
Kurtosis -0.184 0.39
ii- Taksirkan data dalam rajah 1 ( 2 markah)
Bentuk histogram =
Taburan = ?
PEARSON’S CHI-SQUARE
bmicat * tekanan darah Crosstabulation
tekanan darah Total
normal high
Count 48 13 61
normal
% within bmicat 78.7% 21.3% 100.0%
Count 20 25 45
bmicat overweight
% within bmicat 44.4% 55.6% 100.0%
Count 20 27 47
obese
% within bmicat 42.6% 57.4% 100.0%
Count 88 65 153
Total
% within bmicat 57.5% 42.5% 100.0%
Chi-Square Tests
Value df Asymp. Sig. (2-

sided)
Pearson Chi-Square 18.644a 2 .000

Likelihood Ratio 19.494 2 .000
Linear-by-Linear Association 15.159 1 .000
N of Valid Cases 153
a. 0 cells (0.0%) have expected count less than 5. The minimum

expected count is 19.12.
Persembahan data bagi Chi-Square
Blood Pressure Status
Variables Normal High
N % N % X2 P
BMI 18.644 <0.001

Normal 48 78.7 13 21.3
Overweight 20 44.4 25 55.6
Obese 20 42.6 27 57.4
Reporting result:
• Berdasarkan jadual di atas, bagi responden yang
mempunyai BMI normal, 48 (78.7%) mempunyai
status tekanan darah yang normal. Bagi responden
yang mempunyai BMI obese pula, 27 (57.4 %)
menunjukkan status tekanan darah tinggi
berbanding 20 (42.6%) yang menunjukkan tekanan
darah normal.
• Analysis Chi-Square yang dijalankan menunjukkan

terdapat hubungkait yang signifikan antara status
tekanan darah dan BMI responden, X² ( 2, N = 153) =
18.644, p = <0.001
INDEPENDENT SAMPLE T-TEST
Persembahan data
Status BP N Mean (Sd) t df p-value
Umur Normal 88 41.31 (9.7) -1.384 151 0.168
High 65 43.32 (7.6)

Reporting result
Reporting result
Dapatan Kajian t-Test
• Analisis independent sample t- test yang
dijalankan menunjukkan tiada perbezaan
yang signifikan antara purata umur
responden yang mempunyai status tekanan
darah normal (M = 43.32, SD = 7.643)
dengan responden yang mempunyai status
tekanan darah tinggi (M = 41.31, SD = 9.730)
dengan nilai t (151) = 1.384, p = 0.168.
ONE WAY ANOVA
Persembahan data
BMI HbA1c
Mean (Sd) F df P value
0.517 2 0.597
Normal 6.78 (2.13)
Overweight 6.98 (2.29)
Obese 7.23 ( 2.36)
PEARSON’S CORRELATION
COEFFICIENT TEST
Persembahan data
Variables Mean ± SD r Sig. ( 2 tailed)
Hba1c 6.976 ± 2.24
0.075 0.356
Age 42.16 ± 8.93

Association between amount of water that
one consumed and rating of skin elasticity.
• Nyatakan ujian statistic yang digunakan?
• Nyatakan p-value dari jadual tersebut
• Nyatakan nilai correlation coefficient (r) dari
jadual tersebut?
• Terangkan keputusan tersebut.
• There was a positive correlation between
the two variables, r = 0.985, n = 5, p =
0.002.
• There was a strong, positive correlation

between water consumption and skin
elasticity.
• Increases in water consumption were
correlated with increases in rating of skin
elasticity.”

Basic Biostatistics 2022

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Basic Biostatistics 2022

Diunggah oleh

Hak Cipta:

Format Tersedia

BASIC BIOSTATISTICS

• Pemilihan skala dalam mengukur sesuatu variable

• Data yang diklasifikasi ke dalam kategori, nama

antara dua data masih

1. What is blood pressure Blood Ordinal

2. What is the height of this Height Continuous

Identify Categorical / Numerical

Identify Categorical / numerical

Normal Not Normal

Mean (sd) Median (IQR)

• If p value <0.05 : signifikan  terdapat

• If p value >0.05 : tidak signifikan  tiada

• Median : the value that lies in the middle after

• Mode: the most frequently occurring score

Write the values in order:

The median is the middle value = 8

13, 2, 9, 16, 22, 4

Find out their mean age:

Find the mode:

• Ex: 2, 5, 9, 17, 8, 27, 1, 34

• P value <0.05, indicate distribution is not normal.

i- Taksirkan data dalam jadual 2 ( 4 markah)

Umur minima = (years) Mean 42.16 0.722

Min umur = 5% Trimmed Mean

tekanan darah Total

Value df Asymp. Sig. (2-

Pearson Chi-Square 18.644a 2 .000

a. 0 cells (0.0%) have expected count less than 5. The minimum

BMI 18.644 <0.001

• Analysis Chi-Square yang dijalankan menunjukkan

Status BP N Mean (Sd) t df p-value

Umur Normal 88 41.31 (9.7) -1.384 151 0.168

High 65 43.32 (7.6)

Mean (Sd) F df P value

Variables Mean ± SD r Sig. ( 2 tailed)

Hba1c 6.976 ± 2.24

Age 42.16 ± 8.93

• There was a strong, positive correlation

Anda mungkin juga menyukai