Anda di halaman 1dari 88

BASIC BIOSTATISTICS

SOURCES :

http://statistics.laerd.com>spss-tutorials
APA YANG AKAN ANDA PELAJARI?
• Jenis-jenis data (mengikut level of measurement)
• Measures of Central tendency and Dispersion
• Enter data in SPSS
• Normality test
• Analysis data
– Chi-square
– Independent T-test
– One-way Anova
– Pearson’s correlation
scale
Research objective Research question Analysis
Untuk mengukur tahap Apakah tahap Frequency dan
pengetahuan mengenai pengetahuan ibu percentage (%)
gestational diabetes antenatal mengenai
mellitus di kalangan ibu gestational diabetes di Mean & Standard
antenatal di HTAA HTAA? Deviation
Untuk menentukan Adakah terdapat Independent Sample
hubungkait di antara hubungkait antara data T-test dan One-Way
data sosiodemographik sosiodemographik Anova
dengan tahap dengan tahap
pengetahuan mengenai pengetahuan mengenai
gestational diabetes gestational diabetes
mellitus di kalangan ibu mellitus di kalangan ibu
antenatal di HTAA antenatal di HTAA
DUMMY TABLE
DUMMY TABLE
DUMMY TABLE
SKALA PENGUKURAN

• Pemilihan skala dalam mengukur sesuatu variable


ditentukan oleh variable itu sendiri.
• Apabilavariableitu diukur dalam kajian ia akan
memberikan data.
• Pemahaman asas mengenai data ini akan membantu
penyelidik untuk memilih kaedah analisis dan ujian
statistic yang sesuai untuk digunakan.
DATA NOMINAL

• Data yang diklasifikasi ke dalam kategori, nama


atau label.
• Kategori tidak dapat disusun mengikut urutan
• Dapat memberikan nilai numerik pada kategori
tersebut tetapi tidak dapat menjalankan operasi
matematik ke atas nilai tersebut.
• Contoh:
–Lelaki -1
–Perempuan -2
DATA ORDINAL
• Contoh :
• Sama seperti data
– Tahap pendidikan
nominal cuma data Tidak bersekolah -1
boleh disusun Sekolah rendah -2
mengikut urutan. Sekolah menengah - 3
• Tetapi perbezaan Diploma -4

antara dua data masih


– Gred pekerjaan
tidak boleh diukur Gred U29 - 1
secara matematik. Gred U32 - 2
Gred U36 - 3
DATA NUMERIK
• Data discrete
- Values that can be assumed only by whole numbers
(gaps between values)
- Eg: No. of students, no. of teeth extracted, no. of
accident, parity

• Data continuous
- Can assume any numerical values over a certain
interval (with decimal values)
- Eg. Height, Weight, Exam marks
EXAMPLE
Question Variables Measurement
scale

1. What is blood pressure Blood Ordinal


level for this patient? pressure
-Low
-Normal
-High

2. What is the height of this Height Continuous


patient?
________ cm
How to check the type of data?
• Look at the questionnaires/checklist
• Determine the type of data :
– Age 
– Gender 
– Exercise 
– Smoking status 
– Weight 
– Height 
– Systolic/Dystolic 
– HbA1c 
– Homocystine 
Descriptive statistic
Example :
Sociodemographic data : gender, smoking status

Identify Categorical / Numerical

Categorical

Frequency (%)
Descriptive statistic
Example :
To identify knowledge of GDM among antenatal
mother at HTAA.

Identify Categorical / numerical

Numerical

Normality test

Normal Not Normal

Mean (sd) Median (IQR)


Analytical statistic / inferential statistic
Example :
To determine association between
sociodemographic data and knowledge of GDM
among antenatal mother at HTAA.

Null hypothesis :
There is no association between sociodemographic
data and knowledge of GDM among antenatal
mother at HTAA.
How to know
whether there is a
association or not?
Do analysis  check the p-value
• P value < 0.05
• Cut point is used to reject or not to reject a
hypothesis

• If p value <0.05 : signifikan  terdapat


perbezaan / terdapat hubungkait

• If p value >0.05 : tidak signifikan  tiada


perbezaan/ tiada hubungkait
Inferential Statistic
POPULATION

Infer back
to the SAMPEL
population

STATISTICAL
CONCLUSION
MEASURE OF CENTRAL TENDENCY
Measures of Central Tendency
• Mean : the average score

• Median : the value that lies in the middle after


ranking all the scores

• Mode: the most frequently occurring score


The mean
• The mean is the most commonly used
average.
• To calculate the mean of a set of values we
add together the values and divide by the
total number of values.
Mean = Sum of values
Number of values
Measures of central tendency
• Mean : the average score is sensitive to
extreme scores

• Ex: 1 2 3 4 5 6 7 8 9 10
• Ex: 1 2 3 4 5 6 7 8 9 20
• Ex: 1 2 3 4 5 6 7 8 9 100
Measures of central tendency
Median
• Is not sensitive to extreme scores
• Use it when you are unable to use the mean
because of extreme scores
• Def: It is the middle value of a set of numbers
arranged in order.
Measures of central tendency
• Find the median of:
10 , 7 , 9, 12, 7, 8, 6

Write the values in order:


6, 7, 7, 8, 9, 10, 12

The median is the middle value = 8


Measures of central tendency
• When there is an even number of values,
there will be two values in the middle.
• In this case, we have to find the mean of the
two middle values.
• Find the median : 56, 42, 47, 51, 65, 43
• The values in order: 42, 43, 47, 51, 56, 65
• There are two middle values , 47 and 51
• 47 +51= 98 /2 =49
Measures of central tendency
Mode
• Does not involve any calculation or ordering of
data.
• The mode in a set of data is the data value
that appears the most often.
• Ex: 2, 1, 2, 0, 0, 2, 3, 1, 2, 1
• The mode is : 2
Exercise
• Find the mean, median and mode from the
following set of data values:

13, 2, 9, 16, 22, 4

Mean:
Median:
Mode:
Exercise
• This is the frequency distribution of the ages of
students in a class.

Age Frequency
11 7
12 19
13 10
14 4

Find out their mean age:


A. 12 B. 12.3 C. 12.5 D. 19
Exercise
• The table shows the number of mobile phones owned
by 100 families.
No. of mobile phone Frequency
2 2
3 6
4 38
5 16
6 20
7 18

Find the mode:


A. 4 B. 4.5 C. 5 D. 38
MEASURES OF DISPERSION
Measures of Dispersion/ Variablity
• Range
• Interquartile Range
• Variance
• Standard Deviation
Range
• The range of a set of data is a measure of how
the data is spread across the distribution.
• Range = highest value – lowest value

• Ex: 2, 5, 9, 17, 8, 27, 1, 34


• Range : 34 – 1 = 33
Quartile
• Split the ordered data into 4 quarters
• The range of data is divided into 4 equal
percentiles of quarters (25%)

Minimum Maximum
Q1 Median Q3
Inter Quartile Range
• IQR is differences between Q1 dan Q3

IQR

Minimum Maximum
Q1 (25%) Median Q3 (75%)

Range
Variance
• The variance is a measure of variability. It
is calculated by taking the average of
squared deviations from the mean.
• Variance tells you the degree of spread in
your data set. The more spread the data,
the larger the variance is in relation to
the mean.
Variance
• 6, 5, 8, 6, 4, 6, 7, 6, 8
• Mean = (6 +5 +8 +6 +4 +6 +7+6+8) /10
=6
Variance = [(6-6)2 +(6-5) 2 +(6-8) 2 +(6-6) 2 +(6-4) 2
+(6-6) 2 +(6-7) 2+(6-6) 2+(6-8) 2
n-1
Variance =18/9 = 2
Standard Deviation (SD)
• Square root of variance
• Most widely used and better measure of
variability
• Smaller SD indicates closer to the mean
• Like mean, SD is sensitive to extreme values
Example
• Let’s calculate the variance of the follow
data set: 2, 7, 3, 12, 9.
Example
• The first step is to calculate the mean. The
sum is 33 and there are 5 data points.
• Therefore, the mean is 33 ÷ 5 = 6.6.
• Then you take each value in data set,
subtract the mean and square the
difference. For instance, for the first value:
• (2 - 6.6)2 = 21.16
Example
• The squared differences for all values are
added:
• 21.16 + 0.16 + 12.96 + 29.16 + 5.76 = 69.20
• The sum is then divided by the number of
data points:
• 69.20 ÷5 = 13.84
• The variance is 13.84.
• To get the standard deviation, you calculate
the square root of the variance, which is SD
= 3.72.
Small & High SD
Statistical Test
NORMAL DISTRIBUTION
Characteristic
• Shape : Bell-shaped, smooth n symmetrical
• Mean=median=mode
• Skewness in normal range (-2 until 2)
• Kurtosis in normal range(-2 until 2)
Checking for normality
• Test statistic
- Kolmogrov-Smirnov (n> 50) or Shapiro wilk
(n<50)

• Graphical
- Normal curve : bell shaped, symmetrical

• Descriptive data
- Mean=mode=median
- Skewness
- Kurtosis
SKEWNESS
KURTOSIS
Test of Normality

• P value <0.05, indicate distribution is not normal.


• P value > 0.05 indicate the distribution of the sample is
not significantly difference from the normal distribution
HOW TO ENTER DATA IN SPSS?
• BMI

Wt_______
((ht/100) * (ht/100))
Normal : 18.5 – 22.9
Overweight : 23-27.5
Obese : > 27.5
Jadual 2: menunjukkan statistik umur
responden
Statistic Std. Error
Age (years) Mean 42.16 0.722
95% Confidence Interval
for Mean Lower Bound 40.74
Upper Bound 43.59
5% Trimmed Mean 42.08
Median 42
Variance 79.782
Std. Deviation 8.932
Minimum 21
Maximum 64
Range 43
Interquartile Range 12
Skewness 0.163 0.196
Kurtosis -0.184 0.39
Rajah 1: Menunjukkan Taburan Umur
Responden
soalan
Jadual 2 dan Rajah 1 menunjukkan statistic
deskriptif bagi umur responden dalam satu
kajian. Jawab soalan dibawah berdasarkan
jadual 2 dan Rajah 1 di atas.

i- Taksirkan data dalam jadual 2 ( 4 markah)


ii- Taksirkan data dalam rajah 1 ( 2 markah)
i- Taksirkan data dalam jadual 2
Std.
Sample = Age
Statistic Error

Umur minima = (years) Mean 42.16 0.722


95% Confidence Lower
Umur maksima = Interval for Mean Bound
Upper
40.74

Min umur = 5% Trimmed Mean


Bound 43.59
42.08
Sisihan piawai umur = Median 42
Variance 79.782
Skewness = Std. Deviation 8.932
Minimum 21
Kurtosis = Maximum 64
Range 43
Interquartile Range 12
Skewness 0.163 0.196
Kurtosis -0.184 0.39
ii- Taksirkan data dalam rajah 1 ( 2 markah)
Bentuk histogram =
Taburan = ?
PEARSON’S CHI-SQUARE
bmicat * tekanan darah Crosstabulation

tekanan darah Total

normal high

Count 48 13 61
normal
% within bmicat 78.7% 21.3% 100.0%

Count 20 25 45
bmicat overweight
% within bmicat 44.4% 55.6% 100.0%

Count 20 27 47
obese
% within bmicat 42.6% 57.4% 100.0%
Count 88 65 153
Total
% within bmicat 57.5% 42.5% 100.0%
Chi-Square Tests

Value df Asymp. Sig. (2-


sided)

Pearson Chi-Square 18.644a 2 .000


Likelihood Ratio 19.494 2 .000
Linear-by-Linear Association 15.159 1 .000
N of Valid Cases 153

a. 0 cells (0.0%) have expected count less than 5. The minimum


expected count is 19.12.
Persembahan data bagi Chi-Square
Blood Pressure Status
Variables Normal High

N % N % X2 P

BMI 18.644 <0.001


Normal 48 78.7 13 21.3
Overweight 20 44.4 25 55.6
Obese 20 42.6 27 57.4
Reporting result:
• Berdasarkan jadual di atas, bagi responden yang
mempunyai BMI normal, 48 (78.7%) mempunyai
status tekanan darah yang normal. Bagi responden
yang mempunyai BMI obese pula, 27 (57.4 %)
menunjukkan status tekanan darah tinggi
berbanding 20 (42.6%) yang menunjukkan tekanan
darah normal.

• Analysis Chi-Square yang dijalankan menunjukkan


terdapat hubungkait yang signifikan antara status
tekanan darah dan BMI responden, X² ( 2, N = 153) =
18.644, p = <0.001
INDEPENDENT SAMPLE T-TEST
Persembahan data

Status BP N Mean (Sd) t df p-value

Umur Normal 88 41.31 (9.7) -1.384 151 0.168

High 65 43.32 (7.6)


Reporting result
Reporting result
Dapatan Kajian t-Test
• Analisis independent sample t- test yang
dijalankan menunjukkan tiada perbezaan
yang signifikan antara purata umur
responden yang mempunyai status tekanan
darah normal (M = 43.32, SD = 7.643)
dengan responden yang mempunyai status
tekanan darah tinggi (M = 41.31, SD = 9.730)
dengan nilai t (151) = 1.384, p = 0.168.
ONE WAY ANOVA
Persembahan data
BMI HbA1c

Mean (Sd) F df P value

0.517 2 0.597
Normal 6.78 (2.13)
Overweight 6.98 (2.29)
Obese 7.23 ( 2.36)
PEARSON’S CORRELATION
COEFFICIENT TEST
Persembahan data

Variables Mean ± SD r Sig. ( 2 tailed)

Hba1c 6.976 ± 2.24

0.075 0.356

Age 42.16 ± 8.93


Association between amount of water that
one consumed and rating of skin elasticity.
• Nyatakan ujian statistic yang digunakan?
• Nyatakan p-value dari jadual tersebut
• Nyatakan nilai correlation coefficient (r) dari
jadual tersebut?
• Terangkan keputusan tersebut.
• There was a positive correlation between
the two variables, r = 0.985, n = 5, p =
0.002.

• There was a strong, positive correlation


between water consumption and skin
elasticity.
• Increases in water consumption were
correlated with increases in rating of skin
elasticity.”

Anda mungkin juga menyukai