Anda di halaman 1dari 29

LOGISTIC

REGRESSION

Compiled by Renti Mahkota, SKM, M.Epid


REGRESI LOGISTIK
 Analisis statistik yang ampuh untuk
mengestimasi “odds ratio” atau “risk ratio”
yang telah di “adjust” dalam variabel
“confounding” tertentu dan variabel “effect
modifier” (interaksi) tertentu
 Regresi logistik adalah salah satu bentuk
dari model logistik-linear yang sering
digunakan oleh para epidemiologist
maupun para ahli biostatistik
REGRESI LOGISTIK
 Digunakan untuk membuat model risiko
(probabilitas) berkembangnya penyakit (masalah) di
dalam kurun waktu tertentu sebagai fungsi dari
beberapa variabel independen yang diketahui atau
dicurigai berhubungan dengan perkembangan
penyakit (masalah) yang sedang kita teliti
 Dapat digunakan untuk disain:
 Cross Sectional
 Kasus kontrol yang “match” atau tidak
 Kohort
E D
Indep. Dep.
Var Var

Dikhotom (Binar)
Contoh: mati vs hidup
sakit vs tdk sakit
Logistic regression
 Models relationship between set of variables xi
 dichotomous (yes/no)
 categorical (social class, ... )
 continuous (age, ...)

and
 dichotomous (binary) variable Y

 Dichotomous outcome most common situation


in biology and epidemiology
Logistic regression (1)
Table 2 Age and signs of coronary heart disease (CD)
How can we analyse these
data?
 Compare mean age of diseased and non-
diseased

 Non-diseased: 38.6 years


 Diseased: 58.7 years (p<0.0001)

 Linear regression?
Dot-plot: Data from Table 2
Logistic regression (2)
Table 3 Prevalence (%) of signs of CD according
to age group
Dot-plot: Data from Table 3
100
Diseased %

80

60

40

20

0
0 2 4 6 8
Age group
Logistic function (1)
Probability
of disease 1.0

0.8

0.6

0.4

0.2

0.0

x
MODEL LOGISTIK
 Probabilitas terjadinya suatu penyakit
1
 P(X) = 1 + e-(+iXi)

 , 1, 2 ,… I  Parameter (konstan) yang


diestimasi dari data
 X1, X2 ,… Xi  Variabel “eksposure”, variabel
“confounder”, dan/ atau variabel “effect modifier”
 0 < Probabilitas (individual risk) < 1
Logistic transformation

{logit of P(y|x)
Transformasi Logit
 Logit P(X) = ln e [ P(X) / 1-P(X)]
 Logit P(X) = +i Xi

 Logit = Disease Log Odds


 OR = ei
Advantages of Logit
 Properties of a linear regression model
 Logit between -  and + 
 Probability (P) constrained between 0 and
1
 Directly related to notion of odds of
disease
 P  P
ln    α  βx  e αβx
 1- P  1- P
Interpretation of coefficient 

P
 e αβx
1- P
Interpretation of coefficient 
  = increase in logarithm of odds ratio
for a one unit increase in x
 Test of the hypothesis that =0 (Wald
test) 2 
β 2
(1 df)
Variance ( β)

 Interval testing
Example
 Risk of developing coronary heart disease
(CD)
by age (<55 and 55+ years)
 Logistic Regression Model
 P 
ln    α  β1  Age  - 0.841  2.094  Age
 1 - P 
Multiple logistic regression
 More than one independent variable
 Dichotomous, ordinal, nominal, continuous …

 P 
ln    α  β1x1  β2 x 2  ... βi xi
 1- P 
 Interpretation of i
 Increase in log-odds for a one unit increase in xi with all
the other xis constant
 Measures association between x and log-odds adjusted
i
for all other xi
Effect modification
 Effect modification
 Can be modelled by including interaction
terms
 P 
ln    α  β1x1  β2 x 2  β3 x1  x 2
 1- P 
Likelihood ratio statistic
 Compares two nested models
Log(odds) =  + 1x1 + 2x2 + 3x3 + 4x4 (model 1)
Log(odds) =  + 1x1 + 2x2 (model 2)

 LR statistic
-2 log (likelihood model 2 / likelihood model 1) =
-2 log (likelihood model 2) minus -2log (likelihood model 1)

LR statistic is a 2 with DF = number of extra parameters


in model
 Interaction between smoking and exercise?

 P 
ln    α  β1 Exc  β2 Smk  β3 Smk  Exc
  1term
Product - P  3 = -0.4604 (SE 0.5332)

Wald test = 0.75 (1df)

-2log(L) = 342.092 with interaction term


= 342.836 without interaction term

 LR statistic = 0.74 (1df), p = 0.39


 No evidence of any interaction
Example
 Study on the relation of estrogen to
depression in pre-menopause women
Variables in the Equation

B S.E. Wald df Sig. Exp(B)


Step
a
ESTR3(1) 1.543 .584 6.980 1 .008 4.680
1 HOLMES3(1) 1.247 .580 4.622 1 .032 3.480
Constant -1.905 .569 11.223 1 .001 .149
a. Variable(s) entered on step 1: ESTR3, HOLMES3.
Example
Variables in the Equation

B S.E. Wald df Sig. Exp(B)


Step
a
ESTR3(1) 1.543 .584 6.980 1 .008 4.680
1 HOLMES3(1) 1.247 .580 4.622 1 .032 3.480
Constant -1.905 .569 11.223 1 .001 .149
a. Variable(s) entered on step 1: ESTR3, HOLMES3.

Logit(depression) = -1,905 + 1,543 ESTR + 1,247 HOLMES

1
p( y  1 | x) 
1  e ( 1,9051,543ESTR 1, 247 HOLMES )
Example
 If a woman with low estrogen level
(ESTR=1) and low psychosocial stress
(HOLMES=0)
1
p( y  1 | x) 
1  e ( 1,9051,543*11, 247*0 )
p ( y  1 | x)  0,59

 This woman has a probability of depression


of 0.59
Example
Variables in the Equation

B S.E. Wald df Sig. Exp(B)


Step
a
ESTR3(1) 1.543 .584 6.980 1 .008 4.680
1 HOLMES3(1) 1.247 .580 4.622 1 .032 3.480
Constant -1.905 .569 11.223 1 .001 .149
a. Variable(s) entered on step 1: ESTR3, HOLMES3.

ODDS RATIO

 Prediction model only for cohort design


 Other design  use Odds Ratio
Disain Model P (X) OR
Logistik
Follow up   

Case  X 
Control

Cross  X 
Sectional
Alasan melakukan analisis
regresi logistik
1. Bila menggunakan analisis stratifikasi, makin banyak
strata dibuat makin sedikit frekuensi yang terisi
didalam sel suatu tabel bahkan mungkin menjadi nol
2. Untuk melakukan kontrol pada variabel lain yang tidak
di “match”
3. Untuk prediksi individu
4. Bila variabel pemaparan lebih dari satu
5. Bila ada variabel kontrol yang tidak diskret
6. Bila mempunyai dan atau menguasai peranti lunak
komputer untuk melakukan analisis ini

Anda mungkin juga menyukai