Regression Models
Chapter 15
MODEL KUALITATIF
Y i = a + b X i + ei (1)
=1 ya pi
Yi Distribusi binomial
Bukan distribusi normal
=0 tidak 1-pi
Actual predictions
1
True regression line
E(Yi) = a + bXi
Xi
-2 -1 0 1 2
Masalah Estimasi LPM
LPM
1
PROBIT
Z
-3 -2 -1 0 1 2 3
Model Logit
Probabilitas kumulatif logistik
1 1
Pi F Z i F a bX i Zi
a bX i
1 e 1 e
Pi 1 e Zi
P Pe
i i
Zi
1
Zi 1 Pi Pi
e e Zi
Pi 1 Pi
Pi ni
Li Z i ln a bX i Pi =
1 Pi Ni
Contoh:
I (000 USD) Ni ni Pi Zi=F-1(Pi)
6 40 8 0.20 -0.84
8 50 12 0.24 -0.70
10 60 18 0.30 -0.38
13 80 28 0.35 -0.12
15 100 45 0.45 0.03
20 70 36 0.51 -0.52
25 65 39 0.60 0.25
30 50 33 0.66 0.40
35 40 30 0.75 0.67
40 25 20 0.80 0.84
Values of Cumulative probability
Functions
P(Zi) P(Zi)
Zi Dist. Normal Dist. Logistik
-3 0.0013 0.0474
-2 0.0228 0.1192
-1 0.1587 0.2689
0 0.5000 0.5000
1 0.6915 0.6225
2 0.9772 0.8808
3 0.9987 0.9526
y
0.8
1
y= –x
1+e
0.6
0.4
2
-0.5 – 0.5x
y = (2) e
0.2
-4 -2 2 4 x
1.2
0.8
F(Z)
0.6
0.4
0.2
0
-4 -3 -2 -1 0 1 2 3 4
Z
probit logit
Tabel 15.1 (Gujarati, 2003)
FAMILY = Family
Y = Home Ownershi, where 1=Owns a House; 0= Does Not Own a House
X = Family Income, Thousands of $
FAM Y X FAM Y X FAM Y X
1 0 8 15 0 6 28 1 18
2 1 16 16 1 19 29 0 11
3 1 18 17 1 16 30 0 10
4 0 11 18 0 10 31 1 17
5 0 12 19 0 8 32 0 13
6 1 19 20 1 18 33 1 21
7 1 20 21 1 22 34 1 20
8 0 13 22 1 16 35 0 11
9 0 9 23 0 12 36 0 8
10 0 10 24 0 11 37 1 17
11 1 17 25 1 16 38 1 16
12 1 18 26 0 11 39 0 7
13 0 14 27 1 20 40 1 17
14 1 20
The LPM estimated by OLS
Dependent Variable: Y
Method: Least Squares
Sample: 1 40
Included observations: 40
Variable Coefficient Std. Error t-Statistic Prob.
C -0.945686 0.122841 -7.698428 0.0000
X 0.102131 0.008160 12.51534 0.0000
R-squared 0.804761 Mean dependent var 0.525000
Adjusted R-squared 0.799624 S.D. dependent var 0.505736
S.E. of regression 0.226385 Akaike info criterion -0.084453
Sum squared resid 1.947505 Schwarz criterion -9.31E-06
wi Yˆi 1 Yˆi
Yi
Yi
*
wi
Xi
X *
i
wi
The WLS
Dependent Variable: Y/SW
Method: Least Squares
Sample(adjusted): 2 40
Included observations: 28
Excluded observations: 11 after adjusting endpoints
Variable Coefficient Std. Error t-Statistic Prob.
1/SW -1.245592 0.120555 -10.33211 0.0000
X/SW 0.119589 0.006852 17.45438 0.0000
R-squared 0.981050 Mean dependent var 2.191518
Adjusted R-squared 0.980321 S.D. dependent var 3.556681
S.E. of regression 0.498942 Akaike info criterion 1.516095
Sum squared resid 6.472517 Schwarz criterion 1.611252
Log likelihood -19.22533 F-statistic 1345.999
Durbin-Watson stat 1.882836 Prob(F-statistic) 0.000000
Nilai estimasi LPM yang bernilai >1 atau <0
dihilangkan dari observasi.
Yi/wi = -1.245592(1/wi)+0.119589(Xi/wi)
(-10.33211) (17.45438)
R2 = 0.981050
Logit
Dibedakan atas 2 jenis data:
1. Data individu (data at individual or micro level)
2. Data grup (grouped or replicanted data)
Interpretasi
b1= 0.079066 e 0.079066 = 1.0823
rumah tangga yang mempunyai
pendapatan lebih tinggi mempunyai
probabilitas memiliki rumah 1.0823 kali lebih
tinggi
Logit dengan data grup dengan weigted
Xi Ni ni pi 1-pi pi/(1-pi)
6 40 8 0.20 0.80 0.25
8 50 12 0.24 0.76 0.32
10 60 18 0.30 0.70 0.43
13 80 28 0.35 0.65 0.54
15 100 45 0.45 0.55 0.82
20 70 36 0.51 0.49 1.06
25 65 39 0.60 0.40 1.50
30 50 33 0.66 0.34 1.94
35 40 30 0.75 0.25 3.00
40 25 20 0.80 0.20 4.00
Lanjutan….
Pi 1.593238 wi
e e 0.078669X1i
1 Pi
e0.078669 = 1.0818 untuk setiap unit
kenaikan income yang dibobot, kepemilikan
rumah naik 1.0818 atau 8.18%
Menghitung probabilitas
Pada X=20($20000)
L1i = -1.593238wi + 0.078669 X1i = - 0.019858
dibagi wi =4.1816 - 0.004605
-0.004605 = ln (p/1-p)
e-0.004605 = p/1-p =0.9954
(1-p) 0.9954 = p
0.9954 = p(1.9954)
p = 0.9954 / 1.9954 = 0.4988
rt dengan pendapatan $20000 mempunyai
probabilitas memiliki rumah 49.88%
Probit (data grup)
Zi = -1.0088 + 0.0481 Xi
(-17.330) (19.105) R2 = 0.9786
Yi E Yi ui i ui
where the Y’s are independently distributed as
Poisson random variables with mean I
i E Yi 1 2 X 2i 3 X 3i ... k X ki
where the X’s are some of the variables that might
affect the mean value.
Modeling Count Data: the Poisson
Regression Model
For estimation purposes, we write the model as
Y e
Yi ui
Y!
Example: The data related to 100 individuals 65
years of age and older. The objective of the
study was to record the number of falls (Y) in
relation to gender (X2=0 female and 1 for male),
a balance index (X3/+) and a strength index
(X4/+), and intervention variable (X1=0 education
and 1 for education plus aerobic exercise
training)
Table 15.18
Dependent Variable: Y
Sample: 1 100
Convergence achieved after 7 iterations
Y=EXP(C(0)+C(1)*X1+C(2)*X2+C(3)*X3+C(4)*X4)
Coefficient Std. Error t-Statistic Prob.
C(0) 0.37020 0.34590 1.0701 0.2873
C(1) -1.10036 0.17050 -6.4525 0.0000
C(2) -0.02194 0.11050 -0.1985 0.8430
C(3) 0.01066 0.00270 3.9483 0.0001
C(4) 0.00927 0.00414 2.2380 0.0275
R-squared = 0.4857 Adjusted R-squared = 0.4640
Log likelihood = -197.2096 Durbin-Watson statistic = 1.7358
Interpretation of result
We have obtained in Table 15.18 is the estimated
mean value for the ith individual, that is:
3! 4!
0.7491
Interpretation of results
We can also find out the marginal effect of a
regressor on the mean value of Y. Suppose we
want to find out the effect of a unit increase in
the strength index (X4) on mean.
C0 C1 X1i C2 X 2 i C3 X 3 i C4 X 4 i
C4 e C4
X 4
The intercept and variable X2 are individually
statistically insignificant.
Interpretation of results
Concluding:
The model makes restrictive assumptions in that