13
presented by:
Dudi Barmana, M.Si.
1
Agenda
y Konsekuensi yang akan dihadapi y Identifikasi/pendeteksian (pemeriksaaan pola
Today Quote
Orang sering melempar batu di jalan kita. Tergantung kita mau membuat batu itu jadi tembok atau jembatan ---Chinese book of wisdom---
thicker or heavier tails than the normal, LS fit may be sensitive to a small subset of the data. y Heavy-tailed error distributions often generate outliers that pull LS fit too much in their direction. y Prediction could be invalid. y Individual T-Test and Model F-Test could be missleading.
5
Identifikasi/pendeteksian
Residual Plot y Graphical analysis is a very effective way to investigate the adequacy of the fit of a regression model and to check the underlying assumption. y Normal Probability Plot:
y Normal probability plot: a simple way to check
yA
yA
y Ranked residuals: e[1] < < e[n] y Plot e[i] against Pi = (i-1/2)/n y Sometimes plot e[i] against *-1[ (i-1/2)/n] y Plot nearly a straight line for large sample n > 32
if e[i] normal y Small sample (n<=16) may deviate from straight line even e[i] normal y Usually 20 points are required to plot normal probability plots.
10
Kolmogorov-Smirnov Test
Let
F(x) ! P(Xi e x)
F(x) ! x, 0 e x e 1
11
Kolmogorov-Smirnov Test
12
Kolmogorov-Smirnov Test
If X1, X2, , Xn really come from the distribution with cdf F, the distance
should be small.
13
Kolmogorov-Smirnov Test
Computing the test statistic: Suppose we simulate 7 uniform(0,1)s and get:
0.6
0.2
0.5
0.9
0.1
0.4
0.2
(obviously simplified)
14
Kolmogorov-Smirnov Test
0.6
0.2
0.5
0.9
0.1
0.4
0.2
Put them in order: 0.6 0.2 0.5 0.9 0.1 0.4 0.2
F7 (x) ! 0
(x) ! 1 F7 7
15
for x 0.1
for 0.1 e x 0.2
Kolmogorov-Smirnov Test
0.6 0.2 0.5 0.9 0.1 0.4 0.2
F7 (x) ! 0
(x) ! 1 F7 7 3 F7 (x) ! 7 4 F7 (x) ! 7 5 F7 (x) ! 7 6 F7 (x) ! 7 F (x) ! 1
7
for x
0.1
0.2 0.4 0.5 0.6 0.9
for 0.1 e x for 0.2 e x for 0.4 e x for 0.5 e x for 0.6 e x
16
for x u 0.9
Kolmogorov-Smirnov Test
17
Kolmogorov-Smirnov Test
0.6 0.2 0.5 0.9 0.1 0.4 0.2
9 D7 ! } 0.2571429 35
18
Kolmogorov-Smirnov Test
Let X(1), X(2), ,X(n) be the ordered sample. Then Dn can be estimated by
Dn ! max D , D
n
where
n
19
Kolmogorov-Smirnov Test
We reject that this sample came from the proposed distribution if the empirical cdf is too far from the true cdf of the proposed distribution
20
Kolmogorov-Smirnov Test
In the 1930s, Kolmogorov and Smirnov showed that
npg
lim P n
1/2
Dn e t ! 1- 2 (-1)
i !1
i -1
-2i 2 t 2
Pn
1/2
Dn e t } 1- 2 (-1)
i !1
i -1
-2i 2 t 2
and find the value of t that makes the right hand side for an E level test.
21
1- E
Kolmogorov-Smirnov Test
For small samples, people have worked out and tabulated critical values, but there is no nice closed form solution.
E cv
22
0.20
0.10
0.05
0.02
0.01
Kolmogorov-Smirnov Test
For our small sample of size 7,
9 D7 ! } 0.2571429 35
From a table, the critical value for a 0.05 level test for n=7 is 0.483.
23
Kolmogorov-Smirnov Test
For our large sample of size 100,000,
D100000 ! 0.00152392 2
The approximate critical value for a 0.05 level test for n=100,000 is
2 3/ 2
b2 !
E [u 4 ]
2 2
y The Bera Jarque test statistic is given by b12 b2 3
2 2 W !T ~ G 2
24 6 y We estimate b1 and b2 using the residuals from the OLS
25
regression, u .
27
regression model and P can be estimated simultaneously using the method of maximum likelihood. y Use
Where y ! ln 1[1 / n ln yi ] is the geometric i !1 mean of the observations and fit the model
y P
! XF I P 1 y y is related to the Jocobian of the transformation y (P ) converting the response variable y into
28
y Computation Procedure:
y Choose P to minimize SSRes(
) y Use 10-20 values of P to compute SSRes( ). Then plot SSRes( ) v.s. P. Finally read the value of P that minimizes SSRes( ) from graph. y A second iteration can be performed using a finer mesh of values if desired. y Cannot select P by directly comparing residual sum of (P ) squares from the regressions of y on x because of a different scale. y Once P is selected, the analyst is free to fit the model using y (P { 0) or ln y (P = 0).
29
SSRes( ) , but if 0.5 is in the C.I., then we would prefer choose P = 0.5. If 1 is in the C.I., then no transformation may be necessary. y Maximize
30
y Let
2 can be approximated by 1 zE / 2 / n exp( G / n) 2 2 1 tE / 2,R / n or 1 GE ,1 / n where R is the number of residual degrees of freedom. y This is based on
2 E ,1
y exp(x) = 1 + x + x2/2! + y
G !z $t
2 1 2
2 R
31
pertanyaan