Excel Econometrics Models564322846871774856889715753649830738659664572701443446615661722766631390450576603805523588584445500661680797534541605

EXCEL AND ECONOMETRICS
Edward Omey
HUB
Edward.omey@hubrussel.be
www.edwardomey.com
Introduction .............................................................................................................................. 2
1. Preparation ........................................................................................................................... 2
2. Linear models ....................................................................................................................... 4
3. Selection of variables part 1 ................................................................................................ 6
3.1. Example and first order QMC..................................................................................... 6
3.2. Higher order QMC...................................................................................................... 12
4. Regression analysis............................................................................................................. 13
4.1. Example 1 (Ctd)........................................................................................................... 13
4.1.1. MODEL 1: Y^ = a + bX1..................................................................................... 13
4.1.2. MODEL 2: Y^ = a + bX1 + cX2.......................................................................... 19
4.2. Basic Assumptions....................................................................................................... 22
4.2.1. BA1: E(i) = 0........................................................................................................ 22
4.2.2. Homoscedasticity.................................................................................................. 23
4.2.3. BA3: no autocorrelation ...................................................................................... 25
4.2.4. BA4: i N(0, ) ................................................................................................. 25
Econometrics and Excel
E.Omey
HUB BBA
Introduction
This text provides a guide about how to use EXCEL to make some calculations related to
linear models in econometrics.
The data I use here (EDUCATION DATA) is available as an excel sheet on my webpage.
1. Preparation
We need a TOOL to make the calculations. The tool is called DATA ANALYSIS and it can
be found by using TOOLS ==> ADD-INS ==> DATA ANALYSIS:
E.Omey
HUB BBA
Now we find the DATA ANALYSIS as one of the options in TOOLS:
Selecting DATA ANALYSIS, we see the following screen:
E.Omey
HUB BBA
We are going to use:

- Correlation: allows us to calculate data-correlation-coefficients;
- Descriptive Statistics;
- Regression: regression analysis for linear models.
2. Linear models
We will study linear models or models that can be made linear. By linear we mean that the
model is linear in the parameters.
linear:
exponential:
loglinear:
etc
Y^ = a + bX1 + cX2 + dX3

Y^ = aebX
(==> ln(Y^) = ln(a) + bX of Y*^ = a* + bX)
Y^ = aXbZc (==> ln(Y^) = ln(a) + bln(X) + cln(Z))
We can transform variables by using the FUNCTION WIZZARD:
Complicated and non linear models should be treated by using more sophisticated software.
E.Omey
HUB BBA
In EXCEL we estimate parameters by using OLS, this is the METHOD OF LEAST

SQUARES.
To construct and evaluate a good econometric model, we need to discuss several topics:
* selection of variables
- to avoid QMC-problems
- to find marginal contribution of variables.
* the determination coefficient or R: do the results of the analysis correspond to reality?
* theoretical consistency: is the sign of the estimated parameters theoretically correct?
* predictive power: can we use the model to make predictions?
* can we assume that the basic assumptions hold??
E.Omey
HUB BBA
3. Selection of variables part 1

3.1. Example and first order QMC
We try to explain Y = the expenses for education in the states of the US. To this end we use a
small number of variables:
X1 = number of people that live in cities per 1000;
X2 = mean income;
X3 = number of young people (under 18) per 1000.
These are the data:
state
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
337
344
330
261
214
245
233
250
243
216
212
208
215
221
244
234
269
269
268
X1
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
661
722
766
631
390
450
576
603
805
523
588
584
445
500
661
680
797
534
541
X2
394,4
457,8
401,1
523,3
478
588,9
566,3
575,9
489,4
501,2
490,8
575,3
543,9
463,4
492,1
486,9
467,2
478,2
429,6
482,7
505,7
554
533,1
741,5
382,8
412
381,7
424,3
464,7
396,7
394,6
372,4
344,8
368
382,5
418,9
433,6
441,8
432,3
X3
325
323
328
305
303
307
301
310
300
324
329
320
337
328
330
318
309
333
330
318
304
328
323
317
310
321
342
339
287
325
315
332
358
320
355
306
335
335
344
E.Omey
HUB BBA
40
41
42
43
44
45
46
47
48
49
323
304
317
332
315
291
312
316
332
311
605
785
698
796
804
809
726
671
909
831
482
504,6
376,4
450,4
400,5
556
498,9
469,7
543,8
530,9
331
324
366
340
378
330
313
305
307
333
We expect that each of the variables has a positive influence on Y.

To find the order in which we will include the variables in our model, we calculate all
correlation coefficients.
We use TOOLS DATA ANALYSIS CORRELATION:
As INPUT RANGE we select all the data and the labels.

As OUTPUT RANGE we first select the white ball and then click in the white bar that
appears. Then we select an empty cell where we want our output. Then we click OK.
E.Omey
HUB BBA
The result is:

Y
Y
X1
X2
X3
X1
1
0,559354
0,506961
-0,01823
1
0,55568
-0,22613
X2
1
-0,41237
X3
The table serves for several purposes:

a. We can sort the variables w.r.t. their importance for Y. The order is:
X1
with r(Y, X1) = 0.559
X2
with r(Y, X2) = 0.506
X3
with r(Y, X3) = -0.018
Our first choice will be X1, then possibly X2 and then possibly X3.
b. We can check the sign of r(Y, .) and check if it is consistent with our theoretical
expectations. Here we find:
X1: r(Y, X1) > 0: OK
X2 : r(Y, X2) > 0: OK;
X3 : r(Y, X3) < 0: problem!
For X3 we expected a positive sign! There are several possible explanations for this problem:
- we made a mistake when entering the data;
- the theoretical expectations were wrong;
- the correlation coefficient is not significantly different from zero..
In our example the third reason applies!
We confirm this by using the t-test for correlation coefficients.
r (n 2)
We calculate the t-value: t(r) =
1 r
Here n = 49 and r = -0.018and we find t(r) = -0.125.
The P-value (using the t-distribution with parameter n 2) can be found using
FUNCTION WIZZARD TDIST:
E.Omey
HUB BBA
As x-value we take the calculated t-value; the degrees of freedom are n 2; as for tails we
choose1 because we want the P-value.
We find a P-value of 0.45 or 45%.

This large P-value shows that r(Y, X3) is not significantly different from = 0.
But then the sign (+ or -) is unimportant.
c. As a 3rd use, we have to have a closer look to the small correlation coefficients. It is
possible that there is not a linear but another relationship with Y.
E.Omey
HUB BBA
We check this in a graph. We make an X-Y-scatter with X3 on the horizontal axis.
We get the following result:
E.Omey
HUB BBA
10
and:
400
350
300
250
200
150
100
50
0
Series1
200
400
600
Some make-up leads to the following graphs

450
400
350
300
250
200
150
100
50
0
0
100
200
300
400
We change the scale and set the minimum = 250 resp. 200:
E.Omey
HUB BBA
11
400
350
300
250
200
250
300
350
400
In the graph, there is non-linear relationship visible!

The correlation r(Y, X3) is small and we dont see a good way to transform X3.
So, possibly X3 was a bad choice of variable.
d. The table with correlations can be used to select 2 variables.
In our examples we choose:
FIRST VARIABLE: we take X1 because r(Y, X1) is the largest number;
SECOND VARIABLE: we check X2.
* We are allowed to choose X2 if there are no QMC-problems. In our example we have
r(X1, X2) = 0.55. Since this is less that 60% (cf. classroom), we have no QMC-problem.
Conclusion
The analysis shows that we are ready to examine 2 models:
Model 1: Y^ = a + bX1
Model 2: Y^ = a + bX1 + cX2
3.2. Higher order QMC

The next variable to use is X3. To decide whether or not X3 will be included in the model, we
have to check first order QMC and then higher order QMC.
First order QMC: we look at r(X1,X3) = -0,22 and r(X2,X3) = -0,41. Both correlation
coefficients are less than the limit 0,60
Higher order QMX: we consider the model
X3^ = u + vX1 + wX2
(*)
In which we try to explain X3 (the candidate) with the variables that we already included in
the model. If we can explain X3 well by this model, we have a problem of QMC. If we cant
explain X3 well with this model, we conclude that there are no QMC-problems.
We use R of model (*) and take as a limit the value of 36%: if the calculated R is larger than
36% we consider this as QMC and we are not including X3 in the model.
E.Omey
HUB BBA
12
4. Regression analysis
4.1. Example 1 (Ctd)
4.1.1. MODEL 1: Y^ = a + bX1
Using COPY PASTE we select the data that we need. We create a NEW Worksheet:
INSERT NEW WORKSHEET.
Here we put the data (Y and X1) that we want:
state
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
337
344
330
261
214
245
233
250
243
216
212
208
215
221
244
234
269
269
268
323
X1
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
661
722
766
631
390
450
576
603
805
523
588
584
445
500
661
680
797
534
541
605
E.Omey
HUB BBA
13
41
42
43
44
45
46
47
48
49
304
317
332
315
291
312
316
332
311
785
698
796
804
809
726
671
909
831
Now we choose: TOOLS DATA ANALYSIS REGRESSION.
* Input Y-range: we select the data about Y (with LABEL)

* Input X-range: we select the variables (with LABEL)
* We select labels
* Confidence level: EXCEL gives 95 % confident statements. If we also want 99 %
statements, we change 95 % into 99%. (then we get 95% nd 99% c.s.)
* Output range: click the white ball and then select an empty cell in the white space;
* Residuals: the residuals are given by e(i) = Y(i) Y^(i);
* We can ask for graphs but the quality of the graphs is bad.
In our example we choose:
E.Omey
HUB BBA
14
We get the following output:

PART 1
Regression Statistics
Multiple R
0,55935389
R Square
0,31287678
Adjusted R Square 0,29825713
Standard Error
40,862049
Observations
49
ANOVA
df
Regression
Residual
Total
Intercept
X1
1
47
48
SS
MS
F
Significance F
35733,60563 35733,61 21,401123 2,94097E-05
78476,23111 1669,707
114209,8367
Coefficients Standard Error

t Stat
P-value
Lower 95% Upper 95% Lower 99,0% Upp
151,419701
28,10288381 5,388049 2,24E-06 94,88404472 207,95536 75,9759497 226
0,19170106
0,041438711 4,626135 2,941E-05 0,108337205 0,2750649 0,08045654 0,30
* First we get general information about the model:

R = 31% and R = 55%
R adjusted (not for us)
standard error = s(e) = sqrt(SSE/(n-p))
where SSE = e(1) + e(2) + ....
* s(e) is a good estimator for the variance = Var().
* The number of observations is n = 49
* ANOVA = the analysis of variations or variances
E.Omey
HUB BBA
15
SSRegression = variation of Y^ = the explained variation = 35733

SSResidual = SSE = variation of the errors e = the unexplained variation
SSTotal = variation of Y = the variation we have to explain
In the column MS, we calculate the mean squares
For linear models with a constant term, we always have:
SST = SSR + SSE and R = SSR/SST = 1 SSE/SST = (r(Y, Y^))
* The value of R can be evaluated with an F-value

F=
R /( p 1)
(1 R ) /(n p )
and its P-values can be found by using the F(p 1, n p)-distribution.

In EXCEL the P-value is given by Significance F.
In the example, the P-value is small enough to conclude that R is significantly different from
zero. Out model makes sense!
* Next we get information a bout the parameter estimates (a^ for a, and b^ for b) and their
statistical properties.
For the parameter a we find:
a^ = least squares estimate = 151.4
s(a^) = the estimated standard error of a^ = 28.1
the t-value of is t = a^/s(a^) = 5.38
the P-value of a^ = 2 times the one-sided prob-value of the t-value 5.38
95 % c.i. for the parameter a
99 % c.i. for the parameter a
For parameter b we get similar information.
In this example we conclude that the parameter estimates significantly differ from zero.
(high t-value; small P-value; 0 is NOT in the c.i.)
PART 2: the residuals
We print a part of the residual output:
RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
Predicted Y
248,803839
259,539099
213,147442
313,598798
318,391324
299,796321
315,515808
321,841943
Residuals
-13,80383946
-28,53909879
56,85255759
-52,59879754
-18,39132403
17,20367874
71,48419186
-36,8419431
E.Omey
HUB BBA
16
9
10
11
12
13
14
15
16
17
288,485959
295,770599
275,833689
310,531581
292,895083
277,750699
278,709205
261,072707
285,802144
11,51404124
-74,77059902
-11,83368884
-2,531580591
86,10491687
64,24930057
99,29079527
-29,07270727
-54,80214393
In the second column, we find the Y^-values.

In the last column, we find the errors e(i) = Y(i) Y^(i)
The residuals are needed to check the basic assumptions later.
PART 3:
We get some bad graphs and we are not going to use these graphs: we will make our own
graphs!
Residuals
X1 Residual Plot
200
100
0
-100 0
200
400
600
800
1000
X1
X1 Line Fit Plot
500
Y
0
0
500
1000
Predicted Y
X1
It is better to make our own graphs. In our example we make a scatter plot of (Y, Y^).
By using COPY - PASTE we place Y and Y^ in separate columns, and then use
CHART WIZZARD ==> XY scatter
Y
235
231
270
261
Predicted Y
248,803839
259,539099
213,147442
313,598798
E.Omey
HUB BBA
17
300
317
387
285
300
221
264
308
379
342
378
...
212
208
215
221
244
234
269
269
268
323
304
317
332
315
291
312
316
332
311
318,391324
299,796321
315,515808
321,841943
288,485959
295,770599
275,833689
310,531581
292,895083
277,750699
278,709205
...
264,139924
263,37312
236,726673
247,270231
278,134102
281,776422
304,205446
253,788067
255,129974
267,398842
301,905033
285,227041
304,013745
305,547353
306,505858
290,59467
280,051112
325,675964
310,723282
We get
350
300
250
200
Series1
150
100
50
0
0
200
400
600
After make up we get:
E.Omey
HUB BBA
18
Y^
340
320
300
280
260
240
220
200
200
250
300
350
400
In ideal situations we find (Y, Y^) on the first diagonal (45).
4.1.2. MODEL 2: Y^ = a + bX1 + cX2

We COPY PASTE the data that we need in a new worksheet.
state
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
337
344
330
261
214
245
233
250
243
216
212
208
215
X1
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
661
722
766
631
390
450
576
603
805
523
588
584
445
X2
394,4
457,8
401,1
523,3
478
588,9
566,3
575,9
489,4
501,2
490,8
575,3
543,9
463,4
492,1
486,9
467,2
478,2
429,6
482,7
505,7
554
533,1
741,5
382,8
412
381,7
424,3
464,7
396,7
394,6
372,4
344,8
E.Omey
HUB BBA
19
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
221
244
234
269
269
268
323
304
317
332
315
291
312
316
332
311
500
661
680
797
534
541
605
785
698
796
804
809
726
671
909
831
368
382,5
418,9
433,6
441,8
432,3
482
504,6
376,4
450,4
400,5
556
498,9
469,7
543,8
530,9
In DATA ANALYSIS we choose regression and then fill in the form (without choosing plots)
We get the following output:
E.Omey
HUB BBA
20
SUMMARY OUTPUT
Regression Statistics
Multiple R
0,607069231
R Square
0,368533051
Adjusted R Square0,341077966
Standard Error
39,59571155
Observations
49
ANOVA
df
Regression
Residual
Total
Intercept
X1
X2
2
46
48
SS
MS
F
Significance F
42090,09956 21045,05 13,42313
2,559E-05
72119,73717 1567,8204
114209,8367
Coefficients Standard Error

t Stat
P-value Lower 95% Upper 95% Lower 99,0%Upper 9
100,1897013 37,26811096 2,6883493 0,009966
25,172956 175,20645 0,0498717 200,32
0,137661559
0,04829766 2,8502739 0,006517
0,0404435 0,2348796 0,0078852 0,2674
0,184833355 0,091795138 2,0135419 0,049929
5,95E-05 0,3696072 -0,061821 0,4314
* R = 0.368 with an F-value = 13.42 with Prob-value = 2.559E-5: OK

* The parameter estimates show that the parameters are different from 0 at the 95% level; at
the 99% level, the 3rd estimate is not significantly different from 0.
* The marginal contribution of X2 is equal to
MC(X2) = R(OLD model 1) R(NEW model 2) = 0.368- 0.0313 = 0.05565.
The corresponding F-value is
F = MC(X2)* (n- p(NEW model 2)/(1 - R(NEW model 2) = 3.73
The P-value of this number can be found using the F(1, n p(new))- distribution.
Using the FUNCTION WIZARD and FDIST we find a P-value = 5,9%.
If we decide to use = 10%, this P-value show that the marginal contribution of X2 is
acceptable. If we choose another level, the decision can be different.
E.Omey
HUB BBA
21
4.2. Basic Assumptions

In order to use the statistics of the previous section, there are several basic assumptions (BA).
It is necessary to check whether or not we can assume these BA hold.
We use the example and MODEL 2: Y = a + bX1 + cX2 +
4.2.1. BA1: E(i) = 0

We make an XY-scatter with Y on the horizontal axis and the residuals on the vertical axis. If
we see CLUSTERS or OULIERS in the graph, we might have problems with BA 1.
Using COPY PASTE we collect the data we need: (only part of the data are here)
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
e
-8,02005
-31,4475
51,34662
-52,3747
-8,44326
1,411889
64,30088
-44,0164
10,92484
-75,4873
-16,2483
-12,7834
76,68521
65,43955
95,44653
-36,9275
-52,0446
-3,56108
-10,9912
-6,07062
As a graph we find (after make-up):
E.Omey
HUB BBA
22
150
100
50
0
200
-50
250
300
350
400
-100
We dont see outliers.

A closer look shows that we have around 3 clusters or groups of data. We need to investigate
the origin of these clusters and possibly adapt the model by adding new variables.
4.2.2. Homoscedasticity
We have to check whether or not Var() is a constant. We make graphs to check this
assumption.
By COPY PASTE, we make the following table and we add a new column in which we
calculate e. We find (part of the table is here):
e
-8,0200
-31,4475
51,3466
-52,3747
-8,4433
1,4119
64,3009
-44,0164
10,9248
-75,4873
-16,2483
-12,7834
76,6852
65,4396
95,4465
-36,9275
-52,0446
-3,5611
-10,9912
-6,0706
X1
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
X2
394,4
457,8
401,1
523,3
478
588,9
566,3
575,9
489,4
501,2
490,8
575,3
543,9
463,4
492,1
486,9
467,2
478,2
429,6
482,7
e
64,32118
988,9472
2636,475
2743,107
71,28869
1,993431
4134,603
1937,44
119,3521
5698,337
264,0061
163,4159
5880,621
4282,335
9110,04
1363,638
2708,64
12,68131
120,8057
36,85243
Method 1: graphs
We make XY- scatters with the e on the vertical axis and with Y or X1 or X2 or the index i
(for times series) on the horizontal axis. The ideal graph is a horizontal box.
We find (after make-up):
E.Omey
HUB BBA
23
10000
8000
e
6000
4000
2000
0
200
250
300
350
400
10000
8000
e
6000
4000
2000
0
200
400
600
800
1000
X1
10000
8000
e
6000
4000
2000
0
200
300
400
500
600
700
800
X2
Clearly the graphs dont really show a horizontal box. Moreover, the clusters can be seen
more clear now.
Method 2: correlations
Under ideal situations, we find r(e, Y) = r(e, X1) = r(e, X2) = 0.
We calculate the correlations (using the same table as in Method 1) and find:
Y
0,38436
X1
0,15507
X2
0,2688
e
1
We see that r(e,Y) is not small. More investigation is needed.

E.Omey
HUB BBA
24
Method 3: Bartlett + Park + Feldstein + ...

See handbook
4.2.3. BA3: no autocorrelation

See any handbook We dont treat this here.
4.2.4. BA4: i N(0, )

We use the test of Kolmogorov-Smirnov and compare the TDF (theoretical df) and the EDF
(Empirical df).
TDF: this is N(0,)
We estimate by s(e) = (39.59) (cf. OUTPUT of model 2)
For each error e(i), we calculate P( e(i)) by using the FUNCTION WIZZARD and the
function NORMDIST, cf below.
EVF:
For each error e(i) we calculate the proportion of errors e(i)
In practice, we proceed as follows:.
- using COPY PASTE, we copy the errors;
- then we sort the errors from the smallest to the largest (DATA ==> SORT) and number
them with 1, 2, 3,
We get columns 1 and 2 in the following table:
number
e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
EVF
TVF
|EVF-TVF|
-75,4873 0,020408 0,028296 0,007888
-63,1081 0,040816 0,055489 0,014673
-53,8993 0,061224 0,086719 0,025494
-52,3747 0,081633 0,092962 0,011329
-52,0446 0,102041 0,094356 0,007685
-44,0164 0,122449 0,133146 0,010697
-42,0699 0,142857 0,144007 0,00115
-41,416 0,163265 0,147787 0,015478
-37,2263 0,183673 0,173568 0,010105
-36,9275 0,204082 0,17551 0,028572
-31,4475 0,22449 0,213535 0,010955
-29,5101 0,244898 0,22805 0,016848
-23,3252 0,265306 0,277902 0,012596
-21,0497 0,285714 0,297496 0,011782
-17,8828 0,306122 0,325767 0,019644
-17,0337 0,326531 0,333529 0,006998
E.Omey
HUB BBA
25
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
-16,2483
-16,0392
-12,7834
-11,6244
-10,9912
-10,6319
-10,1796
-8,44326
-8,02005
-6,07062
-3,56108
-1,71448
1,411889
2,479064
6,163563
6,711255
10,92484
13,43194
13,63965
19,65465
25,82688
30,10465
36,62317
38,98275
42,02097
50,43538
51,15126
51,34662
52,34578
64,30088
65,43955
76,68521
95,44653
0,346939
0,367347
0,387755
0,408163
0,428571
0,44898
0,469388
0,489796
0,510204
0,530612
0,55102
0,571429
0,591837
0,612245
0,632653
0,653061
0,673469
0,693878
0,714286
0,734694
0,755102
0,77551
0,795918
0,816327
0,836735
0,857143
0,877551
0,897959
0,918367
0,938776
0,959184
0,979592
1
0,340773
0,342712
0,373405
0,38454
0,390666
0,394153
0,398555
0,415571
0,419744
0,439075
0,464169
0,482731
0,514222
0,524961
0,56185
0,567296
0,608691
0,632781
0,634755
0,690188
0,742885
0,776463
0,822498
0,83757
0,855712
0,898625
0,901793
0,902645
0,906917
0,947805
0,950803
0,973609
0,992035
0,006166
0,024635
0,01435
0,023623
0,037906
0,054827
0,070833
0,074225
0,09046
0,091537
0,086851
0,088697
0,077614
0,087284
0,070803
0,085765
0,064778
0,061097
0,079531
0,044506
0,012217
0,000952
0,02658
0,021243
0,018977
0,041483
0,024242
0,004686
0,01145
0,00903
0,00838
0,005982
0,007965
- In Column 3 we calculated EDF. Because of our method, we get EDF(e(i)) = i/n.

- In Column 4, for each e(i) we calculated TDF:
E.Omey
HUB BBA
26
We enter and then copy this formula to get the TDF.

The graph of EDF and TDF is the following: (select the errors and EDF, TDF) (graph 2 is
graph 1 after make up!)
EDF and TDF

1,2
1
0,8
0,6
0,4
0,2
0
-100
-50
Series1
Series2
50
100
150
E.Omey
HUB BBA
27
1
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
-100
-50
50
100
150
In Column 5 of the table, we calculated EDF - TDFand then the maximum of these
numbers KS = MAX(EDF - TDF) = 0.091.
This value should be compared with the theoretical values of KS. In this example, we find
that KS = 0.091 is sufficiently small and we dont reject BA4.
As extra, we can make a histogram of the errors e(i). If BA4 holds, we should find a curve in
the form of a clock Using the other excel-manuel (descriptive statistics) we find the following
graph:
Histogram
-75 -58 -41 -24 -7
10 27 44
61 78
95 112
E.Omey
HUB BBA
28

Excel Econometrics Models564322846871774856889715753649830738659664572701443446615661722766631390450576603805523588584445500661680797534541605

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Excel Econometrics Models564322846871774856889715753649830738659664572701443446615661722766631390450576603805523588584445500661680797534541605

Diunggah oleh

Hak Cipta:

Format Tersedia

EXCEL AND ECONOMETRICS

Econometrics and Excel

Econometrics and Excel

Now we find the DATA ANALYSIS as one of the options in TOOLS:

Selecting DATA ANALYSIS, we see the following screen:

Econometrics and Excel

We are going to use:

Y^ = a + bX1 + cX2 + dX3

We can transform variables by using the FUNCTION WIZZARD:

Econometrics and Excel

In EXCEL we estimate parameters by using OLS, this is the METHOD OF LEAST

Econometrics and Excel

3. Selection of variables part 1

Econometrics and Excel

We expect that each of the variables has a positive influence on Y.

As INPUT RANGE we select all the data and the labels.

Econometrics and Excel

The result is:

The table serves for several purposes:

Econometrics and Excel

We find a P-value of 0.45 or 45%.

We check this in a graph. We make an X-Y-scatter with X3 on the horizontal axis.

We get the following result:

Econometrics and Excel

Some make-up leads to the following graphs

Econometrics and Excel

In the graph, there is non-linear relationship visible!

3.2. Higher order QMC

Econometrics and Excel

Econometrics and Excel

Now we choose: TOOLS DATA ANALYSIS REGRESSION.

* Input Y-range: we select the data about Y (with LABEL)

Econometrics and Excel

We get the following output:

Coefficients Standard Error

* First we get general information about the model:

SSRegression = variation of Y^ = the explained variation = 35733

* The value of R can be evaluated with an F-value

and its P-values can be found by using the F(p 1, n p)-distribution.

Econometrics and Excel

In the second column, we find the Y^-values.

X1 Line Fit Plot

Econometrics and Excel

After make up we get:

Econometrics and Excel

In ideal situations we find (Y, Y^) on the first diagonal (45).

4.1.2. MODEL 2: Y^ = a + bX1 + cX2

Econometrics and Excel

We get the following output:

Econometrics and Excel

Coefficients Standard Error

* R = 0.368 with an F-value = 13.42 with Prob-value = 2.559E-5: OK

Econometrics and Excel

4.2. Basic Assumptions

4.2.1. BA1: E(i) = 0

As a graph we find (after make-up):

Econometrics and Excel

We dont see outliers.

We see that r(e,Y) is not small. More investigation is needed.

Method 3: Bartlett + Park + Feldstein + ...

4.2.3. BA3: no autocorrelation

4.2.4. BA4: i N(0, )