Edward Omey
HUB
Edward.omey@hubrussel.be
www.edwardomey.com
Introduction .............................................................................................................................. 2
1. Preparation ........................................................................................................................... 2
2. Linear models ....................................................................................................................... 4
3. Selection of variables part 1 ................................................................................................ 6
3.1. Example and first order QMC..................................................................................... 6
3.2. Higher order QMC...................................................................................................... 12
4. Regression analysis............................................................................................................. 13
4.1. Example 1 (Ctd)........................................................................................................... 13
4.1.1. MODEL 1: Y^ = a + bX1..................................................................................... 13
4.1.2. MODEL 2: Y^ = a + bX1 + cX2.......................................................................... 19
4.2. Basic Assumptions....................................................................................................... 22
4.2.1. BA1: E(i) = 0........................................................................................................ 22
4.2.2. Homoscedasticity.................................................................................................. 23
4.2.3. BA3: no autocorrelation ...................................................................................... 25
4.2.4. BA4: i N(0, ) ................................................................................................. 25
E.Omey
HUB BBA
Introduction
This text provides a guide about how to use EXCEL to make some calculations related to
linear models in econometrics.
The data I use here (EDUCATION DATA) is available as an excel sheet on my webpage.
1. Preparation
We need a TOOL to make the calculations. The tool is called DATA ANALYSIS and it can
be found by using TOOLS ==> ADD-INS ==> DATA ANALYSIS:
E.Omey
HUB BBA
E.Omey
HUB BBA
2. Linear models
We will study linear models or models that can be made linear. By linear we mean that the
model is linear in the parameters.
linear:
exponential:
loglinear:
etc
Complicated and non linear models should be treated by using more sophisticated software.
E.Omey
HUB BBA
E.Omey
HUB BBA
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
337
344
330
261
214
245
233
250
243
216
212
208
215
221
244
234
269
269
268
X1
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
661
722
766
631
390
450
576
603
805
523
588
584
445
500
661
680
797
534
541
X2
394,4
457,8
401,1
523,3
478
588,9
566,3
575,9
489,4
501,2
490,8
575,3
543,9
463,4
492,1
486,9
467,2
478,2
429,6
482,7
505,7
554
533,1
741,5
382,8
412
381,7
424,3
464,7
396,7
394,6
372,4
344,8
368
382,5
418,9
433,6
441,8
432,3
X3
325
323
328
305
303
307
301
310
300
324
329
320
337
328
330
318
309
333
330
318
304
328
323
317
310
321
342
339
287
325
315
332
358
320
355
306
335
335
344
E.Omey
HUB BBA
40
41
42
43
44
45
46
47
48
49
323
304
317
332
315
291
312
316
332
311
605
785
698
796
804
809
726
671
909
831
482
504,6
376,4
450,4
400,5
556
498,9
469,7
543,8
530,9
331
324
366
340
378
330
313
305
307
333
E.Omey
HUB BBA
X1
1
0,559354
0,506961
-0,01823
1
0,55568
-0,22613
X2
1
-0,41237
X3
E.Omey
HUB BBA
As x-value we take the calculated t-value; the degrees of freedom are n 2; as for tails we
choose1 because we want the P-value.
E.Omey
HUB BBA
E.Omey
HUB BBA
10
and:
400
350
300
250
200
150
100
50
0
Series1
200
400
600
100
200
300
400
We change the scale and set the minimum = 250 resp. 200:
E.Omey
HUB BBA
11
400
350
300
250
200
250
300
350
400
(*)
In which we try to explain X3 (the candidate) with the variables that we already included in
the model. If we can explain X3 well by this model, we have a problem of QMC. If we cant
explain X3 well with this model, we conclude that there are no QMC-problems.
We use R of model (*) and take as a limit the value of 36%: if the calculated R is larger than
36% we consider this as QMC and we are not including X3 in the model.
E.Omey
HUB BBA
12
4. Regression analysis
4.1. Example 1 (Ctd)
4.1.1. MODEL 1: Y^ = a + bX1
Using COPY PASTE we select the data that we need. We create a NEW Worksheet:
INSERT NEW WORKSHEET.
Here we put the data (Y and X1) that we want:
state
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
337
344
330
261
214
245
233
250
243
216
212
208
215
221
244
234
269
269
268
323
X1
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
661
722
766
631
390
450
576
603
805
523
588
584
445
500
661
680
797
534
541
605
E.Omey
HUB BBA
13
41
42
43
44
45
46
47
48
49
304
317
332
315
291
312
316
332
311
785
698
796
804
809
726
671
909
831
E.Omey
HUB BBA
14
Intercept
X1
1
47
48
SS
MS
F
Significance F
35733,60563 35733,61 21,401123 2,94097E-05
78476,23111 1669,707
114209,8367
E.Omey
HUB BBA
15
R /( p 1)
(1 R ) /(n p )
Predicted Y
248,803839
259,539099
213,147442
313,598798
318,391324
299,796321
315,515808
321,841943
Residuals
-13,80383946
-28,53909879
56,85255759
-52,59879754
-18,39132403
17,20367874
71,48419186
-36,8419431
E.Omey
HUB BBA
16
9
10
11
12
13
14
15
16
17
288,485959
295,770599
275,833689
310,531581
292,895083
277,750699
278,709205
261,072707
285,802144
11,51404124
-74,77059902
-11,83368884
-2,531580591
86,10491687
64,24930057
99,29079527
-29,07270727
-54,80214393
Residuals
X1 Residual Plot
200
100
0
-100 0
200
400
600
800
1000
X1
500
Y
0
0
500
1000
Predicted Y
X1
It is better to make our own graphs. In our example we make a scatter plot of (Y, Y^).
By using COPY - PASTE we place Y and Y^ in separate columns, and then use
CHART WIZZARD ==> XY scatter
Y
235
231
270
261
Predicted Y
248,803839
259,539099
213,147442
313,598798
E.Omey
HUB BBA
17
300
317
387
285
300
221
264
308
379
342
378
...
212
208
215
221
244
234
269
269
268
323
304
317
332
315
291
312
316
332
311
318,391324
299,796321
315,515808
321,841943
288,485959
295,770599
275,833689
310,531581
292,895083
277,750699
278,709205
...
264,139924
263,37312
236,726673
247,270231
278,134102
281,776422
304,205446
253,788067
255,129974
267,398842
301,905033
285,227041
304,013745
305,547353
306,505858
290,59467
280,051112
325,675964
310,723282
We get
350
300
250
200
Series1
150
100
50
0
0
200
400
600
E.Omey
HUB BBA
18
Y^
340
320
300
280
260
240
220
200
200
250
300
350
400
Y
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
337
344
330
261
214
245
233
250
243
216
212
208
215
X1
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
661
722
766
631
390
450
576
603
805
523
588
584
445
X2
394,4
457,8
401,1
523,3
478
588,9
566,3
575,9
489,4
501,2
490,8
575,3
543,9
463,4
492,1
486,9
467,2
478,2
429,6
482,7
505,7
554
533,1
741,5
382,8
412
381,7
424,3
464,7
396,7
394,6
372,4
344,8
E.Omey
HUB BBA
19
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
221
244
234
269
269
268
323
304
317
332
315
291
312
316
332
311
500
661
680
797
534
541
605
785
698
796
804
809
726
671
909
831
368
382,5
418,9
433,6
441,8
432,3
482
504,6
376,4
450,4
400,5
556
498,9
469,7
543,8
530,9
In DATA ANALYSIS we choose regression and then fill in the form (without choosing plots)
E.Omey
HUB BBA
20
SUMMARY OUTPUT
Regression Statistics
Multiple R
0,607069231
R Square
0,368533051
Adjusted R Square0,341077966
Standard Error
39,59571155
Observations
49
ANOVA
df
Regression
Residual
Total
Intercept
X1
X2
2
46
48
SS
MS
F
Significance F
42090,09956 21045,05 13,42313
2,559E-05
72119,73717 1567,8204
114209,8367
E.Omey
HUB BBA
21
e
-8,02005
-31,4475
51,34662
-52,3747
-8,44326
1,411889
64,30088
-44,0164
10,92484
-75,4873
-16,2483
-12,7834
76,68521
65,43955
95,44653
-36,9275
-52,0446
-3,56108
-10,9912
-6,07062
E.Omey
HUB BBA
22
150
100
50
0
200
-50
250
300
350
400
-100
4.2.2. Homoscedasticity
We have to check whether or not Var() is a constant. We make graphs to check this
assumption.
By COPY PASTE, we make the following table and we add a new column in which we
calculate e. We find (part of the table is here):
e
-8,0200
-31,4475
51,3466
-52,3747
-8,4433
1,4119
64,3009
-44,0164
10,9248
-75,4873
-16,2483
-12,7834
76,6852
65,4396
95,4465
-36,9275
-52,0446
-3,5611
-10,9912
-6,0706
X1
235
231
270
261
300
317
387
285
300
221
264
308
379
342
378
232
231
246
230
268
508
564
322
846
871
774
856
889
715
753
649
830
738
659
664
572
701
443
446
615
X2
394,4
457,8
401,1
523,3
478
588,9
566,3
575,9
489,4
501,2
490,8
575,3
543,9
463,4
492,1
486,9
467,2
478,2
429,6
482,7
e
64,32118
988,9472
2636,475
2743,107
71,28869
1,993431
4134,603
1937,44
119,3521
5698,337
264,0061
163,4159
5880,621
4282,335
9110,04
1363,638
2708,64
12,68131
120,8057
36,85243
Method 1: graphs
We make XY- scatters with the e on the vertical axis and with Y or X1 or X2 or the index i
(for times series) on the horizontal axis. The ideal graph is a horizontal box.
We find (after make-up):
Econometrics and Excel
E.Omey
HUB BBA
23
10000
8000
e
6000
4000
2000
0
200
250
300
350
400
10000
8000
e
6000
4000
2000
0
200
400
600
800
1000
X1
10000
8000
e
6000
4000
2000
0
200
300
400
500
600
700
800
X2
Clearly the graphs dont really show a horizontal box. Moreover, the clusters can be seen
more clear now.
Method 2: correlations
Under ideal situations, we find r(e, Y) = r(e, X1) = r(e, X2) = 0.
We calculate the correlations (using the same table as in Method 1) and find:
Y
0,38436
X1
0,15507
X2
0,2688
e
1
E.Omey
HUB BBA
24
e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
EVF
TVF
|EVF-TVF|
-75,4873 0,020408 0,028296 0,007888
-63,1081 0,040816 0,055489 0,014673
-53,8993 0,061224 0,086719 0,025494
-52,3747 0,081633 0,092962 0,011329
-52,0446 0,102041 0,094356 0,007685
-44,0164 0,122449 0,133146 0,010697
-42,0699 0,142857 0,144007 0,00115
-41,416 0,163265 0,147787 0,015478
-37,2263 0,183673 0,173568 0,010105
-36,9275 0,204082 0,17551 0,028572
-31,4475 0,22449 0,213535 0,010955
-29,5101 0,244898 0,22805 0,016848
-23,3252 0,265306 0,277902 0,012596
-21,0497 0,285714 0,297496 0,011782
-17,8828 0,306122 0,325767 0,019644
-17,0337 0,326531 0,333529 0,006998
E.Omey
HUB BBA
25
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
-16,2483
-16,0392
-12,7834
-11,6244
-10,9912
-10,6319
-10,1796
-8,44326
-8,02005
-6,07062
-3,56108
-1,71448
1,411889
2,479064
6,163563
6,711255
10,92484
13,43194
13,63965
19,65465
25,82688
30,10465
36,62317
38,98275
42,02097
50,43538
51,15126
51,34662
52,34578
64,30088
65,43955
76,68521
95,44653
0,346939
0,367347
0,387755
0,408163
0,428571
0,44898
0,469388
0,489796
0,510204
0,530612
0,55102
0,571429
0,591837
0,612245
0,632653
0,653061
0,673469
0,693878
0,714286
0,734694
0,755102
0,77551
0,795918
0,816327
0,836735
0,857143
0,877551
0,897959
0,918367
0,938776
0,959184
0,979592
1
0,340773
0,342712
0,373405
0,38454
0,390666
0,394153
0,398555
0,415571
0,419744
0,439075
0,464169
0,482731
0,514222
0,524961
0,56185
0,567296
0,608691
0,632781
0,634755
0,690188
0,742885
0,776463
0,822498
0,83757
0,855712
0,898625
0,901793
0,902645
0,906917
0,947805
0,950803
0,973609
0,992035
0,006166
0,024635
0,01435
0,023623
0,037906
0,054827
0,070833
0,074225
0,09046
0,091537
0,086851
0,088697
0,077614
0,087284
0,070803
0,085765
0,064778
0,061097
0,079531
0,044506
0,012217
0,000952
0,02658
0,021243
0,018977
0,041483
0,024242
0,004686
0,01145
0,00903
0,00838
0,005982
0,007965
E.Omey
HUB BBA
26
-50
Series1
Series2
50
100
150
E.Omey
HUB BBA
27
1
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
-100
-50
50
100
150
In Column 5 of the table, we calculated EDF - TDFand then the maximum of these
numbers KS = MAX(EDF - TDF) = 0.091.
This value should be compared with the theoretical values of KS. In this example, we find
that KS = 0.091 is sufficiently small and we dont reject BA4.
As extra, we can make a histogram of the errors e(i). If BA4 holds, we should find a curve in
the form of a clock Using the other excel-manuel (descriptive statistics) we find the following
graph:
Histogram
10 27 44
61 78
95 112
E.Omey
HUB BBA
28