Anda di halaman 1dari 21

INTERPRETATION OF A REGRESSION EQUATION

120

Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
The scatter diagram shows hourly earnings in 2002 plotted against years of schooling,
defined as highest grade completed, for a sample of 540 respondents from the National
Longitudinal Survey of Youth.
1

INTERPRETATION OF A REGRESSION EQUATION


120

Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
Highest grade completed means just that for elementary and high school. Grades 13, 14,
and 15 mean completion of one, two and three years of college.
2

INTERPRETATION OF A REGRESSION EQUATION


120

Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
Grade 16 means completion of four-year college. Higher grades indicate years of
postgraduate education.
3

INTERPRETATION OF A REGRESSION EQUATION

. reg EARNINGS S
Source |
SS
df
MS
-------------+-----------------------------Model | 19321.5589
1 19321.5589
Residual | 92688.6722
538 172.283777
-------------+-----------------------------Total | 112010.231
539 207.811189

Number of obs
F( 1,
538)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

540
112.15
0.0000
0.1725
0.1710
13.126

-----------------------------------------------------------------------------EARNINGS |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------S |
2.455321
.2318512
10.59
0.000
1.999876
2.910765
_cons | -13.93347
3.219851
-4.33
0.000
-20.25849
-7.608444
------------------------------------------------------------------------------

This is the output from a regression of earnings on years of schooling, using Stata.

INTERPRETATION OF A REGRESSION EQUATION

. reg EARNINGS S
Source |
SS
df
MS
-------------+-----------------------------Model | 19321.5589
1 19321.5589
Residual | 92688.6722
538 172.283777
-------------+-----------------------------Total | 112010.231
539 207.811189

Number of obs
F( 1,
538)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

540
112.15
0.0000
0.1725
0.1710
13.126

-----------------------------------------------------------------------------EARNINGS |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------S |
2.455321
.2318512
10.59
0.000
1.999876
2.910765
_cons | -13.93347
3.219851
-4.33
0.000
-20.25849
-7.608444
------------------------------------------------------------------------------

For the time being, we will be concerned only with the estimates of the parameters. The
variables in the regression are listed in the first column and the second column gives the
estimates of their coefficients.
5

INTERPRETATION OF A REGRESSION EQUATION

. reg EARNINGS S
Source |
SS
df
MS
-------------+-----------------------------Model | 19321.5589
1 19321.5589
Residual | 92688.6722
538 172.283777
-------------+-----------------------------Total | 112010.231
539 207.811189

Number of obs
F( 1,
538)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

540
112.15
0.0000
0.1725
0.1710
13.126

-----------------------------------------------------------------------------EARNINGS |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------S |
2.455321
.2318512
10.59
0.000
1.999876
2.910765
_cons | -13.93347
3.219851
-4.33
0.000
-20.25849
-7.608444
------------------------------------------------------------------------------

In this case there is only one variable, S, and its coefficient is 2.46. _cons, in Stata, refers to
the constant. The estimate of the intercept is -13.93.
6

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
Here is the scatter diagram again, with the regression line shown.

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
What do the coefficients actually mean?

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
To answer this question, you must refer to the units in which the variables are measured.

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
S is measured in years (strictly speaking, grades completed), EARNINGS in dollars per
hour. So the slope coefficient implies that hourly earnings increase by $2.46 for each extra
year of schooling.
10

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
We will look at a geometrical representation of this interpretation. To do this, we will
enlarge the marked section of the scatter diagram.
11

INTERPRETATION OF A REGRESSION EQUATION


21

Hourly earnings ($)

19

$15.53

17
15

$13.07

$2.46

13

One year

11
9
7
10.8

11

11.2

11.4

11.6

11.8

12

12.2

Years of schooling
The regression line indicates that completing 12th grade instead of 11th grade would
increase earnings by $2.46, from $13.07 to $15.53, as a general tendency.
12

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
You should ask yourself whether this is a plausible figure. If it is implausible, this could be
a sign that your model is misspecified in some way.
13

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
For low levels of education it might be plausible. But for high levels it would seem to be an
underestimate.
14

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
What about the constant term? (Try to answer this question yourself before continuing with
this sequence.)
15

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
Literally, the constant indicates that an individual with no years of education would have to
pay $13.93 per hour to be allowed to work.
16

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
This does not make any sense at all. In former times craftsmen might require an initial
payment when taking on an apprentice, and might pay the apprentice little or nothing for
quite a while, but an interpretation of negative payment is impossible to sustain.
17

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
A safe solution to the problem is to limit the interpretation to the range of the sample data,
and to refuse to extrapolate on the ground that we have no evidence outside the data range.
18

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
With this explanation, the only function of the constant term is to enable you to draw the
regression line at the correct height on the scatter diagram. It has no meaning of its own.
19

INTERPRETATION OF A REGRESSION EQUATION


120

EARNINGS 13.93 2.46S


Hourly earnings ($)

100
80
60
40
20
0
0

10 11 12 13 14 15 16 17 18 19 20

-20

Years of schooling
Another solution is to explore the possibility that the true relationship is nonlinear and that
we are approximating it with a linear regression. We will soon extend the regression
technique to fit nonlinear models.
20

Copyright Christopher Dougherty 2011.


These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 1.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own and who feel that they might
benefit from participation in a formal course should consider the London School
of Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
20 Elements of Econometrics
www.londoninternational.ac.uk/lse.

11.07.25

Anda mungkin juga menyukai