z y b1 z1 b2 z2 ... bk zk error
-Where:
j
bj j j1,...k
y
6.1 Beta Coefficients
These new coefficients are called STANDARDIZED
COEFFICIENTS or BETA COEFFICIENTS (which
is confusing as the typical OLS regression uses
Betas).
-This regression estimates the change in ys
standard deviation when xks standard
deviation changes
-Magnitudes of coefficients can now be obtained
y 100[e
% 2 x2
1] (6.8)
-when percentage changes are large, this is a
more accurate calculation
-in our above example:
%y 100[e 1] 23.3%
0.21
y ( 1 2 2 x)x
-as it makes no sense to analyze the effect of a
change in x while keeping x2 constant
6.2 Dynamic Quadratic Functions
If it is the case that B1hat is positive and B2hat is
negative,
-x has a diminishing effect on y
-the graph is an inverted u-shape
-ie: conflict resolution
-talking through a problem can work to solve
it up to a certain point, where more talking is
extraneous and only creates more problems
-ie: Pizza and utility
-eating Pizza will increase utility up to a point
where additional pieces makes one sick
6.2 Dynamic Quadratic Functions
-The maximum point on the graph (where y is
maximized) is always at the point:
x* ( 1/ 22 ) (6.13)
-after this point, the graph is decreasing, which is
of little concern if it only occurs for a small
portion of the sample (ie: very few people force
themselves to eat too much pizza)
-this downward effect could also be found due to
omitting certain variables
-a similar argument goes for a u-shaped curve
with a minimum point
6.2 More Quadratics and Logs
-If a quadratic model has both slope coefficients
either positive or negative, the model
increases or decreases at an increasing rate
-combining quadratics and logs allows for
dynamic relationships including increasing or
decreasing percentage changes:
-For example, if
log( Util ) 0 1log(shrimp ) 2 log( shrimp ) u
2
-Then
%utility [ 1 22 log( shrimp )]% shrimp
6.2 Interaction Terms
-Often a dependent variables impact (partial
effect, elasticity, semi-elasticity) depends on
the value of another explanatory variable
-In these cases variables are included
multiplicatively
-For example, if you get a better nights sleep on
a comfortable bed,
rest 0 1sleep 2comfort sleep
-Then
rest
1 2comfort
sleep
6.2 Interaction Terms
-If there is an INTERACTION EFFECT between
two variables, the are often included
multiplicatively
-in order to summarize one variables effect on y,
one must examine interesting values of the
other variable (mean, lower and upper
quartiles)
-this can be tedious
-often the examination of only one coefficient is
meaningless if the interaction variable cannot
be zero (ie: if comfort cant be zero)
6.2 Reparameterization
-Since the coefficients are going to be examined
from at their means, it is often useful to
reparameterize the model to take means into
account initially:
y 0 1x1 2 x2 3 x1 x2 u
-Becomes:
y 0 1x1 2 x2 3 ( x1 x1 )( x2 x2 ) u
-In this new model, delta2 becomes the partial
effect of x2 on y at the mean value of x1
6.2 Reparameterization
-In other words:
2 2 3 x1
2
1-
2 u
2
y
6.2 Adjusted R-squared
-However SSR/n is a biased estimate of 2u, and
can be replaced by the unbiased estimator
SSR/(n-k-1)
-Likewise SST/n is a biased estimate of 2y, and
can be replaced by the unbiased estimator
SST/(n-1)
-These substitutions give us our adjusted R2:
2 SSR/(n - k - 1)
R 1- (6.21)
SST/(n - 1)
2 2
R 1-
SST/(n - 1)
6.3 Adjusted R-squared
-Unfortunately, adjusted R2 is not proven to be a
better estimator
-the ratio of two unbiased estimators is not
necessarily itself unbiased
-adjusted R2 does add a penalty for including
additional independent variables:
-SSR will fall, but so will n-k-1
-therefore adjusted R2 cannot be artificially
inflated by added variables
6.3 Adjusted R-squared
-When adding a variable, adjusted R2 will
increase only if that variables t-stat is greater
than one (in absolute value)
-Likewise, adding many variables only increase R2
if the F stat for adding those variables is
greater than unity
-adjusted R2 therefore gives a different answer to
including/excluding variables than typical
testing
6.3 Adjusted R-squared
-Adjusted R2 can also be written in terms of R2:
2
2 (1 - R )(n - 1)
R 1- (6.22)
(n - k - 1)
-From this equation we see that adjusted R2 can
be negative
-a negative adjusted R2 indicates a very poor
model fit relative to the number of degrees of
freedom
-note that the NORMAL R2 must be used in the F
formula of (4.41)
6.3 Nonnested Models
-Sometimes it is the case that we cannot decide
between two (generally highly correlated)
independent variables
-Perhaps they both test insignificant separately
yet significant together
-In deciding between the two variables (A and B),
we can examine two nested models:
y 0 1x1 2 x2 3 A u
y 0 1x1 2 x2 3 B u
6.3 Nonnested Models
-These are NONNESTED MODELS as neither is a
special case of the other (as compared to
nested restricted models in F tests)
-ADJUSTED R2s can be compared, with a large
difference in ADJUSTED R2s making a case for
one variable other the other
-a similar comparison can be done with functional
forms:
memory 0 1log( time ) u
memory 0 1time 2time u 2
6.3 Nonnested Models
-In this case, adjusted R2s are a better
comparison than typical R2s as the number of
parameters has changed
-Note that adjusted R2s CANNOT be used to
choose between different functional forms of
the dependent (y) variable
-R2 deals with variation in y, and by changing
the functional form of y the amount of variation
is also changed
-6.4 will deal with ways to compare y and log(y)
6.3 Over Controlling
-in the attempt to avoid omitting important
variables from a model, or by overemphasizing
goodness-of-fit, it is often possible to control
for too many variables
-in general, if changing the variable A will
naturally change both the variables B and C,
including all three variables would amount to
OVER CONTROLLING for factors in the model
A B C
6.3 Over Controlling Examples
-If one wanted to investigate the impact on
reduced TV on school grades, study time
should NOT be included, as
LessTV MoreStudyi ng BetterGrad es
-And it may be nonsensical to expect less TV not
to result in more studying
-If one wanted to examine the impact of
increased income on recreational expenses,
travel expenses should NOT be included, as
they are part of recreational expenses
Travel Expenses
Income Other Expenses Recreation Expenses
6.3 Reducing Error Variance
In Ch. 3, we saw that adding a new x variable:
1) Increases multicollinearity (due to increased
correlation between more independent
variables)
2) Decreases error variance (due to removing
variation from the error term)
From this, we should ALWAYS include variables
that affect y yet are uncorrelated with all of
the explanatory variables OF INTEREST
-This will not affect the biasness (of the variables
of interest) but will reduce sample variance
6.3 Example
Assume we are examining the effects of random
Customs baggage searches on import of coral
from Hawaii
-since the baggage searches are random
(assumed), they are uncorrelated with any
descriptive variables (age, gender, income,
etc.)
-However, these descriptive variables may have
an impact on y (coral import), they can be
included and reduce error variance without
making the estimation of baggage searches
biased