Building Multiple
Regression
Models
Copyright 2012 Pearson Education. All rights reserved.
19-1
19-2
19-3
19-4
19-5
19-6
19-7
19-8
19-9
19-10
19-11
19-12
19-13
Coeff
139.104
1.07347
2.04836
3.58970
5.00967
3.41058
SE(Coeff)
16.69
0.2474
0.6672
2.953
2.104
3.230
t-ratio
8.33
4.34
3.07
1.22
2.38
1.06
P-value
0.0001
0.0001
0.0032
0.2287
0.0203
0.2951
19-14
Coeff
139.104
1.07347
2.04836
3.58970
5.00967
3.41058
SE(Coeff)
16.69
0.2474
0.6672
2.953
2.104
3.230
t-ratio
8.33
4.34
3.07
1.22
2.38
1.06
P-value
0.0001
0.0001
0.0032
0.2287
0.0203
0.2951
19-15
Coeff
139.104
1.07347
2.04836
3.58970
5.00967
3.41058
SE(Coeff)
16.69
0.2474
0.6672
2.953
2.104
3.230
t-ratio
8.33
4.34
3.07
1.22
2.38
1.06
P-value
0.0001
0.0001
0.0032
0.2287
0.0203
0.2951
19-16
Meat-based dishes
Non-meat dishes
19-17
19-18
19-19
19-20
Simple regressions of
the meat group and the
non-meat group.
19-21
19-22
19-23
19-24
19-25
19-26
19-27
19-28
19-29
hi
1 hi
19-30
19-31
19-32
19-33
19-34
19-35
The P-values confirm that the blast coasters dont fit with the
other ones.
Note: The coefficient for Drop is the same as the model without the three
blast coasters.
Copyright 2012 Pearson Education. All rights reserved.
19-36
19-37
19-38
19-39
19-40
19-41
19-42
19-43
19-44
19-45
19.5 Collinearity
Predictor variables exhibit collinearity when one of the
predictors can be predicted well from the others.
Consequences of Collinearity:
Coefficients in a multiple regression model can be surprising,
taking on an unanticipated sign or being unexpectedly large or
small.
The stronger the correlation between coefficients, the more
the variance of their coefficients increases when both are
included in the model (variance inflation). This can lead to a
smaller t-statistic and correspondingly large P-value.
19-46
19.5 Collinearity
Recall Housing Prices based on Living Area and Bedrooms
(Simple Regression Models):
19-47
19.5 Collinearity
Recall Housing Prices based on Living Area and Bedrooms
(Multiple Regression Model):
19-48
19.5 Collinearity
Recall Housing Prices based on Living Area and Rooms
(Simple Regression Models):
19-49
19.5 Collinearity
Recall Housing Prices based on Living Area and Bedrooms
(Multiple Regression Model):
19-50
19.5 Collinearity
The statistic that measures the degree of collinearity
of the jth predictor with the others called the Variance
Inflation Factor (VIF).
It is found as
1
VIFi
2
1 Ri
19-51
19.5 Collinearity
Facts about Collinearity
The collinearity of any predictor with the others in the model
can be measured with its Variance Inflation Factor (VIF).
High collinearity leads to the coefficient being poorly estimated
and having a large standard error (and correspondingly low tstatistic). The coefficient may seem to be the wrong size or
even the wrong sign.
19-52
19.5 Collinearity
Facts about Collinearity
between a
predictor and all of the other predictors in the model. It is not
measured in terms of the correlation between any two predictors.
19-53
19.5 Collinearity
Dealing with Collinearity:
Simplify the model and improve the t-statistic by removing some
of the predictors. Which should you keep?
Variables that are most reliably measured
Variables that are least expensive to find
Variables that are inherently important to the problem
New variables formed by combining variables
19-54
19-55
the residuals
exhibit a bend.
19-56
19-57
19-58
19-59
StartOrder StartOrder
19-60
19-61
Would re-expressing
either variable help?
19-62
19-63
Coeff
19208.2
-19.3947
4.90774e-3
SE(Coeff)
2204
2.229
0.0006
t-ratio
8.71
-8.70
8.71
P-value
0.0001
0.0001
0.0001
19-64
Coeff
19208.2
-19.3947
4.90774e-3
SE(Coeff)
2204
2.229
0.0006
t-ratio
8.71
-8.70
8.71
P-value
0.0001
0.0001
0.0001
19-65
19-66
19-67
19-68
19-69
19-70
19-71
19-72