31.07.2014
Getting RStudio
http://www.rstudio.org/
http://r-project.org/
Inference for the regression model
I Key concepts:
I Standard errors
I Confidence intervals for the coefficients
I Tests of significance
Variability of the regression coefficients
X1
X2
Variability of the regression coefficients
I Variability depends on
Call:
lm(formula = volume ~ diameter + height)
Residuals:
Min 1Q Median 3Q Max
-6.4065 -2.6493 -0.2876 2.2003 8.4847
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877 8.6382 -6.713 2.75e-07 ***
diameter 56.4979 3.1712 17.816 < 2e-16 ***
height 0.3393 0.1302 2.607 0.0145 *
---
df : n k 1.
n : number of observations.
> confint(cherry.lm)
2.5 % 97.5 %
(Intercept) -75.68226247 -40.2930554
diameter 50.00206788 62.9937842
height 0.07264863 0.6058538
Hypothesis test
Call:
lm(formula = volume ~ diameter + height)
Residuals:
Min 1Q Median 3Q Max
-6.4065 -2.6493 -0.2876 2.2003 8.4847
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877 8.6382 -6.713 2.75e-07 ***
diameter 56.4979 3.1712 17.816 < 2e-16 ***
height 0.3393 0.1302 2.607 0.0145 *
---
2.607 2.607
0.4
0.3
pvalue = 0.0145
0.2
0.1
0.0
4 2 0 2 4
Other hypotheses
Call:
lm(formula = volume ~ diameter + height)
Residuals:
Min 1Q Median 3Q Max
-6.4065 -2.6493 -0.2876 2.2003 8.4847
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877 8.6382 -6.713 2.75e-07 ***
diameter 56.4979 3.1712 17.816 < 2e-16 ***
height 0.3393 0.1302 2.607 0.0145 *
---
I Terminology
I If the full model RSS is not much smaller than the submodel
RSS, the submodel is adequate: we do not need the extra
variables.
I To do the test, we
(RSSsub RSSfull )
F =
s 2 (dffull dfsub )
1.0
0.8
Fvalue
0.6
0.4
0.2
pvalue
0.0
0 2 4 6 8 10
Example: Free fatty acid data
I Variables are
Age: months
Weight: pounds
Call:
lm(formula = ffa ~ age + weight + skinfold, data = fatty.df)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.95777 1.40138 2.824 0.01222 *
age -0.01912 0.01275 -1.499 0.15323
weight -0.02007 0.00613 -3.274 0.00478 **
skinfold -0.07788 0.31377 -0.248 0.80714
This suggests
I age is not required if weight and skinfold are retained
> summary(model.sub)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.01651 0.37578 5.366 4.23e-05 ***
weight -0.02162 0.00608 -3.555 0.00226 **