Anda di halaman 1dari 16

Chapter 9, Part 1

Regression Wisdom

Copyright 2010 Pearson Education, Inc.

DO NOW COPY DOWN AND ANSWER ALL DURING VIDEO


Watch this video and answer these questions. (12:46) http://www.learner.org/vod/vod_window.html?pid=146 1.What is the best fitting line that fits data by minimizing the sum of the squares of the residuals? 2. What example is used to illustrate the use of the least squares regression line? 3. In the equation y = a + bx, what is the formula for b? What is b in the equation? What is the formula for a? 4. What does y represent? x? What is a in the equation? 5. Even though you can fit a regression line to any set of data, when is the line valid?
Copyright 2010 Pearson Education, Inc.

Slide 9 - 2

DO NOW

Take 5 minutes to discuss answers with your group. I need 5 people to come up and do the answers on the board and explain it to the class.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 3

AIM AND HOMEWORK

How do we use regressions to figure out what the data has to say? Homework #20: Read Chapter 9, #1-9ODD, 8

Copyright 2010 Pearson Education, Inc.

Slide 9 - 4

Getting the Bends

Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you cant take it for granted.) A curved relationship between two variables might not be apparent when looking at a scatterplot alone, but will be more obvious in a plot of the residuals. Remember, we want to see nothing in a plot of the residuals.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 5

Getting the Bends (cont.)

The scatterplot of residuals against Duration of emperor penguin dives holds a surprise. The Linearity Assumption says we should not see a pattern, but instead there is a bend. Even though it means checking the Straight Enough Condition after you find the regression, its always good to check your scatterplot of the residuals for bends that you might have overlooked in the original scatterplot.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 6

Sifting Residuals for Groups

No regression analysis is complete without a display of the residuals to check that the linear model is reasonable. Residuals often reveal subtleties that were not clear from a plot of the original data.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 7

Sifting Residuals for Groups (cont.)

Sometimes the subtleties we see are additional details that help confirm or refine our understanding. Sometimes they reveal violations of the regression conditions that require our attention.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 8

Sifting Residuals for Groups (cont.)

It is a good idea to look at both a histogram of the residuals and a scatterplot of the residuals vs. predicted values in the regression predicting Calories from Sugar content in cereals:

The small modes in the histogram are marked with different colors and symbols in the residual plot above. What do you see?
Copyright 2010 Pearson Education, Inc.

Slide 9 - 9

Sifting Residuals for Groups (cont.)

An examination of residuals often leads us to discover groups of observations that are different from the rest. When we discover that there is more than one group in a regression, we may decide to analyze the groups separately, using a different model for each group.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 10

Subsets

Heres an important unstated condition for fitting models: All the data must come from the same group. When we discover that there is more than one group in a regression, neither modeling the groups together nor modeling them apart is necessarily correct. You must determine what makes the most sense. In the following example, we see that modeling them apart makes sense.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 11

Subsets (cont.)

The figure shows regression lines fit to calories and sugar for each of the three cereal shelves in a supermarket:

Copyright 2010 Pearson Education, Inc.

Slide 9 - 12

Extrapolation: Reaching Beyond the Data

Linear models give a predicted value for each case in the data. We cannot assume that a linear relationship in the data exists beyond the range of the data. The farther the new x value is from the mean in x, the less trust we should place in the predicted value. Once we venture into new x territory, such a prediction is called an extrapolation.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 13

Extrapolation (cont.)

Extrapolations are dubious because they require the additionaland very questionable assumption that nothing about the relationship between x and y changes even at extreme values of x. Extrapolations can get you into deep trouble. Youre better off not making extrapolations.

Copyright 2010 Pearson Education, Inc.

Slide 9 - 14

Extrapolation (cont.)

Here is a timeplot of the Energy Information Administration (EIA) predictions and actual prices of oil barrel prices. How did forecasters do?

They seemed to have missed a sharp run-up in oil prices in the past few years.
Copyright 2010 Pearson Education, Inc.

Slide 9 - 15

Conclusion
Linear Regression only works for what kind of models? What is an important unstated condition for fitting models? What is an extrapolation?

Copyright 2010 Pearson Education, Inc.

Slide 9 - 16