Anda di halaman 1dari 4

STAT 2011

HANDOUT #4
CORRELATION AND SIMPLE REGRESSION ANALYSIS
Instructor: Hernando Burgos-Soto
A) Formulas
1) The Pearson product-moment correlation coefficient:
=

( ) ( )
( )) ( ))

2) Equation of a simple regression line


= , + .

3) Slope of the regression

( )( )
. =
=
( ))


=
) )

or
SS01 =

( )( ) =

SS00 =

( )) =

and
. =
4) y intercept of the regression line

SS23
SS00

, = .
5) Sum of squares of errors
SSE =

( )) =

) ,

6) Standard error of the estimate


6 =

SSE

2

7) Coefficient of determination
( )) =

SS11 =

SSE . ) SS00
=1
=
SS11
SS11

Testing Hypothesis about the slope


H, : . = 0

H, : . = 0

H, : . = 0

H= : . 0

H= : . > 0

H= : . < 0

8)
=

. .
; where C =
C

6
SS00

and df. = 2

B) Exercises
1)

Determine the value of r for the following data.


x

158 296 87

110 436

349 510 301 322 550

2) In an effort to determine whether any correlation exists between the share prices of airlines,
an analyst sampled six days of activity on the stock market. Using the following share prices
of Air Canada and WestJet, compute the coefficient of correlation. Share prices have been
rounded off to the nearest hundredth for ease of computation.
Air Canada

0.75

0.76 0.84

0.85

0.86

0.86

WestJet

11.92 12.09 12.25 11.85 11.78 11.74

3) Sketch a scatter plot from the following data, and determine the equation of the regression
line.
x

12

21

28 8

17

15

22

20

19 24

4) A corporation owns several companies. The strategic planner for the corporation believes
dollars spent on advertising can to some extent be a predictor of total sales dollars. As an
aid in long-term planning, she gathers the following sales and advertising information from
several of the companies for 2009 ($ millions).

Advertising

$12.5 3.7 21.6 60.0 37.6 6.1 16.8 41.2

Sales

$148 55 338

994 541

89 126

379

Develop the equation of the simple regression line to predict sales from advertising
expenditures using these data.
5) Is it possible to predict the annual number of business bankruptcies by the number of firm
births (business starts)? The following table shows the number of business bankruptcies
(1,000s) and the number of firm births (10,000s) for a six-year period. Use these data to
develop the equation of the regression model to predict the number of business bankruptcies
by the number of firm births. Discuss the meaning of the slope.
Business Bankruptcies (1,000s)

34.3

35.0

38.5

40.1

35.5

37.9

Firm Births (10,000s)

58.1

55.4

57.0

58.5

57.4

58.0

6) Investment analysts generally believe the interest rate on bonds is inversely related to the
prime interest rate for loans; that is, bonds perform well when lending rates are down and
perform poorly when interest rates are up. Can the bond rate be predicted by the prime
interest rate? Use the following data to construct a least squares regression line to predict
bond rates by the prime interest rate.
Bond Rate

5%

12 9 15 7

Prime Interest Rate

16% 6 8

4 7

7) Solve for the predicted values of y and the residuals for the following data:
x

12

21

28

20

17

15

22

19

24

8) Suppose milk is produced in a certain area. Some people might argue that because of
transportation costs, the price of milk in stores increases with the distance of markets from
that area. Suppose the milk prices in eight cities are as follows.
Price of Milk (per 2 L)

$2.64 2.31 2.45 2.52 2.19 2.55 2.40

Distance from Milk-Producing Area (km)

1,245 425

1,346 973 255

2.37

865 1,080 296

Use the prices along with the distance of each city from the milk-producing area to develop
a regression line to predict the price of 2 L of milk by the number of kilometers the city is
from the milk-producing area. Use the data and the regression equation to compute residuals
for this model. Sketch a graph of the residuals in the order of the x values. Comment on the
shape of the residual graph.

9) In Problem 5, you were asked to develop the equation of a regression model to predict the
number of business bankruptcies by the number of firm births. Using this regression model
and the data given in Problem 5 (and provided here again), solve for the predicted values of
y and the residuals. Comment on the size of the residuals.
Business Bankruptcies (1,000s)

34.3

35.0

38.5

40.1

35.5

37.9

Firm Births (10,000s)

58.1

55.4

57.0

58.5

57.4

58.0

10) Determine the sum of squares of error (SSE) and the standard error of the estimate (6 ) for
Problem 8. Determine how many of the residuals computed are within one standard error
of the estimate. If the error terms are normally distributed, approximately how many of
these residuals should be within 16 ?
11) Determine the SSE and 6 for Problem 9. Use the residuals computed and determine how
many of them are within 16 and 26 . How do these numbers compare with what the
empirical rule says should occur if the error terms are normally distributed?
12) Determine the sum of squares of error (SSE) and the standard error of the estimate (6 ) for
Problem 7. Determine how many of the residuals computed are within one standard error
of the estimate. If the error terms are normally distributed, approximately how many of
these residuals should be within 16 ?
13) Compute ) for Problem 7. Discuss the value of ) obtained.
14) Compute ) for Problem 8. Discuss the value of ) obtained.
15) Compute ) for Problem 9. Discuss the value of ) obtained.
16) Test the slope of the regression line determined in Problem 6. Use = 0.05.
17) Test the slope of the regression line determined in Problem 7. Use = 0.01.
18) Test the slope of the regression line determined in Problem 8. Use = 0.10.
19) Test the slope of the regression line determined in Problem 9. Use a 5% level of
significance.
20) Study the following ANOVA table, which was generated from a simple regression analysis.
Discuss the F test of the overall model. Determine the value of t and test the slope of the
regression line.
Analysis of Variance
Source

DF

SS

MS

Regression

116.65

116.65

8.26

0.021

Error

112.95

14.12

Total

229.60

Anda mungkin juga menyukai