Anda di halaman 1dari 4

# UNIVERSITI TUNKU ABDUL RAHMAN

FOUNDATION IN ARTS

## TUTORIAL 5: REGRESSION AND CORRELATION

1. The following data show the IQ and the score in an English test of a sample of 10
pupils taken from a mixed ability class.
The English test was marked out of 50 and the range of IQ values for the class
was 80 to 140.

Pupil A B C D E F G H I J
IQ(x) 110 107 127 100 132 130 98 109 114 124
English(y) 26 31 37 20 35 34 23 38 31 36

## (i) Estimate the product-moment correlation coefficient for the class.

(ii) Find the equation of the regression line of y on x

2. The relationship between the number of hours of revision a day, x, with the marks,
y obtained by 9 students for a paper in an examination is shown in the following
table.
x 4 5 6 7 8 9 10 11 12
y 55 60 53 70 72 68 80 90 85

## (i) Find the linear correlation coefficient.

(ii) Find the equation of the regression line of y on x.
(iii) Find the expected marks that a student who revises 8.5 hours a day for the
paper.

## 3. A sample of 10 married couples working in a factory was selected randomly, the

daily wage of each husband (RM x) and the daily wage of his wife (RM y) are
summarized as below.

 x  1680  x  282280
2
SYY  10
 y  262450
2
 xy  272179
(i) Calculate S xx and S xy .
(ii) Find the equation of the regression line of y on x.
(iii) Interpret the meaning of a and b from part (ii).
(iv) Calculate the coefficient of determination and interpret the meaning.
(v) If the daily wage of a husband is RM115, estimate the daily wage of his
wife.

## 4. The equation of the line of regression of monthly wage, RM y, on work

experience, x years , of workers in a factory is given by y = a + 42 x , with
0  x  12.
(i) Give the interpretation of constant a in the equation.
(ii) State whether the equation can be used to predict the monthly wage of a
worker who has a work experience of 20 years. Give reasons for your

5. The following table shows length of employment (in weeks) for a random sample
of 6 persons working at an automobile inspection station and number of cars
checked by them between noon and 1 o’clock on a given day.

## Length of employment Number of Cars checked

(weeks)
5 16
1 15
7 19
9 23
2 14
12 21

(a) Find the regression line that will enable us to predict number of cars
checked in terms of length of employment for a person working at the
automobile inspection station.
(b) If a person has worked at the inspection station for 10 weeks, estimate the
number of cars checked by the person during the given time period.
(c) What percentage of the variation in numbers of cars checked by the 6
persons can be attributed to the differences in their length of employment
at the inspection station?

6. The following table gives information on ages and cholesterol levels for a random
sample of 10 men.
Age (years) 58 69 43 39 63 52 47 31 74 36
Cholesterol level 189 235 193 177 154 191 213 175 198 181

(a) Identify the independent variable and dependent variable. Hence compute
SXX, SYY and SXY.
(b) Find the regression line of cholesterol level on age.
(c) Plot the scatter diagram and the regression line.
(d) Calculate r and r2 and explain what they mean.
(e) Predict the cholesterol level of a 60-year-old man.

2

7. The following table gives information on the incomes (in thousands of RM) and
charitable contributions (in hundreds of RM) for a random sample of 10
households.
Income 33 23 82 47 26 71 28 39 58 17
Contributions 10 4 29 23 3 28 8 16 18 1

## (a) Taking income as an independent variable and charitable contributions as

a dependent variable, compute SXX, SYY and SXY.
(b) Find the regression line of contributions on income.
(c) Plot the scatter diagram and the regression line.
(d) Calculate the product moment correlation coefficient and coefficient of
determination. Explain what they mean.

8. The following table gives information on GPAs and starting salaries (rounded to
the nearest hundred RM) of seven college graduates.
GPA 2.90 3.81 3.20 2.42 3.94 2.05 2.25
Starting salary 23 28 23 21 32 19 22

## a) With GPA as an independent variable and starting salary as a dependent

variable, compute SXX, SYY and SXY.
b) Find the least squares regression line.
c) Interpret the meaning of the values of a and b calculated in part (c).
d) Plot the scatter diagram and the regression line.
e) Calculate the product moment correlation coefficient.
f) Calculate also the coefficient of determination and briefly explain what it
means.

9. Eight students, randomly selected from a large class, were asked to keep a record
of the hours they spent studying before the midterm examination. The following
table gives the number of hours these eight students studied before the midterm
and their scores on the midterm.
Hours studied 15 7 12 8 18 6 9 11
Midterm score 97 78 87 92 89 57 74 69

## (a) Do the midterms scores depend on hours studied or do hours studied

depend on the midterm scores? Do you expect a positive or a negative
relationship between these two variables?
(b) Taking hours studied as an independent variable and midterm scores as a
dependent variable compute SXX, SYY and SXY.
(c) Find the least squares regression line.
(d) Interpret the meaning of the values of a and b calculated in part (c).
(e) Plot the scatter diagram and the regression line.
(f) Calculate Pearson’s correlation coefficient.
(g) Calculate also the coefficient of determination.

3

## 10. Consider the following pairs of measurement

x 5 3 -1 2 7 6 4
y 4 3 0 1 8 5 3
x  26 y  24 x 2  140 y 2  124 xy  129

## (a) Construct a scatter diagram for these data.

(b) Find the least squares regression line.
(c) Interpret the y-intercept and slope of the least squares regression line.
(d) Predict the value of y if x = 20. Is the predicted value reliable? Give a
reason.
(e) Calculate r. Interpret the results.
(f) Calculate also the coefficient of determination.

## 1. (i) r = 0.7452 (ii) y = − 11.7977 + 0.3727 x

2. (i) 0.9219 (ii) y = 35.3997 + 4.3667 x (iii) 72.52
3. (i) Sxx = 40, Sxy = 19 (ii) yˆ  82.2  0.475x (iv) 0.9025 (v) RM136.83
5. (a) yˆ  13.4318  0.7614 x (b) 21 (c) 79.71%
6. (a) X= Age(years) Y = cholesterol level, Sxx = 1895.6, Sxy = 1029.8,
Syy = 4396.4 (b) yˆ  162.78  0.5433x (d) 0.3567, 0.1273 (e) 195.38
7. (a) Sxx = 4248, Sxy = 1920, Syy = 964 (b) ŷ = -5.162 + 0.4519x
(d) 0.9487 , 0.9
8. (a) Sxx = 3.3647, Sxy = 18.65, Syy = 120 (b) ŷ = 7.712 + 5.543x
(e) 0.9282 (f) 0.8618
9. (a) midterm scores depend on hours studied and we expect a positive
relationship between these two variables.
(b) Sxx = 119.5, Sxy = 237.75, Syy = 1251.875 (c) ŷ =58.98 + 1.99x
(f) 0.6147 (g) 0.3778
10. (b) ŷ =0.0196 + 0.9178x (d) 18.3756 (e) 0.9364
(f) 0.8768