Anda di halaman 1dari 12

C

Chhaapptteerr 99::
LLIIN
NEEA
ARRR
REEG
GRRE
ESSS
SIIO
ONN

Upon completion of this chapter, you should be able to:

 Explain the difference between linear and non-linear relationship


 Define what is linear regression
 Explain what the slope and intercept indicate
 Apply the regression equation to predict

CHAPTER REVIEW

9.1 Preamble
9.2 What is linear regression?
9.3 Linear regression for prediction
9.4 Linear regression analysis
9.5 Calculating relationship between two variables
9.6 Predicting English performance
9.1 PREAMBLE

This chapter discusses linear regression which is used to analyse statistically the relationship
between an independent and dependent variable. It is also used for prediction of a variable
based on what is known of another variable.

9.2 What is a Linear Regression?

There are two key words; ‘linear’ and ‘regression’. When you think of ‘regression’ you are
thinking of ‘prediction’. A regression models the past relationship between variables to
predict their future behaviour. You are predicting the value of variable Y using the value of
variable X. To be able to predict the future value of Y, you need to have data showing the
relationship of variable X and variable Y. For example, to predict future performance in
university using an aptitude test, you need to have a set of data showing the relationship
between ‘Scores on the Aptitude Test’ and university ‘GPA’.

Figure 9.1 Graph showing a linear relationship between the time spent and the
acquisition of a skill

The other word is ‘linear’ which mean that relationship between the two variables should be
a ‘straight-line relationship’ or linear relationship. See Figure 9.1. which shows a ‘linear’
relationship between ‘time spent’ and ‘skill acquisition’. In other word, the more time a
person spends practicing a skill, the more competent is the person in that skill.
The opposite is a ‘non-linear relationship’ where the relationship between the two variables
is represented by a curved line.

Figure 9.2 Graph showing a non-linear relationship between performance and anxiety

See Figure 9.2 which shows a non-linear relationship between mathematics performance
and anxiety levels where the line is curved. Very low or non-existent anxiety levels will be
shown to be correlated with low test scores, then as the anxiety levels go up, so do the test
scores (a little bit of stress may increase the concentration levels), till we reach a point where
really high anxiety levels will be shown to be correlated with low test scores again (as very
high stress levels may impair concentration).

a) What is the meaning of ‘linear’?


b) What do you understand by the terms ‘regression’?
c) Explain the difference between a linear and non-linear relationship.

9.3 Linear Regression for Prediction

The main purpose of linear regression analysis is to assess associations between dependent
and independent variables. The most basic type of regression is that of simple linear
regression. A simple linear regression uses only one independent variable, and it describes
the relationship between the independent variable (hours spent practicing mathematics) and
dependent variable (score in a mathematics test) as a straight line.
When two or more independent variables are used, it is called a multiple regression. For
example ‘attitude towards English’ and ‘reading books in English’ as predictors towards
‘scores in English’. This chapter will focus on simple linear regression.

A simple linear regression models the past relationship between variables to predict their
future behaviour. For example you collect data on the relationship between ‘attitude towards
English’ and ‘scores obtained in an English test’. Based on this data, you can predict the
score a person obtains on an attitude towards English test, his or her score on an English test.
Businesses use regression to predict such things as future sales, stock prices, currency
exchange rates, and productivity gains resulting from a training program.

Strictly speaking, linear regression requires variables to be continuous. A variable is


continuous is metric which means you can compare the difference between any two
variables based on the values obtained. Age measured in years because the size of the
difference between the ages of two persons can be measured quantitatively in years. Other
examples of continuous variables are income measured in monetary units, scores obtained in
tests, attitudes expressed on Likert scale and so forth. Categorical variables such as gender,
religion, location and so forth are not used as predictors in linear regression.

9.4 EXAMPLE - Linear Regression Analysis

Let’s say you had collected data on 10 subjects, on their attitudes towards English and
their English test score and the results are shown in the Table below:

Attitude towards English English Test Score

34 80
37 87
39 89
29 70
28 69
30 72
33 79
37 85
32 81
32 79

o Attitude towards English = scores on an attitude scale


o English Score = based on a English performance test
Based on these data you conduct Linear Regression Analysis for two aim purposes:

 First, you want to test the relationship between between Attitudes towards English
and Performance in English. You are expecting a positive relationship between
Attitude and English score. In other words, as Attitude increases, you expect English
score to also increase. How do you establish this to be true?

 Second, you want go further than just stating the relationship between Attitudes
towards English and English scores. You want to know whether you can PREDICT
values of one variable if you know or can estimate the other variable. In other words,
can you predict performance in English based on what you know about their Attitudes
towards English.

9.5 Calculating the Relationship between Variables

We are saying that English performance depends on Attitude. In the language of Regression
Analysis, English performance is the dependent variable and Attitude towards English is the
independent variable. The distribution of the scores for the 10 students on the x axis
(attitude) and y axis (English performance) is shown in the graph below.

85.00 
mathscor

English 80.00

score
These are students
with high attitude
75.00
scores but doing
poorly on the

performance test.
70.00   

30.0 32.5 35.0 37.5

attitude
What do you observe about the graph or scatterplot above?

You will notice the following:


 There is a positive relationship between attitude and English performance; i.e. there is
a linear relationship between the two variables. It is called a positive relationship
because the line has an ‘upward slope’
 The line drawn is what it would be like if the positive relationship between attitude
and English performance is 1.00.
 However, the graph shows that it is not a perfect positive relationship as there are
cases of students who are ‘behaving’ differently (see Graph above).
 You could also say that a change in y will be followed by a certain amount of change
in x.

a) What is the difference between linear and multiple regression?


b) What do you understand by dependent and independent variables?
c) In the graph above explain the unusual ‘behaviour’ of the two students?

Now, let us compute the relationship statistically using SPSS.

You make English performance the dependent variable (y) and you ‘enter’ the independent
variable Attitude into the regression equation.

Using SPSS you will have several Tables showing the following:

 TABLE 1 shows that ATTITUDE has been ‘entered’ as the independent variable and
ENGLISHSCORE as the dependent variable.

Table 1:

b
Variables Entered/Remov ed

Variables Variables
Model Entered Removed Method
1 ATTITUDEa . Enter
a. All requested variables entered.
b. Dependent Variable: MATHSCOR
ENGLISHSCORE
 TABLE 2 shows the Model Summary which reports a statistic that measures “goodness
of fit” which gives us the line that best fits the points in the Graph, better than any other.

Table 2:

Model Summary

Adjusted Std. Error of


Model R R Square R Square the Estimate
1 .879a .772 .743 4.1937
a. Predictors: (Cons tant), ATTITUDE

 This statistic in called the coefficient of determination represented by ‘R Square’ or R² or


r² (R is the correlation coefficient = 0.879).
 So, R² is 0.772 which means that Attitude contributed 77.2% of the variance to English
performance.

 TABLE 3 is the ANOVA table. What do you observe in this table?

Table 3:

ANOVAb

Sum of
Model Squares df Mean Square F Sig.
1 Regression 476.202 1 476.202 27.076 .001a
Residual 140.698 8 17.587
Total 616.900 9
a. Predictors : (Constant), ATTITUDE
b. Dependent Variable: MATHSCOR
ENGLISHSCORE

 When you do a regression analysis, you also want to know whether the linear relationship
between Attitude (x) and English performance (y) is statistically significant; i.e. whether
there is any significant linear relationship between x and y. That is the reason for the
ANOVA table..

 The F statistic shown in Table 3 is 27.076 which is significant at p<.05 and so we reject
the null hypothesis. In other words, there is a linear relationship between English
performance and Attitude and the relationship is significant at p<.05.
a) What does the R square tell you?
b) Why is the ANOVA calculated?

9.6 Predicting English Performance

Next you want to predict English performance on the score obtained on an Attitude Test. In
other words, you get students to take the Attitude Towards English Test and based on the
score obtained you want to predict their performance in English.

For this you need to use the REGRESSION EQUATION as follows:

ENGLISH Score (Y) = βoX + β1

= Slope ((β1) multiplied by Attitude Score (X) + Intercept (βo)

[Don’t panic seeing these mathematical symbols! We will analyse step-by-step what they
mean].

 The simbol ‘β’ is pronounced as “beta”


 So β0 is pronounced ‘beta zero’ which is the slope
 and β1 is pronounced ‘beta one’ which is the intercept

What is the INTERCEPT?


The intercept is the point at which a line crosses an axis. See Figure 9.3. The Y intercept is
the value of Y at which the line crosses the Y axis. It is therefore the value of Y when X = 0.
The Y intercept in the figure is 2 since the line crosses the Y axis at the point (0,2).
Figure 9.3 Graph showing
relationship between X and Y

Slope

What is the SLOPE?


The slope of a line is the steepness of the line. It is measured as the change in Y associated
with a change of one unit on X. Figure 9.4 shows a line with a slope of 0.5. A change of 1 on
the X axis is associated with a change of 0.5 on the Y axis. For example, as X changes from 1
to 2, Y changes from 3.0 to 3.5.

Figure 9.4 Graph showing


relationship between X and Y

0.5

 Using the ‘Regression Equation” you can predict English score (Y) based on the Attitude
Towards English (X) score.

 If Attitudes (x) and English performance (y) have a positive relationship than the Slope
(β1) will be a positive number. Lines with positive slopes go from the bottom left toward
the upper right.
English
score

Attitude towards English

 If Attitudes (x) and English performance (y) have a negative relationship than the Slope
(β1) will be a negative number. Lines with negative slopes go from the upper right to the
lower left. The following graph has a slope of -1: An increase of 1 on the X axis is
associated with a decrease of 1 on the Y Axis.

English
score

Attitude

o If Attitudes (x) and English scores (y) have NO relationship than the Slope (β1) will
be ZERO.
Using the Regression Equation to Predict

From the SPSS output, you get what is called the ‘Coefficients’ table (see Table 4).

Table 4: Intercept (β1)


= 0.70
Coefficients
Unstandardized Standardized
Model Coefficients Coefficients t Sig.
B Std. Error Beta
1 (Constant)
0.70 0.20 .362 13.54 .000
Attitude
Towards English
1.50 .007
Dependent Variable: Performance in English

Slope (βo) = 1.50

 In a regression equation, the slope and the intercept are referred to as the “Coefficients”
in the model (see Table 4).

 Both the coefficient for the equation are found in column labeled “B” (see Table 4) where
the intercept (β1) is identified as the ‘Constant’, and the slope (βo) is identified as
‘Attitude’.

 For example, a student obtained a score of 30 (identified as X) on the Attitude Test.


What is the predicted English score (identified as Y) that the student is likely to
obtain?

 To predict, you will apply the regression equation with information about the ‘intercept’
and ‘slope’ obtained from the ‘Coefficients Table’, i.e. Y = βoX + β1

o Slope βo = 1.50
o Intercept β1 = 0.70
o X = 30
o Y = ? (Find ‘Y’)
To predict ENGLISH performance (Y) =

1.50 X 30.0 + 0.70 = 45.70

Attitude
Slope (βo) Intercept (β1) Predicted
score
English
Performance
Score

By applying the regression equation, the predicted score is 45.70.

Examples:

1. The formula for a regression equation is Y = 3 X + 2. What would be the predicted Y


score for a person scoring 4 on X?
Answer = 12

2. Suppose it is possible to predict a person's score on Test B from the person's score
on Test A. The regression equation is: Y = 2.3 X + 9.5. What is a person's predicted
score on Test B assuming this person got a 40 on Test A?
Answer = 101.5

c) What is the intercept?


d) What is the slope?
e) How do you know there is no relationship between two variables?
f) What does the ‘coefficients table’ tell you?

Anda mungkin juga menyukai