Todd Thomas
University of Arkansas
tjt001@uark.edu
5 May 2014
5 May 2014
1 / 23
Overview
1
Key Concepts
Simple and Multiple Linear Regression
Linear Correlation
Positive & Negative
Strength of Correlation
Calculating r
Notation for the Linear Correlation Coefficient
Requirements
Formula
Interpretation of r
Properties of r
Example
Multiple Linear Regression
Key Concept
Finding Multiple Regression Equation
Notation
Requirement
Todd Thomas (UARK)
Simple and Multiple Linear Regression
5 May 2014
2 / 23
Key Concepts
Key Concepts
5 May 2014
3 / 23
Linear Correlation
Linear Correlation
5 May 2014
4 / 23
Linear Correlation
Strength of Correlation
5 May 2014
5 / 23
Calculating r
Notation
5 May 2014
6 / 23
Calculating r
Requirements
Requirements
5 May 2014
7 / 23
Calculating r
Formula
(1)
(zx zy )
n1
(2)
5 May 2014
8 / 23
Calculating r
Interpretation of r
Interpertation of r
5 May 2014
9 / 23
Calculating r
Properties of r
5 May 2014
10 / 23
Calculating r
Example
Example 1
The paired shoe/height data from five males are listed below. Find the
value of the linear correlation coefficient r for the paired sample data using
a significance lever of = 0.05.
x (Shoe Print)
29.7
29.7
31.4
31.8
27.6
y(Height)
175.3
177.8
185.4
175.3
172.7
x2
882.09
882.09
985.96
1011.24
761.76
y2
30730.09
31612.84
34373.16
30730.09
29825.29
xy
5206.41
5280.66
5821.56
5574.54
4766.52
5 May 2014
11 / 23
Calculating r
Example
Example cont.
Now we calculate the value of r using equation 1:
nxy (x)(y )
q
r=q
n(x)2 (x)2 n(y )2 (y )2
5(26649.69) (150.2)(886.5)
p
=p
5(4523.14) (150.2)2 5(157271.47) (886.5)2
96.15
=
55.66 475.10
= 0.591
Since r = 0.591 and using the significance level = 0.05 then is there
sufficient evidence to support a claim that there is a linear correlation
between shop print length and height?
Todd Thomas (UARK)
5 May 2014
12 / 23
Calculating r
Example
Example 1 cont.
5 May 2014
13 / 23
Key Concept
Key Concepts
5 May 2014
14 / 23
Key Concept
Definition
Definition
A multiple regression equation expresses a linear relationship between a
response variable y and two or more predictor variables (x1 , x2 , ..., xn ).
The general form of a multiple regression equation obtained from sample
data is:
y = b0 + b1 x1 + b2 x2 + ... + bn xn
(3)
5 May 2014
15 / 23
Notation
Notation
5 May 2014
16 / 23
Requirement
Requirements
5 May 2014
17 / 23
Example 2
Example 2
5 May 2014
18 / 23
Example 2
Data Set
Mother Height
63
67
64
60
65
67
59
60
58
72
63
Father Height
64
65
67
72
72
72
67
71
66
75
69
Daughter Height
58.6
64.7
65.3
61.0
65.4
67.4
60.9
63.1
60.0
71.1
62.2
5 May 2014
19 / 23
Example 2
Results
Using Technology we find that our regression equation is:
daughter = 7.5 + 0.707 mother +0.164 father
R 2 = 67.5% R 2 (adj)= 63.7% P = 0.00
Using the notation from the earlier slide, we denote our multiple regression
equation as:
y = 7.5 + 0.707x1 + 0.164x 2
(4)
5 May 2014
20 / 23
5 May 2014
21 / 23
Definition
The Adjusted Coefficient of Determination is the multiple coefficient
fo determination R 2 modified to account for the number of variables and
the sample size. It is calculated by:
R 2 (adj) = 1
(n 1)(1 R 2 )
[n (k + 1)]
(5)
5 May 2014
22 / 23
5 May 2014
23 / 23