REGRESSION
ANALYSIS
Fandy Valentino
11
Statistika Industri II
Program Studi Teknik Industri
Institut Teknologi Sumatera
1
Introduction
2
Dependensi Variabel
mempengaruhi
𝑥 𝑦
Variabel independen/ prediktor Variabel dependen/ respon
Hubungan: 𝑦 = 𝛼 + 𝛽𝑥
𝛼: intercept
𝛽: slope
Syarat utama:
Tipe variabel
Arah dependensi
3
Tugas
4
Tugas
5
Tugas
6
Tugas
7
Tugas
4c. Prediksi 𝑦 = 2𝑥
𝑥𝑖 𝑦𝑖 𝑦ො𝑖 = 2𝑥𝑖 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 𝑒𝑖2
-1 -3 -2 -1 1
0 -1 0 -1 1
1 0 2 -2 4
2 4 4 0 0
3 3 6 -3 9
4 8 8 0 0
Jumlah kuadrat error 15
8
Tugas
3
4d. Prediksi 𝑦 = 𝑥
2
3
𝑥𝑖 𝑦𝑖 𝑦ො𝑖 = 𝑥𝑖 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 𝑒𝑖2
2
-1 -3 -1,5 -1,5 2,25
0 -1 0 -1 1
1 0 1,5 -1,5 2,25
2 4 3 1 1
3 3 4,5 -1,5 2,25
4 8 6 2 4
Jumlah kuadrat error 12,75
9
Tugas
4e. Prediksi 𝑦 = 2𝑥 − 1
𝑥𝑖 𝑦𝑖 𝑦ො𝑖 = 2𝑥𝑖 − 1 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 𝑒𝑖2
-1 -3 -3 0 0
0 -1 -1 0 0
1 0 1 -1 1
2 4 3 1 1
3 3 5 -2 4
4 8 7 1 1
Jumlah kuadrat error 7
10
Tugas
3
4f. Prediksi 𝑦 = 2 𝑥 − 1
3
𝑥𝑖 𝑦𝑖 𝑦ො𝑖 = 𝑥𝑖 − 1 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 𝑒𝑖2
2
-1 -3 -2,5 -0,5 0,25
0 -1 -1 0 0
1 0 0,5 -0,5 0,25
2 4 2 2 4
3 3 3,5 -0,5 0,25
4 8 5 3 9
Jumlah kuadrat error 13,75
11
Tugas
4g. Kesimpulan
Garis Prediksi Jumlah Kuadrat Error
𝑦 = 2𝑥 15
3
𝑦= 𝑥 12,75
2
𝑦 = 2𝑥 − 1 7
3
𝑦 = 𝑥−1 13,75
2
12
Inferensi
Prediksi 1: 𝑦ො = 𝑎1 + 𝑏1 𝑥
sampling
Prediksi 2: 𝑦ො = 𝑎2 + 𝑏2 𝑥
inferensi
Populasi: 𝑦 = 𝛼 + 𝛽𝑥 መ = 𝑎 + 𝑏𝑥
Sampling: 𝑦ො = 𝛼ො + 𝛽𝑥
13
The Method of Least Squares
11.1
14
11.1 The Method of Least Squares
We introduce the ideas of regression analysis in the simple setting where the
distribution of a random variable Y depends on the value x of one other variable.
Where:
x = independent variable, also called predictor variable, or input variable.
y = dependent variable, or response variable.
Regression of Y on x: The relationship between x and the mean E [Y | x ] of the
corresponding distribution of Y.
Linear regression curve of Y on x
➢ That is, for any given x, the mean of the distribution of the Y’s is given by α + βx. In
general, Y will differ from this mean, and we shall denote this difference by ε, writing
Y = α + βx + ε
➢ ε is a random variable. In this model we can choose α so that the mean of the
distribution of this random variable is equal to zero.
15
11.1 The Method of Least Squares (Cont)
16
11.1 The Method of Least Squares (Cont)
17
11.1 The Method of Least Squares (Cont)
How to estimate the parameters 𝛼 and 𝛽 of the regression line using the
observed data in a manner that somehow provides the best fit to the data
There is n paired observations (𝑥𝑖 , 𝑦𝑖 ) for which it is reasonable to assume
that the regression of Y on x is linear. We want to determine the line (that is,
the equation of the line) which in some sense provides the best fit. If we
predict y by means of the equation
𝑦ො = 𝑎 + 𝑏𝑥
where a and b are constants, then 𝑒𝑖 , the error in predicting the value of 𝑦
corresponding to the given 𝑥𝑖 is
𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖
18
11.1 The Method of Least Squares (Cont)
is a minimum.
19
11.1 The Method of Least Squares (Cont)
20
11.1 The Method of Least Squares (Cont)
Before minimizing the sum of squared deviations to obtain the least squares
estimators, it is convenient to introduce some notation for the sums of squares and
sums of cross-products.
The first expressions are preferred on conceptual grounds because they highlight
deviations from the mean and on computing grounds because they are less
susceptible to roundoff error.
The second expressions are for handheld calculators. Least squares estimates. 21
11.1 The Method of Least Squares (Cont)
22
11.1 The Method of Least Squares (Cont)
The individual deviations of the observations yi from their fitted values 𝑦ො𝑖 =
መ i are called the residuals.
𝛼ො + 𝛽x
The minimum value of the sum of squares is called the residual sum of
squares or error sum of squares.
23
11.1 The Method of Least Squares (Cont)
24
11.1 The Method of Least Squares (Cont)
EXAMPLE 1 (Solution)
The structure of the table guides the calculations.
26
11.1 The Method of Least Squares (Cont)
𝑥 𝑦 𝑥2 𝑦2 𝑥𝑦
20 0.18
60 0.37
100 0.35
140 0.78
180 0,56
220 0.75
260 1.18
300 1.36
340 1.17
380 1.65
Σ 28
11.1 The Method of Least Squares (Cont)
EXAMPLE 2 (solution)
1) For these n = 10 pairs ( xi, yi ) we 2) and then we obtain
first calculate 20002
𝑆𝑥𝑥 = 532000 − = 132000
10
2000 × 8.35
𝑆𝑥𝑦 = 2175.40 − = 505.40
10
8.352
𝑆𝑦𝑦 = 9.1097 − = 2.13745
10
3) Consequently, the estimate of slope is
29
Continue to the next slide
11.1 The Method of Least Squares (Cont)
EXAMPLE 2 (solution cont)
4) and then the estimate of intercept becomes
5) The equation of the straight line that best fits the given data in the sense of least squares,
30
11.1 The Method of Least Squares (Cont)
31
11.1 The Method of Least Squares (Cont)
This set of two linear equations in the unknowns a and b, called the normal
equations, gives the same values of 𝛼ො and 𝛽መ for the line which provides the
best fit to a given set of paired data in accordance with the criterion of least
squares.
32
11.1 The Method of Least Squares (Cont)
EXAMPLE 4 (Solution)
Using the calculations in Example 2, the normal equations are
34
Inferences Based on the Least
Squares Estimators
11.2
35
Inferences
Prediction 1: 𝑦ො = 𝑎1 + 𝑏1 𝑥
sampling
Prediction 2: 𝑦ො = 𝑎2 + 𝑏2 𝑥
inferences
Population: 𝑦 = 𝛼 + 𝛽𝑥 መ = 𝑎 + 𝑏𝑥
Samples: 𝑦ො = 𝛼ො + 𝛽𝑥
36
11.2 Inferences Based on the Least Squares Estimators
38
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
Note the close relationship between 𝑆𝑥𝑥 and 𝑆𝑦𝑦 and the respective sample
variances of the x and the y, in fact
𝑆𝑥𝑥 σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2
= = 𝑠𝑥2
𝑛−1 𝑛 𝑛−1
𝑆𝑦𝑦 σ𝑖=1 𝑦𝑖 − 𝑦ത 2
= = 𝑠𝑦2
𝑛−1 𝑛−1
The estimate of σ2 is
(𝑆𝑥𝑦 )2
𝑆𝑆 𝑆𝑦𝑦 − 𝑆 σ 𝑛
(𝑦 − 𝑦
ො ) 2
𝐸 𝑥𝑥 𝑖=1 𝑖 𝑖
𝑠𝑒2 = = =
𝑛−2 𝑛−2 𝑛−2
Traditionally, se is referred to as the standard error of the estimate. The 𝑠𝑒2 estimate is
the residual sum of squares, or the error sum of squares, divided by n − 2
39
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
Statistics for inferences about α and β
𝑎−𝛼 𝑛𝑆𝑥𝑥
𝑡𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒 = 2
𝑠𝑒 𝑆𝑥𝑥 + 𝑛 𝑥ҧ
𝑏−𝛽
𝑡𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒 = 𝑆𝑥𝑥
𝑠𝑒
are random variables having the t distribution with n - 2 degrees of freedom.
40
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
1 𝑥ҧ 2 1 𝑥ҧ 2
𝑎 − 𝑡𝛼 × 𝑠𝑒 + < 𝛼 < 𝑎 + 𝑡𝛼 × 𝑠𝑒 +
2 𝑛 𝑆𝑥𝑥 2 𝑛 𝑆𝑥𝑥
1 1
𝑏 − 𝑡𝛼 × 𝑠𝑒 < 𝛽 < 𝑏 + 𝑡𝛼 × 𝑠𝑒
2 𝑆𝑥𝑥 2 𝑆𝑥𝑥
41
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
43
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
In connection with tests of hypotheses concerning the regression coefficients
α and β, those concerning β are of special importance because β is the slope
of the regression line. That is, β is the change in the mean of Y corresponding
to a unit increase in x.
If β = 0, the regression line is
horizontal and the mean of Y does
not depend linearly on x. For tests
of the null hypothesis β = β0
We use the statistic test and the
criteria are like those in the table
with μ replaced by β.
44
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
20002
𝑆𝑥𝑥 = 532000 − 10 = 132000
2000×8.35
𝑆𝑥𝑦 = 2175.40 − = 505.40
10
8.352
𝑆𝑦𝑦 = 9.1097 − = 2.13745
10
45
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
EXAMPLE 6. (Solution)
1. Null hypothesis: 𝛽 = 0 ; Alternative hypothesis: 𝛽 ≠ 0
2. Level of significance: 𝛼 = 0.05
3. Criterion: Reject the null hypothesis if t < −2.306 or t > 2.306, where 2.306
is the value of t0.025 for 10 − 2 = 8 degrees of freedom
4. Calculations: Using the quantities obtained in Examples 2 and 5, we get
46
Continue to the next slide
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
47
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
If x is held fixed at x0, the quantity we want to estimate is α + βx0 and it would seem
reasonable to use 𝛼ො + 𝛽x መ 0, where α and β are again the values obtained by the
method of least squares. In fact, it can be shown that this estimator is unbiased,
has the variance
48
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
49
Continue to the next slide
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
50
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
EXAMPLE 7 (Solution)
(a) The scatter plot in Figure 11.7
suggests fitting a straight line
model. Using computer software,
we obtain
51
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
EXAMPLE 7 (Solution cont)
b) The estimated regression line is
52
Continue to the next slide
11.2 Inferences Based on the Least Squares Estimators
(Cont.)
➢ We are 95% confident that mean strength is between 9.79 and 10.09 kN
for all alloy sheets that could undergo a prestrain of 9 percent.
53
Curvilinear Regression
11.3
54
11.3 Curvilinear Regression
Where the subscripts and limits of summation are omitted for simplicity.
Note that this is a system of 𝑝 + 1 linear equations in the 𝑝 + 1 unknowns
𝑏0 , 𝑏1 , 𝑏2 , … , 𝑏𝑝 . If the x’s include 𝑝 + 1 distinct values, then the normal
equations will have a unique solution.
56
11.3 Curvilinear Regression
EXAMPLE 11 (SOLUTION)
a) As can be seen from Figure 11.12, the
overall pattern suggests fitting a second-
degree polynomial having one relative
minimum.
b) Normal equation method of least squares
58
Continue to the next slide
11.3 Curvilinear Regression
Σ 59
Continue to the next slide
11.3 Curvilinear Regression
EXAMPLE 11 (Solution Cont)
b) Alternatively, the summations required for substitution into the normal
equations are
𝑥 2 = 204 3 = 1296 𝑥 4 = 8772
𝑥 = 36 𝑥
Thus we have to solve the following system of three linear equations in the
unknowns b0, b1, and b2:
61
Multiple Regression
11.4
62
11.4 Multiple Regression
As in the case of one independent variable, we shall first treat the problem
where the regression equation is linear, namely, where for any given set of
values 𝑥1 , 𝑥2 , … 𝑥𝑟 , for the 𝑟 independent variables, the mean of the
distribution of Y is given by
𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑟 𝑥𝑟
63
11.4 Multiple Regression (cont)
64
11.4 Multiple Regression (cont)
As before, we write the least squares estimates of β0, β1, and β2 as b0, b1,
and b2. Note that in the abbreviated notation σ 𝑥1 stands for σ𝑛𝑖=1 𝑥𝑖1 , σ 𝑥1 , 𝑥2
stands for σ𝑛𝑖=1 𝑥𝑖1 𝑥𝑖2 , σ 𝑥1 𝑦 stands for σ𝑛𝑖=1 𝑥𝑖1 𝑦𝑖 , and so forth.
65
11.4 Multiple Regression (cont)
66
11.4 Multiple Regression (cont)
𝑦 𝑥1 𝑥2 𝑥1 𝑦 𝑥2 𝑦 𝑥12 𝑥1 𝑥2 𝑥22
41 1 5 41 205 1 5 25
49 2 5 98 245 4 10 25
69 3 5 207 345 9 15 25
65 4 5 260 325 16 20 25
40 1 10 40 400 1 10 100
50 2 10 100 500 4 20 100
58 3 10 174 580 9 30 100
57 4 10 228 570 16 40 100
31 1 15 31 465 1 15 225
36 2 15 72 540 4 30 225
44 3 15 132 660 9 45 225
57 4 15 228 855 16 60 225
19 1 20 19 380 1 20 400
31 2 20 62 620 4 40 400
33 3 20 99 660 9 60 400
43 4 20 172 860 16 80 400
Σ 723 40 200 1963 8210 120 500 3000 67
11.4 Multiple Regression (cont)
68
11.4 Multiple Regression (cont)
69
11.4 Multiple Regression (cont)
EXAMPLE 13 (Solution)
We use software to produce the statistical analysis.
72
Continue to the next slide
11.4 Multiple Regression (cont)
The value 𝛽1 = 2.719 tells us that if x1 is increased by one unit, while x2 is held
constant, the estimated mean assessment time will increase by 2.719 hours.
This change in x1 corresponds to changing from a simple to difficult problem.
2 = −0.3641 implies that if x2 is increased by one unit, while x1 is held
Similarly, 𝛽
constant, the estimated mean assessment time decreases by 0.3641 hours.
73
Correlation
11.6
74
11.6 Correlation
There are problems where the x’s as well as the y’s are values assumed by
random variables.
This would be the case, for instance, if we studied the relationship between:
➢input and output of a wastewater treatment plant,
➢the tensile strength and the hardness of aluminum,
➢impurities in the air and the incidence of a certain disease.
Problems like these are referred to as problems of correlation analysis, where
it is assumed that the data points (𝑥𝑖 , 𝑦𝑖 ) for i = 1, 2, . . . , n are values of a
pair of random variables whose joint density is given by 𝑓(𝑥, 𝑦)
75
11.6 Correlation (cont)
The scatter plot provides a visual impression of the relation between the x
and y values in a bivariate data set. The best interpretation of the sample
correlation coefficient is in terms of the standardized observations
76
11.6 Correlation (cont)
77
11.6 Correlation (cont)
Alternatively, if one component of the pair tends to be large when the other is small,
and vice versa, the correlation coefficient r is negative. This case corresponds to a
northwest to southeast pattern in the scatter plot. It can be shown that the value of r
is always between −1 and 1, inclusive.
1. The magnitude of r describes the strength of a linear relation and its sign
indicates the direction.
➢ r = +1 if all pairs (𝑥𝑖 , 𝑦𝑖 ) exactly on a straight line having a positive slope.
➢ r > 0 if the pattern in the scatter plot runs from lower left to upper right.
➢ r < 0 if the pattern in the scatter plot runs from upper left to lower right.
➢ r = −1 if all pairs (𝑥𝑖 , 𝑦𝑖 ) lie exactly on a straight line having a negative slope.
➢ A value of r near −1 or +1 describes a strong linear relation.
78
11.6 Correlation (cont)
2. A value of r close to zero implies that the linear association is weak. There may still
be a strong association along a curve.
From the definitions of 𝑆𝑥𝑥 , 𝑆𝑥𝑦 and 𝑆𝑦𝑦 , we obtain a simpler calculation formula for r.
𝑆𝑥𝑦
𝑟=
𝑆𝑥𝑥 × 𝑆𝑦𝑦
79
11.6 Correlation (cont)
EXAMPLE 14 Calculating the sample correlation coefficient
The data in the table are the numbers of minutes it took 10 mechanics to assemble a
piece of machinery in the morning, 𝑥, and in the late afternoon, 𝑦. Calculate 𝑟.
𝑥 11,1 10,3 12,0 15,1 13,7 18,5 17,3 14,2 14,8 15,3 142,3
𝑦 10,9 14,2 13,8 21,5 13,2 21,1 16,4 19,3 17,4 19,0 166,8
80
11.6 Correlation (cont)
𝑥 𝑦 𝑥2 𝑥𝑦 𝑦2 EXAMPLE 14 (Solution)
11,1 10,9 123,2 120,99 118,8 Determine the summations needed for the
10,3 14,2 106,1 146,26 201,6 formulas. We get
2
12,0 13,8 144,0 165,6 190,4 142,3
𝑆𝑥𝑥 = 2085,3 − = 60,381
15,1 21,5 228,0 324,65 462,3 10
142,3 × 166,8
13,7 13,2 187,7 180,84 174,2 𝑆𝑥𝑦 = 2434,7 − = 61,126
10 2
18,5 21,1 342,3 390,35 445,2 166,8
𝑆𝑦𝑦 = 2897,8 − = 115,576
17,3 16,4 299,3 283,72 269,0 10
14,2 19,3 201,6 274,06 372,5 So,
𝑆𝑥𝑦 61,126
14,8 17,4 219,0 257,52 302,8 𝑟= = = 0,732
15,3 19,0 234,1 290,7 361,0 𝑆𝑥𝑥 × 𝑆𝑦𝑦 60,381 × 115,576
Σ 142,3 166,8 2085,3 2434,7 2897,8
81
11.6 Correlation (cont)
82
11.6 Correlation (cont)
84
11.6 Correlation (cont)
85
11.6 Correlation (cont)
EXAMPLE 16 Calculating the proportion of y variation attributed to the linear relation
Refer to Example 14 concerning the data on assembly times. Find the proportion of
variation in y, the afternoon assembly times, that can be explained by a straight-
line fit to x, the morning assembly times.
𝑥 𝑦 𝑥2 𝑥𝑦 𝑦2 142,32
11,1 10,9 123,2 120,99 118,8 𝑆𝑥𝑥 = 2085,3 − = 60,381
10
10,3 14,2 106,1 146,26 201,6 142,3 × 166,8
12,0 13,8 144,0 165,6 190,4 𝑆𝑥𝑦 = 2434,7 − = 61,126
10 2
15,1 21,5 228,0 324,65 462,3 166,8
13,7 13,2 187,7 180,84 174,2 𝑆𝑦𝑦 = 2897,8 − = 115,576
10
18,5 21,1 342,3 390,35 445,2
17,3 16,4 299,3 283,72 269,0 𝑆𝑥𝑦 61,126
14,2 19,3 201,6 274,06 372,5 𝑟= = = 0,732
14,8 17,4 219,0 257,52 302,8 𝑆𝑥𝑥 × 𝑆𝑦𝑦 60,381 × 115,576
15,3 19,0 234,1 290,7 361,0
Σ 142,3 166,8 2085,3 2434,7 2897,8 86
11.6 Correlation (cont)
EXAMPLE 16 (Solution)
In the earlier example, we obtained 𝑟 = 0,732. Consequently, the proportion
of variation in y attributed to x is 𝑟 2 = 0,7322 = 0,536
The result we have obtained here implies that 𝑟 2 = 53,6% of the variation
among the afternoon times is explained by (is accounted for or may be
attributed to) the corresponding differences among the morning times.
87
11.6 Correlation (cont)
88
11.6 Correlation (cont)
89
11.6 Correlation (cont)
90
11.6 Correlation (cont)
91
11.6 Correlation (cont)
92
11.6 Correlation (cont)
EXAMPLE 17. (Solution)
1. Null hypothesis: 𝜌 = 0 ; Alternative hypothesis: 𝜌 ≠ 0
2. Level of significance: 𝛼 = 0,05
3. Criterion: Reject the null hypothesis if 𝑍 < −1,96 or 𝑍 > 1,96, where 𝑍 = Ƶ 𝑛 − 3
4. Calculations: The value of Ƶ corresponding to 𝑟 = 0,732 is
1 1 + 𝑟 1 1 + 0,732
Ƶ = ln = ln = 0,933
2 1 − 𝑟 2 1 − 0,732
so that
𝑍 = Ƶ 𝑛 − 3 = 0,933 × 10 − 3 = 2,47
5. Decision: Since 𝑍 = 2,47 exceeds 1,96, the null hypothesis must be rejected; we
conclude that there is a relationship between the morning and later afternoon times it
takes a mechanic to assemble the given kind of machinery.
93
11.6 Correlation (cont)
94
11.6 Correlation (cont)
96
Multiple Linear Regression
(Matrix Notation)
11.7
97
11.7 Multiple Linear Regression (Matrix Notation)
98
11.7 Multiple Linear Regression (Matrix Notation)
Example we have pairs (𝑥𝑖1 , 𝑥𝑖2 , 𝑦𝑖 ) in table
𝑥𝑖1 𝑥11 𝑥21 … 𝑥𝑛1
𝑥𝑖2 𝑥12 𝑥22 … 𝑥𝑛2
𝑦𝑖 𝑦1 𝑦2 … 𝑦𝑛
We will find the value (𝑏0 , 𝑏1 , 𝑏2 ) such that
𝑏0 + 𝑏1 𝑥11 + 𝑏2 𝑥12 = 𝑦1
𝑏0 + 𝑏1 𝑥21 + 𝑏2 𝑥22 = 𝑦2
…
𝑏0 + 𝑏1 𝑥𝑛1 + 𝑏2 𝑥𝑛2 = 𝑦𝑛
In matrix notation
99
11.7 Multiple Linear Regression (Matrix Notation) [cont]
100
11.7 Multiple Linear Regression (Matrix Notation) [cont]
101
11.7 Multiple Linear Regression (Matrix Notation) [cont]
102
11.7 Multiple Linear Regression (Matrix Notation) [cont]
Solution
Here k = 1 and, dropping the subscript 1, we have
103
Continue to the next slide
11.7 Multiple Linear Regression (Matrix Notation) [cont]
104
Continue to the next slide
11.7 Multiple Linear Regression (Matrix Notation) [cont]
Finally,
1 1 6
𝑠𝑒2 = 𝑦 − 𝑦ො ′ 𝑦 − 𝑦ො = −1 2 + 22 + −1 2 + 02 + 02 = =2
𝑛−𝑘−1 5−1−1 3
105
11.7 Multiple Linear Regression (Matrix Notation) [cont]
𝑦 𝑥1 𝑥2 𝑥1 𝑦 𝑥2 𝑦 𝑥12 𝑥1 𝑥2 𝑥22
EXAMPLE 19 Calculating the least 41 1 5 41 205 1 5 25
squares estimates using 49 2 5 98 245 4 10 25
𝑋 ′ 𝑋 −1 𝑋 ′ 𝑦 69 3 5 207 345 9 15 25
65 4 5 260 325 16 20 25
With reference to the Example 40 1 10 40 400 1 10 100
12, use the matrix expressions to 50 2 10 100 500 4 20 100
determine the least squares 58 3 10 174 580 9 30 100
estimates of the multiple 57 4 10 228 570 16 40 100
regression coefficients. 31 1 15 31 465 1 15 225
36 2 15 72 540 4 30 225
44 3 15 132 660 9 45 225
57 4 15 228 855 16 60 225
19 1 20 19 380 1 20 400
31 2 20 62 620 4 40 400
33 3 20 99 660 9 60 400
43 4 20 172 860 16 80 400
Σ 723 40 200 1963 8210 120 500 3000 106
11.7 Multiple Linear Regression (Matrix Notation) [cont]
EXAMPLE 19 (Solution)
Substituting σ 𝑥1 = 40, σ 𝑥2 = 200, σ 𝑥12 = 120, σ 𝑥1 𝑥2 = 500, σ 𝑥22 = 3000,
and n = 16 into the expression for 𝑿’𝑿 above, we get
16 40 200
𝑋 ′ 𝑋 = 40 120 500
200 500 3000
Then the inverse of this matrix can be obtained by any one of a number of
different techniques; using the one based on cofactors, we find that
1 110000 −20000 −4000
(𝑋 ′ 𝑋)−1 = −20000 8000 0
160000
−400 0 320
Finally
108
Continue to the next slide
11.7 Multiple Linear Regression (Matrix Notation) [cont]
The residual sum of squares also has a convenient matrix expression. The predicted
0 + 𝛽
values 𝑦𝑖 = 𝛽 1 𝑥𝑖1 + 𝛽
2 𝑥𝑖2 can be collected as a matrix
𝑦ො 𝑋𝛽መ
1 ′
Then 𝛽መ = 𝑋 ′ 𝑋 −1 𝑋 ′ 𝑦
and 𝑠𝑒2 = 𝑦 − 𝑋𝛽መ 𝑦 − 𝑋𝛽መ
𝑛−𝑘−1
Generally, the sum of squares error, SSE, has degrees of freedom
dof = n − (number of β’s in model)
𝑣 = 𝑛 − (𝑘 + 1) 110
END
111