REGRESSION
1. Scatter Plot
Scatter plot adalah sebuah grafik yang biasa
digunakan untuk melihat suatu pola
hubungan antara 2 variabel. Untuk bisa
menggunakan scatter plot, skala data yang
digunakan haruslah skala interval dan rasio
Scatter Plot hanya memberikan gambaran
hubungan yang ditangkap indera mata (tidak
menjelaskan arah atau keeratan hubungan)
Berbagai contoh Scatter Plot
N
3. Korelasi
• Alat analisis yang digunakan untuk
mengetahui arah dan keeratan hubungan dua
variabel
• Nilai -1 sd +1
Rumus Korelasi Sederhana
Di mana:
R = Nilai koefisien korelasi ganda
k = jumlah variabel bebas (independen)
n = jumlah sampel
F = F-hitung yang selanjutnya akan dibandingkan dengan
Kaidah penguji signifikansi:
Jika Fhitung > F tabel : signifikan
jika Fhitung < F tabel : maka tidak signifikan
carilah nilai F tabel menggunakan tabel F dengan rumus:
taraf signifikansinya α = 0.01 atau α = 0.05
4. REGRESI
Regresi Sederhana
Ŷ = a + bX
Keterangan:
Ŷ = Respon (variabel terikat/dependen)
a = Constanta
b = Koefisien regresi variabel independen
X = Prediktor (variabel bebas/independen)
The Multiple Regression Model
Idea: Examine the linear relationship between
1 dependent (y) & 2 or more independent variables (xi)
Population model:
Y-intercept
Population slopes Random Error
y β0 β1x1 β 2 x 2 βk x k ε
Estimated multiple regression model:
Estimated Estimated
(or predicted) Estimated slope coefficients
value of y intercept
12
ŷ b0 b1x1 b 2 x 2 bk x k
Multiple Regression Model
Two variable model
y
ŷ b0 b1x1 b 2 x 2
l x1
abe
ri
va
tuk
un
o pe x2
Sl
uk v ar iabel x 2
unt
Slope
x
13 1
Multiple Regression Model
Two variable model
y
yi
<
Observasi sampel
ŷ b0 b1x1 b 2 x 2
yi
<
e = (y – y)
x2i
x2
<
x1i The best fit equation, y ,
is found by minimizing the
x sum of squared errors, e2
14 1
Interpretation of Estimated
Coefficients
Slope (bi)
Estimates that the average value of y changes by bi units for
each 1 unit increase in Xi holding all other variables
constant
Example: if b1 = -20, then sales (y) is expected to decrease
by an estimated 20 pies per week for each $1 increase in
selling price (x1), net of the effects of changes due to
advertising (x2)
y-intercept (b0)
The estimated average value of y when all xi = 0 (assuming
15 all xi = 0 is within the range of observed values)
Multiple Regression Output
•Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172 Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
19
Multiple Coefficient of
Determination (continued)
Regression Statistics
SSR 29460.0
Multiple R 0.72213 R 2
.52148
R Square 0.52148 SST 56493.3
Adjusted R Square 0.44172
Standard Error 47.46341
52.1% of the variation in pie sales is
Observations 15 explained by the variation in price
and advertising
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
21
Adjusted R2 (continued)
22
Multiple Coefficient of
Determination
(continued)
•Regression Statistics
Multiple R 0.72213
R 2A .44172
R Square 0.52148
44.2% of the variation in pie sales is explained
Adjusted R Square 0.44172
by the variation in price and advertising, taking
Standard Error 47.46341
Observations 15
into account the sample size and number of
independent variables
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
24
F-Test for Overall Significance
(continued)
Test statistic:
SSR
k MSR
F
SSE MSE
n k 1
where F has (numerator) D1 = k and
(denominator) D2 = (n – k - 1)
degrees of freedom
25
F-Test for Overall Significance
(continued)
•Regression Statistics
MSR 14730.0
Multiple R 0.72213
F 6.5386
R Square
Adjusted R Square
0.52148
0.44172
MSE 2252.8
Standard Error 47.46341 With 2 and 12 degrees P-value for
Observations 15 of freedom the F-Test
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Test Statistic:
bi 0
(df = n – k – 1)
t
sbi
27
Are Individual Variables Significant?
(continued)
•Regression Statistics
t-value for Price is t = -2.306, with p-
•Multiple R •0.72213
value .0398
•R Square •0.52148
•Adjusted R Square •0.44172
t-value for Advertising is t = 2.855, with
•Standard Error •47.46341
p-value .0145
•Observations •15
SSE
s MSE
n k 1
Is this value large or small? Must compare to
the mean size of y for comparison
29
Standard Deviation of the Regression
Model
•Regression Statistics
•Multiple R •0.72213 The standard deviation of the
•R Square •0.52148 regression model is 47.46
•Adjusted R Square •0.44172
•Standard Error •47.46341
•Observations •15
31
Multicollinearity
(continued)
32
Detect Collinearity
(Variance Inflationary Factor)
1
VIFj
1 Rj
2