Anda di halaman 1dari 40

Correlation and

Regression
Agenda

• What is Correlation
• What is Regression
• Meaning of Beta
• Least square coefficient estimates
• Goodness of fit
• Output interpretation
Why correlation

• What happens to Sweater sales with increase in temperature?


• Is there an association between them
• If so, what is the strength of association between them
• Ice – cream sales V/S temperature?
• Is there an association between them
• If so, what is the strength of association between them

• Which one of these two is stronger? How to quantify the association?


Correlation…
• It is a measure of association (linear association only)
• Formula for Correlation coefficient between two random variables X,Y
is :
Type of relationship
Type of relationship
Type of relationship
Range of correlation
Range of correlation
Strength of association
Correlation in excel & R
X Y
-31 900
-25 625
-24 576
-19 361 = CORREL(A2:A16,B2:B16)
-13 169
-6 36 = -0.11997
-1 1
3 9
10 100
11 121
14 196
15 225
24 576
24 576
29 841
Correlation is not
causation…
Number of ice creams sold in Chennai V/S the number of people drowning in the
Marina beach?
Regression
Why Regression
• Last 20 days of sales data in KFC shop in EA mall. Number of visitors vs
Burger sold. Day # of mall
visitor Burger sold 1400
1 2728 566
1200
2 2098 444
3 2111 454 1000
4 2009 440
=CORREL(H2:H21,I2:I21) 5 3635 760 800

Burger sold
6 4171 881
= 0.998966 7 5244 1091 600

8 3695 783
400
9 3088 666
10 2674 564
Number of visitors is 11 3591 750
200

expected to be 6000 12 3013 650 0


1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500
tomorrow. How 13 5045 1054 # of mall visiitors
14 6118 1245
many burger will be
15 2851 616
sold? 16 2698 564
17 3015 652
18 3409 704
19 3179 683
20 5510 1125
If I want to impact the
sales burger what
should I do?
• Independent variable : # of mall visitor

• Dependent variable : Burger sold.


Regression
Regression line
Meaning of Beta (β)
Least square method
How good is my regression line?
Explained and Unexplained variance
Explained and Unexplained variance
Goodness of fit
Types of relationship
Types of relationship
Types of relationship
Standard error of estimate
Standard Deviation of the
Regression Slope
Comparing Standard Errors
Significance testing
When can I NOT fit a linear
regression line
Assumptions of linear regression
line
Linear Vs Non Linear relationships
Y distributed normally
Variance along the fitted line
Multicollinearity

A B A B
R code…

Anda mungkin juga menyukai