Part 6: Correlation
6-2/49
Part 6: Correlation
6-3/49
Correlated Variables
Part 6: Correlation
6-4/49
Correlated Variables
Part 6: Correlation
6-5/49
Correlation Agenda
Part 6: Correlation
6-6/49
Part 6: Correlation
* There are several types of color blindness and large variation in the incidence across different demographic
groups. These are broad averages that are roughly in the neighborhood of the true incidence for particular groups.
6-7/49
Part 6: Correlation
6-8/49
Dependent Events
Random variables X and Y are dependent if PXY(X,Y) PX(X)PY(Y).
Color Blind
= .0250
Gender
No
Yes
Total
P(Male)
= .5000
Male
.475
.025
0.50
P(Color blind)
= .0275
Female
.4975
.0025
0.50
Total
.97255
.0275
1.00
Part 6: Correlation
6-9/49
Equivalent Definition of
Independence
Part 6: Correlation
6-10/49
Part 6: Correlation
6-11/49
Part 6: Correlation
6-12/49
Conditional Probability
Prob(A | B) = P(A,B) / P(B)
Prob(Color Blind | Male)
=
Prob(Color Blind,Male)
P(Male)
= .025 / .50
= .05
Color Blind
Gender
No
Yes
Total
Male
.475
.025
0.500
Female
.4975
.0025
0.50
Total
.97255
.0275
1.00
6-13/49
Part 6: Correlation
6-14/49
Conditional Distributions
Marginal Distribution of Color Blindness
Color Blind
Not Color Blind
.0275
.9725
Distribution Among Men (Conditioned on Male)
Color Blind|Male
Not Color Blind|Male
.05
.95
Distribution Among Women (Conditioned on Female)
Color Blind|Female Not Color Blind|Female
.005
.995
The distributions for the two genders are different. The
variables are dependent.
Part 6: Correlation
6-15/49
Yes=1
No=0
Total
Yes=1
1/52
12/52
13/52
No=0
3/52
36/52
Total
4/52
48/52
P(Ace|Heart)
= 1/13
P(Ace|Not-Heart)
= 3/39 = 1/13
P(Ace)
= 4/52 = 1/13
= 1/4
P(Heart|Not-Ace)
= 12/48 = 1/4
39/52
P(Heart)
= 13/52 = 1/4
52/52
Part 6: Correlation
6-16/49
Part 6: Correlation
6-17/49
Part 6: Correlation
6-18/49
Joint Distribution
R = Real estate cases
F = Financial cases
Finance
0
1
Total
Real Estate
0
1
2
.15
.10 .05
.30
.20 .20
.45
.30 .25
Total
.30
.70
1.00
Marginal
Distribution
for Financial
Cases
Part 6: Correlation
6-19/49
The probability distribution of Real estate cases (R) given Financial cases (F)
varies with the number of Financial cases (0 or 1).
The probability that (R=2)|F goes up as F increases from 0 to 1.
This means that the variables are not independent.
Part 6: Correlation
6-20/49
Part 6: Correlation
6-21/49
Part 6: Correlation
6-22/49
Conditional Distributions
Overall Distribution
Color Blind
Not Color Blind
.0275
.9725
Distribution Among Men (Conditioned on Male)
Color Blind|Male
Not Color Blind|Male
.05
.95
Distribution Among Women (Conditioned on Female)
Color Blind|Female Not Color Blind|Female
.005
.995
The distribution changes given gender.
Part 6: Correlation
6-23/49
Covariation
Part 6: Correlation
6-24/49
= 0.670
= 0.855
Part 6: Correlation
6-25/49
0.4
0.2 0.0 -
Financial Cases
Part 6: Correlation
6-26/49
= 0.7
Part 6: Correlation
6-27/49
Part 6: Correlation
A=Ace
H=Heart Yes=1 No=0
Total
Yes=1
1/52
12/52
13/52
No=0
3/52
36/52
39/52
52/52
6-28/49
Total
4/52
48/52
SUM
= 0 !!
Part 6: Correlation
6-29/49
6-30/49
Covariance(X,Y)
Part 6: Correlation
6-31/49
Correlation
R = .8 F = .7
Var(F) = 02(.3)+12(.7) - .72
Standard deviation = ..46
= .21
Correlation=
.04
=0.107
.46 .81
Part 6: Correlation
6-32/49
Uncorrelated Variables
Independence implies zero correlation. If
the variables are independent, then the
numerator of the correlation coefficient is
zero.
Part 6: Correlation
6-33/49
Expected Value
Variance and Standard Deviation
Part 6: Correlation
6-34/49
Part 6: Correlation
6-35/49
Mean of a Sum
R = .8
F = .7
Part 6: Correlation
6-36/49
Part 6: Correlation
6-37/49
x y 2 xy
2
x
2
y
Part 6: Correlation
Variance of a Sum
R = .8,
R2 = .66, R = .81
F = .7,
F2 = .21, F = .46
RF = 0.04
What is the variance of the total number of cases that occur each month?
This is the variance of F+R = .21 + .66 + 2(.04) = .95.
The standard deviation is .975.
6-38/49
Part 6: Correlation
6-39/49
ax by a b 2abxy x y
2
2
x
2
y
Part 6: Correlation
R2 = .66, R = .81
F = .7,
F2 = .21, F = .46
RF = 0.04, , RF = .107
What is the variance of the total number of lawyers needed each month?
What is the standard deviation? This is the variance of 2R+3F
6-40/49
Part 6: Correlation
6-41/49
Part 6: Correlation
6-42/49
Part 6: Correlation
6-43/49
Part 6: Correlation
6-44/49
Application - Portfolio
6-45/49
Portfolio
Part 6: Correlation
6-46/49
Part 6: Correlation
6-47/49
Part 6: Correlation
6-48/49
W=1
W=0
Part 6: Correlation
6-49/49
Summary
Part 6: Correlation