Anda di halaman 1dari 11

2.

7
Correlation coefficient and Bivariate Normal Distribution
Meaning of correlation:

In a bivariate distribution we may be interested to find out if there is any correlation


or covariance between the two variables under study. If the change in one variable

two variables deviate in the same direction, . ., if the increase (or decrease) in one
affects a change in the other variable, the variables are said to be correlated. If the

be positive. But, if they constantly deviate in the opposite directions, . ., if increase


results in a corresponding increase (or decrease) in the other, correlation is said to

(or decrease) in one results in corresponding decrease (or increase) in the other,
correlation is said to be negative. For example, the correlation between (i) the
heights and weights of a group of persons, and (ii) the income and expenditure; is
positive and the correlation between (i) price and demand of a commodity and (ii)
the volume and pressure of a perfect gas; is negative. Correlation is said to be perfect
if the deviation in one variable is followed by a corresponding and proportional
deviation in the other.

Karl Pearson’s Coefficient of Correlation:

As a measure of intensity or degree of linear relationship between two variables, Karl

,
Pearson, a British Biometrician developed a formula called correlation coefficient.
Correlation coefficient between two variables and , usually denoted by
or , is a numerical measure of linear relationship between them and is defined by

, = =
,
.

where = , = − − ,

=" = − =" = −
! ! ! !
and
Note:

1. # , provides a measure of linear relationship between and . For non –


linear relationship, however, it is not suitable.

2. Karl Pearson’s correlation coefficient is also called product – moment correlation


coefficient.

Properties:

1. −1 ≤ # , ≤ 1. If # = −1, the correlation is perfect and negative. If # = 1,


the correlation is perfect and positive.

&= and " = , then # &, " = # ,


2. Correlation coefficient is independent of change of origin and scale. That is, if
'( '*
) +

Theorem: Two independent variables are uncorrelated.

Proof:

Consider = , = − −

⟹ = , − . …… (1)

If and are independent, then

= . .…… (2)

From 1 and 2 , if and are independent, then # , =0

The converse need not be true. That is, uncorrelated variables need not be
independent.

Example 1 : Let ~0 1, 2 and = 3


. Then 4 =4 5
= 1.

Solution: Consider , = − . = 6
− . !

= 0−0 = 0

⟹ , = 0 but and are related by = !


.
Thus, uncorrelated variables need not be independent.

Note: The converse is true if the joint distribution of , is bivariate normal.

Example 2: The j.p.m.f of , is given below:

−2 2

1 2 5
7 7

3 3
2
7 7

Find the correlation coefficient between and

Solution : Computation of marginal p.m.fs

−2 2 8 9

1 2 5 4
7 7
8

4
3 3
2 7 7 8

< = 6 ? 1
> >

We have
6 ? 6 ? ! C
= ∑ = < = = −1 × + 1 × = − + = = ,
> > > > > D

!6 ? 6 ?
!
= ∑ = ! E = = −1 + 1! × = + = 1, then
> > > >

! C ! C C?
" = !
− =1−F G =1− =
D CH CH
D D D C
=∑ I = 0× +1× = =
> > > !
Similarly,
D D D C
!
=∑ !
I = 0! × + 1! × = = and
> > > !

! C C ! C C C
" = !
− = −F G = − =
! ! ! D D

C 6 ! !
= 0 × −1 × + 0 × 1 × + 1 × −1 × + 1 × 1 × = 0
> > > >
Further,

Thus, , = −

0− × =−
C C C
D ! >

,
L
∴ # , = =− =− = −0.2582
M C

K" K" √C?


LN L
K ×
LO P

Example 3: Two random variables and have the joint probability density
function
3 − T − 9 , 1 < = < 1 , 1 < < 1^
S T, 9 = U
1 , WXYZ[\]Y
Find correlation coefficient between and .

Solution: By symmetry in = and we have _C = = _! =


" ="
, and

The m.p.d.f is given by


C C
3
_C = = ` _ =, b =` 2−=− b = −=
a a 2

− = , _ 0 < = < 1^
6
Thus, _C = = d!
0 , eℎ gh i
Consider.

3 3 5
= ` =_C = b= = ` = j − =k b= = ` j = − = ! k b= =
C C C

a a 2 a 2 12
C
3 C C
3 1
!
= ` =_C = = ` = j − =k b= = ` j = ! — = 6 k b= =
!
a a 2 a 2 4

Further,
C C C C
= ` ` = _ =, b= b = ` ` = 2 − = − b= b
a a a a

=! =6 =!
C C C C
=` m` 2= − = − = b= n b = ` !
o2. − − p b
a a a 2 3 2 a
C
1 C
2
= ` j1 − − k b = ` j − k b
a 3 2 a 3 2

2 1 1 1
C ! 6 6 C
= ` m − nb = o − p = − =
a 3 2 3 6 a 3 6 6

1
∴ =
6
? !
Thus, " = − = −F G = − = =
! ! C C !? 6H'!? CC
D C! D CDD CDD CDD

? !
, = − = −F G = − = =−
C C !? !D'!? C
H C! H CDD CDD CDD
and

∴ The correlation coefficient is given by

,
L L
# , = =− =− =−
LPP LPP C
LL
K" = K" LL LL CC
K K LPP
LPP LPP
Bivariate Normal Distribution:

The bivaraite normal distribution is a generalization of a normal distribution for a


single value.

and be two normally correlated variables with correlation coefficient #. Let


= rC , " = C! , = r! and " = !! .
Let

,
bivariate normal distribution with parameters rC , r! , C , ! and # if its p.d.f. is given
Definition: The bivariate continuous random variable is said to follow
! !

by

_ =, = =< w− x − 2# + |};
C C y'zL u y'zL {'zu {'zu u
!stL tu C'vu ! C'vu tLu tL tu tuu

−∞ < =, , rC , r! < ∞, C > 0, ! > 0 and −1 < # < 1.

Notation: , ~ۥ rC , r! , C! , !! , # . Read as ,


distribution with parameters rC , r! C! , !! and #.
follows bivariate normal

Note: The curve ‚ = _ =, which is the equation of a surface in three dimensions is


called the Normal correlation surface.

Marginal p.d.fs of and : The m.p.d.f of is given by


ƒ
_C = = ` _ =, b

= = r! + and b = !b
{'zu
Let
tu
, then !

Therefore,

y'zL !
_C = = „ =< …− UF G − 2# F G+ †‡ b
tu ƒ C y'zL !
!stL tu C'v u 'ƒ ! C'vu tL tL

C y'zL ! !
= =< …− F G ‡ „'ƒ =< …− ! x −#F G| ‡ b
C ƒ C y'zL
!stL C'vu ! tL C'vu tL

w −#F G} = e. Then b = 1 − #! be
C y'zL
C'v u tL
Let
∴ _C = = =< w− F G} „'ƒ =< F− ! G be
C C y'zL ƒ ˆu
!stL ! tL

C y'zL !
= =< …− F G ‡ . √2‰
C
!stL ! tL

C y'zL !
⟹ _C = = =< …− F G ‡ for −∞ < = < ∞
C
tL √!s ! tL

Similarly, it can be shown that

C {'zL !
_! = =< …− F G ‡ for −∞ < = < ∞
C
tu √!s ! tu

Hence ~• rC , !
C and ~• r! , !
! .

Note: If , ~ۥ rC , r! , C , ! , #
! !
, then ~• rC , !
C and ~• r! , !
!

Conditional p.d.fs of and

The conditional probability density function (c.p.d.f.) of for given is given by

_ =,
_C|! =| =
_!
y'zL ! {'zu !
= =< … UF G − 2# F GF G+F G 1 − 1 − #! †‡
C 'C y'zL {'zu
tL √!s C'vu ! C'vu tL tL tu tu

= =< w u x = − rC − 2# = − rC − r! + #! − r! ! |}
C 'C ! tL tLu
tL √!s C'v u ! C'vu tL tu tuu

!
= =< … u x = − rC − # − r! | ‡
C 'C tL
tL √!s C'vu ! C'vu tL tu

!
Therefore, _C|! =| = =< … u x = − rC − # − r! | ‡
C 'C tL
tL √!s C'vu ! C'vu tL tu

which is the univariate normal distribution with mean

| = = rC + # − r! and
tL
tu

" | = = !
C 1 − #!
Thus, the c.p.d.f of for fixed is given by

C
| = ~• …rC + # − r! , !
C 1 − #! ‡
!

Similarly, the c.p.d.f of for fixed = = is given by

C 'C tu !
_!|C |= = =< … x − r! − # = − rC | ‡, −∞ < <∞
tu √!s C'vu ! C'vu tuu tL

Thus, the c.p.d.f of for fixed is given by

!
| = = ~• …r! + # = − rC , !
! 1 − #! ‡
C

Example 4: If , ~‹0 Œ, 21, 2, 3Œ, > 0, find


• Ž < < 16| = 5 = 1. •ŒŽ
where when

Solution:
!‘
Here rC = 5, r! = 10, !
C = 1, !
! = 25. We know that | = = ~••r,

where r = r! + # = − rC and = 1 − #! .
tu ! !
tL !

Here r = 10 + # × 5 − 5 = 10 and = 25 1 − #!
? !
C

Thus | = 5 ~••10, 25 1 − #! ‘. We want to find # so that


E 4 < < 16| = 5 = 0.954

Let “ = = ~• 0, 1 ⟹ E F <“< G = 0.954


'z 'Ca D'Ca CH'Ca
t ? C'vu t t

⟹ E F− < “ < G = 0.954 ⟹ E F0 < “ < G = 0.477


H H H
t t t

From standard normal table, we have = 2 ⟹ = 3 ⟹ =9


H !
t

9 9 16 4
⟹ 25 1 − #! = 9 ⟹ 1 − #! = ⟹ #! = 1 − = ⟹ # = = 0.8
25 25 25 5
Example 5: Find Z , for the jointly normal distribution

— 3T − 9 3 + 3T9˜
2
S T, 9 = YT– o− p , −∞ < =, <∞
3•√5 ™

Solution: Given , ~ۥ rC , r! , C , ! , #


! !
. Then its p.d.f. is given by
1 1 = − rC !
= − rC − r! − r! !
_ =, = =< o− d − 2# + šp − − 1
2‰ C ! 1 − #! 2 1−# !
C
!
C !
!
!

We have
1 — 2= − !
+ 2= ˜
_ =, = =< o− p , −∞ < =, <∞
2‰√3 6

. , _ =, = 2‰√3 =< …− ‡−−−− 3


1 ›4=2 + 2 −2= œ
6

Comparing 1 and 2 , we get rC = r! = 0. Then 1 becomes

=! !
=
1 U !+ ! − 2# †
_ =, = =< •− ž— 3
C ! C !
2‰ C ! 1 − #! 2 1 − #!

Comparing 2 and 3 , we find

1 − #! = 3 and =
v C
1 − #! = √3 , 1 − #! = ,
! 6 !
C ! C D ! tL tu C'vu 6

= 1 , = 4, #! =
! ! C
On solving we get C ! D

g , =#=K =±
C C
D !
Thus

Example 6: Determine the parameters of the bivariate normal distribution

—2™ T − 3 3
− 23 T − 3 9 + 5 + • 9 + 5 3 ˜
S T, 9 = YT– o− p
32™

Solution: If , ~ۥ rC , r! , C
!
, !
!
, # , then
¡¢£ u ¡¢£ ¥¢£ u ¥¢£ u
C UF ¤ L G '!vF ¤ L GF ¤ u G ¦F ¤ uG †
_ =, = =< − L L u u
§
!stL tu C'vu ! C'v u

Comparing these functions, we get


1 16 1
rC = 2, r! = −3, = , =
2‰ C ! 1 − #! 216 2 1 − #! C
!

9 1 12 2#
= , =
216 2 1 − #! !
! 216 2 C ! 1−#
!

∴ 1 − #! = , 1 − #! = 12 , 1 − #! = 18#
! !¨ !
C D ! C !

⟹ 1 − #! = 81 = 18# ⟹ #! = ,
! ! ! ! C
C ! D

Further, C = 3 and ! = 4.

Thus, = = =
C C C
!stL tu C'vu L C!s √6
!s×6×DKC'
P

∴ , ~€• F2,3,9,16, G
C
!

Example 7: If ~0 ©, 3
and |T ~0 T, 3
, show that

, ~‹0 ©, ©, 3
,3 3
, .

Solution: We are given that

C y'z !
_ = = =< …− F G ‡ , −∞ < = < ∞
C
√!s t ! t

C {'y !
I |= = =< …− F G ‡ , −∞ < <∞
C
√!st ! t

1 1 =−r ! −= !
∴ ℎ =, =I |= _ = = =< …− UF G +F G †‡
2‰ ! 2
{'y ! {'z¦z'y ! {'z ! y'z !
Consider F G =F G =F G +F G − 2F GF G
y'z {'z
t t t t t t
C C y'z ! {'z ! y'z {'z
Thus, ℎ =, = =< …− U2 F G +F G − 2F GF G†‡
!stu ! t t t t

The bivariate normal p.d.f. is given by

C C y'z ! y'zL {'zu {'zu !


_ =, = exp …− UF G − 2# F GF G+F G †‡
!stL tu C'v u ! C'vu tL tL tu tu

On comparing ℎ =, with _ =, , we get


1
1 − #! = !
, !
1 − #! = !
C ! C
2
1 − #!
= , 1 − #! = , rC = r! = r
C ! ! ! !
# !

On solving, we get #! = =2 , =
C ! ! ! !
! ! C .

, ~ۥ Fr, r, !
,2 , G
! C
√!
Thus,

Example 8: The variables and are connected by the equation - + ® + = 1.


Show that the correlation between them is −2 if signs of - and ® are same and +2
if they are different signs.

Solution: Given ¯ + ° + = 0 ⟹ ¯ +° + =0

∴ ¯• − ‘ + °• − ‘=0⟹• − ‘=− • − ‘
*
(

∴ , = •— − ˜— − ˜‘ = − − =−
* ! * !
( (
and

= − = − =
! ! *u ! *u !
(u (u

· · ·
' t¶ u ' t¶ u '
∴#= = = =
±²³ , ¸ ¸ ¸
t´ ∙t¶ ·u
·
¹¸¹t¶ u
·
¹ ¹
t¶ u K u t¶ u ¸
¸

, 1 , _ ¯ ¯ºb ° ℎ¯ << i e i Iºi ^


∴#= =x
∙ −1 , _ ¯ ¯ºb ° ℎ¯ i¯» i Iºi