Anda di halaman 1dari 3

1

Variance, Covariance and Covariance Matrix

The variance for a set of numbers is defined to have a measure how the data points deviate (or vary) from their mean. For a set of numbers: , ,. , Variance, ( ) . ). ( ), the mean of the

square of the deviations from mean, where the mean,

Note that the square root of variance is the standard deviation (

The above formula can be reduced to the following: ( ), the mean of square minus the square of mean. By definition, variance is always positive and it has no upper limit. We can likewise, calculate variance for another set of numbers: ( ) . , ,. , Variance,

Now we may want to know if there is any connection between the two sets of numbers! For that we devise a formula which may tell us how the variation in one can be related to the same with the other. Let us rewrite the expression (2) in the following way: Following the above, we can construct a formula: or in some other notation,

, this is COVARIANCE!

If we consider a probability distribution, that is when we consider that the numbers appear according to some probability, the formula for variance is written in the following way: ( ) , where stands for the expectation value, replaced by the expectation value of the variable . Now, the covariance formula would be written as Cov( Where ( ) and ( ). )= (( )( )) ( ) , ( ), the mean is now

Abhijit Kar Gupta.

Email: kg.abhi@gmail.com

By definition, covariance can take any positive or negative value without limit including zero. This is easy to understand if we consider that the covariance is defined in terms of the product )( of two factors, ( ); each of the factors can be positive or negative. To have a normalized measure, one considers the correlation formula: Corr ( )= , it is a normalized definition. Correlation is defined to vary between -1 and +1.

What is the meaning of covariance? If the covariance is positive, both the factors are either positive or both negative. If the covariance is negative, one of the factors is positive while other is negative. In case of zero value, one or both are zero. Now, if we consider two sets of numbers (or two probability distributions for two different events), the covariance value of them tells us how or whether they are related. If the increase in one variable causes in some way the increase in the other, we call they are positively correlated. That means their covariance value is positive. If the increase in one makes another to decrease (or other way), the covariance value will be negative. If the variation in one does not affect the other, the covariance is zero. Now, we can readily see, for two sets of variables (or numbers or probability distributions), we can construct a 2x2 matrix: ( )

It is easy to check that the this matrix, we call this COVARIANCE MATRIX, is a symmetric matrix with and each diagonal elements ( or ) is essentially the covariance with itself (simply the variance)! We can generalize the idea for more than two variables. For 3 variables, we obtain a covariance matrix of order 3x3 and so on. When the value of any off-diagonal element is positive, we call the variables are positively correlated; for higher values we know they are highly (or strongly) correlated. The above is the basis of understanding a covariance matrix.

Abhijit Kar Gupta.

Email: kg.abhi@gmail.com

Abhijit Kar Gupta.

Email: kg.abhi@gmail.com

Anda mungkin juga menyukai