FinQuiz Notes 2 0 1 8
2. CORRELATION ANALYSIS
Scatter plot and correlation analysis are used to examine portfolio could be diversified or decreased.
how two sets of data are related. If there is zero covariance between two assets, it
means that there is no relationship between the
2.1 Scatter Plots rates of return of two assets and the assets can be
included in the same portfolio.
A scatter plot graphically shows the relationship
between two varaibles. If the points on the scatter plot Correlation coefficient measures the direction and
cluster together in a straight line, the two variables have strength of linear association between two variables. The
a strong linear relation. Observations in the scatter plot correlation coefficient between two assets X and Y can
are represented by a point, and the points are not be calculated using the following formula:
=
connected.
2.2 & Correlation Analysis & Calculating and
2.3 Interpreting the Correlation Coefficient
=
(
)(
)
The sample covariance is calculated as:
(, )
or
= =
1
()()
where,
n = sample size NOTE:
Example:
The covariance of a random variable with itself is = 47.78 = 40 = 250
NOTE:
The null hypothesis is the hypothesis to be tested. The
alternative hypothesis is the hypothesis that is accepted
if the null is rejected.
2
=
Difference b/w Covariance & Correlation: The
1
~(
2)
covariance primarily provides information to the investor
about whether the relationship between asset returns is where,
positive, negative or zero, but correlation coefficient tells
r is the sample coefficient of correlation calculated by
the degree of relationship between assets returns.
=
(,)
NOTE:
Correlation coefficients are valid only if the means,
variances & covariances of X and Y are finite and t = t-statistic (or calculated t)
constant. When these assumptions do not hold, then the n 2 = degrees of freedom
correlation between two different variables depends
largely on the sample selected. Decision Rule:
If test statistic is < t-critical or > + t-critical with n-2
2.4 Limitations of Correlation Analysis degrees of freedom, (if absolute value of t > tc), Reject
H0; otherwise Do not Reject H0.
NOTE:
Spurious correlation may suggest investment strategies
that appear profitable but actually would not be so, if Magnitute of r needed to reject the null hypothesis (H0:
implemented. = 0) decreases as sample size n increases. Because
as n increases the:
Testing the Significance of the Correlation o number of degrees of freedom increases
2.6 o absolute value of tc decreases.
Coefficient
o t-value increases
t-test is used to determine if sample correlation
coefficient, r, is statistically significant. In other words, type II error decreases when sample size
(n) increases, all else equal.
Two-Tailed Test:
Null Hypothesis H0 : the correlation in the population is 0
( = 0);
Reading 9 Correlation and Regression FinQuiz.com
NOTE:
Type I error = reject the null hypothesis although it is true. Practice: Example 7, 8, 9 & 10
Type II error = do not reject the null hypothesis although Volume 1, Reading 9.
it is wrong.
3. LINEAR REGRESSION
Regression analysis is used to: Independent variable: The variable used to explain the
dependent variable. Also called exogenous or
Predict the value of a dependent variable based on predicting variable.
the value of at least one independent variable
Explain the impact of changes in an independent Intercept (b0): The predicted value of the dependent
variable on the dependent variable. variable when the independent variable is set to zero.
b0 = y b1 x
Linear regression assumes a linear relationship between
the dependent and the independent variables. Linear Slope Coefficient or regression coefficient (b1): A
regression is also known as linear least squares since it change in the dependent variable for a unit change in
selects values for the intercept b0 and slope b1 that
,
the independent variable.
minimize the sum of the squared vertical distances
between the observations and the regression line.
or
Estimated Regression Model: The sample regression line
provides an estimate of the population regression line.
Note that population parameter values b0 and b1 are
not observeable; only estimates of b0 and b1 are Error Term: It represents a portion of the dependent
observeable. variable that cannot be explained by the independent
varaiable.
Example:
n =100
n 1
y = 5,411.41; (x x)( yi y )
cov( X , Y ) = = 1,356,256
i
n 1
y = b0 + b1 x = 6,535 0 .0312 x
cov( X , Y ) 1,356,256
b1 = = = 0.0312
s x2 43,528,688
b0 = y b1 x = 5,411.41 ( 0.0312)(36,009.45) = 6,535
3.2 Assumptions of the Linear Regression Model 3.4 The Coefficient of Determination
1. The regression model is linear in its parameters b0 and The coefficient of determination is the portion of the
b1 i.e. b0 and b1 are raised to power 1 only and total variation in the dependent variable that is
neither b0 nor b1 is multiplied or divided by another explained by the independent variable. The coefficient
regression parameter e.g. b0 / b1. of determination is also called R-squared and is denoted
as R2.
" "
""!
&
When regression model is nonlinear in parameters,
regression results are invalid.
'( )
' *
"+(
" )
Even if the dependent variable is nonlinear but
'( )
'
parameters are linear, linear regression can be used. =
4. The variance of the error term is the same for all In case of a single independent variable, the coefficient
observations. (It is known as Homoskedasticity of determination is: R2 = r2
assumption).
5. Error values () are statistically independent i.e. the where,
error for one observation is not correlated with any
R2 = Coefficient of determination
other observation.
r = Simple correlation coefficient
6. Error values are normally distributed for any given
value of x.
Example:
3.3 The Standard Error of Estimate Suppose correlation coefficient between returns of two
assets is + 0.80, then the coefficient of determination will
Standard Error of Estimate (SEE) measures the degree of be 0.64. The interpretation of this number is that
variability of the actual y-values relative to the estimated approximately 64 percent of the variability in the returns
(predicted) y-values from a regression equation. Smaller of one asset (or dependent variable) can be explained
the SEE, better the fit. by the returns of the other asset (or indepepnent
variable). If the returns on two assets are perfectly
!": = #
$1
correlated (r = +/- 1), the coefficient of determination will
be equal to 100 %, and this means that if changes in
or returns of one asset are known, then we can exactly
SSE 2,252,363
s = = = 151.60 where,
n2 98
y = Average value of the dependent variable
NOTE:
3.5 Hypothesis Testing
Higher level of confidence or lower level of significance
results in higher values of critical t i.e. tc. This implies
In order to determine whether there is a linear that:
relationship between x and y or not, significance test (i.e.
t-test) is used instead of just relying on b1 value. t-statistic Confidence intervals will be larger.
is used to test the significance of the individual Probability of rejecting the H0 decreases i.e. type II
coefficients (e.g. slope) in a regression. error increases.
The probability of Type-I error decreases.
Null and Alternative hypotheses
Stronger regression results lead to smaller standard errors
H0: b1 = 0 (no linear relationship)
of an estimated parameter and result in tighter
H1: b1 0 (linear relationship does exist)
confidence interval. As a result probability of rejecting H0
b1 b1
)
increases (or probability of Type-I error increases).
Test statistic = t=
s b1 p-value: The p-value is the smallest level of significance
where, at which the null hypothesis can be rejected.
,1 = Sample regression slope coefficient
Decision Rule: If p < significance level, H0 can be
b1 = Hypothesized slope
Standard error of the slope
rejected. If p > significance level, H0 cannot be rejected.
$ .
3.7 Prediction Intervals
$ 1
Regression k
2 3 4" 5#
1
; $
7 : :
where,
-
nk
$1 5$# 5$ 67 8 8 <
Error
9 9 75$
1
%
'
and
s f = s 2f
-
Total n1
s2 = squared SEE
n = number of observations
= estimated mean of X
Or X = value of independent variable
14.105%.
0
! !