Anda di halaman 1dari 14

Welcome to the course

Financial Econometrics I

Professor Ruben Enikolopov

What is Econometrics?
„ “Econometrics is what econometricians
do” (Kennedy,1996)
„ Econometrics is the statistical analysis of
economic (and related) data (Stock)
„ Not really different from statistics

Quantitative questions
Econometrics is used to give answers to
quantitative questions.
Since we use data to answer these
questions, our answers will always have
some uncertainty.
Therefore, we need not only numerical
answer to the question, but a measure of
how precise is the answer as well.

1
What econometrics is use for?
1. Descriptive analysis

2. Prediction

3. Causal inference

Descriptive analysis
Just describe relationship between different
variables X and Y.

Sometimes you want to know correlation between


two variables holding constant some other
variables.

Example: How do mutual fund returns correlate


with market returns? What if you control for other
risk factors?

Prediction
You know current and past values of a number of
variables and you want to know what will be the
value of a particular variable in the future.

Always based on extrapolating past trends.

Example: Predict the probability that a person with


given characteristics will default on a loan.

2
Causal Inference
You want to know what will happen with variable Y
if you actively change variable X.

Example: Do laws that provide better protection for


investors lead to more developed capital markets?

Difference between approaches


„ Causal inference is by far the hardest
question.
„ Both prediction and causal inference
require thinking about counterfactuals –
something that we don’t really observe.
„ Difference between the three approaches
is usually not in the econometric methods
used, but in questions asked and
interpretation of the results.

Simple example
Suppose we want to see how the returns of a
mutual fund are correlated with market returns.
We know monthly excess returns of the
portfolio – R
pt

We know monthly market returns– R mt

3
Rp
* *

* * *
*

*
*
*
*

*
* *
*
*

Rm

α+βRm
Rp
* *

* * *
*

*
*
*
*

*
* *
*
*

Rm

How do we choose α and β ?


α+βRm
Rp
* *

Rpt * * *
ut *

*
*
*
*

*
* *
*
*

Rm
Rmt

4
How do we choose α and β ?
We want to minimize the errors Δ.
Standard way is to minimize the sum of
squared errors:

T T

ΣΔ = min Σ(R − α − β R mt )2
2
min
α β , t =1
t
α ,β t =1
ft

FOC :
T
(1) −2Σ (R ft − αˆ − βˆ R mt ) = 0
t =1
T
( 2) −2ΣR mt (R ft − αˆ − βˆ R mt ) = 0
t =1

How do we choose α and β ?


T T

ΣR ft ΣR mt
αˆ = t =1 − βˆ t =1
= R f − βˆ R m
From (1) we find that T T

Substituting this into (2) we get that


T T
⎛T 2 2 ⎞
ΣR mt (R ft − R f + βˆ R m − βˆ R mt ) = Σ ( R mt R ft ) −T R m R f − βˆ ⎜ ΣR mt −T Rm ⎟
t =1 t =1 ⎝ t =1 ⎠

Σ( R − Rm ) ( R ft − R f )
T

mt

So that βˆ = t =1

Σ( R − Rm )
T 2
mt
t =1

Welcome to OLS
What we just got is the Ordinary Least
Squares estimates for α and β.

By far the most common technique in


econometrics.

This estimation approach is not only


intuitive, but provides some nice statistical
properties.

5
Predicted values & residuals
OLS divides the observation Yt into two parts: a part that is
“explained” by Xt (the predicted value) and a part that is not
(the residual):
ˆ +uˆ = OLS prediction + OLS residual
Yt =Yt t

The predicted value of Yi is Yˆt = αˆ + βˆ X t

The residual for observation t is uˆt = Yt − Yˆt

Goodness of fit. Example.


We draw a line to approximate the relationship
between Rf and Rm.
But how good our estimated regression line fits the
observations ?
* *

Y Y * *

* * *
* *
* * *
* * * *
* X
X

R2
The most common measure of the goodness of fit.
It measures how well predicted values of Y
correlate with the observed values
( )
R 2 = ⎡corr Yt , Yˆt ⎤ ∈ [ 0,1]
2

⎣ ⎦
Same as the fraction of sample variation of Y
explained by regression
ESS SSR
R2 = = 1−
TSS TSS
T T T
where ESS= ∑ (Yˆt − Yˆ )2 , TSS=∑ (Yt − Y ) 2 , SSR = ∑ uˆt 2
t =1 t =1 t =1

6
Standard Error of the Regression
(SER)
Measures the spread of the distribution of u.

1 T 1 T 2
SER = ∑
T t =1
(uˆt − uˆ ) 2 = ∑ uˆt
T t =1
since by construction for OLS
1 T
uˆ = ∑ uˆt = 0
T t =1

Generalization
In the simple example that we have just
considered we have provided an approximation for
the data at hand.

This is fine for describing the data, but is not


enough for predicting or making causal inference.

Need to make additional assumptions about where


the observation are coming from.

Statistical model
Assume that there is a general relationship that holds for all
possible observations in some population (e.g. all mutual
funds at all dates).
In the case of linear relationship we specify a statistical
model
Yt = α + β X t + u t , t = 1,.., T

„X is the independent variable or regressor


„Y is the dependent variable
„ α = intercept
„ β = slope
„ ui = the regression error

7
Statistical model
The regression error consists of omitted factors, or
possibly measurement error in the measurement of Y.
In general, these omitted factors are other factors that
influence Y, other than the variable X

This statistical model is assumed to hold for all possible


observations, but we only observe T of them.

We think of this sample as just one realization of all


potential samples of size T that could have been drawn.
In this sense we think of Yt, Xt and ut as random
variables.

Estimators and estimates


Our goal is to try to estimate β using the information
from the sample (Yt, Xt), t=1,…,T that we have.

The rule that translates sample into an approximate


value of β is called estimator. Basically, this is a
formula in which you plug in Y and X to get β̂
OLS is the most common estimator.

The resulting value βˆ is called an estimate.

The Least Squares Assumptions


1. The conditional distribution of u given X
has mean zero, that is, E(ut|X = xt) = 0.

2. (Xt,Yt), t =1,…,T are independently and


identically distributed (i.i.d.)

3. Large outliers in X and Y are rare;


technically, E[X4] < ∞ and E[Y4] < ∞ .

8
Assumption 1

Assumption E(u|X = x) is equivalent to


E(Y|X=x) = α + βx
This is a crucial assumption

Assumption 2: (Xt,Yt), t =1,…,T are i.i.d.


„ implication of random sampling
„ often doesn’t hold, e.g. in time-series data like
our example

Assumption 3: E[X4] < ∞ and E[Y4] < ∞


„ Important for finding the distribution of estimates
„ plausible, unless the data is really badly
miscoded.

Statistical properties of estimates


The estimate βˆ is a random variable. It will be different for
different sample. We are usually interested in the following
statistical properties:
1. Consistency – does β̂ approach β as N grows does true.
This means that for large enough samples we will get the
right result.
2.Unbiasedness – E[ βˆ ] = β
This means that even for a constant sample size we should
be on right on average.
3. Distribution of βˆ

9
Statistical properties of OLS
OLS estimator is
1. Consistent. βˆ ⎯⎯ →β
p

2. Unbiased E[ βˆ ] = β
σ2
( )
Var βˆ = σ β2ˆ = ν 4 , where ν i = ( X i - E[ X ])ui
nσ X
βˆ − β
3. For large T βˆ ∼ N ( β , σ β2ˆ ), and ∼ N (0,1)
σ β2ˆ

Variance of estimate
„ The larger is T the smaller is the variance of βˆ .
„ The larger is the variance of X the smaller is the
variance of β̂ .

Distribution of estimate
„ Why should we care about distribution of β̂ ?
„ Inference is as important as the estimation,
since it tells you how much you can trust your
estimates. Often it tells you that you shouldn’t
trust them at all.
„ SE ( βˆ ) : The Standard Error of βˆ
an estimator of the square root of the variance
of the sampling distribution of that statistic.
1 T
1 T −2∑
( X t − X )uˆt 2
SE ( βˆ ) = t =1
2
T ⎡1 T 2⎤
⎢T ∑ ( X t − X ) ⎥
⎣ t =1 ⎦

10
Statistical inference
Knowing the distribution of βˆ we can test
statistical hypothesis, i.e. ask questions like:
„What is the probability that β1 = 0 ?

Whenever we test some hypothesis we assume


that this hypothesis is true and compute the
probability of getting the estimates we observe. If
this probability is too low, we reject the hypothesis.

Simple t-test
Suppose we want to test the hypothesis
H 0 : β = b vs. H a : β ≠ b
We choose a certain level of significance α
(usually 5%) and select critical value qα /2 such that
the probability that absolute value of standard
normal variables is greater than qα /2
We reject the hypothesis if t = β − b > qα / 2
ˆ

( )
SE βˆ

For 5% level of significance the critical value is


1.96, so we reject the hypothesis if |t|> 1.96

Significance level is the probability that we will


reject H0, when H0 is actually true.

Confidence intervals & p-value


„ 95% confidence interval for β is {βˆ ± 1.96 se( βˆ )}

„ The 95% CI contains the true β in 95% of all


samples.

„ The 95% CI is the set of β that is not rejected at


the 5% level

„ p-value Pr { N (0,1) > t } . Reject the null


hypothesis if p-value is less than the required
level of significance.

11
Additional assumptions
„ You might see, in addition,
Assumption 4. ut is i.i.d. N(0,σ2) (where Var(u) doesn’t
depend on X)
Assumption 5. Xt, t =1,…, T is fixed (nonrandom) over
repeated samples

„ Assumption 4 & 5:
• justify the use of the Student t distribution (if the SE is
computed using a special “homoskedasticity-only”
formula)
• are poor descriptions of reality
• aren’t used in this course: we won’t use the Student t
distribution – just the large-n normal approximation.

Real Example
How do the returns of a mutual fund
correlates with market.
Data from the ING Russia mutual fun.
Time period: January 1997- January 2008

ING Russia mutual fund.

12
ING Russia mutual fund returns

ING Russia mutual fund and US market


return

100

50
INGRUS_RETURN

-50

-100
-40 -20 0 20 40

MARKET

13
Heteroskedasticity & Homoskedasticity
Often people make additional assumption that
Var(u|X=x) is constant.
In this case u is said to be homoskedastic.
Otherwise, u is heteroskedastic.
So far we have implicitly assumed that u might be
heteroskedastic (although it could be homoskedastic).
Why assume homoskedasticity:
1. OLS in this case is the Best Linear Unbiased
Estimator
2. Formula for SE ( βˆ ) becomes simpler.

Homoskedasticity.
Practical implications.
„ The homoskedasticity-only formula for the standard error
of ˆ and the “heteroskedasticity-robust” formula differ.
β
„ The two standard errors coincide (when n is large) in the
special case of homoskedasticity
„ In general, you get different standard errors using the
different formulas.
„ If you use homoskedasticity-only SEs and there is in fact
heteroskedasticity, your SEs (and t-statistics and
confidence intervals) will be wrong – frequently,
homoskedasticity-only SEs are too small.
„ Homoskedasticity is a strong assumption which is
usually wrong.

Heteroskedasticity-robust standard errors

Homoskedasticity-only standard errors

14

Anda mungkin juga menyukai