Welcome To The Course: Financial Econometrics I

Welcome to the course
Financial Econometrics I
Professor Ruben Enikolopov
What is Econometrics?
“Econometrics is what econometricians
do” (Kennedy,1996)
Econometrics is the statistical analysis of
economic (and related) data (Stock)
Not really different from statistics
Quantitative questions
Econometrics is used to give answers to
quantitative questions.
Since we use data to answer these
questions, our answers will always have
some uncertainty.
Therefore, we need not only numerical
answer to the question, but a measure of
how precise is the answer as well.
1
What econometrics is use for?
1. Descriptive analysis
2. Prediction
3. Causal inference
Descriptive analysis
Just describe relationship between different
variables X and Y.
Sometimes you want to know correlation between

two variables holding constant some other
variables.
Example: How do mutual fund returns correlate

with market returns? What if you control for other
risk factors?
Prediction
You know current and past values of a number of
variables and you want to know what will be the
value of a particular variable in the future.
Always based on extrapolating past trends.
Example: Predict the probability that a person with

given characteristics will default on a loan.
2
Causal Inference
You want to know what will happen with variable Y
if you actively change variable X.
Example: Do laws that provide better protection for

investors lead to more developed capital markets?
Difference between approaches

Causal inference is by far the hardest
question.
Both prediction and causal inference
require thinking about counterfactuals –
something that we don’t really observe.
Difference between the three approaches
is usually not in the econometric methods
used, but in questions asked and
interpretation of the results.
Simple example
Suppose we want to see how the returns of a
mutual fund are correlated with market returns.
We know monthly excess returns of the
portfolio – R
pt
We know monthly market returns– R mt
3
Rp
* *
* * *
*
*
*
*
*
*
* *
*
*
Rm
α+βRm
Rp
* *
* * *
*
*
*
*
*
*
* *
*
*
Rm
How do we choose α and β ?

α+βRm
Rp
* *
Rpt * * *
ut *
*
*
*
*
*
* *
*
*
Rm
Rmt
4
We want to minimize the errors Δ.
Standard way is to minimize the sum of
squared errors:
T T
ΣΔ = min Σ(R − α − β R mt )2
2
min
α β , t =1
t
α ,β t =1
ft
FOC :
T
(1) −2Σ (R ft − αˆ − βˆ R mt ) = 0
t =1
T
( 2) −2ΣR mt (R ft − αˆ − βˆ R mt ) = 0
t =1

T T
ΣR ft ΣR mt
αˆ = t =1 − βˆ t =1
= R f − βˆ R m
From (1) we find that T T
Substituting this into (2) we get that

T T
⎛T 2 2 ⎞
ΣR mt (R ft − R f + βˆ R m − βˆ R mt ) = Σ ( R mt R ft ) −T R m R f − βˆ ⎜ ΣR mt −T Rm ⎟
t =1 t =1 ⎝ t =1 ⎠
Σ( R − Rm ) ( R ft − R f )
T
mt
So that βˆ = t =1
Σ( R − Rm )
T 2
mt
t =1
Welcome to OLS
What we just got is the Ordinary Least
Squares estimates for α and β.
By far the most common technique in

econometrics.
This estimation approach is not only

intuitive, but provides some nice statistical
properties.
5
Predicted values & residuals
OLS divides the observation Yt into two parts: a part that is
“explained” by Xt (the predicted value) and a part that is not
(the residual):
ˆ +uˆ = OLS prediction + OLS residual
Yt =Yt t
The predicted value of Yi is Yˆt = αˆ + βˆ X t
The residual for observation t is uˆt = Yt − Yˆt
Goodness of fit. Example.

We draw a line to approximate the relationship
between Rf and Rm.
But how good our estimated regression line fits the
observations ?
* *
Y Y * *
* * *
* *
* * *
* * * *
* X
X
R2
The most common measure of the goodness of fit.
It measures how well predicted values of Y
correlate with the observed values
( )
R 2 = ⎡corr Yt , Yˆt ⎤ ∈ [ 0,1]
2
⎣ ⎦
Same as the fraction of sample variation of Y
explained by regression
ESS SSR
R2 = = 1−
TSS TSS
T T T
where ESS= ∑ (Yˆt − Yˆ )2 , TSS=∑ (Yt − Y ) 2 , SSR = ∑ uˆt 2
t =1 t =1 t =1
6
Standard Error of the Regression
(SER)
Measures the spread of the distribution of u.
1 T 1 T 2
SER = ∑
T t =1
(uˆt − uˆ ) 2 = ∑ uˆt
T t =1
since by construction for OLS
1 T
uˆ = ∑ uˆt = 0
T t =1
Generalization
In the simple example that we have just
considered we have provided an approximation for
the data at hand.
This is fine for describing the data, but is not

enough for predicting or making causal inference.
Need to make additional assumptions about where

the observation are coming from.
Statistical model
Assume that there is a general relationship that holds for all
possible observations in some population (e.g. all mutual
funds at all dates).
In the case of linear relationship we specify a statistical
model
Yt = α + β X t + u t , t = 1,.., T
X is the independent variable or regressor

Y is the dependent variable
α = intercept
β = slope
ui = the regression error
7
Statistical model
The regression error consists of omitted factors, or
possibly measurement error in the measurement of Y.
In general, these omitted factors are other factors that
influence Y, other than the variable X
This statistical model is assumed to hold for all possible

observations, but we only observe T of them.
We think of this sample as just one realization of all

potential samples of size T that could have been drawn.
In this sense we think of Yt, Xt and ut as random
variables.
Estimators and estimates

Our goal is to try to estimate β using the information
from the sample (Yt, Xt), t=1,…,T that we have.
The rule that translates sample into an approximate

value of β is called estimator. Basically, this is a
formula in which you plug in Y and X to get β̂
OLS is the most common estimator.
The resulting value βˆ is called an estimate.
The Least Squares Assumptions

1. The conditional distribution of u given X
has mean zero, that is, E(ut|X = xt) = 0.
2. (Xt,Yt), t =1,…,T are independently and

identically distributed (i.i.d.)
3. Large outliers in X and Y are rare;

technically, E[X4] < ∞ and E[Y4] < ∞ .
8
Assumption 1
Assumption E(u|X = x) is equivalent to

E(Y|X=x) = α + βx
This is a crucial assumption
Assumption 2: (Xt,Yt), t =1,…,T are i.i.d.

implication of random sampling
often doesn’t hold, e.g. in time-series data like
our example
Assumption 3: E[X4] < ∞ and E[Y4] < ∞

Important for finding the distribution of estimates
plausible, unless the data is really badly
miscoded.
Statistical properties of estimates

The estimate βˆ is a random variable. It will be different for
different sample. We are usually interested in the following
statistical properties:
1. Consistency – does β̂ approach β as N grows does true.
This means that for large enough samples we will get the
right result.
2.Unbiasedness – E[ βˆ ] = β
This means that even for a constant sample size we should
be on right on average.
3. Distribution of βˆ
9
Statistical properties of OLS
OLS estimator is
1. Consistent. βˆ ⎯⎯ →β
p
2. Unbiased E[ βˆ ] = β
σ2
( )
Var βˆ = σ β2ˆ = ν 4 , where ν i = ( X i - E[ X ])ui
nσ X
βˆ − β
3. For large T βˆ ∼ N ( β , σ β2ˆ ), and ∼ N (0,1)
σ β2ˆ
Variance of estimate
The larger is T the smaller is the variance of βˆ .
The larger is the variance of X the smaller is the
variance of β̂ .
Distribution of estimate
Why should we care about distribution of β̂ ?
Inference is as important as the estimation,
since it tells you how much you can trust your
estimates. Often it tells you that you shouldn’t
trust them at all.
SE ( βˆ ) : The Standard Error of βˆ
an estimator of the square root of the variance
of the sampling distribution of that statistic.
1 T
1 T −2∑
( X t − X )uˆt 2
SE ( βˆ ) = t =1
2
T ⎡1 T 2⎤
⎢T ∑ ( X t − X ) ⎥
⎣ t =1 ⎦
10
Statistical inference
Knowing the distribution of βˆ we can test
statistical hypothesis, i.e. ask questions like:
What is the probability that β1 = 0 ?
Whenever we test some hypothesis we assume

that this hypothesis is true and compute the
probability of getting the estimates we observe. If
this probability is too low, we reject the hypothesis.
Simple t-test
Suppose we want to test the hypothesis
H 0 : β = b vs. H a : β ≠ b
We choose a certain level of significance α
(usually 5%) and select critical value qα /2 such that
the probability that absolute value of standard
normal variables is greater than qα /2
We reject the hypothesis if t = β − b > qα / 2
ˆ
( )
SE βˆ
For 5% level of significance the critical value is

1.96, so we reject the hypothesis if |t|> 1.96
Significance level is the probability that we will

reject H0, when H0 is actually true.
Confidence intervals & p-value

95% confidence interval for β is {βˆ ± 1.96 se( βˆ )}
The 95% CI contains the true β in 95% of all

samples.
The 95% CI is the set of β that is not rejected at

the 5% level
p-value Pr { N (0,1) > t } . Reject the null

hypothesis if p-value is less than the required
level of significance.
11
Additional assumptions
You might see, in addition,
Assumption 4. ut is i.i.d. N(0,σ2) (where Var(u) doesn’t
depend on X)
Assumption 5. Xt, t =1,…, T is fixed (nonrandom) over
repeated samples
Assumption 4 & 5:
• justify the use of the Student t distribution (if the SE is
computed using a special “homoskedasticity-only”
formula)
• are poor descriptions of reality
• aren’t used in this course: we won’t use the Student t
distribution – just the large-n normal approximation.
Real Example
How do the returns of a mutual fund
correlates with market.
Data from the ING Russia mutual fun.
Time period: January 1997- January 2008
ING Russia mutual fund.
12
ING Russia mutual fund returns
ING Russia mutual fund and US market

return
100
50
INGRUS_RETURN
-50
-100
-40 -20 0 20 40
MARKET
13
Heteroskedasticity & Homoskedasticity
Often people make additional assumption that
Var(u|X=x) is constant.
In this case u is said to be homoskedastic.
Otherwise, u is heteroskedastic.
So far we have implicitly assumed that u might be
heteroskedastic (although it could be homoskedastic).
Why assume homoskedasticity:
1. OLS in this case is the Best Linear Unbiased
Estimator
2. Formula for SE ( βˆ ) becomes simpler.
Homoskedasticity.
Practical implications.
The homoskedasticity-only formula for the standard error
of ˆ and the “heteroskedasticity-robust” formula differ.
β
The two standard errors coincide (when n is large) in the
special case of homoskedasticity
In general, you get different standard errors using the
different formulas.
If you use homoskedasticity-only SEs and there is in fact
heteroskedasticity, your SEs (and t-statistics and
confidence intervals) will be wrong – frequently,
homoskedasticity-only SEs are too small.
Homoskedasticity is a strong assumption which is
usually wrong.
Heteroskedasticity-robust standard errors
Homoskedasticity-only standard errors
14

Welcome To The Course: Financial Econometrics I

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Welcome To The Course: Financial Econometrics I

Diunggah oleh

Hak Cipta:

Format Tersedia

Welcome to the course

Professor Ruben Enikolopov

Sometimes you want to know correlation between

Example: How do mutual fund returns correlate

Always based on extrapolating past trends.

Example: Predict the probability that a person with

Example: Do laws that provide better protection for

Difference between approaches

We know monthly market returns– R mt

How do we choose α and β ?

How do we choose α and β ?

Substituting this into (2) we get that

By far the most common technique in

This estimation approach is not only

The predicted value of Yi is Yˆt = αˆ + βˆ X t

The residual for observation t is uˆt = Yt − Yˆt

Goodness of fit. Example.

This is fine for describing the data, but is not

Need to make additional assumptions about where

X is the independent variable or regressor

This statistical model is assumed to hold for all possible

We think of this sample as just one realization of all

Estimators and estimates

The rule that translates sample into an approximate

The resulting value βˆ is called an estimate.

The Least Squares Assumptions

2. (Xt,Yt), t =1,…,T are independently and

3. Large outliers in X and Y are rare;

Assumption E(u|X = x) is equivalent to

Assumption 2: (Xt,Yt), t =1,…,T are i.i.d.

Assumption 3: E[X4] < ∞ and E[Y4] < ∞

Statistical properties of estimates

Whenever we test some hypothesis we assume

For 5% level of significance the critical value is

Significance level is the probability that we will

Confidence intervals & p-value

 The 95% CI contains the true β in 95% of all

 The 95% CI is the set of β that is not rejected at

 p-value Pr { N (0,1) > t } . Reject the null

ING Russia mutual fund.

ING Russia mutual fund and US market

Heteroskedasticity-robust standard errors

Homoskedasticity-only standard errors

Anda mungkin juga menyukai

X is the independent variable or regressor

The 95% CI contains the true β in 95% of all

The 95% CI is the set of β that is not rejected at

p-value Pr { N (0,1) > t } . Reject the null