Anda di halaman 1dari 36

ADVANCED ECONOMETRICS

ECO 318
Instructor: Kanika Mahajan
Textbooks
Jeffrey M. Wooldridge [JMW], Introductory Econometrics: A
Modern Approach, 4th or 5th edition

Cameron, A.C. and Trivedi, P.K [CT]. Microeconometrics


using Stata, 2nd ed., Stata Press, 2010

Angrist, J.D and Pischke, J.-S. Mostly Harmless Econometrics,


Princeton University Press, 2008. (Optional)
Assessment
Homework Assignments (3): 20%
Class Discussions: 10%
Mid-term: 30%
Final: 40%
Plagiarism
Econometrics?
Using statistics to understand economic phenomenon
Hypothesis
Inference

Why specific name then?


Grounded in economic theory
Econometric applications?
Descriptive analyses
Summary statistics, graphical patterns

Causality
Regression based, understanding a particular effect

Forecasting
Explanatory power of the model
Descriptive analysis
Consider the simple single variable model :
Y X

If X is a categorica l variable then we can estimate the conditional


expectation using a sample of observations
E(Y|X)
The estimate is called conditional sample mean

If X is continuous?
How important is descriptive analyses?
Read the below articles, we will discuss them in the next
class. The links will be emailed.

Beyond triple talaq by Kapil Sibal, Indian Express


(May 26, 2017)
http://indianexpress.com/article/opinion/columns/triple-talaq-
supreme-court-aimplb-muslim-personal-law-shariah-4673882/

Hindus and Muslims: The true picture of divorce by


Yugank Goyal, Livemint (July 25, 2017)
http://www.livemint.com/Opinion/ydCWT2mGxmg4d9NXrMax
EI/Hindus-and-Muslims-The-true-picture-of-divorce.html
Causality?
Estimate the parameters of the previous model. Assumptions.

Suppose X in the previous model is not the only variable which affects
the outcome variable. Another variable with which it is correlated , that
also has an effect on the outcome :
Y 1 X 1 2 X 2 u

Interpret , 1 , 2?
Assumptions for estimation of the above using OLS?
RECAP: MULTIPLE LINEAR
REGRESSION (OLS)
Y 1 X 1 2 X 2 ... k X k u

Estimation of parameters of the above equation : M inimize sum of squared residuals


n

i
(Y
i 1

X X ... X ) 2
1 i1 2 i2 k ik

First order conditions :


n

(Y - X
i 1
i 1 i1 2 X i 2 ... k X i k ) 0
n

X
i 1
i1 (Yi - 1 X i1 2 X i 2 ... k X i k ) 0

.
.
n

X
i 1
ik (Yi - 1 X i1 2 X i 2 ... k X i k ) 0
Goodness of fit
n n n
SST (Yi Y ) , SSE (Yi Y ) , SSR (ui ) 2
2 2

i 1 i 1 i 1

SST SSE SSR


Goodness of fit defined as the proportionof sample variation in depedent
variable which is explained by independent variables fitted using OLS
SSR
R 1
2

SST
Alternatively, it is the squared correlatio n coefficien t between Yi and predicted Yi
Small Sample Properties of the OLS estimator :
1) Unbiased
2) Efficient
Under what assumptions?
M LR.1Linear in parameters
M LR.2 Random Sampling
M LR.3 No perfect collineari ty
M LR.4 Zero conditional mean (unbiasedn ess) : E(u i | X) 0 i 1,...n

Unbiased : M LR.1- M LR.4


E( )
Omitted variable bias will violate M LR.4 then estimator biased.
M LR.5
(a) Homoskedasticity : Var(u i | X) E(u i | X) 2
2

(b) No Serial Correlatio n : cov(u i , u j ) 0 i j (usually violated when


time series data, we will see later)

Small Sample Properties of the OLS estimator :


Efficient (M LR.1- M LR.5) : M inimum variance among all the linear Unb iased
estimators
2
Variance( k )
SSTk (1 Rk )
2
M LR.6
The population error is independent of the explanatory variables and is
normally distributed with zero mean and constant variance
u~N( 0 , 2 )

Under M LR.1- M LR.6


~ N( , Var( ))
j j j
n
Proof : j j wij ui
i 1

w' s are non - random because depend on x' s. Thus, j have the same normal
distribution as the errors.
M LR.6 is required for Inference
1) z - Stat/t - stat : testing individual coefficien ts
2) t - stat : linear combinatio n of parameters
3) F - stat/ 2 : multiple linear restrictions
M LR.6
The population error is independent of the explanatory variables and is
normally distributed with zero mean and constant variance
u~N( 0, 2 )

Under M LR.1- M LR.6


Y | X ~ N( 1 X 1 ... k X k , 2 )

This may not be true always.For e.g. wage? prices? Other clearly non - normal
are number of visits to a doctor last month?
Normality of dependant variable?
Usually even if Y is not normally distributed, some transformation is.
Try tofind the transformation of Y which is normal and also makes intuitive
sense.

In STATA : use the command ladder


ladder Y
It performs ' sktest' which is a test for normality based on skewness and
kurtosis and then combines the two tests into an overall test statistic.
H 0 : Variable is normally distributed
STATA Example
2.0e-12 cubic square identity

3.0e-04
2.5e-08
2.0e-08
1.5e-12

2.0e-04
1.5e-08
1.0e-12

1.0e-08

1.0e-04
5.0e-13

5.0e-09
0

0
0 1.00e+12
2.00e+12
3.00e+12
4.00e+12 0 5.00e+07
1.00e+08
1.50e+08
2.00e+08
2.50e+08 0 5000 10000 15000

sqrt log 1/sqrt


.01 .02 .03 .04

100150200
Density

1.5
1
.5

50
0

0
60 80 100 120 140 8 8.5 9 9.5 10 -.018 -.016 -.014 -.012 -.01 -.008

inverse 1/square 1/cubic


2.0e+07

8.0e+10
8000

1.5e+07

6.0e+10
6000

1.0e+07

4.0e+10
4000

5.0e+06

2.0e+10
2000

0
-.0003
-.00025
-.0002
-.00015
-.0001
-.00005 -1.00e-07
-8.00e-08
-6.00e-08
-4.00e-08
-2.00e-08 0 -3.00e-11-2.00e-11-1.00e-11 0

Price
Histograms by transformation
OLS ASYMPTOTICS
Asymptotics
Assumption for OLS estimator of the above to be BLUE :

1) Linear in parameters
2) Random Sample
3) No perfect multicolli nearity
4) Zero conditional mean (unbiasedn ess) : E(u i | X) 0 i 1,...n
5) Homoskedasticity (Efficienc y, Inference) : Var(u i | X) E(u i | X) 2
2

Unbiasedness and Efficiency are finite/sma ll sample properties of the OLS estimator.
i.e. they hold for any sample size ' n'

Asymptoticproperties: Large Sample properties of the OLS estimator


i.e. as n
Important Asymptotic property of any
estimator
Consistency :
In general, let be the estimator of based on a sample Y1 , Y2 ,...Yn , then
is a consistent estimator of if for every 0
Prob(| | ) 0
as N
Another notation : plim( )
Law of Large Numbers playsan important role in this property:
Let Y1 , Y2 ,...Yn be independent, identicall y distributed random variables with
mean , then
plim( Y )
Suppose the random variable is the number which appears on the roll of a die.
M ean of these random variables as number of rolls become large should be
3.5
Consistency essentially implies that as sample size increases, the sampling
distribution of the estimator converges. The probability that theestimator
is close to the parameter increases.

Show using example in STATA.


Consistency : Under M LR.1- M LR.4
M athematical proof of consistency in simple regression .
Yi X i ui

( X X )Y
n

i 1 i i

(X X )
n 2

i 1 i

( X X )X
n n
( X i X )ui
i 1
i 1
i i

(X X )
n 2 n 2

i 1 i i 1
(Xi X )
n ( X X )u
1 n

i 1 i i

n (X X )
2
1 n
i 1 i
ApplyLaw of Large numbers. The numerator is covariance and the denominato r is
variance in the population.
Cov( X i , ui )
[Assumption MLR.4]
Var ( X i )

Consistency : Under M LR.1- M LR.4


Therefore the assumption required for consistency is actually weaker,
M LR.4': Cov( X i , ui ) 0
2
But what if correlatio n between X i and ui ? Need M LR.4 for this.
M agnitude of the AsymptoticBias
Cov( X i , ui )
p lim Sign?
Var ( X i )

Suppose the true population model is : Yi X i1 X i 2 v


Estimated : Yi X i1 u
Cov( X i1 , X i 2 )
p lim
Var ( X i )

Can we sign the above?


Can we solve the problem of inconsistency by getting more data?
When multiple regressors :
Suppose the true population model is : Yi 1 X i1 2 X i 2 X i 3 v
X 3 is Omitted, X1 and X 3 are correlated , X 2 and X 3 are uncorrelat ed,
then as long as X and X are correlated both and will be inconsistent.
1 2 1 2

If X1 and error are not correlated with any independent variable only then
will their coefficien ts be consistent.
Asymptotic normality
M LR.6 : Required for inference of the estimated parameters
What if Y' s not normal?
Can anything be said about what happens to the distribution of the estimated
parameters when Y is not conditonal normal distributed?

Here the Central Limit Theorem can be invoked to see that the estimated
parameters will have asymptoticnormality
CLT : In general, let Y be the estimator of population mean ( ) based on a sample
Y1 , Y2 ,...Yn , then the below holds
N (Y - ) N(0, 2 )
as N
We have already seen in STATA, using simulation , that as N increases
the sampling distribution gets tighter around the true population parameter.
But notice that the distribution also changes with sample size. We do not
know this distribution for each N.

What CLT says is that the distribution of recentered and a rescaled estimator
gets close to a normal distribution.
Show using example in STATA.
1.5
1
.5
0

-1 -.5 0 .5 1
x

kdensity mean1000n kdensity mean100000n


What about the OLS estimator?
It is also like a mean, so we can invoke the CLT :
n 1 i 1 ( X i X )ui
n



2
1 n
n i 1
(Xi X )
Under the Gauss - M arkov assumptions M LR.1- M LR.5
a 2 2
(i) n ( j ) ~ Normal 0, 2 where 2 is the asymptoticvariance of
a aj
j

j j
2

n ( ). For slope coefficien ts a p lim n 1
n 2
i 1 ij

r where the r ij

are the residuals from regressing Xj on other independent variables.


(ii) 2 is a consistent estimator of 2 Var (u )
j a
(iii) For each j : ~ Normal0,1 We Still need

se( j ) homoskedasticity. CLT doesnt
where se( j ) is OLS standard error. get us away from that. Only
normally distributed errors
not a strict requirement
anymore.
Intuitively: Large sample sizes are
better
Estimated variance of j
2
Var ( j )
SST j (1 R j )
2

As N increases,
a
~ 2
2

(1 R j ) (0,1)
2

SST j ~ n j
2

Therefore, Var ( j ) goes to zero at the rate of 1/n as n increases.