Anda di halaman 1dari 10

Notes on Econometrics Applications

Jorge Rojas Freedom Fighter Seattle, WA, USA January 25, 2012
Abstract This is a summary containing the main ideas in the subject. This is not a summary of the lecture notes, this is a summary of ideas and basic concepts. The mathematical machinery is necessary, but the principles are much more important.

Linear Algebra
1. (AT )T = A 2. (A + B)T = AT + B T 3. (AB)T = B T AT 4. (cA)T = cAT c R

Properties of Transpose

5. det(AT ) = det(A) 6. a b = aT b =< a, b > (inner product) 7. This is important: If A has only real entries, then (AT A) is a positivesemidenite matrix. 8. (AT )1 = (A1 )T 1

9. If A is a square matrix, then its eigenvalues are equal to the eigenvalues of its transpose. Notice that if A M(nm) , then AAT is always symmetric. Properties of the Inverse 1. (A1 )1 = A
1 2. (kA)1 = k A1

k R \ {0}

3. (AT )1 = (A1 )T 4. (AB)1 = B 1 A1 5. det(A1 ) = [det(A)]1 Properties of the trace 1. Denition. tr(A) =
n i=1

aii

2. tr(A + B) = tr(A) + tr(B) 3. tr(cA) = c tr(A) c R 4. tr(AB) = tr(BA) 5. Similarity invariant: tr(P 1 AP ) = tr(A) 6. Invariant under cyclic permutations: tr(ABCD) = tr(BCDA) = tr(CDAB) = tr(DABC) 7. tr(X Y ) = tr(X) tr(Y ) where is the tensor product, also known as Kronecker product. 8. tr(XY ) =
i,j

Xij Yji matrices A M(mn) and B M(pq) a1n B . .. . . . amn B


mpnq

The Kronecker product is dened for as follows: a11 B . AB = . . am1 B

Properties of the Kronecker Product 1. (A B)1 = (A1 B 1 ) 2. If A M(mm) and B M(nn) , then: |A B| = |A|n |B|m (A B)T = AT B T tr(A B) = tr(A)tr(B) 3. (A B)(C D) = AC BD
Careful! it doesnt distribute with respect to the usual multiplication

Properties of Determinants

only dened for A Mnn

1. det(aA) = an det(A) a R 2. det(A) = (1)n det(A) 3. det(AB) = det(A) det(B) 4. det(In ) = 1 5. det(A) =
1

det(A1 )

6. det(BAB 1 ) = det(A) similarity transformation. 7. det(A) = det(AT ) 8. det(A) = det(A) the bar represents complex conjugate. Dierentiation of Linear Transformations 1. 2. 3. 4. 5.
aT x x Ax x
(matrices)

= =

xT a x xT AT x

=a = AT

xT A x

=A = (A + AT )x = abT x + baT x 3

xT Ax x

aT xxT b x

Dierentiation of traces 1. 2. 3. 4.
tr(AX) X

tr(XA) X

= AT = (BA)T = (BX T CA)T + (CAXB)

tr(AXB) X

tr(XBA) X

tr(AXBX T C) X |X| X

tr(XBX T CA) X

= cof actor(X) = det(X) (X 1 )T

Probability Distributions

Here, we could say that, starts the summary for Econometrics ECON581. Denition 2.1 Normal distribution: where is the mean and 2 is the variance. (x)2 1 x R f (x) = e 22 2 If the mean is zero and the variance is one, then we have the standard normal distribution N (0, 1). The normal distribution has no closed form solution for its cumulative density function CDF. Denition 2.2 Chi-square Distribution: We say that 2 has r degrees (r) of freedom.
r

Zi iidN (0, 1)i = 1, . . . , r = A =


i=1

Zi2 2 (r)

E(A) = r and V (A) = 2r Thus, the 2 is just a square sum of standard normal distributions. We (r) use this distribution to test the value of the variance of a population. For instance, H0: 2 = 5 against H1: > 5

Denition 2.3 t-student Distribution: We say that t(r) has r degrees of freedom. The t-distribution has fatter tails than the standard normal distribution. Z t(r) Z N (0, 1) A 2 (Z and A are independent) = T = (r) A/r E(T ) = 0 and V (T ) =
r r2

The t distribution is an appropriate ratio of a standard normal and a 2 (r) random variables. Denition 2.4 F Distribution: We say that F (r1 , r2 ) has r1 degrees of freedom in the numerator and r2 degrees of freedom in the denominator. A1 /r1 F (r1 , r2 ) A1 2 1 ) A2 2 2 ) (A1 and A2 are independent) = F = (r (r A2 /r2 We use the F distribution to test whether two variances are the same or not 2 2 2 2 after a structural break. For instance, H0: 0 = 1 against H1: 0 > 1 .

Probability Denitions

Denition 3.1 The expected value of a continuos random variable is given by: E[X] =

xf (x)dx

(1)

The notation variable.

just means that is the domain of the relevant random

Denition 3.2 The variance of a continuos random variable is given by: V [X] = V ar[X] = E[(x )2 ] =

(x )2 f (x)dx

(2)

Denition 3.3 The covariance of two continuos random variables is given by: C[X, Y ] = Cov[X, Y ] = E[XY ] E[X]E[Y ] (3) Notice that the covariance of a r.v. X with itself is its variance. In addition, if two random variables are independent, then its covariance is zero. The reverse is not necessarily true.

Some useful properties: 1. E(a + bX + cY ) = a + bE(X) + cE(Y ) 2. V (a + bX = cY ) = b2 V (X) + c2 V (Y ) + 2bcCov(X, Y ) 3. Cov(a1 + b1 X + c1 Y, a2 + b2 X + c2 Y ) = b1 b2 V (X) + c1 c2 V (Y ) + (b1 c2 + c1 b2 )Cov(X, Y ) 4. If Z = h(X, Y ), then E(Z) = EX [EY |X (Z|X)] Law of iterated expectations

Econometrics

A random variable is a real-valued function dened over a Sample Space. The Sample Space () is the set of all possible outcomes. Before collecting the data (ex-ante) all our estimators are random variables. Once we have realized the data(ex-post), we get a specic number for our estimators. These numbers are what we called estimates. Remark 4.1 A simple Econometric Model: yi = +ei is not a regression model, but is an econometric one. i = 1, . . . , n. This

In order to estimate we make the following assumptions: 1. E(ei ) = 0 i 2. V ar(ei ) = E(e2 ) = 2 i i

3. Cov(ei , ej ) = E(ei ej ) = 0 i = j In a near future, we will further assume that the residual term follows a normal distribution with = 0 and variance 2 . This is not necessary for the estimation process, but we need to run some hypothesis tests. What we are looking for is for a line that ts the data, minimising the distance between the tted line and the data. In other words, Ordinary Least Squares (OLS).
n

Min
i=1

(yi )2 Min SSR


i 1 n n i=1

ei 2

The estimator is then given by =

yi = y

Denition 4.2 We say that an estimator is Unbiased if: E() = In other words, if after innitely sampling we are able to achieve the true population value. For this particular estimator () is easy to see that is indeed unbiased 1 2 and its variance is V ar() = n , given the assumption that the draws are iid. Note 4.3 Linear combination of normal distribution is a normal distribution. Proposition 4.4 If N (, ), then Z = n
2

/ n

N (0, 1)

Standard normal values: (z 1.96) = 0.025 and (z 1.64) = 0.05 Note: 1. 2.


e2 i 2 ei 2 2

2 n
2

= (n1) 2 We lose one degree of freedom here because we n1 2 need to use one datum to estimate . 2 / n

3. When we do not know 2 our standardise variable is Z =

t(n1)

Hypothesis Testing

Reject H0 Reject H1

H0 is true H1 is true Type I error OK OK Type II error

Thus, we dene the following probabilities: P(Type I error) = P(Reject H0| H0 is true) = P(Type II error) = P(Fail to reject H0| H0 is false) = 1 and is the so-called power of the test. Remark 4.5 Multiple Linear Regression (Population) yi = xi + ei i = 1, . . . , n vector notation

ASSUMPTIONS E(ei ) E(e2 ) i E(ei ej) ei = = = 0 i 2 i 0 i = j N (0, 2 ) (4)

X variables are non-stochastic. There is NO exact linear relationship among X variables. If ei is not normal, we may apply the Central Limit Theorem (CLT). However, for this we need to have a large sample size. How large is large enough? 30 (n K) is one number, but it will depend on the problem. OLS estimator results from minimising the SSE(error sum of squares)
n n

=(
i=1

xi xi )1
i=1

xi y i

(5)

The above estimator is useful if we are in Asymptopia. In matrix notation we have: y = X + e e iid N (0, 2 In ) X is non-stochastic The OLS from the sample is: = (X X)1 X Y = + (X X)1 X e (6)

(7)

This mathematical form is useful to run analysis in the nite sample world. The OLS estimator is unbiased and its variance-covariance matrix is given by: Cov() = E[( E())( E()) ] (8) = E[(X X)1 X ee X(X X)1 ] = 2 (X X)1 Thus, N (, 2 (X X)1 ) 8

Denition 4.6 The matrix MX = In X(X X)1 X is symmetric and T idempotent, i.e., MX = MX and MX MX = MX In general, we can have Mi = In Xi (Xi Xi )1 Xi . Thus, Mi Xj is interpreted as the residuals from regressing Xj on Xi . Note 4.7 The following properties are important for demonstrations: 1. If A is a square matrix, then A = CC 1 where is a diagonal matrix with the eigenvalues of A, and C is the matrix of the eigenvectors in column form. 2. If A is symmetric, then C C = CC = In and hence A = CC 3. If A is symmetric and idempotent, then is a diagonal matrix with either eigenvalues 1 or 0. 4. If A = CC , then rank(A)=r where r = n i i=1 Using this denition we get that e e = e MX e and hence E( e) = 2 (n K) e Theorem 4.8 Gauss-Markov Theorem: In a linear regression model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimator (BLUE) of the coecients is given by the OLS estimator. Best means giving the lowest possible mean squared error of the estimate. Notice that the errors need not be normal, nor independent and identically distributed (only uncorrelated and homoscedastic). The proof for this theorem is based on supposing an estimator = CY that is better than and nding the related contradiction.

Viva la Revolucin Libertaria!!! o


9

One page Summary


Y E() Cov()
2

= = = =

X + e (X X)1 X Y 2 (X X)1 N (, 2 (X X)1 ) 1 1 ee= nK nK 2 4 (n K)


n

e2 i
i=1

E( 2 ) = 2 V ar( ) =
2

e e 2 (n) ee (n K) 2 = 2 (nK) 2 2 e N (0, In ) e N (0, 2 In ) MX = In X(X X)1 X e = MX Y Theorem 5.1 Gauss-Markov Theorem: In a linear regression model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimator (BLUE) of the coecients is given by the OLS estimator. Best means giving the lowest possible mean squared error of the estimate. Notice that the errors need not be normal, nor independent and identically distributed (only uncorrelated and homoscedastic).

10

Anda mungkin juga menyukai