and Statistics

Moshe Buchinsky

UCLA

Course Logistics

12:30AM-1:45PM, at Broad Art Center, 2160E.

I Lab lectures (Econ 103L): Fridays 11:00-11:50AM, Fowler

Museum, A103B.

I One has to register to BOTH Econ 103 AND Econ 103L.

You cannot take one without the other.

I Office hours: Fridays 1:00-2:00PM, Bunche Hall 8373.

The Teaching Assistants

I Changsu Ko - Head TA

I Changsu Ko (head TA)

I Yun Feng

I Dong Ook Eun

I Kun Hu

I Jeonghwan Kim

I Lu Liu

I Pavel Andreyanov

Textbooks and Other Material

C. Lim, 4th Edition, 2011. Wiley and Sons. (Required!)

I Using Stata for Principles of Econometrics. Lee C. Adkins &

R. Carter Hill, 4th Edition. Wiley and Sons. (Strongly

Recommended)

I Other:

I Basic Econometrics. Gujarati, D. N., 5th Revised Edition,

2010, McGraw-Hill.

I Introductory Econometrics. Wooldridge, J., 6th Edition,2013,

Cengage Learning.

I Probability and Statistical Inference. Hogg, R. V. and E. A.

Tanis, 8th Edition, 2009, MacMillan.

I Lecture notes: Slides will be posted on the class website before

class. I will not bring copies!

I Course outline - see Syllabus

Course Requirements - Problem Sets

I Mix of multiple choice questions, analytic exercises, and

STATA work.

I Grading of problem sets:

I 5 points for completing homework.

I 5 points for correct answer to question chosen randomly by

TAs.

I Schedule: See tentative schedule in the syllabus.

I Due at end of class on due date. No exceptions!

I Must hand in hardcopy.

I Note! You are encouraged to work together, but everyone

must hand in their own hardcopy, and do the computer work

independently.

Course Requirements - Midterm (20% or 0%

I In class on Thursday, November 9, 2017.

I Very much like the problem sets.

I Optional I will compute the final grade both with and

without the midterm and use the better grade.

I No makeup exam!

Questions, Problems, etc.

I Post questions to the class website discussion board.

I TAs and myself will be monitoring the discussion board daily

to answer such questions.

I Emails to me or TAs on such questions will not be answered.

Regrading

I The entire exam or homework will be regraded.

Please Remember

Course Overview

I Econometrics...

I Uses statistical methods to analyze economic data.

I aims to answer quantitative questions.

I Main tool: regression analysis

I We want to determine the causal effect of one variable

(X ) on another variable (Y ).

I Econ 41: statistical analysis of one variable.

I Econ 103: analysis of the relationship between two (or more)

variables.

Examples of questions of interest:

I What is the effect of a new marketing campaign on sales?

I How does class size affect education outcomes (e.g. test

scores)?

I How much an additional year of schooling increases wages?

I What is the relationship between credit scores and loan default

rates?

I How much does output grow if the Fed cuts interest rates by

1%?

I Do more policemen reduce crime?

Examples of questions of interest:

I Theory:

I How to interpret results in articles, or the results your

computer gives.

I To understand the caveats of regression analysis.

I Why it is essential to do good empirical work.

I How to do empirical analysis yourself:

I You will work with real datasets.

I You will use statistical software, namely STATA.

Review

(Econ 41)

1. Definitions

probabilities.

I Outcomes ( xi ): mutually exclusive potential results of a

random process.

I Probability of an outcome: proportion of times that the

outcome occurs in the long run.

I Types of Random Variables:

I Discrete Random Variable: takes on a finite number of values.

I Example: coin toss; the number of times a computer crashes.

I Continuous Random Variable: takes on any value in a real

interval, each specific value has zero probability.

I Example: height of an individual; time it takes to commute to

school.

1. Definitions

possible outcomes of a random variable X and the

corresponding probabilities is called the PDF, which stands for:

I the Probability Density Function for continuous functions.

I PDFs must satisfy:

I f (xi ) 0; and

P R

i f (xi ) = 1 (if discrete) or f (x)dx = 1 (if continuous)

I

1. Definitions

I Height of adult men in inches.

1. Definitions

probability that the random variable is less than or equal to a

particular value.

1. Definitions

I Expected Value or Expectation or Mean

P

i xi f (xi ) if X is discrete

X = E (X ) =

R

xf (x)dx if X is continuous

I Weighted average of the possible values of X with the

probability f (x) serving as weights

1. Definitions

I Moments (continued):

I Variance

X2 = Var (X ) = E (X X )2

P 2

i (xi X ) f (xi ) if X discrete

=

(x X )2 f (x)dx if X cont.

R

its mean

I Standard deviation

2. Definitions - Two Random Variables

probability that the random variables simultaneously take on

certain values, x and y .

I It is the function

fX ,Y (x, y ) = Pr(X = x, Y = y ).

X X

fY (y ) = Pr(X = xi , Y = y ) = fX ,Y (x, y )

xi x

2. Definitions - Two Random Variables

probability that Y takes on the value y when X takes on the

value x.

fY |X (y | x) = Pr(Y = y | X = x)

Pr(X = x, Y = y )

=

Pr(X = x)

fX ,Y (x, y )

=

fX (x)

2. Definitions - Two Random Variables

I Example: Men say they will vote for the Republican candidate

rather than the Democratic candidate in their districts by a margin

of 45 percent to 32 percent. The numbers are nearly reversed for

women, with 36 percent saying they will vote Republican and 43

percent saying they will vote Democratic. New York Times,

September 20, 2010

I Assume that there are 50% men and 50% women.

1 Democrat

1 male

X = Y = 2 Republican

2 female

3 other

2. Definitions - Two Random Variables

x1 = 1 x2 = 2

y1 = 1 .16 .215

y2 = 2 .225 .18

y3 = 3 .115 .105

2. Definitions - Two Random Variables

P

x1 = 1 x2 = 2 xi

y1 = 1 .16 .215 .375

y2 = 2 .225 .18 .405

y3 = 3 .115 .105 .22

P

yj .5 .5 1

2. Definitions - Two Random Variables

female:

Pr(Y = 1, X = 2) .215

= = .43

Pr(X = 2) .5

I The conditional distribution of being a Republican, given being

a female:

Pr(Y = 2, X = 2) .18

= = .36

Pr(X = 2) .5

2. Definitions - Two Random Variables

random variables, and is defined as

association between two random variables, and is defined as

Cov (X , Y )

XY =

X Y

I It can be shown that

1 1

2. Definitions - Two Random Variables

I Independence:

fX ,Y (x, y ) = fX (x)fY (y )

I If two random variables are independent, then the conditional

distribution of each variable coincides with its marginal

distribution, that is

fY |X (y | x) = fY (y ), and

fX |Y (x | y ) = fX (x).

Properties of Expectations, Variance and Covariance

I E (a) = a

I E (aX + b) = aE (X ) + b

I E (X + Y ) = E (X ) + E (Y )

I Var (a) = 0

I Var (aX + b) = a2 Var (X )

I Cov (X , X ) = Var (X )Cov (X , Y ) = E (XY ) E (X )E (Y )

I Var (aX + bY ) = a2 Var (X ) + b 2 Var (Y ) + 2abCov (X , Y )

Properties of Expectations, Variance and Covariance

I E (XY ) = E (X )E (Y )

I Var (X + Y ) = Var (X ) + Var (Y )

I Cov (X , Y ) = 0, but the reverse is not true! Zero covariance

does not imply independence.

Special Probability Distributions - Normal

I Standard normal distribution: N(0, 1)

I A variable that follows a standard normal distribution is often

denoted by Z .

I Its CDF is often denoted by , that is, Pr(Z c) = (c) for

any c.

X

I Note that if X N(, 2 ), then Z = N(0, 1).

I Note also that if X and Y are normally distributed, so is

X + Y (and any other linear combination).

Special Probability Distributions - Normal

Special Probability Distributions - Chi-Squared

m

X

W = Zi2 2m ,

i=1

I 2 distribution with m degrees of freedom.

Special Probability Distributions - Chi-Squared

Special Probability Distributions - Student-t

W 2m ,and Z and W are independent. Then

Z

p tm

W /m

I Note: t = N(0, 1)

Special Probability Distributions - Student-t

Special Probability Distributions

I Standard Normal vs. Student t distributions

Special Probability Distributions - F distribution

independent.

I Then

W /m

Fm,n

V /n

I F distribution with m and n degrees of freedom

I Note: mFm, = 2m and F1,n = tn2

Special Probability Distributions - F distribution

Computing Probabilities

mean 70 inches (178 cm) and standard deviation 4 inches (7.6

cm).

I If you are 65 inches tall (165 cm), what percentage of men are

shorter than you?

I Let X denote the height of men. X N(70, 16).

I Let us standardize X :

X 70

Z= N(0, 1)

4

I We have to do the same manipulation for 65.

65 70

= 1.25

4

Computing Probabilities

CDF!

(1.25) = 10.56%

I You can compute this value in Stata using command di

normal(-1.25).

I Alternatively, see Appendix Table 1 in textbook.

Computing Probabilities

Computing Probabilities

Computing Probabilities

1 0.8944 = 0.1056!

