Anda di halaman 1dari 5

Chapter 0: introduction > How can we estimate the unknown parameter of the sx1 is the sample standard deviation

of the sx1 is the sample standard deviation for X1


model? sx2 is the sample standard deviation for X2
sy is the sample standard deviation for Y
Variables: Categorical (Gender, marital status) Numerical - OLS (ordinary least square) most popular
_________________________________
(Discrete “countable “like years of education, number of - Maximum likelihood (ML) to NOTE:
children) or (continuous “measurable” like wage, prices, return). - Methods of Moments (MM) Y= b0 + b1x1i +…..bkixi + ei
- Generalized Methods of Moments (GMM) can be written as
Data types y= ÿ + ei
Cross-sectional data consists of measurement for SSE = Σ (yi – ÿi)2 = Σ(yi-b0-b1xi)2 where ÿ=b0+b1x1i+….+bkixi
observations, e.g. individuals, firms etc... at a given point of bo = E(y) – (b1*E(X)) is the predicted value of the dependent variable and the residual, ei, is the
time. Variables like gender, years of education are suited in where E(Y,X) represents the mean difference between
the observed and the predicted values.
columns and observations in rows.
________________________________________
Time series data consists of measurements on one or more b1=Sxy/Sx2 to NOTE:
variables (stock market indices, IR etc.…...) over time. Variable where: SSR = ÿ´-E(y)
aligned in columns with chronological measurements SSE = y- ÿ
(frequency: daily, weekly...) SST = y- E(y)
where
Panel or longitudinal data is a combination of time series and
E(y) is the mean
cross-sectional data. Consists of measurements for ÿ is the predicted value of y
observations over a period of time. y is the observed value
Pooled cross-sectional data contain measurements for ________________________________________
observation which are randomly selected each time period Coefficient of multiple correlation

R = √R2 (to measure the strength of relationship between dependent and


three main desirable properties independent variables)
* Unbiasedness
* Efficiency Coefficient variance estimators
* consistency

Chapter1: Linear regression model

SST= SSR+SSE
where:

Chapter 3: Confidence intervals for parameters

Two-sided x% confidence interval

Generic Regression equation


Y = Bo + B1X T value is calculated as:
Y = Dependent variable
X= independent variable
Bo= Y intercept (Average value of Y if Xi=0)
B1 = how much in average Y change if X increase by one unit.

Simple linear regression model


𝑦�=�0+�1𝑥�+𝜖�

Multiple linear regression model the fit of the regression equation to the data is improved as
𝑦�=�0+�1𝑥� +⋯+�𝐾� +𝜖� SSR increases and SSE decreases. Tests of Hypothesis for the regression coefficient
𝜖�: random error (usually = 0)
Coefficient of determination or R2 = SSR/SST = 1-SSE/SST
Linear regression function (conditional mean)
𝐸 (𝑦�/𝑥1…, 𝑥𝐾) =�0+�1𝑥� +⋯+�𝐾𝑥� interpretation of R2=X% of the variability in (dependent
variable) can be explained by the variation (independent
Error term: 𝜖�=𝑦�−𝐸(𝑦�/𝑥�) variables)

> A high R2 value indicates a better regression. Could result


from small SSE, large SST or both.

When there is one independent variable, R2=P2 (P is


correlation between x and y)

Adjusted coefficient of determination (R2)

Chapter2: Estimation and properties of estimators

Least Squares Coefficient Estimators Standard error of the estimate (Se2)

Estimation for multiple linear regression Hypothesis test for all parameters together
F-statistic
Ÿ= b0 + b1x1i + b2x2i

The coefficient estimators are calculated as follow:

Decision rule
ÿ�=�0+�1𝑥�1+⋯+�𝐾𝑥�𝐾 > estimated average values for 𝑦� given
values of 𝑥�1,…,𝑥�𝐾.
b0,b1,bk… are estimators parameters for B0,B1,Bk….

The ÿ�’s are called fitted values for the observations 𝑦�

Residual: The difference between 𝑦� and the fitted value ÿ� is


called the residual 𝑒� (𝑒�=𝑦�−ÿ�)

The sum of the residual is always 0 Where:


The residuals 𝑒� are estimates for the unknown (unobserved) rx1y is the sample correlation between X1 and Y
rx2y is the sample correlation between X2 and Y
error terms 𝜖�. rx1x2 is the sample correlation between X1 and X2
Linear-Log: y i =β 0 + β 1 l n ⁡( x i )+ ϵ i 3) Test the joint significance (F-Test) of parameters

“If x increases by 1%, y changes on average by δ1 ,


0.01* B1 units”

⁡ δ 2

Chapter6: Specification Bias and IV Estimation: H 0 : ⁡ δ 1 =δ 2 =0 (all deltas are zero)

Misspecification: Selecting independent variables is one of the


most important issues when specifying a regression model.
Problems can occur if 1) Too few independent variables are
included or 2) too many independent variables are included

Including too few is worse than having too many in the end.
Conclusion: Null-hypothesis that all parameters are zero is 4) If H0 not rejected: Misspecification/Functional form
rejected. All parameters together are significant. is not a problem.

To Note: Omitted Variables: Concerns when an independent variable


that is relevant has not been included in the model. Can happen If H1 rejected: Misspecification/Functional form seems to have
when the linear regression model contains on independent a problem (but source not identified
variable, then: t2=F (Equivalence of t-test and f-test) due to lack of data (which cannot easily be solved) or
ignorance. Consider the model:
Hypothesis test for subset of parameters
y i = β 0 + β 1 x i + β 2 x i+ ϵ i

If β2 is omited, it goes into the error term u =

The test statistic is F-distributed with


degrees of freedom (df).
(�, �−𝐾−1) ϵ i + β 2 xi ⁡
Too many independent Variables: Why shouldn’t there be as
> Decision rule same as previous one many independent variables be included as possible? 1) can
have the undesirable effect of increasing the standard errors of
Chapter 4: Dummy Variables and Prediction and model becomes the parameters estimators 2) Multiple testing problem/data
dredging)  regression model building should be based on
The concept is converting the categorical variables into
y i= β 0 + β 1 x i +ui economic theory and common sense
numerical variable for the sake of the linear regression principle.
We have model with 1 independent:
Dummy variable has only two values (0,1) y i =β 0 + β 1 x i +ϵ i
where
1 = characteristic observed No issue if X2 is uncorrelated with X1, since error term u not
0 = characteristic not observed correlated with X1. However, if X2 is correlated with X1, the x −
1
−x ¿
i
error term u is correlated with X1 and Assumption 2 of OLS is ¿
¿
Dummy variables for differences in slope violated (omitted variable bias)! The bias created in b1 ¿ 2
∑ ¿
over/understates the model outputs. If correlations are small, s 2
e

bias is small too. ¿


s ¿ =√ ¿
¿

IV Estimation (Instrument Variable): is a tool that allows for


solving the problem when the unobservable error is correlated
Adding an irrelevant variable ( β 2=0 ):
with explanatory variables (if u1 is correlated with X1), as seen

Prediction
above.
y i =β 0 + β 1 x i + β 2 x 2+ ϵ i = β 0 + β 1 x i+ 0
For the estimation, we need an observable variable, “instrument
A prediction interval is obtained by: z1” that satisfies 2 conditions:  adds correlation effect:

1
1) z1 is uncorrelated with ui:
s ¿ becomes hence
x −− x
i ¿
¿
¿
¿ 2
¿
∑ ¿
2
so the standard
s e
For model with one independent variable: ¿
¿
C o v ( z i 1 ,u i) =0 √ ¿

error of coefficient we are interested in becomes larger


2) z1 is correlated with x1:

C o v ( zi 1 , x i )≠0
Chapter7:

Chapter5: Functional Form This reduces the bias, but the variance of estimated coefficients
gets larger
Multicollinearity
Up to now, linear regression models were always “linear”, now it
RESET Test (Ramsey’s regression specification error test):
is not always the case and must be transformed Multicollinearity: Two
Can be used to detect issues related to omitted variables and
basically tells if there is something missing (a variable) in the or more correlated independent x
model. variables

Linear-Linear: y i =β 0 + β 1 x i +ϵ i 1) Add a quadratic and cubic functions of the fitted

“If x increases by 1 unit, y changes values ^y 2 , ^y 3 to the original regression model


on average by B1 units”
(the serve as proxies for nonlinear functions):
Log-Log: l n ⁡( y i )=β 0+ β 1 l n ⁡( x i )+ϵ i
2)
“If x increases by 1%, y changes on average by 2
B1%” y i= β 0 + β 1 x i + δ 1 ⁡ ^
y +δ 2 ⁡
Log-Linear: l n ⁡( y i )=β 0+ β 1 xi + ϵ i
“If x increases by 1 unit, y changes on average by
100* B1%”
Perfect Multicollinearity: Perfectly correlated independent two but not a big problem enough. VIF is the stronger criteria.
variables can be avoided with careful attention to model (Its okay because we know that rooms make no sense after a 1) Estimate the linear model
specifications. This violates the assumption 3 under OLS. Could certain amount). Best solution seems to let both variables in the
be solved by leaving one of the two variables out. model (even if they are correlated). y i =β 0 + β 1 x i +…+ β k x k +ϵ i
Example (female and male dummy): Solutions for Multicollinearity:
If male =1, then female = 0 and vice versa, hence
one dummy should be dropped (anyway since j-1)
see above 2) Obtain y i=β 0 + β 1 x i +ϵ i
^ and
 F-test
Yield curve is also issue, 3m Zins, 10y Zins etc  best solution, but often not possible residuals (as far as we always have been gone
all multicollinear since 10y is 3m with a spread etc anyway)
Chapter 8: Heteroscedasticity
High-Multicollinearity: Highly correlated independent variables
(common in practice). It is okay, but can lead to problems such Deals with Assumption 4 of OLS, which says that error terms 3) Estimate “auxiliary regression”:
as increased standard errors that widen CI’s etc. No violation need to have a constant variance (homoscedastic). If there
of OLS assumptions, but its difficult to isolate separate effects is no constant variance, the error variance is not fix and
of each independent variable and it leads to larger standard depends on the value of the independent variables or on the
errors of OLS estimators and insignificant t-statistics indicate observation number i. This is quite a common problem in linear
large uncertainty about parameters. regression analysis, especially with cross-sectional data.

k
s¿ =
Consequences of heteroscedasticity: The OLS estimated
standard errors sbo, sb1, sbk are incorrect (too low or too
high). This can make intervals and hypotheses tests invalid.

Note: OLS estimators are still unbiased: E(bk) = βk 2


e i =a 0+ a1 ^
y i +ui by quadrating

Detecting Heteroskedasticity  Visually (Residual Plots) or


Formally (Breusch Pagan, White test) 2
4) Get the R¿e from the auxiliary regression
Visually - Residual Plots (2 ways):

2
5) Calculate Chi-squared test statistic = n* R¿e

√ s 2e
∑ ( x i k −−x k ) 2
*
1
2
1−R k
VIF is >=

1, (VIF = Variance inflation Factor), so if R2 big, VIF big


2
Reject H0 if n* R ¿e > invX2(area,df) or

X2CDF(lower,upper,df) < alpha

Menu X2CDF(lower,upper,df) Lower = test statistic,


- Plot
VIF Hence, if >6> Upper = +inf , df =1
k 5> 8 (always for auxiliary regr.
Pagan)
bk − β 0
k
sbk big t = t  small  we can’t find Menu invX2(area,df) area = 1-alpha, df = 1
s¿ >6> (always for auxiliary regr.
¿ 5> 9 Pagan)

significances on single parameters. BUT if then the F-Test


shows significances on all together, we have to be alert about
multicollinearity.
Formally: White test: Does also work when there is non-linear
effects on error variance. Again can be done on Anscomb plot
Results of High multicollinearity: Estimation results not
or on each X.  More flexible than Pagan for identifying many
robust, adding/removing variables from model can drastically
patterns of heterskedasticity, but also does not provide a
change parameters. Nonsensical signs and magnitudes (higher
solution for model correction.
std errors increase chance of estimates with extreme values or residuals ei versus independent variables xi (or t in time
wrong signs)
series), and look at each plot of each x variable against its Test: H0:
residuals Constant error
Examples of Multicollinearity: Lagged
variance (everything is fine)
independent variables (incomet, incomet-1),
independent variables with common time trend
(GDP, housing tend to move in same direction), H1: Error
Independent variables that capture similar - Plot residuals ei versus fitted values ^y i (Tukey-Anscombe
variance depends on
predicted regression
phenomena (crimerate, unemployment rate,
function (on the fitted
poverty rate)
plot) values)
Detecting Multicollinearity:
1) Estimate the linear model
1) Correlation Matrix: Correlations of 0.8 and higher
> Look for a change in variance (rule of thumb: if you move 1
point, and it changes the whole picture, there is probably y i =β 0 + β 1 x i +…+ β k x k +ϵ i
may signal multicollinearity (but it does not heteroskedasticity)
capture linear combinations of several
independent variables)

2) Obtain y i=β 0 + β 1 x i +ϵ i
^ and
2) Variance if Inflation Factor VIF: If VIF is > 10,
then multicollinearity is high. Accounts for Formally – residuals (as far as we always have been gone
combinations of several independent variables anyway)

3) The estimated coefficients bk of two highly 3) Estimate “auxiliary regression”:


positively correlated variables often have an
inverse sign and can be insignificant although 2 2
F-Test shows significance. Breusch Pagan: Issue e i =a 0+ a1 ^
y i + a1 ^
y i +ui
with this test is, that it is
anyway “included” into White test, so not really meaningful. It
puts a “Regressionline into the Tukey Plot, or against any X)
2
It is a statistical test to detect heteroskedasticity. The 4) Get the R¿e from the auxiliary regression
heteroskedasticity is assumed to be a linear function of the
Example expected values (otherwise the test does not work). The test
Rent: Here it gives no indication about any potential model correction. 
2
can be seen
that rooms
Problem is the linear auxiliary regression works only if
residuals have linear variance! 5) Calculate Chi-squared test statistic = n* R¿e
and area
are highly Test: H0: Constant error variance (everything is fine)
correlated
(0.877) but H1: Error variance depends on predicted
have different signs. VIF regression function (on the fitted values)
says there is a problem with the
2
Reject H0 if n* R¿e > invX2(area,df) or
misspecification issues (non-linear relationships) rather than
autocorrelation. Too few runs vs μr : Positive Autocorrelation
X2CDF(lower,upper,df) < alpha
AR Errors: The error term ϵt can be interpreted as random Too many Runs vs μr : Negative
Me p- X2CDF(lower,upper, Lower = test statistic,
Autocorrelation
nu
>
val
ue
df) Upper = +inf , df =2
(always for auxiliary
shock at time t. Often, ϵt follows an autoregressive AR
6 regr. White) Interpretation: If no autocorrelation, then number of runs is
process.
>
5> approx. normally distr. With mean μr and SD σr
8 Assuming a first order AR(1) process means: “The error today is
Me Cri invX2(area,df) area = 1-alpha, df = 2 nothing else than the error yesterday with a ut”:
nu tic (always for auxiliary
> al regr. White) Example:
6 Val ϵ t = ρ ϵ t−1 +ut , where ρ = linear
> ue
5> dependence between t and t-1  asumed to be 0 in classic
9 regression!

if ρ=0 , then no
Example:
2( 14 * 18)
μr = =16.75
autocorrelation ( 14+18 )
n = 1091 2 2
Test stat: 1091*0.168 = 183.288 and σ u =σ t with
Crit= invX2(0.99,2) = 9.21


pval=X2CDF(183.288,+inf,2) = 0
Stationarity condition: −1< ρ< 1 (16.75−1 )(16.75−2)
σr= =2.7375
> Reject H0 since Test Stat > Critical value
and/or p-value < 0.01 (14+18−1)
if ρ<0 , then neg. Autkorrel (next day it will
Remedies for Heteroskedasticity
5−16.75
be pos, then and neg. Etc)
s ¿ =−4.2922
> Transform the data by using log(yi) instead ot yi  Log 2.7375
Linear, the estimate the model again and do White test. If if ρ>0 , then
visually and white test tells me that this is still not enough, go to
White corrected (see next) Decision: |-4.2922| > 1.996  Reject H0, there is autocorrel,
and we have 5 runs vs 17 expected hence positive autocor.

> Modify estimation procedure to obtain correct standard


errors. 2 ways: Durbin Watson Test: Simple and popular test for
pos. Autokorrel (alway pos) autocorrelation. Good initial test, but limited to identify AR(1)
error terms. Given:
a. White – Corrected (robust) and AR(1) process most common
b. Weighted Least Squares WLS (not important for exam) choice, but can also use p- lags in AR(p) process.
y i = β 0 + β 1 x t 1 +…+ β k x t k + ϵ t
Detecting Autocorrelation
White Corrected: Most popular remedy for heteroscedasticity. with ϵ t = ρ ϵ t−1 +ut ( ρ=
Adjusts standard errors for heteroskedasticity and does not rely
on any assumptions about functional form of heteroscedasticity. autokorr. Koeff. Unknown)
Note: OLS estimators are unchanged (so coefficients are still
okay) but standard errors can increase due to white correction
and mess up intervals etc. (see example below): No Test: H0 : ρ=⁡ 0 (no positive or negative
Pos.
Autoc
Autoc autocorrelation)  d should be close to 2 if H0 should hold

Neg. 0≤d≤4
Chapter9:
Autoc
H1 : ρ≠⁡ 0 (positive or negative autocorrelation)

d >2 = neg autoc, d<2 = positive autocorrel


Autocorrelation Plotting: Run regressoion, T
calcualte residuals, plot them ∑t=2 ( ϵ t −ϵ t−1 )2 C o v ( ϵ t −1 , ϵ t )
Under assumption 5 for OLS it: on time series chart chronologically:
d= T
∑t=2 ϵ 2t
=2
( V a r ( ϵt) )
=2( 1− ρ )

C o v ( ϵ t , ϵ s )=C o r r ( ϵ t , ϵ s )=0 , ⁡ t≠s Run Test: Issue that low power any may not detect
autocorrelation.
 error terms should not be correlated
H0: No autocorrelation I f ⁡ ρ=0 ⁡t h e n ⁡ d=2, ⁡ ⁡ ⁡ ⁡ ⁡ ⁡ ⁡ ⁡i f ⁡ ρ=1 ⁡ t h e n⁡ d=⁡ ⁡ ⁡ ⁡ ⁡ ⁡ ⁡ ⁡ ⁡ ⁡ a
Autocorrelation happens in regression models with time series
data (order of observation is relevant). It means that the error T1: Nr of positive residuals
in period t is related to error in another period s. OLS
estimators are still unbiased, but std errors are biased! H1: positive OR negative autocorrelation
T2: Nr of negative residuals
Positive Autocorrelation: Errors tend to be followed by errors
of same sign.  Standard Errors too small = CI too narrow
and test statistics too large (some variables may appear
r− μ r
Test statistic s= Critical
significant even if they are not) σr
Negative Autocorrelation: Errors tend to be followed by errors
of opposite sign.  Standard Errors too large = CI too wide Value Z = invnorm(1-a/2,0,1)
and test statistics too small (some variables may not appear
significant even when they are)

Decision: Reject H0 if |s| > Z and if so,


decide
on
positive Remedies
or negative

Careful autocorrelation
when drawing conclusions about
autocorrel. The data that looks like having
autocorrelation may simply have other
Key point of PE
Assignment 1

v) the number of included variables are too high, we therefore


run the risk of?
> Autocorrelation

Assignment 2

v) on the 5% significance level, the upper limit of the prediction


interval for Y14 = eZ14 is given by?

we apply the formula

The auxiliary regression is e2 = 25046821+1731yˆ n = 51, R2


= .015, 2 2 1,.10 nR =.765 < 2.71 = X2(1,10) , therefore, do not
Assignment 3 reject the H0 that the error terms have constant variance at the
10% level.

Exercises

German Bundesliga

Use your calculations by hand to answer the following


questions.
i. Find the 95% confidence interval for ��1. Interpret your result.

Answer> b1+/- t(n-k-1,alpha/2) * sb1

95% confidence interval for ��1: (0.28, 0.65). The 95%


confidence interval defines a range of values that you can be
95% certain contains the population parameter ��1.

ii. 99% confidence interval for ��1: (0.21, 0.72). Now you can
be 99% certain that the interval contains the population
parameter �1.

iii. �0: �1=0 can be rejected at significance level �=0.01


because value of the null hypothesis is not part of the 99%
confidence interval for �1

Test the null hypothesis that the combined effect of


net transfers and mean age is insignificant (�� =
0.05):
�0 : � 2 = � 3 = 0
�0: �2 ≠ 0 and/or �3 ≠ 0
Answer
>
Restricted model
point�=�0+�1𝑝ayrolli+𝜀� with 𝑅𝑅𝑟𝑟2=0.64177
Test statistic F=(𝑅𝑢2−𝑅𝑟2)/�(1−𝑅𝑢2)/
(�−𝐾−1)=(0.66006−0.64177)/2(1−0.66006)/
(18−3−1)=0.38

Critical value F(2,14),0.05=3.739


�0: �2=�3=0 cannot be rejected at �=0.05
significance level. (F(Q,n-k-1) > F)
12.63 dummy variables

What are the model constant and the slope coefficient of x1


when the dummy variable equals 1 in the following equations,
where x1 is a continuous variable and x2 is a dummy variable
with a value of 0 or 1?

a. yˆ = 4 + 9x1 + 1.78x2 + 3.09x1x2

a. yˆ = 5.78 + 12.09x1 , b0 = 5.78, slope coefficient of 1 x =


12.09

13.28 white test

Anda mungkin juga menyukai