Anda di halaman 1dari 6

Econometrics Example Sheet

Problem 1:

From the data for 46 states in the US for 1992, Baltagi obtained the following regression results:

LogC(hat)= 4.3 – 1.34 log(P) + 0.17 log(Y)

se (0.91) (0.32) (0.2) Adjusted R2=0.27

where C is Cigarette consumption (packs per year)

P is real price per pack
and Y is real disposable income per capital
1. Interprete the meaning of each estimated coefficients
2. Are the signs of coefficients of your expectation
3. How much P and Y explain for the variation in C
4. What is the elasticity of demand for cigarette w.r.t. price? Is it statistically significant? If
so, is it statistically different from 1?
5. What is the income elasticity of demand for cigarette? is it statistically significant? If not,
what might be the reasons for that?
6. Test for overall significance of the model

Problem 2:

You want to study the dependence of beer expenditures of employees in a company on their
incomes, ages and sexes. You have collected a random sample of observations on 40 office
employees, 20 of whom are females and 20 of whom are males. Here is the description of variables
in the data set:

BEi : the annual beer expenditures of employee i, measured in dollars per year;
INCi : the annual income of employee i, in thousands of dollars per year;
AGEi : the age of employee i, in years;
SEXi : the dummy variable, SEXi = 1 if employee i is female and SEXi = 0 if employee i is

You propose the following model (model (1)):

BEi = β1 + β 2 INC i + β 3 AGEi + β 4 SEX i + β5 SEX i * INC i + β 6 SEX i * AGEi + u i
Using OLS method in EVIEWS, you obtain the following results:

Result (1)
Dependent variable: BE
Included observations: 40
Variable Coefficient Std. Error t-Statistic Prob.
C 489.8631 73.85524 6.632747 0.0000
INC 0.002893 0.000775 3.734180 0.0007
AGE -10.07924 2.229676 -4.520493 0.0001
SEX -265.8574 113.3658 -2.345129 0.0250
SEX*INC -0.001029 0.000971 -1.059491 0.2968
SEX*AGE 4.231494 3.648383 1.159827 0.2542
R-squared 0.6470

Result (2)
Ramsey RESET Test:
F-statistic 2.110154 Probability 0.119102
Log likelihood ratio 7.432899 Probability 0.059308

Result (3)
Breusch-Godfrey Serial Correlation LM Test:
F-statistic 0.545784 Probability 0.584685
Obs*R-squared 1.319452 Probability 0.516993

Result (4)
White Heteroskedasticity Test:
F-statistic 0.768684 Probability 0.645556
Obs*R-squared 7.495667 Probability 0.585656

Result (5)
BEi = 459.21+ 0.0023 INCi - 8.42 AGEi -169.87 SEXi R2=0.6294

Result (6)
BEi = 342.88+ 0.00238 INCi - 7.575 AGEi R2= 0.3292

1. Write down the sample regression model of model (1) based on the result (1)? Write down the
population regression model and sample regression model for male and female employees and
explain the meaning of the estimated regression coefficients?

2. Use results (2), (3) and (4) to test for possible problems in the estimated model of model (1). In
each test, specify clearly type of test, type of problem, the statistic used, null and alternative
hypothesis and conclusion about the problem.

3. Using result (1), for male employees, how the expenditures for beer change if their income
increases 1000USD/year? Answer the same question for female employees given that:
ˆ ,β
cov( β ˆ ) =0
2 5

4. In the model (1), state the null and the alternative hypothesis if you want to test that the models
for the expenditures of beer for male and female are not different in slope coefficients of both INC
and AGE. In other words, you want to conduct the joint test of hypothesis of equal slope
coefficients of male and female for INC and equal slope coefficients of male and female for AGE.
Perform this test using appropriate information given above.

5. Using the results above to test the hypothesis that the variable SEX does not affect the annual
expenditures for beer.

Problem 3:

A researcher is using data for a sample of 526 paid workers to investigate the relationship between
hourly wage rates Yi (measured in dollars per hour) and years of formal
education Xi(measured in years). Preliminary analysis of the sample data produces the following
sample information:
N = 526 ; ∑Yi =3101.35; ∑Xi =6608; ∑Yi2= 25446.29 ; ∑Xi2= 87040
∑Xi Yi= 41140.65; ∑xi yi=2179.204; ∑yi2=7160.414; ∑xi2= 4025.43; ∑uˆ2=5980.682
1. Use the above information to compute OLS estimates of the intercept coefficient and slope
2. Interpret the slope coefficient estimate you calculated in part 1
3. Calculate an estimate of σ2, the error variance
4. Compute the value of r2, the coefficient of determination for the estimated OLS sample
regression equation. Briefly explain what the calculated value of r2 means?
5. Test the opinion that years of formal education does not affect hourly wage rates
6. Compute two-side 95% CI for slope coefficient. Would two-side 99% CI be wider or
narrower than two-side 95% CI and why?
7. There is opinion that when formal education increase 1 year, the average wage rates
increase at lease 0.5 USD/h. Test this opinion.
8. Predict the average Y when X is 10 years.

Problem 4:

Dependent Variable: GNP

Included observations: 15
Variable Coefficient Std. Error t-Statistic Prob.
C -529.6074 18.04051 -29.35656 0.0000
CPI 14.18311 0.382173 37.11174 0.0000
I 1.325414 0.123177 10.76025 0.0000
R 6.541530 3.058294 2.138947 0.0557
R-squared 0.999646 Mean dependent var 1748.647
Adjusted R-squared 0.999550 S.D. dependent var 738.1458
S.E. of regression 15.65737 Akaike info criterion 8.562939
Sum squared resid 2696.686 Schwarz criterion 8.751752
Log likelihood -60.22204 F-statistic 10368.12
Durbin-Watson stat 1.764303 Prob(F-statistic) 0.000000

1. Read the information from the above report.

2. Write down SRF and explain the meaning of each estimated coefficients.
3. Test for significance of each variable
4. Test for the significance of all independent variables simultaneously.
5. How much the independent variables can explain for the variation of GNP.
6. When investment increases 1bil, in which range can GNP increase?
7. Test for hypothesis that GNP will increase at least 15 bils if CPI increase 1%.
8. Test for the hypothesis that CPI and I have the same effects on GNP given that the
covariance between two corresponding estimated coefficients is .04228

Problem 5

The demand for roses was estimated using quarterly figures for the period 1971 (3rd quarter) to 1975
(2nd quarter). Two models were estimated and the following results were obtained:

Y = Quantity of roses sold (dozens)

X2 = Average wholesale price of roses ($ per dozen)
X3 = Average wholesale price of carnations ($ per dozen)
X4 = Average weekly family disposable income ($ per week)
X5 = Time (1971.3 = 1 and 1975.2 = 16)
ln = natural logarithm
The standard errors are given in parentheses.
A. ln Yt∧ = 0.627 - 1.273 ln X2t + 0.937 ln X3t + 1.713 ln X4t - 0.182 ln X5t
(0.327) (0.659) (1.201) (0.128)
R2 = 77.8% D.W. = 1.78 N = 16

B. ln Yt∧ = 10.462 - 1.39 ln X2t

R = 59.5% D.W. = 1.495 N = 16

Correlation matrix:
ln X2 ln X3 ln X4 ln X5
ln X2 1.0000 -.7219 .316 -.7792
ln X3 -.7219 1.0000 -.1716 .5521
ln X4 .3160 -.1716 1.0000 -.6765
ln X5 -.7792 .5521 -.6765 1.0000

a) How would you interpret the coefficients of ln X2, ln X3 and ln X4 in model A?

What sign would you expect these coefficients to have? Do the results concur with your expectation?

b) Are these coefficients statistically significant?

c) Use the results of Model A to test the following hypotheses:
i) The demand for roses is price elastic
ii) Carnations are substitute goods for roses
iii) Roses are a luxury good (demand increases more than proportionally as income rises)
d) Are the results of (b) and (c) in accordance with your expectations? If any of the tests are
statistically insignificant, give a suggestion as to what may be the reason.
e) Do you detect the presence of multicollinearity in the data? Explain.
f) Do you detect the presence of serial correlation? Explain
g) Do the variables X3, X4 and X5 contribute significantly to the analysis? Test the joint significance
of these variables.
h) Starting from model B, assuming that at the time point of January, 1973, there was a disaster that
heavily affected the quantity of roses produced. Suggest a model to check if we have to use two
different models for the data before and after the disaster. (Using dummy variable).

Problem 5:

Two large US corporations, General Electric and Westinghouse, compete with each other and
produce many similar products. In order to investigate whether they have similar investment
strategies, we estimate the following model using pooled time series data for the period 1935 to
1954 for the two firms:
INVt = β1 + β2DVt + β3Vt + β4DV*Vt + β5Kt + β6DV*Kt + ut (1)

where INV = gross investment in plant and equipment

V = value of the firm = value of common and preferred stock
K = stock of capital
DV = 0 if General Electric (observations 1 to 20)
= 1 if Westinghouse (observations 21 to 40)

All three continuous variables are measured in millions of 1947 dollars. Pooling the data yields 40
observations with which to estimate the parameters of the investment function. However, pooling
is valid only if the regression parameters are the same for both firms. In order to test this
hypothesis, intercept and slope dummy variables are included in the model.

Dependent Variable: INV

Method: Least Squares

Sample: 1 40
Included observations: 40
Variable Coefficien Std. Error t-Statistic Prob.
C -9.956306 23.62636 -0.421407 0.6761
DV 9.446916 28.80535 0.327957 0.7450
V 0.026551 0.011722 2.265064 0.0300
DV*V 0.026343 0.034353 0.766838 0.4485
K 0.151694 0.019356 7.836865 0.0000
DV*K -0.059287 0.116946 -0.506962 0.6155
R-squared 0.827840 Mean dependent var 72.59075
Adjusted R-squared 0.802523 S.D. dependent var 47.24981
S.E. of regression 20.99707 Akaike info criterion 9.064124
Sum squared resid 14989.82 Schwarz criterion 9.317456
Log likelihood -175.2825 F-statistic 32.69818
Durbin-Watson stat 1.121571 Prob(F-statistic) 0.000000

(a) Interpret all the coefficient estimates, stating whether the signs are as you would expect, and
comment on the statistical significance of the individual coefficients.
(b) Comment on the overall fit and statistical significance of the model.
(c) The Jarque-Bera statistic is 7.77 and its p-value is 0.02. What can you conclude about the
distribution of the disturbance term? Why is this test important?
(d) On the basis of the above results, is pooling the data from the two firms appropriate? Explain.
(e) An alternative way of testing whether pooling the data is appropriate, without using dummy
variables, is to use the Chow breakpoint test. Referring to table below, briefly discuss how the
test works and whether the results are consistent with the earlier model (which includes
dummy variables).
Chow Breakpoint Test: 21
F-statistic 1.189433 Probability 0.328351
Log likelihood ratio 3.992003 Probability 0.262329

(f) Explain the results and implications of the following Ramsey RESET test. (Note that the
dummy variables have been omitted from the original model).

Ramsey RESET Test:

F-statistic 0.000200 Probability 0.988806
Log likelihood ratio 0.000219 Probability 0.988189

Test Equation:
Dependent Variable: INV
Method: Least Squares
Date: 05/15/02 Time: 13:07
Sample: 1 40
Included observations: 40
Variable Coefficien Std. Error t-Statistic Prob.
C 17.81458 8.199161 2.172732 0.0365
V 0.015226 0.006706 2.270632 0.0293
K 0.144467 0.065596 2.202383 0.0341
FITTED^2 -2.87E-05 0.002028 -0.014128 0.9888
R-squared 0.809773 Mean dependent var 72.59075
Adjusted R-squared 0.793921 S.D. dependent var 47.24981
S.E. of regression 21.44950 Akaike info criterion 9.063919
Sum squared resid 16562.91 Schwarz criterion 9.232807
Log likelihood -177.2784 F-statistic 51.08255
Durbin-Watson stat 1.106556 Prob(F-statistic) 0.000000

Note: We can have similar questions using results from eviews to check for autocorrelation and
heteroscedasticity (Breusch Godfrey test and White test).