(
1
1
1
1
1 1
stat



 
sd sd
t
=
=
Answer:
1 : H
lenroll 0
= 
1 : H
lenroll 1
> 
45736 . 2 109776 . 0 / ) 1 26976 . 1 ( = = t
Onesided (right) critical value at % 5 = o is 1.66. This value can be obtained by using the
command: disp invttail(95,0.05) which yields 1.6610518.
Conclusion: Since the tstat (2.46) is greater than the critical values (1.66), H
0
is rejected. We would
argue that the elasticity of crime with respect to enroll is more than unity.
Example 4.5 Housing prices and air pollution
Data: hprice2s8.dta
For a sample of 506 communities in the Boston area, we estimate a model relating median housing price
(price) in the community to various community characteristics: nox is the amount of nitrogen oxide in the
air, in parts per milion; dist is a weighted distance of the community from five employment centers, in
miles; rooms is the average number of rooms in house in the community; and stratio is the average
studentteacher ratio of schools in the community.
1. Estimate the regression:
7
i i i i i i
u stratio rooms dist nox price + + + + + =
4 3 2 1 0
) log( ) log( ) log(     
2. Test the hypothesis that the elasticity of price with respect to nox is negative one against the
alternative that it does not equal negative one. Write the hypothesis. Determine the tstatistic and the
critical value at % 5 = o . What is your conclusion?
Note: The tstat should be computed as
)
(
) 1 (
1
1
1
1 1



 
sd sd
tstat
=
=
3. Calculate the pvalue for testing the hypothesis that the elasticity of price with respect to nox is
negative one against the alternative that it does not equal negative one.
Example 4.7 Effect of job training on firm scrap rates
Data: jtrains8.dta
The scrap rate for a manufacturing firm is the number of defective items out of every 100 produced. Thus,
for a given number of items produced, a decrease in the scrap rate reflects higher worker productivity. We
want to use the scrap rate to measure the effect of worker training on productivity.
1. Check the data. For the year 1987 and for nonunionized firms, how many observations the data has?
2. Estimate the following regression only for the year 1987 and for nonunionized firms:
i i i i i
u employ sales hrsemp scrap + + + + = ) log( ) log( ) log(
3 2 1 0
   
where hrsemp is annual hours of training per employee, sales is annual firm sales (in dollars), and
employ is the number of firm employees.
3. What is the average scrap rate and average hrsemp in 1987?
4. What is your comment on the economic significance of the training variable?
5. What about the statistical significance of the training variable?
5a. Test the hypothesis that the effect of training on scrap is zero in the population against the
alternative that it is negative. Write the hypothesis. Determine the tstatistic and the critical value
at % 1 = o , % 5 = o , and % 10 = o . What is your conclusion?
5b. Calculate the pvalue to test the hypothesis that the effect of training on scrap is zero in the
population against the alternative that it is negative.
Note: The onesided pvalue is obtained as onehalf of the pvalue for the twotailed test.
Section 4.4 Testing hypotheses about a single linear combination of the parameters
In this section we show how to test a single hypothesis involving more than one of the
j
 . Consider a
simple model to compare the returns to education at junior college and fouryear colleges (universities).
The model is u exper univ jc wage + + + + =
3 2 1 0
) log(    
The hypothesis of interest is whether one year at junior college is worth one year at a university against a
onesided alternative that a year at a junior college is worth less than a year at a university. Thus:
2 1 0
: H   = ,
2 1 1
: H   < .
The tstatistic for the these hypotheses is
)
(
2 1
2 1
 
 
=
se
t
1. Estimate with OLS the regression
i i i i i
u exper univ jc wage + + + + =
3 2 1 0
) log(    
8
Answer:
use twoyears8.dta, clear
desc
Contains data from C:\STORAGE\Z_Copy\Dept IE\Ekonometri 1_S2\BKF\Data\Data stata
8\twoyears8.dta
obs: 6,763
vars: 22 27 Feb 2012 08:36
size: 311,098 (97.3% of memory free)

storage display value
variable name type format label variable label

female byte %8.0g =1 if female
phsrank byte %8.0g % high school rank; 100 = best
ba byte %8.0g =1 if bachelor's degree
aa byte %8.0g =1 if associate's degree
black byte %8.0g =1 if africanamerican
hispanic byte %8.0g =1 if hispanic
id long %12.0g id number
exper int %8.0g total (actual) work experience
jc float %9.0g total 2year credits
univ float %9.0g total 4year credits
lwage float %9.0g log hourly wage
stotal float %9.0g total standardized test score
smcity byte %8.0g =1 if small city, 1972
medcity byte %8.0g =1 if med. city, 1972
submed byte %8.0g =1 if suburb med. city, 1972
lgcity byte %8.0g =1 if large city, 1972
sublg byte %8.0g =1 if suburb large city, 1972
vlgcity byte %8.0g =1 if very large city, 1972
subvlg byte %8.0g =1 if sub. very lge. city, 1972
variabl0 byte %8.0g =1 if northeast
nc byte %8.0g =1 if north central
south byte %8.0g =1 if south

sum lwage jc univ exper
Variable  Obs Mean Std. Dev. Min Max
+
lwage  6763 2.248096 .4876918 .5555456 3.911953
jc  6763 .3388946 .7721268 0 3.833333
univ  6763 1.926274 2.297001 0 7.5
exper  6763 122.3816 33.42799 3 166
reg lwage jc univ exper
Source  SS df MS Number of obs = 6763
+ F( 3, 6759) = 644.53
Model  357.752575 3 119.250858 Prob > F = 0.0000
Residual  1250.54352 6759 .185019014 Rsquared = 0.2224
+ Adj Rsquared = 0.2221
Total  1608.29609 6762 .237843255 Root MSE = .43014

lwage  Coef. Std. Err. t P>t [95% Conf. Interval]
+
jc  .0666967 .0068288 9.77 0.000 .0533101 .0800833
univ  .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper  .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons  1.472326 .0210602 69.91 0.000 1.431041 1.51361

9
2. Find
2 / 1
12
2
2
2
1 2 1
} 2 )]
( se [ )]
( se {[ )
( se s + =    
Answer:
qui reg lwage jc univ exper
matrix v=e(V)
matrix list v
symmetric v[4,4]
jc univ exper _cons
jc .00004663
univ 1.928e06 5.330e06
exper 1.718e08 3.933e08 2.480e08
_cons .00001741 .00001573 3.105e06 .00044353
scalar s12 = v[2,1]
scalar se_jc_min_univ=(_se[jc]^2 + _se[univ]^2 2*s12)^(1/2)
disp se_jc_min_univ
.00693591
From the last command we get 00693591 . 0 )
( se
2 1
=   .
3. Find the pvalue for the test.
Answer:
The tstatistic for the these hypotheses is
)
(
2 1
2 1
 
 
=
se
t
gen t = (_b[jc]_b[univ])/se_jc_min_univ
disp t
1.4676566
disp 1ttail(6759, 1.4676566)
.07112203
The pvalue suitable for the test is 0.07112203.
Since the one sided pvalue is 7.11% which is greater than 5%, then H0 should not be rejected at 5%
significance level, meaning that one year of junior college is worth one year at a university.
Define a new parameter
2 1 1
  u = and rearrange the model to give:
u exper totcoll jc
u exper univ jc jc
u exper univ jc jc jc wage
+ + + + =
+ + + + + =
+ + + + + =
3 2 1 0
3 2 2 1 0
3 2 2 2 1 0
) ( ) (
) log(
  u 
    
     
4. Estimate the modified model with OLS.
Answer:
gen totcoll=jc+univ
reg lwage jc totcoll exper
Source  SS df MS Number of obs = 6763
10
+ F( 3, 6759) = 644.53
Model  357.752575 3 119.250858 Prob > F = 0.0000
Residual  1250.54352 6759 .185019014 Rsquared = 0.2224
+ Adj Rsquared = 0.2221
Total  1608.29609 6762 .237843255 Root MSE = .43014

lwage  Coef. Std. Err. t P>t [95% Conf. Interval]
+
jc  .0101795 .0069359 1.47 0.142 .0237761 .003417
totcoll  .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper  .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons  1.472326 .0210602 69.91 0.000 1.431041 1.51361

5. Test 0 : H
1 0
= u against the onesided alternative 0 : H
1 1
< u
Answer:
4676 . 1 0069359 . 0 / 0101795 . 0 = = t
Onesided (left) critical value at % 5 = o is 1.65. This value can be obtained by using the command:
disp invttail(6759,0.95) which yields 1.6450791.
Since the absolute tstat is less than the absolute critical value, then H
0
should not be rejected. We
argue that 0
1
= u which means that
2 1
  = .
6. What is the pvalue for the test.
Answer:
The onesided (left) pvalue is 0.071. This value can be obtained by using the command disp 1
ttail(6759,1.47) or the command disp ttail(6759,1.47) which yields 0.07080415, or by
taking the P>t value from the stata output (0.142) and divide it by two.
Section 4.5 Testing multiple linear restrictions: The F test
Data: mlb1s8.dta
1. Check the data.
2. Estimate the regression model that explains major league baseball players salaries:
i i i i i i i
u rbisyr hrunsyr bavg gamesyr years salary + + + + + + =
5 4 3 2 1 0
) log(      
where salary is the 1993 total salary, years is years in the league, gamesyr is average games played
per year, bavg is career batting average, hrunsyr is home runs per year, and rbisyr is runs batted in per
year.
3. Test whether bavg, hrunsyr, and rbisyr are jointly statistically insignificant, using the SSRform of
the Ftest.
4. Test whether bavg, hrunsyr, and rbisyr are jointly statistically insignificant, using the Rsquared of
the Ftest.
Example 5.3 Testing multiple linear restrictions: The LM test
Data: crime1.dta
1. Check the data (number of observation and number of variables)
Answer: # obs is 2725; # variables is 16
11
2. We will conduct the LM test using the crime model below
u qemp86 ptime86 tottime avgsen pcnv narr86 + + + + + + =
5 4 3 2 1 0
     
Estimate the model using OLS.
Answer:
reg narr86 pcnv avgsen tottime ptime86 qemp86
Source  SS df MS Number of obs = 2725
+ F( 5, 2719) = 24.29
Model  85.9532425 5 17.1906485 Prob > F = 0.0000
Residual  1924.39391 2719 .707757967 Rsquared = 0.0428
+ Adj Rsquared = 0.0410
Total  2010.34716 2724 .738012906 Root MSE = .84128

narr86  Coef. Std. Err. t P>t [95% Conf. Interval]
+
pcnv  .1512246 .040855 3.70 0.000 .2313346 .0711145
avgsen  .0070487 .0124122 0.57 0.570 .031387 .0172897
tottime  .0120953 .0095768 1.26 0.207 .0066833 .030874
ptime86  .0392585 .0089166 4.40 0.000 .0567425 .0217745
qemp86  .1030909 .0103972 9.92 0.000 .1234782 .0827037
_cons  .7060607 .0331524 21.30 0.000 .6410542 .7710671

3. Use the LM statistic to test the null hypothesis that avgsen and tottime have no effect on narr86
once the other factors have been controlled for.
Step 1. Estimate the restricted model
reg narr86 pcnv ptime86 qemp86
Source  SS df MS Number of obs = 2725
+ F( 3, 2721) = 39.10
Model  83.0741941 3 27.691398 Prob > F = 0.0000
Residual  1927.27296 2721 .708295833 Rsquared = 0.0413
+ Adj Rsquared = 0.0403
Total  2010.34716 2724 .738012906 Root MSE = .8416

narr86  Coef. Std. Err. t P>t [95% Conf. Interval]
+
pcnv  .1499274 .0408653 3.67 0.000 .2300576 .0697973
ptime86  .0344199 .008591 4.01 0.000 .0512655 .0175744
qemp86  .104113 .0103877 10.02 0.000 .1244816 .0837445
_cons  .7117715 .0330066 21.56 0.000 .647051 .776492

Step 2. Obtain the residuals u
~
from the regression.
predict ures, resid
Step 3. Run the regression of u
~
on pcnv, ptime86, qemp86, avgsen, and tottime
reg ures pcnv avgsen tottime ptime86 qemp86
Source  SS df MS Number of obs = 2725
+ F( 5, 2719) = 0.81
Model  2.87904835 5 .575809669 Prob > F = 0.5398
Residual  1924.39392 2719 .707757969 Rsquared = 0.0015
+ Adj Rsquared = 0.0003
Total  1927.27297 2724 .707515773 Root MSE = .84128
12

ures  Coef. Std. Err. t P>t [95% Conf. Interval]
+
pcnv  .0012971 .040855 0.03 0.975 .0814072 .0788129
avgsen  .0070487 .0124122 0.57 0.570 .031387 .0172897
tottime  .0120953 .0095768 1.26 0.207 .0066833 .030874
ptime86  .0048386 .0089166 0.54 0.587 .0223226 .0126454
qemp86  .0010221 .0103972 0.10 0.922 .0193652 .0214093
_cons  .0057108 .0331524 0.17 0.863 .0707173 .0592956

Step 4. Calculate the LM statistics
. scalar lm = e(N)*e(r2)
. disp lm
4.0707294
Step 5a. Calculate the pvalue
. disp chi2tail(2,lm)
.13063283
Step 5b. Calculate the critical value at % 10 = o
. disp invchi2tail(2,0.1)
4.6051702
Step 6. Conclude
Since the critical value (4.605) is greater than the LM stat (4.0707), than we fail to reject the null
hypothesis that 0
2
=  and 0
3
=  at the 10% level.
The pvalue is 1306 . 0 ) 0707294 . 4 (
2
2
= > _ P , so we would reject the null at the 15% level.
Problems 3.4 Heteroskedasticity
Data: sleep75.dta
1. Check the data
2. The model below
u age educ totwrk sleep + + + + =
3 2 1 0
   
can be used to study the tradeoff between time spent sleeping and working and to look at other
factors affecting sleep. If adults trade off sleep for work, what is the sign of
1
 ? What signs do
you think
2
 and
3
 will have?
3. Estimate the model.
4. If someone works five more hours per week, by how many minutes is sleep predicted to fall? Is
this a large tradeoff?
5. Discuss the sign and magnitude of the estimated coefficient on educ.
6. Would you say totwrk, educ, and age explain much of the variation in sleep? What other factors
might affect the time spent sleeping? Are these likely to be correlated with totwrk? F stat
7. Explain intuitively the procedures from BreuschPagan and White to test the presence of
heteroskedastic error. Compare the two approaches. Dengan brus pagan
8. Conduct the BreuschPagan test for heteroskedasticity for the error term in the equation above
and explain whether you think the error u is heteroskedastic.
13
9. Conduct the White test for heteroskedasticity for the error term in the equation above and explain
whether you think the error u is heteroskedastic.