Anda di halaman 1dari 5

ECON 8899 - CAUSAL INFERENCE AND EVIDENCE-BASED POLICY

HOMEWORK 1
Prithvijit Mukherjee
September 24,2014
1. The Authors have three treatment variables and one control grouphence a correspond-
ing set of four causal exposure dummies, {D
j
}
3
j =0
and corresponding set of four poten-
tial outcome random variable {Y
j
}
3
j =0
. The Authors estimate the ATE of receiving each
of the three treatment is compared to receiving no treatment at all.
The following ATE are estimated in their trend analysis to elicit specic effect one
treatment with respect to other treatments:
a) ATE of receiving treatment 2 instead of treatment 1.
ATE =E[Y
D
2
|D =2] +E[Y
D
2
|D =1] E[Y
D
1
|D =1] E[Y
D
1
|D =2]
b) ATE of receiving treatment 3 instead of treatment 2.
ATE =E[Y
D
3
|D =3] +E[Y
D
3
|D =2] E[Y
D
2
|D =3] E[Y
D
2
|D =3]
c) ATE of receiving treatment 3 instead of treatment 1.
ATE =E[Y
D
3
|D =3] +E[Y
D
3
|D =1] E[Y
D
1
|D =3] E[Y
D
1
|D =1]
2. If in the paper the Authors added a fourth treatment, the Table1 below gives all
possible combination of observable and unobservable potential outcomes. There
would be 20 counterfactual states in the current study after the inclusion of the fourth
treatment variable.
Table 1: Observable and Counterfactual
Treatment Y
0
Y
1
Y
2
Y
3
Y
4
D
0
Observable as Y Counterfactual Counterfactual Counterfactual Counterfactual
D
1
Counterfactual Observable as Y Counterfactual Counterfactual Counterfactual
D
2
Counterfactual Counterfactual Observable as Y Counterfactual Counterfactual
D
3
Counterfactual Counterfactual Counterfactual Observable as Y Counterfactual
D
4
Counterfactual Counterfactual Counterfactual Counterfactual Observable as Y
1
The Authors do not need to worry about estimating the unobservable potential
outcomes because the assignment of treatment is randomized at every strata (i.e.
meter routes) hence it is independent of the potential outcomes, or {Y
j
}
4
j =0
{D
j
}
4
j =0
.Therefore in expectation say for treatment 4, the following equality holds:
E[Y |D =4]=E[Y
D
4
|D =0]=E[Y
D
4
|D =1]=E[Y
D
4
|D =2]=E[Y
D
4
|D =3]
3. To estimate Average Treatment Effect on the Treated (ATT) the authors would have to
assume the following:
a) To take care of selection bias
E[Y
0
|D =0]=E[Y
0
|D =1]=E[Y
0
|D =2]=E[Y
0
|D =3]
b) SUTVA in order to ensure that individual level causal effects are independent of
treatment assignment.
c) Exculdability in order to ensure that it is only the assignment of treatment which
impacts the outcome and no other factor, since the authors are trying to elicit the
causal impact of a particular treatment.
d) 0 <Pr (
i
) <1 the probability of assignment of treatment is strictly between 0 and
1.
4. The treatment is dened as "to initiate a targeted, mail-based residential customer
conservation education program through a randomized experimental design".
Therefore one can clearly dene the counterfactual as the people in the different
groups "receiving no targeted mail-based materials" or "not receiving any technical
advice on water conservation" or "not receiving weak social norm based targetted
letters" and "not receiving strong social norm based targetted letters". Note that the
control group kept on receiving their water bill with a tip sheet.
5. The central theme behind randomization is to eliminate all selection biases and other
potential rival explanations such that differences in outcomes is only solely
attributable to the treatment and chance variations. Randomization ensures that the
assignment of treatment is independent of individual covariates, thus ensuring
Pr [X|D =1]=Pr [X|D =0] which requires a balanced covariate between the control
and the treatment group. The test to show that there are no systematic correlation
between the assignment of treatment and post treatment results is critical in
ascertaining a causal relationship.
Eyeballing the summary statistics in the Table 2 and Table 3 below one does not
observe a striking difference observing 2006 and Spring 2007 pre-treatment results.
The covariates seems balanced across the treatment groups. The regression results in
Table 4 shows no evidence of a systematic relationship between treatment status and
pre-treatment results.
6. This does not imply that the randomization has failed. All it implies is that the one
particular covariate is imbalanced see Treatment 2 in Table 5. But this evidence is
weak observing summary statistic Table 6 looks balanced across mean between
various treatment groups. Thus the evidence against imbalanced randomization is
2
Table 2: 2006 - Summary Statistics
(June-Nov 2006, Treatment 1-3, 4 Control)
mean sd min max
1 58.42647 39.95928 20 676
2 58.17559 41.24874 20 801
3 58.43559 40.66821 20 1000
4 58.2982 41.38286 20 2441
Total 58.31386 41.13629 20 2441
Table 3: Spring 2007 - Summary Statistics
(April May 2007, Treatment 1-3, 4 Control)
mean sd min max
1 15.97807 11.74354 0 188
2 15.8782 11.6914 0 189
3 15.98219 11.52707 0 186
4 15.89007 12.02063 0 586
Total 15.90848 11.90151 0 586
Table 4: Regression Results
(1) (2)
2006 Spring 2007
treat1 0.128 0.0880
(0.31) (0.74)
treat2 -0.123 -0.0119
(-0.30) (-0.10)
treat3 0.137 0.0921
(0.33) (0.78)
_cons 58.30

15.89

(379.33) (357.36)
N 106669 106669
t statistics in parentheses

p <0.05,

p <0.01,

p <0.001
3
weak. If the experimenter nds imbalance across many covariates one should
consider re-randomization but is not applicable just based on evidence on house size.
Table 5: House Size Regression
(1)
acres_1
treat1 -0.0156
(-1.41)
treat2 -0.0310

(-2.79)
treat3 -0.00297
(-0.27)
_cons 0.583

(140.22)
N 103747
t statistics in parentheses

p <0.05,

p <0.01,

p <0.001
Table 6: House Size Summary Statistics
(1)
mean sd min max
1 .567446 1.089673 .0146871 46.328
2 .5520407 .8484981 .000023 50.293
3 .5801156 1.021446 .0116046 43.0625
4 .5830885 1.14611 0 90.7664
Total .5776593 1.097905 0 90.7664
7. Within-year variability for Summer 2007 is 28.963, when a the power test for a two
sided tail is employed we require a sample size of 38928. But for the analysis where we
are looking at reduced water use a one-sided test would sufce, when we employed a
one sided test the required sample size yielded was 28386.
8. Table 7 below reports the regression results from running the regression using Model
B. The advantage of adding data from 2006 and 2007 is that increases the precision of
the estimates. As we observed earlier that the data from 2006 and 2007 were
uncorrelated with treatment assignment including these variable account for other
unobserved variable thus lowering the standard error of the estimates.
9. The result discussed in the article is based on a pilot of seven children which I doubt
4
Table 7: Model B Estimation
(1)
Model B
treat1 -0.24
(0.19)
treat2 -0.99

(0.17)
treat3 -1.74

(0.17)
water_2006 0.36

(0.01)
apr_may_07 0.82

(0.04)
Constant 6.21

(0.45)
Observations 106669
Adjusted R
2
0.62
Standard errors in parentheses

p <0.05,

p <0.01,

p <0.001
will have anly statistical power to draw a causal inference about the effectiveness of
the treatment. Parents who participated in this pilot program are subject to selection
bias (since they were either a part of a program or referred by the community). There
are no medical test for autism, the cases which saw progress might be due to wrong
daignosis so false positive could be driving the results.
The measurement of the outcome and the treatment were months apart there could
be other rival explanations which could have driven the positive results for the pilot,
like some parent acted on their own accord after the initial treatment which lead to
the positive result.
The purpose of the treatment was to ascertain the positive impact of early treatment
on autism, however the control groups which were selected apart from "Children who
also had early autism symptoms but chose to receive treatment at an older age", might
not have been the ideal control group to compare the results.
5

Anda mungkin juga menyukai