Date: 24.08.2006
Version: tirc_80
University of Essex
Department of Government
Wivenhoe Park
Colchester CO4 3SQ
UK
contact: tpluem@essex.ac.uk, vtroe@essex.ac.uk
Abstract:
Earlier versions of this paper have been presented at the 21st Polmeth conference at Stanford University, Palo Alto, 29.-31. July 2004, the 2005 MPSA conference in Chicago, 7.-10. April and the APSA annual conference 2005 in Washington, 1.-4. September 2005. We thank the editor and the referees of Political
Analysis and Neal Beck, Greg Wawro, Donald Green, Jay Goodliffe, Rodrigo
Alfaro, Rob Franzese, Jrg Breitung and Patrick Brandt for helpful comments
on previous drafts. The usual disclaimer applies.
1.
Introduction
The analysis of panel data has important advantages over pure time-series or
cross-sectional estimates advantages that may easily justify the extra costs of
collecting information in both the cross-sectional and the longitudinal dimension. Many applied researchers rank the ability to deal with unobserved heterogeneity across units most prominently. They pool data just for the purpose of
3
controlling for the potentially large number of unmeasured explanatory variables by estimating a fixed effects (FE) model.
Yet, these clear advantages of the fixed effects model come at a certain price.
One of its drawbacks, the problem of estimating time-invariant variables in
panel data analyses with unit effects, has widely been recognized: Since the FE
model uses only the within variance for the estimation and disregards the
between variance, it does not allow the estimation of time-invariant variables
(Baltagi 2001, Hsiao 2003, Wooldridge 2002). A second drawback of the FE
model (and by far the less recognized one) is its inefficiency in estimating the
effect of variables that have very little within variance. Typical examples in
political science include institutions, but political scientists have used numerous
variables that show much more variation across units than over time. An
inefficient estimation is not merely a nuisance leading to somewhat higher
standard errors. Inefficiency leads to highly unreliable point estimates and may
thus cause wrong inferences in the same way a biased estimator could. Therefore, the inefficiency of the FE model in estimating variables with low within
variance needs to be taken seriously.
This article discusses a remedy to the related problems of estimating timeinvariant and rarely changing variables in fixed effects model with unit effects.
We suggest an alternative estimator that allows estimating time-invariant
variables and that is more efficient than the FE model in estimating variables
that have very little longitudinal variance. We call this superior alternative
fixed effects vector decomposition (fevd) model, because the estimator decomposes the unit fixed effects in an unexplained part and a part explained by the
time-invariant or the rarely changing variables. The fixed effects vector
decomposition technique involves the following three steps: First, estimation of
the unit fixed effects by the baseline panel fixed effects model excluding the
time-invariant but not the rarely changing right hand side variables. Second,
regression of the fixed effects vector on the time invariant and/or rarely
changing explanatory variables of the original model (by OLS) to decompose
the unit specific effects into a part explained by the time invariant variables and
an unexplained part. And third, estimation of a pooled OLS model by including
all explanatory time-variant variables, the time-invariant variables, the rarely
changing variables and the unexplained part of the fixed effects vector. This
4
stage is required to control for multicollinearity and to adjust the degrees of
freedom in estimating the standard errors of the coefficients.1
Based on Monte Carlo simulations we demonstrate that the vector decomposition model has better finite sample properties in estimating models that include
either time-invariant or almost time-invariant variables correlated with unit
effects than competing estimators. In the analyses dealing with the estimation of
time-invariant variables, we compare the vector decomposition model to the
fixed effects model, the random effects model, pooled OLS and the HausmanTaylor model. We find that while the fixed effects model does not compute
coefficients for the time-invariant variables, the vector decomposition model
performs far better than pooled OLS, random effects and the Hausman-Taylor
procedure if both time-invariant and time-varying variables are correlated with
the unit effects.
The analysis of the rarely changing variables takes these results one step
further. Again based on Monte Carlo simulations, we show that the vector
decomposition method is more efficient than the fixed effects model2 and thus
gives more reliable estimates than the fixed effects model under a wide variety
of constellations. Specifically, we find that the vector decomposition model is
superior to the fixed effects model when the ratio between the between variance
and the within variance is large, when the overall R is low, and when the
correlation between the rarely changing / time-invariant variable and the unit
effects (i.e. the higher the effectively used between variance) is low. These
advantages of the fevd model equally apply to both cross-sectional and timeseries dominant panel data. What matters for the estimation problem provided
by time-invariant and rarely changing variables is not so much whether the
data set at hand includes more cases or periods, but whether the between variation exceeds the within variation by a certain threshold.
The procedure we suggest is superficially similar to that suggested by Hsiao (2003: 52).
However, Hsiao only claims that his estimate for time-invariant variables ( ) is consistent
as N approaches infinity. We are interested in the small sample properties of our estimator
and thus explore time-series cross sectional (TSCS) data. Hsiao (correctly) notes that his
is inconsistent for TSCS. Moreover, he does not provide standard errors for his estimate
of , nor does he compare his estimator to others. Since we fully develop our estimator, we
do not further consider Hsiao's brief discussion.
We also ran all simulations on rarely changing variables for the random effects model and
pooled OLS. Unless the time-varying variables are uncorrelated with the unit effects, the
vector decomposition model performs strictly better than both competitors. For the sake of
clarity and simplicity, we do not report simulation output for pooled OLS and the RE model
in the section dealing with rarely changing variables.
5
In a substantive perspective, this article contributes to an ongoing debate about
the pros and cons of fixed effects models (Green et al. 2001, Beck/ Katz 2001;
Plmper et al. 2005; Wilson/ Butler 2003; Beck 2001). While the various parties in the debate put forward many reasons for and against fixed effects
models, this paper analyzes the conditions under which the fixed effects model is
inferior to alternative estimation procedures. Most importantly, it suggests a
superior alternative for the cases in which the FE models inefficiency impedes
reliable point estimates.
We proceed as follows: In section 2 we illustrate the estimation problem and
discuss how applied researchers dealt with it. In section 3, we describe the
econometrics of the fixed effects vector decomposition procedure in detail.
Section 4 explains the setup of the Monte Carlo experiments. Section 5 analyzes
the finite sample properties of the proposed fevd procedure relative to the fixed
effects and the random effects model, the pooled OLS estimator, and the
Hausman-Taylor procedure in estimating time-invariant variables and section 6
presents MC analyses for rarely changing variables in which we without loss of
generality compare only the fixed effects model to the vector decomposition
model. Section 7 concludes.
2.
6
invariant or rarely changing variable. The level of democracy, the status of the
president, electoral rules, central bank autonomy, or federalism to mention
just a few do not change often even in relatively long pooled time-series
datasets. Other politically relevant variables, such as the size of the minimum
winning coalition, and the number of veto-players change more frequently, but
the within variance, the variance over time, typically falls short of the between
variance, the variance across units. The same may hold true for some macroeconomic aggregates. Indeed, government spending, social welfare, tax rates,
pollution levels, or per capita income change from year to year, but panels of
these variables can still be dominantly cross-sectional.
Unfortunately, the problem of rarely changing variables in panel data with unit
effects remained by-and-large unobserved.3 Since the fixed effects model can
compute a coefficient if regressors are almost time-invariant, it seems fair to say
that most applied researchers have accepted the resulting inefficiency of the
estimate without paying too much attention. Yet, as Nathaniel Beck has
unmistakably formulated: () although we can estimate () with slowly
changing independent variables, the fixed effect will soak up most of the
explanatory power of these slowly changing variables. Thus, if a variable ()
changes over time, but slowly, the fixed effects will make it hard for such
variables to appear either substantively or statistically significant. (Beck 2001:
285) Perhaps even more importantly, inefficiency does not just imply low levels
of significance; point estimates are also unreliable since the influence of the error
on the estimated coefficients becomes larger as the inefficiency of the estimator
increases.
In comparison, by far more attention was devoted to the problem of timeinvariant variables. With the fixed effects model not computing coefficients for
time-invariant variables, most applied researchers apparently estimated empirical models that include time-invariant variables by random effects models or by
pooled-OLS (see for example Elbadawi/ Sambanis 2002; Acemoglu et al. 2002;
Knack 1993; Huber/ Stephens 2001). Daron Acemoglu et al. (2002) justify not
controlling for unit effects by stating the following: Recall that our interest is
in the historically-determined component of institutions (that is more clearly
exogenous), hence not in the variations in institutions from year-to-year. As a
3
None of the three main textbooks on panel data analysis (Baltagi 2001, Hsiao 2003,
Wooldridge 2002) refers explicitly to the inefficiency of estimating rarely changing variables
in a fixed effects approach.
7
result, this regression does not (cannot) control for a full set of country
dummies. (Acemoglu et al. 2002: 27)
Clearly, both the random effects model and pooled-OLS are inconsistent and
biased when regressors are correlated with the unit effects. Employing these
models trades the ability to compute estimates of time-invariant variables for
the unbiased estimation of time-varying variables. Thus, they may be a secondbest solution if researchers are solely interested in the coefficients of the timeinvariant variables.
In contrast, econometric textbooks typically recommend the Hausman-Taylor
procedure for panel data with time-invariant variables and correlated unit
effects (Hausman/ Taylor 1981; see Wooldridge 2002: 325-328; Hsiao 2003: 53).
The idea of the estimator is to overcome the bias of the random effects model in
the presence of correlated unit effects and the solution is standard: If a variable
is endogenous use appropriate instruments. In brief, this procedure estimates a
random effects model and uses exogenous time-varying variables as instruments
for the endogenous time-varying variables and exogenous time-invariant
variables plus the unit means of the exogenous time varying variables as
instruments for the endogenous time-invariant variables (textbook characterizations of the Hausman-Taylor model can be found in Wooldridge 2002, pp. 22528 and Hsiao 2003, pp. 53ff). In an econometric perspective, the procedure is a
consistent solution to the potentially severe problem of correlation between unit
effects and time-invariant variables. Unfortunately, the procedure can only work
well if the instruments are uncorrelated with the errors and the unit effects and
highly correlated with the endogenous regressors. Identifying those instruments
is a formidable task especially since the unit effects are unobserved (and often
unobservable). Nevertheless, the Hausman-Taylor estimator has recently gained
in popularity at least among economists (Egger/ Pfaffermayr 2004).
3.
Recall the data-generating process of a fixed effects model with time invariant
variables:
K
k =1
m =1
yi t = + k x k i t + m z mi + u i + i t .
(1)
8
where the x-variables are time-varying and the z-variables are assumed to be
time-invariant.4 ui denotes the unit specific effects (fixed effects) of the data
generating process and it is the iid error term, and are the parameters to
estimate.
In the first stage, the fixed effects vector decomposition procedure estimates a
standard fixed effects model. The fixed effects transformation can be obtained
by first averaging equation (1) over T:
K
k =1
m =1
yi = k x k i + m z mi + ei + u i
(2)
where
yi =
1 T
1 T
1 T
yi t , x i = x i t , ei = ei t
T t =1
T t =1
T t =1
and e stands for the residual of the estimated model. Then equation 2 is
subtracted from equation 1. As is well known, this transformation removes the
individual effects u i and the time-invariant variables z. We get
K
yi t yi = k x k i t x k i + m ( z mi z mi ) + ei t e + ( u i u i )
k =1
m =1
(3)
variables of the fixed effects transformation. We run this fixed effects model
with the sole intention to obtain estimates of the unit effects u i . At this point,
it is important to note that the estimated unit effects u i do not equal the unit
effects u i in the data generating process since estimated unit effects include all
time invariant variables the overall constant term and the mean effects of the
time-varying variables x. Equation 4 explains how the unit effects are computed
and what explanatory variables account for these unit effects
K
u i = yi kFE x ki ei
(4)
k =1
where FE
is the pooled OLS estimate of the demeaned model in equation 3.
k
The u i include the unobserved unit specific effects as well as the observed unit
specific effects z, the unit means of the residuals ei and the time-varying
variables x ki , whereas u i only account for unobservable unit specific effects. In
4
In section 5 we assume that one z-variable is rarely changing and thus only almost timeinvariant.
9
stage 2 we regress the unit effects u i from stage 1 on the observed timeinvariant and rarely changing variables the z-variables (see equation 5) to
obtain the unexplained part h i (which is the residual from regression the unit
specific effect on the z-variables). In other words, we decompose the estimated
unit effects into two parts, an explained and an unexplained part that we dub
hi :
M
u i = m z mi + h i
(5)
m =1
h i = u i - m z mi
(6)
m =1
As we said above, this crucial stage decomposes the unit effects into an
unexplained part and a part explained by the time-invariant variables. We are
solely interested in the unexplained part h i .
In stage 3 we re-run the full model without the unit effects but including the
unexplained part h i of the decomposed unit fixed effect vector obtained in stage
2. This stage is estimated by pooled OLS.
K
k =1
m =1
yi t = + k x k i t + m z mi + h i + i t .
(7)
10
of units (N-1) to account for the number of estimated unit effects in stage 1.
The deviation of fevd standard errors from pooled OLS standard errors of the
same model increases in N and decreases in T. Not correcting the degrees of
freedom leads to a potentially serious underestimation of standard errors and
overconfidence in the results. In adjusting the standard errors we explicitly
control for the specific characteristics of the three step approach.
Estimating the model requires that heteroscedasticity and serial correlation
must be eliminated. If the structure of the data at hand is as such, we suggest
running a robust Sandwich-estimator or a model with panel corrected standard
errors (in stage 3) and inclusion of the lagged dependent variable (Beck and
Katz 1995) or/and model the dynamics by an MA1 process (Prais-Winsten
transformation of the original data in stage 1 and 3).5 The coefficients of the
time-invariant variables are estimated in a procedure similar to cross-sectional
OLS. Accordingly, the estimation of time invariant variables shares the pooled
OLS properties. However, the estimation deals with an omitted variable (the
unobserved unit effects), the estimator remains inconsistent even if N
approaches infinity. A potential solution is to use instruments for the timeinvariant and rarely changing variables correlated with the unit effects.
However, such instruments are notoriously difficult to find, especially since unit
effects are unobservable.
In the absence of appropriate instruments, all existing estimators give biased
results. In the case of the fixed effects vector decomposition model, this is the
case because in order to compute coefficients for the time invariant variables,
we need to make a stark assumption: All variance is attributed to the rarely
changing or time-invariant variables and the covariance between the z-variables
and the fixed effects is assumed to be zero.
The bias of m estimated by OLS in the second stage depends on the covariance
between the z-variables and the fixed effects and the cross-sectional variance of
the z-variables. The bias of m is positive, the coefficient of the z-variables tends
to be larger than the true value, if the rarely changing and time-invariant
Since pcse and robust options only manipulate the VC matrix and therefore the standard
errors of the coefficients it is sensible to do these corrections only in stage 3 because stage 1
is solely used to receive the fixed effects (which are not altered by either pcse or robust VC
matrix). A correction for serial correlation by a Prais-Winsten transformation also affects
the estimates and therefore the estimated fixed effects in stage 1 and is accordingly
implemented in both stage 1 and stage 3 of the procedure.
11
variables co-vary positively with the unit fixed effects and vice versa. The larger
the between variance of the z-variables the smaller the actual bias of m .
4.
For two reasons we are not interested in analyzing the infinite sample properties: First,
econometric textbook wisdom suggests that pooled OLS is the best estimator if N equals
infinity while the FE model has the best properties for the problem at hand if N is finite and
T infinite. And second, data sets used by applied researchers have typically fairly limited
sizes. Adolph, Butler, and Wilson (2005, pp 4-5) show that most data sets analyzed by
political scientists consist of between 20 and 100 cases typically observed over between 20
and 50 periods. Unfortunately, an estimator with optimal asymptotic properties does not
need to perform best with finite samples.
12
i t is white noise and is for each run repeatedly drawn from a standard normal
possible permutations of these settings is 2000, which would have led to 2000
times the aggregated number of estimators used in both experiments times 1000
single estimations in the Monte Carlo analyses. In total, this would have given
18 million regressions. However, without loss of generality, we simplified the
Monte Carlos and estimated only 980,000 single regression models. We report
only representative examples of these Monte Carlo analyses here, but the output
of the simulations is available upon request.
N~(0,1); z3 in chapter 5 is rarely changing, the between and within standard deviation for
this variable are changed according to the specifications in figures 5-7.
13
5.
We report the RMSE and the bias of the five estimators, averaged over 10
experiments with varying correlation between z3 and u i . The Monte Carlo
analysis underlying Table 1 holds the sample size and the correlation between
x3 and u i constant. In other words, we vary only the correlation between the
correlated time-invariant variable z3 and the unit effects corr(u,z3).
Table 1 about here
Observe first, that (in this and all following tables) we highlight all estimation
results, in which the estimator performs best or within a narrow range of 10%
(of the RMSE) to the best estimator. Table 1 reveals that estimators vary
widely in respect to the correlated explanatory variables x3 and z3. While the
vector decomposition model, Hausman-Taylor, and the fixed effects model
estimate the coefficient of the correlated time-varying variable (x3) with almost
identical accuracy, pooled OLS, the vector decomposition model and the
random effects model perform more or less equally well in estimating the effects
of the correlated time-invariant variable (z3). In other words, only the fixed
effects vector decomposition model performs best with respect to both variables
correlated with the unit effects, x3 and z3.
The poor performance of Hausman-Taylor results from the inefficiency of
instrumental variable models. While it holds true that one can reduce the
inefficiency of the Hausman-Taylor procedure by improving the quality of the
instruments (Breusch/ Mizon/ Schmidt 1989; Amemiya/ MaCurdy 1986, Baltagi/ Khanti-Akom 1990, Baltagi / Bresson/ Pirotte 2003, Oaxaca/ Geisler 2003),
all carefully selected instruments have to satisfy two conditions simultaneously:
they have to be uncorrelated with the unit effects and correlated with the
endogenous variables. Needless to say that finding instruments which simultaneously satisfy these two conditions is a difficult task especially since the unit
effects cannot be observed but only estimated.
Pooled OLS and the random effects model fail to adequately account for the
correlation between the unit effects and both the time-invariant and the timevarying variables. Hence, parameter estimates for all variables correlated with
the unit effects are biased. When applied researchers are theoretically interested
14
in both time-varying and time-invariant variables, the fixed effect vector
decomposition technique is superior to its alternatives.
Figures 1a-d allow an equally easy comparison of the five competing estimators.
Note that in the simulations underlying these figures, we held all parameters
constant and varied only the correlation between the time-invariant variable z3
and u i (Figures 1a and 1b) and the time-varying variable x3 and u i (Figures 1c
and 1d), respectively. Figures 1a and 1c display the effect of this variation on
the RMSE of the estimates for the time-varying variable x3, Figures 1b and 1d
the effect on the coefficient of the time-invariant variable z3.
corr(z3, u i ) affects
1.0
0.10
0.8
RMSE (x3)
0.11
0.09
RSME (x3)
corr(x3, u i ) affects
pooled OLS
0.08
random effects
the
0.07
RMSE
0.06
0.6
0.2
fixed effects
0.0
hausman-taylor
of x3
xtfevd
0.05
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.0
1.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
corr (x3, u)
corr (z3, u)
rho( u i ,x3)=0.3
rho( u i ,z3)=0.3
2.2
2.0
2.0
1.8
1.8
1.4
RMSE (z3)
RMSE (z3)
hausman-taylor
1.4
1.2
1.0
the
0.8
RMSE
0.4
hausman-taylor
1.6
1.6
of z3
random effects
pooled OLS
0.4
1.2
1.0
0.8
0.6
0.6
xtfevd
0.4
0.2
xtfevd
0.2
0.0
0.0
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.0
0.1
0.2
corr (z3, u)
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
corr (x3, u)
rho( u i ,x3)=0.3
rho( u i ,z3)=0.3
Figures 1 a-d: Change in the RMSE over variation in the correlation between
the unit effects and z3, x3, respectively
15
Figures 1a to 1d re-establish the results of Table 1. We find that fevd, random
effects and pooled OLS perform equally well in estimating the coefficient of the
correlated time-invariant variable z3, while fixed effects, Hausman-Taylor and
fevd are superior in estimating the coefficient of time-varying variable x3. We
find that the advantages of the vector decomposition procedure over its
alternatives do not depend on the size of the correlation between the regressors
and the unit effects but rather hold over the entire bandwidth of correlations.
The fixed effects vector decomposition model is the sole model which gives
reliable finite sample estimates if the dataset to be estimated includes timevarying and time-invariant variables correlated with the unit effects.8 This
seems to suggest that there is no reason to use the fevd estimator in the absence
of time-invariant variables. In the following section, we demonstrate that this
conclusion is not correct. The fevd estimator also gives more reliable estimates
of the coefficients of variables which vary over time but which are almost timeinvariant. We call those variable rarely changing variables.
6.
One advantage of the fixed effects vector decomposition procedure over the
Hausman-Taylor procedure and the Hsiao suggestions is that it extends nicely
to almost time-invariant variables. Estimation of these variables by fixed effects
gives a coefficient, but the estimation is extremely inefficient and hence the
estimated coefficients are unreliable (Green et al. 2001 Beck/ Katz 2001).
However, if we do not estimate the model by fixed effects, than estimated
coefficients are biased if the regressor is correlated with the unit effects. Since it
seems not unreasonable to assume that the unit effects are made up primarily of
geographical and various institutional variables, it is not unreasonable to
perform an orthogonal decomposition of the explained part and an unexplained
part as described above. Clearly, the orthogonality assumption is often
incorrect and this will inevitably bias the estimated coefficients of the almost
time-invariant variables. As we will demonstrate in this section, this bias is
under identifiable conditions less harmful than the inefficiency caused by fixed
effects estimation. At the same time, our procedure is also superior to
Appendix A demonstrates that this result also holds true when we vary the sample size.
Even with a comparably large T and N the fixed effects vector decomposition model
performs best.
16
estimation by random effects or pooled OLS as we leave the time-varying
variables unbiased whereas the latter two procedures do not.
Obviously the performance of fevd will depend on what exactly the data
generating process is. In our simulations we show that unless the DGP is highly
unfavorable for fevd, our procedure performs reasonably well and is generally
better than its alternatives.
Before we report the results of the Monte Carlo simulations, let us briefly
explain why the estimation of almost time-invariant variables by the standard
fixed effects model is problematic due to inefficiency and what that inefficiency
does to the estimate. The inefficiency of the FE model results from the fact that
it disregards the between variation. Thus, the FE model does not take all the
available information into account. In technical terms, the estimation problem
stems from the asymptotic variance of the fixed effects estimator that is shown
in equation 8:
( )
(8)
We have also compared the vector decomposition and the fixed effects model to pooled OLS
and the random effects model. Since all findings for time-invariant variables carry over to
rarely changing variables, indicating that the vector decomposition model dominates pooled
OLS and random effects models, we report the results of the RE and pooled OLS Monte
Carlos only in the online appendix.
17
Results displayed in Table 2 mirror those reported in Table 1. As before, we
find that only the fevd procedure gives sufficiently reliable estimates for both
the correlated time-varying x3 and the rarely changing variable z3. As expected,
the fixed effects model provides far less reliable estimates of the coefficients of
rarely changing variables. There can thus be no doubt that the fixed effects
vector decomposition model can improve the reliability of the estimation in the
presence of variables with low within and relatively high between variance. We
also find that pooled OLS and the RE model estimate rarely changing variables
with more or less the same degree of reliability as the fevd model but are far
worse in estimating the coefficients of time-varying variables. Note that these
results are robust regardless of sample size.10
Since any further discussion of these issues would be redundant, we do not
further consider the RE and the pooled OLS model in this section. Rather, this
section provides answers to two interrelated questions: First, can the vector
decomposition model give more reliable estimates (a lower RMSE) than the FE
model? And second, in case we can answer the first question positively, what
are the conditions that determine the relative performance of both estimators?
To answer these questions, we assess the finite sample properties of the
competing models in estimating rarely changing variables by a second series of
Monte Carlo experiments. With one notable exception, the data generating
process in this section is identical to the one used in section 5. The exception is
that now z3 is not time-invariant but a rarely changing variable with a low
within variation and a defined ratio of between to within variance.
The easiest way to explore the relative performance of the fixed effects model
and the vector decomposition model is to change the ratio between the between
variance and the within variance across experiments. We call this ratio the b/wratio and compute it by dividing the between standard deviation by the within
standard deviation of a variable. There are two ways to vary this ratio
systematically: we can hold the between variation constant and vary the within
variation or we can hold the within variation constant and vary the between
variation. We use both techniques. In Figure 2, we hold the between standard
10
We re-ran all Monte Carlo experiments on rarely changing variables for different sample
sizes. Specifically, we analyzed all permutations of N={15, 30, 50, 70, 100} and T={20, 40,
70, 100}. The results are shown in Table A2 of Appendix A (see the Political Analysis
webpage). All findings for rarely changing variables remain valid for larger and smaller
samples, as well as for N exceeding T and T exceeding N.
18
deviation constant at 1.2 and change the within standard deviation successively
from 0.15 to 1.73, so that the ratio of between to within variation varies
between 8 and 0.7. In Figure 3, we hold the within variance constant and
change the between variance.
Parameter settings:
2.2
2.0
N=30
1.8
T=20
RMSE (z3)
1.6
rho(u,x3)=0;
1.4
rho(u,z3)=0.3
fixed effects
1.2
1.0
0.8
within SD (z3)
0.6
fevd
0.151.73
0.4
0.2
8
19
Parameter settings:
0.8
N=30
0.7
T=20,
RMSE (z3)
0.6
rho(u,x3)=0
0.5
rho(u,z3)=0.3
0.4
fixed effects
0.3
within SD (z3): 1
0.2
0.1
fevd
0.0
8
20
Parameter settings:
5.0
N=30
4.5
T=20,
4.0
3.5
R = 0.5
b/w-ratio
3.0
2.5
rho(u,x3)=0,
2.0
1.5
1.0
0.5
0.0
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
within SD (z3): 1
correlation of z3 and ui
Figure 4: The correlation between z3 and u i and the minimum ratio between
the between and within standard deviation that renders fevd superior to the
fixed effects model
Note that, as expected, the threshold b/w-ratio is strictly increasing in the
correlation between the rarely changing variable and the unobserved unit
effects. In the case where the rarely changing variable is uncorrelated with u i ,
the threshold of b/w-ratio is as small as 0.2. At a correlation of 0.3, fevd is
superior to the FE model if the b/w-ratio is larger than approximately 1.7; at a
correlation of 0.5 the threshold increases to about 2.8 and at a correlation of 0.8
the threshold gets close to 3.8. Therefore, we cannot offer a simple rule of
thumb which informs applied researchers of when a particular variable is better
estimated as invariant variable by fevd or as time-varying variable. Even worse,
the correlation between the unit effects and the rarely changing variable is not
directly observable, because the unit effects are unobservable. However, the
odds are that at a b/w-ration of at least 2.8, the variable is better included into
the stage 2 estimation of fevd than estimated by a standard FE model.
Applied researchers can improve estimates created by the vector decomposition
model by reducing the potential for correlation. To do so, stage 2 of the fevd
model needs to be studied carefully. We can reduce the potential for bias of the
estimation by including additional time-invariant or rarely-changing variables
into stage 2. This may reduce bias but is likely to also reduce efficiency.
Alternatively, applied researchers can use variables which are uncorrelated with
the unit effects as instruments for potentially correlated time-invariant or rarely
changing variables a strategy which resembles the Hausman-Taylor model.
21
Yet, as we have repeatedly pointed out: it is impossible to tell good from bad
instruments since the unit effects can not be observed.
The decision whether to treat a variable as time invariant or varying depends
on the ratio of between to within variation of this variable and on the
correlation between the unit effects and the rarely changing variables. In this
respect, the estimation of time-invariant variables is just a special case of the
estimation of rarely changing variables a special case in which the between-towithin variance ratio equals infinity and fevd is consequently better.
These findings suggest that strictly speaking the level of within variation
does not influence the relative performance of fevd and FE models. However,
with a relatively large within variance, the problem of inefficiency does not
matter much the RMSE of the FE estimator will be low. Still, if the within
variance is large but the between variance is much larger, the vector
decomposition model will perform better on average. With a large within
variance, the actual absolute advantage in reliability of the fevd estimator will
be tiny.
From a more general perspective, the main result of this section is that the
choice between the fixed effects model and the fevd estimator depends on the
relative efficiency of the estimators and on the bias. As King, Keohane and
Verba have argued (1994: p. 74), applied researchers are not well advised if they
base their choice of the estimator solely on unbiasedness. At times, point
predictions become more reliable (the RMSE is smaller) when researchers use
the more efficient estimator. The fixed effects vector decomposition model is
more efficient than the fixed effects model since it uses more information.
Rather than just relying on the within variance, our estimator also uses the
between variance to compute coefficients.
6.
Conclusion
Under identifiable conditions, the vector decomposition model produces more reliable estimates for time-invariant and rarely changing variables in panel data
with unit effects than any alternative estimator of which we are aware. The case
for the vector decomposition model is clear when researchers are interested in
time-invariant variables. While the fixed effects model does not compute
coefficients of time-invariant variables, the vector decomposition model performs
better than the Hausman-Taylor model, pooled OLS and the random effects
model.
22
The case for the vector decomposition model is less straightforward, when at
least one regressor is not strictly time-invariant but shows some variation across
time. Nevertheless, under many conditions the vector decomposition technique
produces more reliable estimates. These conditions are: first and most
importantly, the between variation needs to be larger than the within variation;
and second, the higher the correlation between the rarely changing variable and
the unit effects, the worse the vector decomposition model performs relative to
the fixed effects model and the higher the b/w-ratio needs to be to render fevd
more reliable.
From our Monte Carlo results, we can derive the following rules that may
inform the applied researchers selection of an estimator on a more general level:
Estimation by Pooled-OLS or random effects models is only appropriate if unit
effects do not exist or if the Hausman-test suggests that existing unit effects are
uncorrelated with the regressors. If either of these conditions is not satisfied, the
fixed effects model and the vector decomposition model compute more reliable
estimates for time-varying variables. Among these models, the fixed effects
model performs best if the within variance of all regressors of interest is
sufficiently large in comparison to their between variance. We suggest
estimating a fixed effects model, unless the ratio of the between-to-within
variance exceeds 2.8 for at least one variable of interest. Otherwise, the
efficiency of the fixed effects vector decomposition model becomes more
important than the unbiasedness of the fixed effects model. Therefore, the
vector decomposition procedure is the model of choice if at least one regressor is
time-invariant or if the between variation of at least one regressor exceeds it's
within variation by at least a factor of 2.8 and if the Hausman-test suggests
that regressors are correlated with the unit effects.
23
References
Acemoglu, Daron/ Johnson, Simon/ Robinson, James, Thaicharoen, Yunyong
(2002): Institutional Causes, Macroeconomic Symptoms: Volatility, Crises
and Growth, NBER working paper 9124.
Adolph, Christopher/ Butler, Daniel M. / Wilson, Sven E. (2005): Like Shoes
and Shirt, One Size Does Not Fit All: Evidence on Time Series CrossSection Estimators and Specifications from Monte Carlo Experiments,
unpubl. Manuscript.
Alfaro, Rodrigo A. (2005): Application of the Symmetrically Normalized IV
Estimator, unp. manuscript, Boston Colloge.
Amemiya, Takeshi/ MaCurdy, Thomas E. (1986): Instrumental-Variable
Estimation of an Error-Components Model, Econometrica 54: 869-881.
Baltagi, Badi H. (2001): Econometric Analysis of Panel Data, Wiley and Sons
Ltd.
Baltagi, Badi H./ Khanti-Akom, Sophon (1990): On Efficient Estimation with
Panel Data: An Empirical Comparison of Instrumental Variable
Estimators, Journal of Applied Econometrics 5, 401-406.
Baltagi, Badi H./ Bresson, Georges/ Pirotte, Alain (2003): Fixed Effects,
Random Effects or Hausman-Taylor? A Pretest Estimator, Economics
Letters 79, 361-369.
Beck, Nathaniel (2001): Time-Series-Cross-Section Data: What Have We
Learned in the Past Few Years? Annual Review of Political Science 4,
271-293.
Beck, Nathaniel/ Katz, Jonathan (1995): What to do (and not to do) with
Time-Series Cross-Section Data, American Political Science Review 89:
634-647.
Beck, Nathaniel/ Katz, Jonathan N. (2001): Throwing Out the Baby with the
Bath Water: A Comment on Green, Kim, and Yoon, International
Organization 55:2, 487-495.
Breusch, Trevor S./ Mizon, Grayham E./ Schmidt, Peter (1989): Efficient
Estimation using Panel Data, Econometrica 57, 695-700.
Cornwell, Christopher/ Rupert, Peter (1988): Efficient Estimation with Panel
Data: An Empirical Comparison of Instrumental Variables Estimators,
Journal of Applied Econometrics 3, 149-155.
Egger, Peter/ Pfaffermayr, Michael (2004): Distance, Trade and FDI: A
Hausman-Taylor SUR Approach, Journal of Applied Econometrics 19,
227-246.
Elbadawi, Ibrahim/ Sambanis, Nicholas (2002): How Much War Will We See?
Explaining the Prevalence of Civil War. Journal of Conflict Resolution
46:3, 307-334.
Green Donald P./ Kim, Soo Yeon/ Yoon, David H. (2001): Dirty Pool,
International Organization 55, 441-468.
Greenhalgh, C./ Longland, M./ Bosworth, D. (2001): Technological Activity
and Employment in a Panel of UK Firms, Scottish Journal of Political
Economy 48, 260-282.
Hausman, Jerry A. (1978): Specification Tests in Econometrics, Econometrica
46, 1251-1271.
Hausman, Jerry A./ Taylor, William E. (1981): Panel Data and Unobservable
Individual Effects, Econometrica 49: 6, 1377-1398.
Hsiao, Cheng (1987): Identification, in: John Eatwell, Murray Milgate, and
Peter Newman (ed.): Econometrics, W.W. Norton: London, 95-100.
Hsiao, Cheng (2003): Analysis of Panel Data, Cambridge University Press,
Cambridge.
Huber, Evelyne/ Stephens, John D. (2001): Development and Crisis of the Welfare State. Parties and Policies in Global Markets, University of Chicago
Press, Chicago.
24
Iversen, Torben/ Cusack, Thomas (2000): The Causes of Welfare State
Expansion. Deindustrialization of Globalization, World Politics 52, 313349.
Knack, Stephen (1993): The Voter Participation Effects of Selecting Jurors from
Registration Lists. Journal of Law and Economics 36, 99-114.
King, Gary / Keohane, Robert O. / Verba, Sidney (1994): Designing Social
Inquiry: Scientific Inference in Qualitative Research, Princeton University
Press, Princeton, New Jersey.
Oaxaca, Ronald L./ Geisler, Iris (2003): Fixed Effects Models with TimeInvariant Variables. A Theoretical Note, Economics Letters 80, 373-377.
Plmper, Thomas/ Troeger, Vera E./ Manow, Philip (2005): Panel Data
Analysis in Comparative Politics. Linking Method to Theory, European
Journal of Political Research 44, 327-354.
Wilson, Sven E./ Butler, Daniel M. (2003): Too Good to be True? The Promise
and Peril of Panel Data in Political Science, unp. Manuscript, Brigham
Young University, 2003.
Wooldridge, Jeffrey M. (2002): Econometric Analysis of Cross Section and Panel
Data, MIT Press, Cambridge.
25
P-OLS
RMSE
fevd
average
bias
RMSE
average
bias
time-varying
variable x3
0.187
-0.167
0.103
0.001
time-invariant
variable z3
0.494
-0.470
0.523
-0.548
Settings of the parameter held constant:
N=30, T=20
Corr(u,x1)=corr(u,x2)=corr(u,z1)=corr(u,z2)=0
Corr(u,x3)=0.5
Hausmann-Taylor
average
RMSE
bias
0.105
-0.003
RE
FE
RMSE
average
bias
RMSE
average
bias
0.173
-0.149
0.103
-0.001
1.485
-1.128
0.506
-0.481
Settings of the varying parameter:
Corr(u,z3)={0.1, 0.2,, 0.9, 0.99}
fevd
RMSE
time-varying
variable x3
0.069
Rarely changing
variable z3
0.131
parameters held constant:
N=30, T=20
corr(u,x1)=corr(u,x2)=corr(u,z1)
=corr(u,z2)=0
corr(u,z3)=0.3
corr(u,x3)=0.5
Between SD (z3)=1.2
average bias
FE
RMSE
average bias
0.001
0.069
0.000
0.001
0.858
0.008
varied parameters:
Within SD
(z3)={0.04,,0.94}
26
Online Appendix
P-OLS
15
30
N=50
70
100
fevd
15
30
N=50
70
100
h-taylor
15
30
N=50
70
100
RE
15
30
N=50
70
100
FE
15
30
N=50
70
100
20
0.164
0.158
0.168
0.160
0.172
20
0.113
0.079
0.062
0.051
0.043
20
0.114
0.079
0.060
0.051
0.042
20
0.146
0.138
0.154
0.145
0.151
20
0.110
0.080
0.061
0.052
0.044
x3
T=
40
0.160
0.169
0.172
0.173
0.172
40
0.079
0.054
0.042
0.037
0.030
40
0.080
0.056
0.044
0.037
0.030
40
0.146
0.153
0.150
0.152
0.156
40
0.078
0.058
0.043
0.037
0.031
70
0.167
0.175
0.172
0.170
0.169
70
0.058
0.041
0.032
0.028
0.023
70
0.060
0.038
0.032
0.026
0.022
70
0.144
0.152
0.151
0.149
0.147
70
0.057
0.041
0.032
0.027
0.023
100
0.162
0.173
0.168
0.169
0.170
100
0.050
0.035
0.027
0.022
0.019
100
0.048
0.034
0.027
0.022
0.018
100
0.139
0.152
0.146
0.149
0.146
100
0.049
0.036
0.028
0.024
0.020
20
0.316
0.247
0.281
0.258
0.248
20
0.348
0.298
0.305
0.302
0.305
20
1.085
1.720
0.539
0.434
0.543
20
0.338
0.264
0.290
0.268
0.254
20
z3
T=
40
0.275
0.257
0.251
0.253
0.256
40
0.316
0.299
0.300
0.299
0.296
40
0.437
0.704
2.169
2.889
1.085
40
0.282
0.253
0.251
0.260
0.254
40
70
0.261
0.242
0.241
0.243
0.247
70
0.302
0.301
0.302
0.300
0.302
70
0.738
0.721
0.399
1.541
0.406
70
0.264
0.261
0.249
0.256
0.257
70
100
0.251
0.273
0.248
0.243
0.252
100
0.304
0.297
0.300
0.299
0.300
100
0.721
2.645
1.386
0.377
0.749
100
0.255
0.275
0.252
0.251
0.261
100
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
Table A2 displays the RMSE of the procedures for all permutations of
N={15,30,50,70,100} and T={20,40,70,100} in the estimation of time-invariant
variables.
P-OLS
15
30
N=50
70
100
fevd
15
30
N=50
70
100
RE
15
30
N=50
70
100
FE
15
30
N=50
70
100
20
0.087
0.100
0.105
0.104
0.112
20
0.053
0.039
0.031
0.026
0.021
20
0.077
0.074
0.073
0.070
0.076
20
0.056
0.038
0.030
0.026
0.022
x3
T=
40
0.104
0.111
0.113
0.114
0.113
40
0.038
0.028
0.021
0.019
0.015
40
0.072
0.079
0.075
0.076
0.076
40
0.037
0.027
0.021
0.018
0.014
70
0.111
0.115
0.114
0.109
0.108
70
0.029
0.020
0.016
0.013
0.011
70
0.078
0.081
0.073
0.069
0.068
70
0.029
0.021
0.016
0.013
0.011
100
0.103
0.107
0.108
0.110
0.109
100
0.024
0.017
0.013
0.011
0.009
100
0.075
0.076
0.069
0.068
0.065
100
0.023
0.017
0.014
0.011
0.009
20
0.193
0.140
0.106
0.088
0.076
20
0.188
0.137
0.102
0.088
0.072
20
0.192
0.136
0.111
0.090
0.074
20
0.702
0.738
0.741
0.776
0.746
z3
T=
40
0.140
0.100
0.073
0.066
0.054
40
0.143
0.100
0.077
0.063
0.053
40
0.148
0.097
0.073
0.062
0.054
40
0.733
0.732
0.733
0.738
0.720
70
0.104
0.072
0.058
0.047
0.041
70
0.105
0.071
0.057
0.047
0.041
70
0.109
0.073
0.058
0.047
0.040
70
0.734
0.753
0.718
0.728
0.707
100
0.088
0.065
0.047
0.041
0.034
100
0.086
0.059
0.048
0.040
0.034
100
0.090
0.063
0.050
0.041
0.033
100
0.701
0.698
0.726
0.723
0.738
28
Table A3
p-ols
RMSE
average bias
time-varying
variable x3
0.265
-0.265
Rarely changing
variable z3
0.133
0.028
parameters held constant:
N=30, T=20
corr(u,x1)=corr(u,x2)=corr(u,z1)
=corr(u,z2)=0
corr(u,z3)=0.3
corr(u,x3)=0.5
Between SD (z3)=1.2
fevd
RMSE
average bias
RE
RMSE
average bias
FE
RMSE
average bias
0.069
0.001
0.230
-0.230
0.069
0.000
0.131
0.001
0.858
0.008
0.133
0.027
varied parameters:
Within SD
(z3)={0.04,,0.94}