North-Holland
In a multivariate regression model relating individual returns to the market return, CAPM implies
non-linear restrictions on the parameters. Several asymptotically valid tests of these restrictions
have been suggested. The existing Monte Carlo evidence shows that some of these tests are
unreliable for reasonable sample sizes, but does not indicate well which tests are reliable. This
paper reports the results of an extensive Monte Carlo experiment. Shankens CSR test and Jobson
and Korkies corrected likelihood ratio test are quite accurate in all cases we consider.
1. Introduction
Empirical testing of the capital asset pricing model (CAPM) has a long
history. The most widely known studies are tests of the Sharpe (1964)-Lintner
(1965) CAPM by Black, Jensen and Scholes (1972), Fama and MacBeth
(1973), Blume and Friend (1973) and Friend, Westerfield and Granito (1978),
and a test of the Merton (1973) continuous time CAPM by Jensen (1972).
These studies employ cross-sectional regressions of mean returns on estimated
betas, and suffer from a measurement error problem because of their reliance
on estimated (rather than actual) betas. To avoid this measurement error
problem, recent tests of CAPM have been based on estimation of the multi-
variate regression model. As noted by Gibbons (1980,1982), the linearity of the
relationship between expected return and risk implies a set of non-linear
restrictions on the parameters of the multivariate regression model relating
individual returns to the market return. These non-linear restrictions can be
tested in a variety of standard ways. Gibbons considers the Wald (W) test and
the likelihood ratio (LR) test, while Stambaugh (1981,1982) suggests the
Lagrange Multiplier (LM) test. All of these tests are valid asymptotically (as
the number of time-series observations increases), but they may differ substan-
tially from each other, and from their asymptotic distributions, even when the
sample size is moderately large. The existing Monte Carlo evidence, reported
by Gibbons (1982), Stambaugh (1982), Jobson and Korkie (1982), and
*The authors thank Rex Thompson, the referee, for helpful comments and suggestions
and pi, and let p be the K x 1 vector whose ith element is E(r,). The CAPM
implies linearity between p and p; p= yii, + y2/3 for some (scalar) yi and y2.
This in turn implies that
f= (Y1,Y2)= (pp~)-ppr;.
(4)
Let e = ? - _?f denote the K X 1 vector of residuals from this regression, and
define s* as the sample variance of r,. Then we have the statistic
Shanken shows that Q* + xi_* as T -+ 00 [under the null hypothesis that (2)
holds], and we will refer to the Q* test as the test which compares (5) to a
critical value of xi_*.
362 C. E. Amsler and P. Schmidt, Monte Carlo analysis of CA PM tests
LR = T ln(1 + Q*/T),
while
LM = Q*/(l + Q*/T).
Such an explicit statement is necessary because the Wald test is not invariant
(in finite samples) to the way in which the restrictions are written; only the
Wald test, among the tests we consider, has this defect. The Wald test statistic
is calculated as a quadratic form involving the amounts by which the restric-
tions (7) fail in the unrestricted &,j?. (It is not known to be related in any exact
functional way to Q*.) Its asymptotic distribution is x$_~.
We now briefly discuss the tests for the BJS restrictions as given in (3). We
begin again with Shankens tests. Let 6, j?, and 2 be as before. Let ti be the
estimate in a GLS second-pass regression of &, on (1 - &):
~,=,e-l(iK-~)/(iK-~),e-l(iK-8), (8)
the GLS version of the BJS estimate. Define e = & - Pl(iK - 8) and s2 as
above. Then the Q* test compares
Q* =Z-&e/[l +(i;,-T1)/s2]
Roughly speaking, the design of our Monte Carlo experiment follows that of
Stambaugh (1981), with two major exceptions. First, we use more replications:
generally 4000, though only 2000 for the more expensive parametric configura-
tions (when K 2 15, or when T 2 200). Second, because we consider tests of
both the BJS restrictions (3) and the FM restrictions (2), and since (3) implies
(2), we choose the parameters so that (3) is satisfied.
We performed a total of sixteen experiments. Each experiment requires the
specification of the following parameters: T, K, pm, a,, yl, fl, 2. We then create
a set of T realizations for r,,,, as T independent drawings from N(p,,ui),
using a pseudo-random number generator. We calculate a! according to (2),
where except in one case y; = - y1 so that (3) holds as well. We then calculate
T values of E(r,)=a,+&,,,, i=l,..., K. These calculations are only done
once per experiment. Then, for each of the 4000 (or 2000) replications, we do
the following. We generate ( E,~, E,~, . . . , E,~), t = 1,. _. , T, as independent draw-
ings from N(0, Z), using the random number generator. Adding E, to E( r,), we
calculate the T realizations of ri for i = 1,. . . , K. Using the data on the
individual returns (ri) and the market return (r,,,), we calculate the test
statistics given in the last section. Finally, cumulating the results of the 4000
(or 2000) replications, we calculate the mean and variance of each test statistic;
the number of rejections of the null hypothesis [either (2) or (3)] at the l%, 5%
and 10% level; and the x2 goodness of fit statistic (based on ten equiprobable
cells) which tests whether the actual distribution of the test statistic under the
null hypothesis conforms to its hypothetized distribution (either x2 or F).
Most of our experiments use T = 72, but we test the effect of changing T by
also considering T = 30, 200 and 400. Similarly, most of our experiments use
K = 5, but we test the effect of changing K by also considering K = 10, 15, 20
and 30. All of our experiments use a,,, = 0.0328 (Stambaughs value). All use
y1 = 0.0012 (Stambaughs value) except for one experiment which uses y1 =
0.006. We vary p and .Z widely. Specific parametric configurations will be
given in the next section.
The main results of our experiments are given in tables 2-5. (Table 1 lists
some of the parameter values used in the experiments, to which we will refer
shortly.) These results are for the tests based on the Newton-Raphson esti-
mates. We have not reported separately the results for the tests based on the
actual MLEs, because they are so similar to the results in tables 2-5.
For each test statistic, and for each experiment, we report the proportion of
rejections of the null hypothesis (that the restrictions are correct) at the l%, 5%
and 10% levels, the x2 goodness of fit statistic, and the mean and variance of
Table 1
Values of /!I (betas of individual assets) and 1 (error covariance matrix) used in experiments.
- _.- - _ -_ -. - - - . ._ - _ .~ ._ - - - _. - - - .
Table 1 (continued)
0.26
0.12 0.32
0.09 0.24 0.45
0.08 0.17 0.18 0.23
0.06 0.24 0.24 0.19 0.40
0.09 0.25 0.29 0.20 0.33 0.58
-0.01 0.15 0.18 0.13 0.22 0.30 0.36
H, = 0.01 0.07 0.15 0.08 0.08 0.00 0.08 0.40
-0.04 0.25 0.31 0.17 0.34 0.41 0.31 0.10 0.94
-0.10 0.13 0.16 0.10 0.25 0.31 0.31 0.05 0.40 0.56
0.10 0.00 0.00 0.05 -0.03 - 0.02 -0.06 0.09 - 0.06 -0.13 0.23
0.12 0.02 0.02 0.04 - 0.02 -0.02 - 0.07 0.07 - 0.06 -0.12 0.17 0.20
0.20 0.07 0.06 0.05 0.04 0.00 - 0.01 0.08 - 0.02 -0.11 0.18 0.20 0.67
0.18 0.07 0.04 0.02 -0.01 - 0.02 - 0.04 0.09 -0.08 -0.18 0.18 0.20 0.28 0.60
0.17 0.10 0.09 0.10 0.14 0.11 0.02 0.13 0.12 -0.01 0.17 0.20 0.21 0.20 0.55
Table 2
Proportions of rejections, x2 goodness of fit statistics, means and variances, for various tests based on two-step estimators.=
2 1% 0.007 0.014 0.008 0016 0.01x 0.010 0.010 0.014 0.015 0.009 0021 0.025 0.010 0010
5% 0.046 0.061 0049 0 06X 0 075 0050 0.04X 0.062 0.066 0.052 0 074 0.080 0.056 0.054
T= 72 10% O.lGil 0.119 0.107 0.123 0 132 0.104 0.103 0.11x 0 126 0 103 0.136 0.146 0.102 0.103
K=5 14.7 4X.1 19.3 35 0 62.4 26.9 7.08 32 X 4x 1 19 3 6X.7 ** 26.9 5 12
I& 3.03 3.1x 3.07 3.21 3.30 1.04 2.9X 4.25 4 31 4 13 4 39 4.52 1.05 407
Var. 5.52 6.63 5.76 7.25 7 67 0 76 5.83 X.69 9 16 7.71 104 11.0 0.59 8.17
~___ ~~.
3 1% 0.000 0.019 0.006 0035 0.039 0.011 0.010 0018 0 022 0.006 0.040 0.046 0.010 0.010
5% 0.000 0.076 0.042 0.103 0.113 004X 0046 0.06X 0 079 0041 0 117 0129 0.051 0.051
T= 72 10% 0.001 0.140 0.095 0.171 0.1X6 0099 0094 0.133 0.147 0.091 0.18X 0202 0 loo 0.100
K= 10 2 l * l t 39.8 ** ** 10.9 115 ** ** 42.6 l* ** 6.95 8.64
h!& 2.51 8.72 8.09 9.15 9.42 1.03 7.87 9.71 9.90 9.12 10.5 10.8 1.03 9.00
VU. 3.84 19.2 14.3 24.9 26.3 0.32 15.7 20.5 22 1 15.X 29 6 31.3 0.29 1X 3
4 1% O.ooO 0.02X 0.003 0.074 0.085 0009 0009 0.022 0 030 0002 0 OPR 0.097 0.009 0.010
5% 0.000 0 112 0.035 0.1x2 0 205 0050 0.047 0.090 0 112 0.030 0.200 0218 0050 0.053
T= 72 10% O.COO 0.193 0.092 0 269 0 2X8 010x 0103 0 171 0.197 0.0x9 0.291 0312 0105 0.105
K= 15 x2 ** * l 30.0 ** ** 7 72 X.06 ** ** 41.6 l * *l 3.94 3.63
MeatI 2.91 14.9 13.2 16.3 16.X 1.04 12.9 15 7 16 1 14.3 179 1x.4 104 14.1
Var. 5.09 34.1 21.5 52.1 55.1 021 25.1 33.0 36 7 22.4 5x 0 61.4 0.20 2x.1
5 1% 0.000 0.028 o.tnl2 0.076 0 085 0.006 0.006 0.020 0 029 0.003 0.086 0,098 0.008 0.00X
5% O.ooO 0.106 0.035 0.184 0.198 0.050 0.047 0.096 0.117 0 032 0.196 0.217 0.04X 0.049
T=l2 10% 0.001 0.190 0.089 0.271 0.293 0.102 0.095 0.172 0 194 0 OXX 0.284 0.316 0.104 0.104
K= 15 2 ,I t. 54.1 l * ** 4.48 2.70 ** ** 46.4 l * *l 6.14 4.58
M&n 2.99 14.9 13.3 16.4 16.9 105 12.9 15.X 16.2 14.3 179 184 1.04 14 1
Var. 5.21 32.5 20.6 49.5 52.3 0.20 24 5 33.5 36.2 22.1 57.3 60.6 0.19 27.7
T is the time-series sample size. K is the cross-sectional sample sue (number of assets). FM and BJS refer to the Fama-MacBeth and
Black-Jensen-Scholes models. W is the Wald test. LR is the likelihood ratio test LR* IS LR with Barletts correction. LM is the Lagrange multipher
test.Q is Shankens adJusted Q statistic. Q is QA without a degree of freedom correction. CSR 15 QA multiplied by a correcfxxa factor reflecting
use of Hotellings T*. ** refers to x2 > 100
36X C. E. Anwler and P. Schmidt, Monte Carlo analysis of CA PM tests
the test statistic. [A double asterisk (**) indicates a x2 value in excess of 100.1
The proportions of rejections differ from the expected proportions
(0.01,0.05,0.10) by an amount which is significant at the 5% level if they do
not fall in the intervals [0.007,0.013], [0.044,0.056], and [0.091,0.109], respec-
tively. (For experiments 4,. 5, 7, 8, 11, 12 and 13, which used only 2000
replications, the intervals become [0.006, 0.0141, [0.041, 0.0591 and
[0.087,0.113].) Similarly, a x2 statistic exceeding 16.9 indicates a lack of fit to
the hypothesized null distribution which is significant at the 5% level.
Table 2 reports the results of our first five experiments. Roughly speaking,
these are an attempt to replicate the results of Stambaugh (1981). Experiment 1
uses exactly the parameters of Stambaughs case 1: T = 72, K = 5, CL,,,= 0.0125,
en! = 0.0328, yi = 0.0012, yz = 0.0130, /3= /?i of table 1, 2 = Z, of table 1.
Since y2 - CL,,,# yi, however, the null hypothesis (3) specified by the BJS model
is not true for this experiment. (This explains the large number of rejections
reported in table 2 for experiment 1, for tests of the BJS model.) In experiment
2 (and thereafter) we fix this problem by leaving all parameters unchanged
except now CL, = 0.0142. Thus both (2) and (3) - the null hypotheses specified
by the FM and BJS models respectively - will hold, and we investigate the
behavior of the tests under their null hypotheses. (Comparing the tests of the
FM model in experiments 1 and 2 shows that our change in pm does not affect
the results, anyway.)
Similarly, our experiments 3, 4 and 5 are exactly the same as Stambaughs
experiments 2, 3 and 4, except for the above change in CL,,,.For experiment 3
we have K = 10, /3 = p2 of table 1, 2 = Z, of table 1. For experiment 4 we
have K = 15, /3 = & of table 1, Z = Zs of table 1, whereas for experiment 5 we
have K = 15, p = & of table 1, B = I4 of table 1. All other parameters are as
in experiment 2. Stambaughs parameter values basically represent historical
values in his data, as reported in Stambaugh (1981, p. 145).
Our results in table 2 are very similar to those of Stambaugh (1981, table 37,
pp. 155-156). The LM test appears to be more accurate than the W or LR test.
The W test rejects too seldom for tests of the FM model (basically never for
experiments 3,4 and 5) and too often for tests of the BJS model, while the LR
test rejects too often in both cases. The LM test appears to reject too seldom,
in most cases, but it generally does better than the W and LR tests.
The Q, Q* and CSR tests of Shanken were not part of Stambaughs
experiment, nor was Jobson and Korkies LR* test. The QA and Q* tests
perform poorly; they reject far too often. However, the CSR test and the LR*
test are very accurate, generating proportions of rejections close to the desired
levels and (generally) acceptably low x2 goodness of fit values. There is not
much reason to prefer the CSR test to the LR* test or vice-versa.
In table 3, we report the results of some more experiments designed to see
how fast the tests performances improve when T increases. The parameter
values here are the same as in experiment 3 (i.e., Stambaughs experiment 2),
Table 3
Proportions of rejections, x2 goodness of fit statistics, means and variances, for various tests based on two-step estimators.
6 1% 0.000 0.055 0.000 0.133 0.159 0.013 0.012 0.040 0.062 0.000 0.171 0.201 0.013 0.015
5% 0.001 0.156 0.032 0.246 0.283 0.063 0.053 0.133 0.178 0.030 0.291 0.338 0.058 0.063
T= 30 10% 0.010 0.240 0.098 0.331 0.374 0.115 0.104 0.218 0.264 0.089 0.383 0.425 0.118 0.123
2 ** ** ** ** ** 14.1 ** ** ** ** **
K=lO 10.7 18.0 32.7
h&l 2.34 10.3 8.38 12.1 12.9 1.13 7.87 11.1 11.9 9.45 14.4 15.4 1.14 9.29
Var. 4.50 27.7 12.4 59.9 68.8 0.53 16.3 26.6 32.5 13.1 78.9 90.6 0.50 20.0
3 1% 0.000 0.019 0.006 0.035 0.039 0.011 0.010 0.018 0.022 0.006 0.040 0.046 0.010 0.010
5% 0.000 0.076 0.042 0.103 0.113 0.048 0.046 0.068 0.079 0.041 0.117 0.129 0.051 0.051
T= 72 10% 0.001 0.140 0.095 0.171 0.186 0.099 0.094 0.133 0.147 0.091 0.188 0.202 0.100 0.100
K=lO Mx,,2 ** ** ** **
7 1%
5% 0.000
K= 2
M:an ** 1 .Ol 7.95
3.85 15.1
9.24 17.1 12.9 36.5 45.4 14.3 17.1
9.29 9.03 9.46 9.55 1.01 8.98
Var. 3.92 16.4 14.9 17.8 18.1 0.26 15.3 18.2 18.5 16.6 20.3 20.7 0.23 17.3
1%
5% 0.000
0.113 0.109
K= 10 2
h&an II 12.7
-- - -. - .^ . ._ _ __ _ ,_, ,_, _, _ ,, _ ,_ ,_ ,_
Table 4
Proportions of rgections, x2 goodness of fit statistics, means and variances. for various tests based on two-step estimators.a
9 1% 0.006 0.015 0.010 0.020 0.021 0.010 0.010 0.013 0.014 0.010 0.020 0.023 0.012 0.012
5% 0.037 0.061 0.048 0.067 0.074 0.049 0.047 0.061 0.067 0.051 0.076 0.085 0.054 0.054
T-72 10% 0.090 0.127 0.111 0.132 0.140 0.107 0.104 0.119 0.125 0.106 0.134 0.143 0.105 0.106
K-5 IAIlz 15.9
2.95 44.4
3.22 17.9
3.11 54.3
3.26 88.2
3.35 13.3
1.06 4.60
3.02 34.1
4.26 46.9
4.31 18.7
4.13 62.8
4.39 4.51
** 31.7
1.05 4.07
8.50
VU. 5.04 6.83 5.92 7.50 7.94 0.79 6.01 8.88 9.31 7.82 10.6 11.2 0.61 8.31
10 1% 0.001 0.019 0.008 0.037 0.042 0.011 0.010 0.016 0.022 0.006 0.040 0.048 0.012 0.012
5% 0.014 0.075 0.044 0.104 0.113 0.049 0.047 0.070 0.082 0043 0.118 0.131 0.053 0.053
T= 72 10% 0.035 0.140 0.095 0.169 0.186 0.100 0.094 0.131 0.148 0.096 0.188 0.203 0.103 0.103
K- 10 2 l .* 30.7 * l* 8.04 9.79 92.3 l* 35.6 ** l* 8.63 6.90
&an 6.99 8.70 8.08 9.14 9.40 1.03 7.85 9.64 9.91 9.13 10.5 10.8 1.03 9.02
VU. 9.53 19.5 14.4 25.2 26.6 0.32 15 9 20.2 22.4 16.1 29.9 31.7 0.29 18.5
11 1% o.cHx 0.025 0.004 0.067 0.080 0.007 0.007 0.014 0.024 0.003 0.080 0.090 0.007 0.007
5% 0.005 0.105 0.032 0.178 0.198 0.045 0.042 0.078 0.113 0.027 0.194 0.215 0.046 0.050
T-72 10% 0.012 0.184 0.087 0.277 0.296 0.101 0.094 0.152 0.190 0.081 0.288 0.320 0.100 0.100
K- 15 2 ** l. 29.8 ** l* 11.4 11.0 ** l* 48.1 l* ** 14.6 8.67
h&an 10.6 14.7 13.1 16.2 16.6 1.03 12.8 15.3 16.0 14.2 17.8 18.3 1.04 14.0
VU. 12.9 32.9 20.9 50.0 52.9 0.20 24.8 30.4 35.7 22.0 56.1 59.3 0.19 27.4
12 1% 0.042 0.002 0.130 0.145 0.007 0.007 0.042 0.003 0.196 0.167 0.002 0.007
5% 0.132 0.025 0.268 0.299 0.047 0.043 0.141 0.025 0.296 0.327 0.045 0.049
T-72 10% 0.225 0.065 0.372 0.412 0.081 0.080 0.236 0.065 0.404 0.437 0.078 0.087
K-20 * *1:; 81.3 ** ** 22.4 16.4 l* 877 l* ** 23.1 12.8
h&l 18.1 24.3 25.0 1.02 17.6 22.5 19.1 26.2 26.9 1.02 18.9
VU. 49.0 26.1 89.7 94.9 0.16 34.0 51.8 26.6 98.8 104.0 0.15 36.6
13 1% 0.126 0.001 0.420 0.453 0.011 0.012 0.146 0.001 0.452 0.487 0.013 0.014
5% 0.295 0.016 0.605 0.634 0.047 0.050 0.316 0.014 0.645 0.670 0.046 0.060
T- 72 107 0.416 0.053 0.692 0.727 0.100 0.107 0.430 0.048 0.722 0.749 0.101 0.126
K-30 l* ** l* It 12.6 5.75 .* t* l* tt 10.4 32.9
h&l 36.8 28.4 47.9 49.2 1.05 28.1 38.5 29.4 50.7 52.2 1.05 29.7
VU. 101.0 34.5 300.0 318.0 0.14 59.1 107.0 35.0 333.0 352.0 0.14 63.7
aWald tests are not available for K = 20 or K = 30, to avoid having to invert matrices of dimension 40 or 60.
T is the time-series sample size. K is the cross-sectional sample size
Table 5
Proportions of rejections, x2 goodness of fit statistics, means and variances, for various tests based on two-step estimators.
2 1% 0.007 0.014 0.008 0.016 0.018 0.010 0.101 0.014 0.015 0.009 0.021 0.025 0.010 0.010
5% 0.046 0.061 0.049 0.068 0.075 0.050 0.048 0.062 0.066 0.052 0.074 0.080 0.056 0.054
T=72 10% 0.100 0.119 0.107 0.123 0.132 0.104 0.103 0.118 0.126 0.103 0.136 0.146 0.102 0.103
K=5 2
i&Ul 14.7
3.03 48.1
3.18 19.3
3.07 35.0
3.21 62.4
3.30 26.9
1.04 7.08 32.8 48.1 19.3 68.7 ** 26.9 5.12
2.98 4.25 4.31 4.13 4.39 4.52 1.05 4.07
Var. 5.52 6.63 5.76 7.25 7.67 0.76 5.83 8.69 9.16 7.71 10.4 11.0 0.59 8.17
14 1% 0.001 0.013 0.009 0.016 0.018 0.010 0.010 0.001 0.016 0.008 0.020 0.022 0.012 0.012
5% 0.008 0.063 0.048 0.067 0.074 0.049 0.047 0.017 0.067 0.054 0.074 0.080 0.056 0.054
T=72 10% 0.035 0.115 0.103 0.120 0.131 0.102 0.099 0.051 0.124 0.105 0.133 0.145 0.104 0.105
K=5 x2 *a 18.9 8.22 25.1 51.1 3.56 6.65 ** 38.9 13.0 58.6 ** 32.1 4.88
Mean 2.43 3.14 3.03 3.17 3.26 1.03 2.94 3.57 4.29 4.10 4.36 4.49 1.04 4.05
Var. 2.93 6.72 5.83 7.36 7.79 0.77 5.90 4.91 9.33 7.82 10.6 11.2 0.61 8.32
15 1% 0.007 0.014 0.008 il.016 0.018 0.010 0.010 OX06 0.014 0.009 0.019 0.022 0.012 0.012
5% 0.046 0.062 0.049 0.068 0.075 0.050 0.048 0.037 0.064 0.048 0.073 0.083 0.050 0.049
T=72 10% 0.100 0.119 0.106 0.122 0.132 0.104 0.103 0.086 0.126 0.106 0.136 0.145 0.104 0.106
K=5 h&l2 14.7
3.03 32.6
3.18 19.9
3.07 35.0
3.21 62.4
3.30 3.15
1.04 7.08
2.98 24.5
3.94 50.5
4.31 21.7
4.12 67.8 ** 46.8 8.20
4.38 4.51 1.05 4.07
Var. 5.52 6.63 5.76 7.25 7.66 0.76 5.83 6.80 9.39 7.86 10.7 11.3 0.61 8.40
16 1% 0.006 0.012 0.008 0.014 0.018 0.009 0.008 0.012 0.014 0.007 0.020 0.022 0.009 0.009
5% 0.045 0.062 0.049 0.068 0.073 0.050 0.047 0.059 0.062 0.052 0.070 0.078 0.053 0.053
T= 72 10% 0.096 0.112 0.101 0.118 0.129 0.099 0.092 0.121 0.128 0.105 0.133 0.142 0.103 0.105
K=5 h&l2 8.93
3.01 18.6
3.17 10.0
3.06 19.9
3.20 45.8
3.29 11.3
1.04 3.98
2.97 37.5
4.24 55.2
4.30 20.7
4.12 64.8
4.38 ** 29.5 5.47
4.50 1.05 4.06
Var. 5.42 6.56 5.70 7.16 7.58 0.75 5.76 8.55 9.05 7.62 10.3 10.8 0.59 8.07
aT is the time-series sample size. K is the cross-sectional sample size (number of assets). FM and BJS refer to the Fama-MacBeth and
Black-Jensen-Scholes models. W is the Wald test. LR is the likelihood ratio test. LR* is LR with Barletts correction. LM is the Lagrange multiplier
test. Q is Shankens adjusted Q statistic. Q is Q without a degree of freedom correction. CSR is Q multiplied by a correction factor reflecting
use of Hotellings T*. ** refers to x2 > 100.
312 C. E. Amsler and P. Schmidt, Monte Carlo atzalysis of CA PM !esfs
except for experiments 6, 7 and 8 we have T = 30, 200 and 400, respectively.
The results generally support our expectation that the tests should become
more accurate as T increases (since all of the tests are valid asymptotically).
The LR, LM, LR* and CSR tests are at all reasonably accurate by T = 200.
For T = 30, only the CSR and LR* tests are at all reliable. The Wald test of
the FM model continues to do very poorly (almost no rejections) even for
T=400.
In table 4 we report the results of five experiments designed to test the effect
of changing K (the number of assets). Experiment 9 has K = 5 and is identical
to experiment 2 except that now 2 is a diagonal matrix with diagonal elements
equal to 0.0003. (Incidentially, comparing experiments 2 and 9 shows that this
change makes very little difference.) A diagonal matrix is convenient in this
context because we can change its size (from 5 to 30) without changing its
character. Experiment 10 has K= 10, and p is the same as for experiment 9
but with each element of p repeated twice. Similarly for experiments 11, 12
and 13, with K= 15, 20 and 30, respectively, we repeat each element of p
three, four, or six times. Thus we attempt to change K without changing the
nature of X and /3.
The results support our expectation (based on the results for table 1) that the
tests should perform less well when K increases, for T fixed. This clearly
happens, except for the CSR and LR* tests, which continue to perform
reasonably well even for K = 30.
Finally, in table 5 we report the result of three more experiments which test
the effects of changes in other parameters. These experiments are variations of
our experiment 2 (Stambaughs experiment 1) and hence have T = 72 and
K = 5. In experiment 14 we shrink the range of /3 by moving each element of p
halfway to the mean value; explicitly, /I = & of table 1. In experiment 15 we
use the same /3 as in experiment 2, but with 0.2 added to each element. (Thus
in these two experiments we have changed the variance and the mean of the
ps, respectively.) In experiment 16 we revert to the /3 of experiment 2, but use
yi = 0.006 (and y2 = 0.0082 so yZ - p, = - yi). None of these changes makes
any real difference, except for the Wald test.
As mentioned above, the results for the tests based on the actual MLEs are
very similar to the results for the tests based on the Newton-Raphson
estimates. The proportions of rejections typically differ by only 0.001 or 0.002,
and such small differences are not worth a separate discussion. However, the
fact that the choice of estimator makes so little difference is itself interesting. It
confirms the existing feeling that CAPM tests based on the Newton-Raphson
estimates are as reliable as tests based on the MLEs.
Although the point of the paper is the accuracy of CAPM tests, not the
properties of parameter estimates, we did calculate summary statistics for
C. E. Amsler and P. Schmidt, Monte Carlo analysis of CA PM tests 373
Table 6
Means, variances, mean squared errors and average absolute errors, for various estimators;
estimates of y, (FM model) and y (BJS model).
T is the time-series sample size. K is the cross-sectional sample size (number of assets). FM
and BJS refer to the Fama-MacBeth and Black-Jensen-Scholes models. y, and y are parameters
of these models (risk premia). OLS ZPASS is the two-stage cross-sectional regression estimator.
LINMLE is the linearized maximum likelihood estimator (Newton-Raphson estimator). MLE is
the maximum likelihood estimator. Variance bound is the asymptotic variance from the informa-
tion matrix. MSE and AAE are abbreviations of mean squared error and average absolute error.
374 C. E. Amler and P. Schmidt, Monte Carlo analysis of CA PM tests
each estimator. We also display the variance bound, which is the asymptotic
variance from the information matrix. The true value of y1 (FM model) or y
(BJS model) is 0.0012 in each of these five experiments, so asymptotically
mean x 10e2 should equal 0.1200.
Although we will not discuss these results in detail, some general conclusions
are clear. First, the Newton-Raphson estimator generally outperforms the
OLS two-pass estimator in terms of variance, MSE and AAE, but not in terms
of bias. Second, the Newton-Raphson estimator generally outperforms the
MLE. The MLE is generally biased downward, while the Newton-Raphson
estimator is generally biased upward. The MLE always has a higher variance
than the Newton-Raphson estimator. The MLE occasionally was off by
spectacularly large amounts, due to a few outliers, though this did not occur in
any of the experiments reported in table 6. The only potential reason to prefer
the MLE is on the basis of smaller bias when K is large relative to T.
Otherwise the Newton-Raphson estimator is generally the preferred one.
6. Conclusions
The main results of our experiment are clear and easily summarized:
References
Bartlett, M.S., 1938, Further aspects of the theory of multiple regression, Proceedings of the
Cambridge Philosophical Society 34. 33-47.
Black, F., M. Jensen and M. Scholes. 1972, The capital asset pricing model: Some empirical
findings, in: M. Jensen, ed., Studies in the theory of capital markets (Praeger. New York).
Fama, E. and J. MacBeth, 1973, Risk, return and equilibrium: Empirical tests. Journal of Political
Economy 91, 607-636.
Friend, I., R. Westerfield, and M. Granite, 1978, New evidence on the capital asset pricing model,
Journal of Finance 33, 903-917.
Gibbons, M.R., 1980, Econometric methods for testing a class of financial models: An application
of the nonlinear multivariate regression model, Unpublished doctoral dissertation (University
of Chicago, Chicago, IL).
Gibbons, M.R., 1982 Multivariate tests of linancial models: A new approach, Journal of Financial
Economics 10, 3-27.
Jensen, M.C., 1972, Capital markets: Theory and evidence, Bell Journal of Economics and
Management Science 3, 357-398.
Jobson. J.D. and 9. Korkie, 1982, Potential performance and tests of portfolio efficiency, Journal
of Financial Economics 10, 433-466.
Kandel, S., 1984, The likelihood ratio test statistic of mean-variance efficiency without a riskless
asset, Journal of Financial Economics 13, 575-592.
Linter, J., 1965, Security prices, risk, and maximal gains from diversification, Journal of Finance
20. 587-616.
MacKinlay, A.C., 1984, An analysis of multivariate financial tests, Unpublished manuscript
(University of Chicago, Chicago, IL).
Merton. R.C., 1973, An intertemporal capital asset pricing model, Econometrica 41. 867-887.
Roll, R.R., 1985, A note on the geometry of Shankens CSR T: test for mean/variance efficiency,
Journal of Financial Economics, this issue.
Shanken, J., 1982a, An analysis of the traditional risk-return model, Unpublished doctoral
dissertation (Carnegie-Mellon University, Pittsburgh, PA).
Shanken, J., 1982b, An asymptotic analysis of the traditional risk-return model, Unpublished
manuscript (University of California, Berkeley, CA).
Shanken, J., 1983, Maximum likelihood methods and the traditional asset pricing empirical
paradigm, Unpublished manuscript (University of California, Berkeley, CA).
Shanken, J., 1985, Multivariate tests of the zero-beta CAPM, Journal of Financial Economics, this
issue.
Sharpe, W., 1964, Capital asset prices: A theory of market equilibrium under conditions of risk,
Journal of Finance 19, 425-442.
Stambaugh, R.F.. 1981, Missing assets, measuring tL market, and testing the capital asset pricing
model, Unpublished doctoral dissertation (University of Chicago, Chicago, IL).
Stambaugh, R.F., 1982, On the exclusion of assets from tests of the two-parameter model: A
sensitivity analysis, Journal of Financial Economics 10, 237-268.