Anda di halaman 1dari 9

Akaike's Information Criterion in Generalized Estimating Equations

Wei Pan

Biometrics, Vol. 57, No. 1. (Mar., 2001), pp. 120-125.

Stable URL:
http://links.jstor.org/sici?sici=0006-341X%28200103%2957%3A1%3C120%3AAICIGE%3E2.0.CO%3B2-Q

Biometrics is currently published by International Biometric Society.

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained
prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in
the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/journals/ibs.html.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic
journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers,
and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take
advantage of advances in technology. For more information regarding JSTOR, please contact support@jstor.org.

http://www.jstor.org
Mon Feb 18 07:59:51 2008
BIOMETRICS 57, 120-125
March 2001

Akaike's Information Criterion in Generalized Estimating Equations

Wei Pan
Division of Biostatistics, University of Minnesota,

MMC 303, 420 Delaware Street SE, Minneapolis, Minnesota 55455, U.S.A.

email: weipQbiostat.umn.edu

SUMMARY.Correlated response data are common in biomedical studies. Regression analysis based on the
generalized estimating equations (GEE) is an increasingly important method for such data. However, there
seem to be few model-selection criteria available in GEE. The well-known Akaike Information Criterion
(AIC) cannot be directly applied since AIC is based on maximum likelihood estimation while GEE is
nonlikelihood based. We propose a modification to AIC, where the likelihood is replaced by the quasi-
likelihood and a proper adjustment is made for the penalty term. Its performance is investigated through
simulation studies. For illustration, the method is applied to a real data set.
KEY WORDS: Akaike Information Criterion; Generalized estimating equations; Generalized linear models;
Model selection; Quasi-likelihood.

1. Introduction generalized estimating equations (GEE), there is no likelihood


Correlated response data arise often from biomedical stud- defined; thus, AIC cannot be directly used. On the other hand,
ies. An example to be studied is the Wisconsin Epidemiologic the issue of model selection in GEE has been largely neglected.
Study of Diabetic Retinopathy (WESDR) (Klein et al., 1984), The goal of this article is to propose an extension of AIC to
where a binary response variable is the presence of diabetic GEE. It involves using the quasi-likelihood constructed from
retinopathy in each of the two eyes from each participant in the estimating equations (Wedderburn, 1974). Since in gen-
the study. Since the two observations on the two eyes from eral the GEE estimator has different asymptotic properties
the same participant tend to be correlated, statistical analy- from those of the MLE, a modification to the penalty term in
ses have to take proper account of this correlation. Since the the usual AIC is also necessary.
publication of the seminal paper by Liang and Zeger (1986), This article is organized as follows. In Section 2, we first
the generalized estimating equation (GEE) approach has be- briefly review the GEE and quasi-likelihood; then we propose
come increasingly important in handling such correlated data. a modification to AIC in GEE. Simulation results are pre-
Model selection is an important issue in almost any practi- sented in Section 3 to show its performance in selecting the
cal data analysis. A common problem is variable selection in working correlation matrix and selecting covariates in GEE.
regression: given a large group of covariates (including some Section 4 applies the method to the WESDR data, followed
higher order terms), one needs to select a subset to be in- by a brief discussion.
cluded in the regression model. In the WESDR, 13 potential
2. AIC in GEE
risk factors were collected, and we need to determine which
of these factors are to be included. It is well known that, in 2.1 GEE
observational studies such as the WESDR, excluding some Suppose we have a random sample of observations from
important risk factors (i.e., confounders) may result in mis- n individuals. For each individual i, we have a vector of
leading estimates of the effects of other risk factors. On the responses Yi = (xi, . . . , Yi,? )' and corresponding covariates
other hand, including all covariates may lead to a too complex Xi = (Xi1,. . . , Xin2)',where each y3 is a scalar and XiJ is a
model with difficulty in interpretation and with less precise pvector. In general, the components of Y, are correlated but
parameter estimates. Y, and Yk are independent for any i # k (conditional on the
There is an extensive model-selection literature in statistics covariates). We use 2) = {(Yl , X i ) , . . . , (Y,, X,)) to denote
(e.g., Miller, 1990, and references therein) but mainly for the the data at hand. To model the relation between the response
classic linear regression with independent data. One powerful and covariates, one can use a regression model similar to the
and widely used model-selection criterion is Akaike's Infor- generalized linear models, g(p,) = X,P, where pi = E(Y, 1
mation Criterion (AIC) (Akaike, 1973). AIC is based on the X i ) , g is a specified link function, and P = (PI,. . . , Pp)' is
likelihood and asymptotic properties of the maximum like- a vector of unknown regression coefficients to be estimated.
lihood estimator (MLE). Since no distribution is assumed in The GEE approach estimates P through solving the following
AIC in Generalized Estimating Equations 121

estimating equations (Liang and Zeger, 1986): With a 1 x p covariate x and a specified regression model
E(y) = y = g-l(zp) and var(y) = $V(y), the quasi-likeli-
hood can be written as a function of the regression coefficients
P, i.e., &(PI 4; (Y,2 ) ) = Q ( g l ( d l , 4; y).
In the current context, if the working independence model
where D, = D,(P) = dyz(P)/dP1 and V, is a working R = I is used, the working assumption is that the paired
covariance matrix of Y,. V, can be expressed in terms of a observations (Y,j , Xi3) in 2) are independent. Hence, the
working correlation matrix R = R(cv), V, = A ~ / ~ R ( ~ ) Aquasi-likelihood ~ / ~ , based on V is
where A, is a diagonal matrix with elements var(Y,:,) =
$V(p,:,), which is specified as a function of the mean yz3.The
cv may be some unknown parameters involved in the working
correlation structure, which can be estimated through the
method of moments or another set of estimating equations. It is easy to verify that the left-hand side of the GEE
An attractive point of the GEE approach is that it yields S(P; I,2)) in (1) is equivalent to dQ(P, 4; I,V)/dp. Thus, the
a consistent estimator of p, b, even when the working GEE (1) can be regarded as a quasi-likelihood score equation.
correlation matrix R is misspecified (Liang and Zeger, However, if we use a more general working correlation ma-
1986). For instance, it is often convenient to use a working trix R, there is no guarantee that a corresponding quasi-likeli-
independence model where R = I . Some other popular choices hood exists unless certain conditions are satisfied (McCullagh
include compound symmetry (CS) (i.e., exchangeable) with and Nelder, 1989, p. 333-335). Furthermore, even if it exists,
Rz3 = p for any z # 3 or first-order autoregressive (AR-1) in general it is difficult to construct. How to construct a quasi-
likelihood with a general working correlation matrix is beyond
with Rt3 = where R,:, denotes the (z,j)th element of
the scope of this article. The main goal of this article is to
R. Due to its simplicity, the working independence model is
propose a criterion based on Q(,!?, 4; I,V ) , the quasi-likelihood
attractive. Many studies have shown that b obtained under
under the working independence model with an estimated P,
the independence model is relatively efficient (Zeger, 1988;
using any general working correlation structure in GEE.
McDonald, 1993), at least when the correlation between
responses is not large. Another compelling reason for using 2.3 AIC and a Modzficatzon to AIC in GEE
the working independence model is in partly conditional We first briefly review the derivation of AIC, which will
modeling of means for longitudinal data (Pepe and Anderson, motivate our modification to AIC. A more rigorous and
1994). However, for time-varying or cluster-specific covariates, general discussion is available from Linhart and Zucchini
Fitzmaurice (1995) showed that the resulting estimator from (1986). For simplicity of notation, we first assume that the
the independence model may be very inefficient; its efficiency dispersion parameter 4 is known; hence, we can ignore it in
may be as low as 60% compared with the estimator obtained the (quasi-)likelihood function. At the end of this section, we
by using the correct correlation structure. Hence, this poses a will discuss the situation when 4 is unknown.
model-selection problem in selecting the working correlation Suppose we have a candidate model Ml and the true
structure. Of course, we may also need to decide which model M* with log-likelihood functions L(P; V) and L(P*;V ) ,
covariates are to be included in the regression model g(p,). respectively. Throughout, we assume that each model can be
Below we propose a quasi-likelihood-based model-selection indexed by the parameter vector P. A well-known measure
criterion that can be applied to address the above issues. of separation between two models is given by the Kullback-
Leibler information (Kullback and Leibler, 1951), also known
2.2 Quasz-Lzkelihood as the cross entropy. The Kullback-Leibler information
Now we need to briefly review the quasi-likelihood. For the between M1 and M- is
moment, suppose we only have a scalar response variable, y.
We first construct the quasi-likelihood function for the mean
parameter y = E(y) (and dispersion parameter 4); then we where the expectation Elv* is taken with respect to the
will write it in terms of the regression parameter P. true distribution of V (i.e., under model M*). From a set of
Based on the model specification E(y) = p and var(y) = candidate models M , in which each can be indexed by P, we
$V(y), the (log) quasi-likelihood function is (McCullagh and would like to choose the model with the smallest Ao(P,p,).
Nelder, 1989, p. 325) However, in practice, since both P and P* are unknown,
we have to estimate Ao(P,P*). AIC was motivated as an
(b,
asymptotically unbiased estimator of Ehr, [no P*)],where
b
For instance, with grouped binary data, y -
is often specified that V(y) = p(1 - p l n ) ; then (up to a random
is the maximum likelihood estimator (MLE) under any
B i n ( n , ~ i)t candidate model in JV and the expectation is taken over the
b. Akaike proposed using AIC as a model-selection
constant) Q(P, 4 ; ~ = ) L ( p , d ; y ) l 4 , where L ( p , d ; y ) = Y X criterion, i.e.,
+
logjy/(n - y)] n log(n - y) is the log likelihood for the
+
A I C = -2L(b; V) 2p, (4)
binomial distribution. When 4 = 1, the quasi-likelihood
Q reduces to L. However, 4 > 1 is extremely useful in where p is the dimension of p. Model selection is accomplished
modeling overdispersion that commonly occurs in practice. by selecting from M the one that minimizes AIC.
Some common examples of the quasi-likelihood are given in Since GEE is nonlikelihood based, we do not have a likeli-
McCullagh and Nelder (1989, p. 326). hood function in this context. However, we may have a quasi-
122 Biometries, March 2001

likelihood. We propose replacing the likelihood L in (3) by flI, and v,. and that ~(b; I,V) = 0, we know Q I C ( I ) is an
the quasi-likelihood Q under the working independence model asymptotically unbiased estimator of (7). Furthermore, fiI
and define a new discrepancy as and fi are directly available from the model fitting results in
many statistical packages, such as SAS and S-Plus. Hence, we
recommend the routine use of Q I C ( I ) whenever possible. QIC
We assume that any quasi-likelihood model in M can can also be applied to select a working correlation structure
be indexed by the parameter vector P and that P, is in GEE: one needs to calculate the QIC for various candidate
the corresponding parameter for the quasi-likelihood model working correlation structures and then pick the one with the
induced by the true data-generating model M e . For simplicity, smallest QIC. Note that here the goal of selecting a working
with a slight abuse of notation, we suppress the dependence correlation structure is to estimate P more efficiently.
of A(p, p,, I ) on the true model M - . It is well known that In practice, since 4 is unknown, we plug in 4, which
is estimated from the largest model available. In variable
selection, that means we estimate 4 based on the regression
model including all covariates. This is similar to estimating
the dispersion parameter in linear regression with Mallows'
(1973) C,. A more general but also more difficult approach is
to use the extended quasi-likelihood (McCullagh and Nelder,
1989, p. 349), which we do not pursue here.
and the latter is positive semidefinite. Under suitable 2.4 Remarks

conditions, one can exchange the order of the integration and


When all modeling specifications in GEE are correct,

differentiation. Then P* is a local minimizer of A(/?,P*,I )


with regard to P. In other words, for any /3 in a neighborhood
of p*, we have
-
and vT are asymptotically equivalent and t r a c e ( f l I ~), %

trace(1) = p. Then QIC reduces to AIC. In GEE with cor-


related data, one may take QICu(R) -~Q(,!?(R);I,2)) 2p
as an approximation to QIC(R), and thus QIC,(R) can be
+
potentially useful in variable selection. However, it is easy to
This implies that the discrepancy A(/?,P*,I) is well defined
see that QIC,,(R) cannot be applied to select the working
for all the models close to the true model. Though we cannot
correlation matrix R.
prove p, is in general a global minimizer of A(P,P*,I),
Our main motivation of defining the discrepancy
in the common situation that the marginal quasi-likelihood
A ( p , p , , I ) using Q(P;I , V ) is the latter's simplicity and
Q(P; ( X I , XzI)) is equal to the log likelihood L(P; (K3,X z 3 ) ) ,
uniqueness. However, as suggested by one referee, it may be
it is straightforward to verify that then P+ is indeed
possible to define a more general discrepancy as A(P, P*, R) =
a global minimizer of A@,/?*,I ) due to the fact that
EhI, [-2Q(P; R, V)]. But note that Q(P; R, V) may not be
EM, [L(P*;(Y,j,Xzj))I > EM, [L(P;(Y,j,xij))I for any P # unique and in general can be calculated as a path-dependent
p+ (cf., Lehmann, 1983, p. 409). line integral (McCullagh and Nelder, 1989, Section 9.3.2).
Now suppose the GEE estimator /? = P(R) is obtained
Nevertheless, according to Theorem 1 of Hanfelt and Liang
using any general working correlation structure R. Following
(1995; see also Li, 1993), A(P,P*,R) is still a well-defined
the idea of deriving Proposition 2 of Linhart and Zucchini
discrepancy in the sense of (6)
(1986, p. 241, which is for minimum discrepancy estimators),

EM, -
we can approximate End, [A@,P*, I)]as

+
[~(b;
P*,I)] - ~ E A I * 1,V)1
2E~.r,[ (b- ~ - ) ' s ( bI,
; V)]
3. Simulations
Simulation studies were conducted to investigate the
performance of our proposed model-selection criterion QIC in
selecting the working correlation structure and selecting the
+2 trace(RI J), (7) covariates in a marginal logistic regression model. SVe used
the same true model as in Fitzmaurice (1995). The response
where J = cov(b), which can be consistently estimated by variable YZt is binary and its ma~ginalmean is y,t, with
the robust or sandwich covariance estimator, say, pT (Liang
and Zeger, 1986). RI can also be consistently estimated by logit(y,t)=Po+Plxl,t+Pa(t-1), t=1,2,3and
its empirical estimator flI = -d2&(/3; I,V)/d/3dp'Ia=j. Note z = 1,. . . , n ,
b
that, for = B(R), we have ~(b;
R, 2)) = 0 but not necessarily where the xl ,t are 1.i.d. Bernoulli, i.e., xl ,t = 0 or 1
~ ( 6I,;V) = 0 unless R = I. By ignoring the second term that with probability 112 and Po = 0.25 = -PI = -Pa. The
is difficult to estimate, we have an estimator of the right-hand true correlation matrix is CS. We used a large correlation,
side of (7), p = 0.5, and moderate sample size, n = 50 or 100. The joint
distribution of the Y, was simulated from Bahadur's (1961)
representation (see Fitzmaurice, 1995, for more details).
This is our proposed quasi-likelihood under the independence For each sample size, n = 50 or 100, our proposed method
model criterion (QIC) for GEE. Our simulation results (see is most likely to correctly select the CS from the three given
Section 3) show that ignoring the second term in (7) does not correlation structures (Table 1).Since the distribution form
dramatically, but does somewhat, influence the performance of the data is known, we can also compute the MLE and
of Q I C ( R ) , and Q I C ( I ) is the best. Note that, if the working thus AIC. For comparison, we also attach the results of using
independence model is used in GEE, by the consistency of j, AIC by assuming various correlation matrices. Unsurprisingly,
AIC in Generalized Estimating Equations 123

Table 1 overdispersed (grouped) binary data. The results (not shown


Frequency of the working correlation m a t r i x selected by here) also appeared to be promising.
QIC versus AIC for the marginal logistic model from 1000
independent replzcations. T h e true correlation m a t r i x is CS. 4. An Example
We apply the method to the WESDR (Klein et al., 1984).
n = 50 n = 100 The study goal was to determine the risk factors for diabetic
Criterion Ind CS AR-1 Ind CS AR-1 retinopathy. The binary response is the presence of diabetic
retinopathy in each of two eyes from each of 720 individuals
QIc 138 678 184 140 721 139 in the study. There are 13 potential risk factors. As shown
AIC 0 836 164 0 946 54
in Barnhart and Williamson (1998), a univariate analysis was
conducted to investigate the marginal association between the
AIC is more efficient than is QIC, probably for two reasons. response variable and each risk factor. It was found that eight
First, the MLE of /3 is more efficient than the GEE estimator. of them are marginally associated with the response variable.
Second, information on the true correlation structure is em- Barnhart and Williamson included only four risk factors, i.e.,
bedded in the likelihood function in AIC but not directly in duration of diabetes (years), glycosylated hemoglobin level,
the quasi-likelihood Q(P; I,V )in QIC. As mentioned earlier, diastolic blood pressure, and body mass index, plus the two
the strength of QIC is that it is nonlikelihood based, whereas quadratic terms of duration of diabetes and body mass index
in practice the likelihood approach is often too restrictive with in their final model. Now we consider adding all or some of
its strong distributional assumption for correlated categorical the four removed covariates (i.e., intraocular pressure, systolic
data. blood pressure, pulse rate, and proteinuria) into Barnhart and
Now we consider variable selection with an expanded full Williamson's model. Hence, we have 16 candidate models.
model, Note that these models cannot be ordered as a nested se-
quence, and one advantage of using a flexible model-selection
criterion such as QIC is its ability to compare nonnested mod-
els. Due to the nature of the possible correlation between the
where XI,,^, Po, Dl, and p2 are as before, x3,it and x4,,t are two observations on the two eyes from the same participant,
i.i.d. uniform U ( - 1 , l ) and independent of xl,it, and P3 = GEE is used to fit the marginal logistic regression model and
P4 = 0. For simplicity, we consider five nonnested candi- QIC is applied to do model selection, all under the work-
date models with various subsets of covariates included. The ing independence model. The selected top four models, along
results of using QIC with different working correlation ma- with the full model (ranked 8) and Barnhart and Williamson's
trices are shown in Table 2. The performance of the three model (ranked l o ) , are listed in Table 3. The p-values associ-
QICs with different working correlation matrices is close, but ated with GEE estimates are also presented. According to the
QIC(1nd) appears to be the best. This is probably related to QIC values, the top four models are very close but different
the error introduced by ignoring the second term in (7) for from Barnhart and Williamson's model in that proteinuria is
QIC(CS) and QIC(AR-1). For comparison, we also list the re- included in the former four models. From Table 3, we can see
sults of using AIC under the correct and incorrect correlation that proteinuria is an important (and statistically significant)
structures. Surprisingly, QIC(1nd) turns out to be comparable risk factor, and adding intraocular pressure or systolic blood
with AIC/CS. When the distributional assumptions are vio- pressure into the model may also improve its performance.
lated, the performance of AIC deteriorates, as demonstrated
by AIC/Ind and AIC/AR-1, which incorrectly assume the in- 5. Discussion
dependence and AR-1 correlation matrices, respectively. For likelihood-based methods, there are many well-studied
We also did simulation studies to investigate the QICs per- model-selection criteria, such as AIC. But for nonlikelihood-
formance in selecting the working correlation matrix in mod- based methods, such as GEE, there is a lack of literature
eling a partly conditional mean for longitudinal data (Pepe on model selection. In this article, we have proposed a new
and Anderson, 1994) and in variable selection for correlated criterion QIC that works for GEE. The QIC involves using

Table 2
Frequency of the set of variables selected by QIC versus AIC for the marginal logistic model from 1000 independent
replications. T h e true model has { X I , X Z ) , and AIC/CS is calculated correctly using the CS correlation matrix.

n = 50 n = 100
Criterion XI ~ 1 , ~ XI,^
2 ~ 1 , ~ 2 , ~~ 13 , ~ 2 , ~ 3 ,X I~ 4 xi,x2 X I , X ~ ~ 1 ~ ~ 2 ~~ x1 3 ~ ~ 2 ~ ~
Biometrics, March 2001

Table 3

QIC and robust p-values for each covariate i n the top four models and the other two models with the W E S D R data

Model
Covariate 1 2 3 4 8 10
Intraocular pressure
Systolic blood pressure
Pulse rate
Proteinuria
Duration of diabetes
Glycosylated hemoglobin
Diastolic blood pressure
Body mass index
(Duration of d i a b e t e ~ ) ~
(Body mass index)2
QIC(1nd)

the quasi-likelihood constructed under the working indepen- ory, B. N . Petrov and F . Csaki (eds), 267-281. Budapest:
dence model and the naive and robust covariance estimates of Akademiai Kiado.
estimated regression coefficients. Although using other more Bahadur, R. R. (1961). A representation of the joint distri-
general quasi-likelihood seems possible, we choose to use the bution of responses to n dichotomous items. In Studies
quasi-likelihood under the working independence model due i n Item Analysis and Prediction, Volume VI, Stanford
to its simplicity. However, QIC allows one to use any gen- Mathematical Studies i n the Social Sciences, H. Solomon
eral working correlation structure to estimate the parameters (ed.), 158-168. Stanford, California: Stanford University
in GEE. In simulation studies, we found that the QIC works Press.
well in variable selection and selecting the working correlation Barnhart, H. X. and Williamson, J. M. (1998). Goodness-of-
matrix. We were particularly impressed with the performance fit tests for GEE modeling with binary data. Biometries
of QIC(1) in variable selection. Further applications warrant 54, 720-729.
future studies. Fitzmaurice, G. M. (1995). A caveat concerning independence
estimating equations with multiple multivariate binary
data. Biometrics 51, 309-317.
J . J . and Liang, K.-Y. (Igg5). likeli-
The author thanks Dr Huiman Barnhart for providing the
WESDR data set. The author is grateful to Dr Lynn Eberly, hood ratios for general estimating functions. Biometrika
two referees, an associate editor, and the editor for extremely 82, 461-477.
thorough and helpful comments that greatly improved the Klein, R., Klein, B. E. K., Moss, S. E., Davis, M. D., and
article. DeMets, D. L. (1984). The Wisconsin Epidemiologic
Study of Diabetic Retinopathy: 11. Prevalence and risk
of diabetic retinopathy when age at diagnosis is less than
30 years. Archives of Ophthalmology 102, 520-526.
Les donnkes & rkponses corrklkes sont habituelles dans les Kullback, S. and Leibler, R. A. (1951). On information and
ktudes biomkdicales. L'analyse de regression baske sur sufficiency. Annals of Mathematical Statistics 22, 79-86.
les kquations d'estimation gknkraliskes (GEE) est une mkthode Lehmann, E. L. (1983). Theory of Point Estimation. New
d'importance croissante pour de telles donnkes. Poutant, il
semble exister peu de critkres de sklection de modkles disponi- York: Wiley.
bles pour GEE. Le critkre d'information d'Akaike (AIC) bien Li, B. (1993). A deviance function for the quasi-likelihood
connu, ne peut 6tre appliquk directement, ktant donnk que method. Biometrika 80, 741-753.
I'AIC est bask sur l'estimation du maximum de vraisemblance. Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data anal-
alors que GEE est bask sur la quasi-vraisemblance, Nous pro- ysis using generalized linear models. Biometrika 73, 13-
posons une modification de AIC, oh la vraisemblance est rem- 22.
placke par la quasi-vraisemblance et un ajustement adapt6 est Linhart, L. and Zucchini, W . (1986). Model Selection. New
fait pour le terme de pknalitk. Ses performances sont kvalukes York: Wiley.
au travers d'ktudes de simulation. Pour illustration, la mk- Mallows, C. L. (1973). Some comments on C p .Technometrics
thode est appliquke & un jeu de donnkes rkel.
15, 661-675.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear
Models, 2nd edition. London: Chapman and Hall.
Akaike, H. (1973). Information theory and an extension of McDonald, B. W. (1993). Estimating logistic regression pa-
the maximum likelihood principle. In Proceedings of the rameters for bivariate binary data. Journal of the Royal
Second International Symposium on Information The- Statistical Society, Series B 55, 391-397.
AIC in Generalized Estimating Equations 125

Miller, A. J. (1990). Subset Selection i n Regression. London: Zeger, S. L. (1988). The analysis of discrete longitudinal data:
Chapman and Hall. Commentary. Statistics i n Medicine 7, 161--168.
Pepe, M.S. and Anderson, G. (1994). A cautionary note on in- Zeger, S. L., Liang, K.-Y., and Albert, P. S. (1988). Models
ference for marginal regression models with longitudinal for longitudinal data: A generalized estimating equation
data and general correlated response data. Communica- approach. Biometrics 42, 121-130.
tions i n Statistics, Series B 23, 939-951.
Wedderburn, R. W. M. (1974). Quasi-likelihood functions, Received June 1999. Revised December 1999 and June 2000.
generalized linear models, and the Gauss-Newton meth- Accepted June 2000.
od. Biometrika 61, 439-447.
http://www.jstor.org

LINKED CITATIONS
- Page 1 of 2 -

You have printed the following article:


Akaike's Information Criterion in Generalized Estimating Equations
Wei Pan
Biometrics, Vol. 57, No. 1. (Mar., 2001), pp. 120-125.
Stable URL:
http://links.jstor.org/sici?sici=0006-341X%28200103%2957%3A1%3C120%3AAICIGE%3E2.0.CO%3B2-Q

This article references the following linked citations. If you are trying to access articles from an
off-campus location, you may be required to first logon via your library web site to access JSTOR. Please
visit your library's website or contact a librarian to learn about options for remote access to JSTOR.

References

Goodness-of-Fit Tests for GEE Modeling with Binary Responses


Huiman X. Barnhart; John M. Williamson
Biometrics, Vol. 54, No. 2. (Jun., 1998), pp. 720-729.
Stable URL:
http://links.jstor.org/sici?sici=0006-341X%28199806%2954%3A2%3C720%3AGTFGMW%3E2.0.CO%3B2-9

A Caveat Concerning Independence Estimating Equations with Multivariate Binary Data


Garrett M. Fitzmaurice
Biometrics, Vol. 51, No. 1. (Mar., 1995), pp. 309-317.
Stable URL:
http://links.jstor.org/sici?sici=0006-341X%28199503%2951%3A1%3C309%3AACCIEE%3E2.0.CO%3B2-0

Approximate Likelihood Ratios for General Estimating Functions


John J. Hanfelt; Kung-Yee Liang
Biometrika, Vol. 82, No. 3. (Sep., 1995), pp. 461-477.
Stable URL:
http://links.jstor.org/sici?sici=0006-3444%28199509%2982%3A3%3C461%3AALRFGE%3E2.0.CO%3B2-I

On Information and Sufficiency


S. Kullback; R. A. Leibler
The Annals of Mathematical Statistics, Vol. 22, No. 1. (Mar., 1951), pp. 79-86.
Stable URL:
http://links.jstor.org/sici?sici=0003-4851%28195103%2922%3A1%3C79%3AOIAS%3E2.0.CO%3B2-3
http://www.jstor.org

LINKED CITATIONS
- Page 2 of 2 -

A Deviance Function for the Quasi-Likelihood Method


Bing Li
Biometrika, Vol. 80, No. 4. (Dec., 1993), pp. 741-753.
Stable URL:
http://links.jstor.org/sici?sici=0006-3444%28199312%2980%3A4%3C741%3AADFFTQ%3E2.0.CO%3B2-Z

Longitudinal Data Analysis Using Generalized Linear Models


Kung-Yee Liang; Scott L. Zeger
Biometrika, Vol. 73, No. 1. (Apr., 1986), pp. 13-22.
Stable URL:
http://links.jstor.org/sici?sici=0006-3444%28198604%2973%3A1%3C13%3ALDAUGL%3E2.0.CO%3B2-D

Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton Method


R. W. M. Wedderburn
Biometrika, Vol. 61, No. 3. (Dec., 1974), pp. 439-447.
Stable URL:
http://links.jstor.org/sici?sici=0006-3444%28197412%2961%3A3%3C439%3AQFGLMA%3E2.0.CO%3B2-F

Longitudinal Data Analysis for Discrete and Continuous Outcomes


Scott L. Zeger; Kung-Yee Liang
Biometrics, Vol. 42, No. 1. (Mar., 1986), pp. 121-130.
Stable URL:
http://links.jstor.org/sici?sici=0006-341X%28198603%2942%3A1%3C121%3ALDAFDA%3E2.0.CO%3B2-E

Anda mungkin juga menyukai