Anda di halaman 1dari 13

American Journal of Epidemiology Vol.

135, No 9
Copyright C 1992 by The Johns Hopkins University School of Hygiene and Public Health Printed in U.S.A.
All rights reserved

Selection of Controls in Case-Control Studies


II. Types of Controls

Shotom Wacholder,1 Debra T. Sllverman,1 Joseph K. McLaughlin,1 and Jack S. Mandel2

Types of control groups are evaluated using the principles described in paper 1 of
the series, "Selection of Controls in Case-Control Studies" (S. Wacholder et al. Am J
Epidemiol 1992;135:1019-28). Advantages and disadvantages of population controls,
neighborhood controls, hospital or registry controls, medical practice controls, friend
controls, and relative controls are considered. Problems with the use of deceased
controls and proxy respondents are discussed. Am J Epidemiol 1992;135:1029-41.

bias (epidemiology); epidemidogic methods; retrospective studies

In this paper, we apply the comparability for the sampling plan. When a roster iden-
principles of study base, deconfounding, and tifying all members of the base is available,
comparable accuracy presented in our pre- controls can be selected simply as a random
vious paper (1) to the practical problem of sample from that roster as in a nested case-
choosing a control group. A number of control study (2) or a case-cohort study (2).
choices for sources of controls are discussed When the probability of case identifica-
and evaluated within the framework of the tion among members of a primary base de-
principles. We also offer specific suggestions pends on a variable, the study base principle
that we believe are useful in choosing con- is violated and there can be selection bias,
trols. unless control selection depends proportion-
ally on values of that variable. For example,
POPULATION CONTROLS when the probability of disease diagnosis
depends on access to medical care, a hospital
In a study with a primary base, where the control series with similar dependence on
focus is the disease experience of a popula- access might be more appropriate than pop-
tion during a specified time interval in a ulation controls (4). Or, in a case-control
defined geographic area, randomly sampled study of occupational risk factors using pop-
controls from that population satisfy the ulation controls, investigators might con-
study base principle. More complex sam- sider excluding cases diagnosed at smaller
pling schemes, such as frequency matching hospitals in the catchment region for logistic
or cluster sampling (2, 3), are also appropri- reasons. If these hospitals tend to serve rural
ate, as long as the analysis properly accounts communities, however, urban occupations
may be overrepresented among the cases.
Received for publication May 8, 1991, and in final form An alternative type of control, such as hos-
February 11, 1992. pital controls, or stratification by geographic
Abbreviation: RDO, random digit dialing.
1
Biostatistjcs Branch, National Cancer Institute, Be- factors in the design or analysis may alleviate
thesda, MD. this problem.
2
Department of Environmental and Occupational
Health, School of Public Health, University of Minnesota,
Minneapolis, MN. Advantages of population controls
Reprint requests to Dr. Shotom Wacholder, Biostatisttcs
Branch, National Cancer Institute, 6130 Executive Blvd., There are a number of advantages of pop-
EPN 403, Rockville, MD 20892. ulation controls.
1029
1030 Wacholder et al.

Same study base. Selection of population comparable accuracy principle. Despite an


controls from a primary base ensures that interviewer's best efforts to have the subject's
the controls are drawn from the same source response refer to the period before disease,
population as the case series (the study base the responses by a previously hospitalized
principle (I)). The mechanisms for sampling case may reflect modifications in exposure
from a population are similar to those com- due to the disease itself, such as drinking less
monly used in survey research. coffee or alcohol after an ulcer, or due to
Exclusions. The definition of the base can changes in perception of past habits after
encompass the exclusions, e.g., not being in becoming ill.
the catchment area at the appropriate time, Less motivation. Population controls may
being a previous case of the disease under be less motivated to cooperate than hospital
study, or not being at risk of the disease controls.
under study, such as women who have had
a hysterectomy in a uterine cancer study. Selection of population controls when a
Extrapolation to base. The distribution of roster exists
exposures in the controls can be readily ex-
Selection of population controls is sim-
trapolated to the base for purposes such as
plest when there is a complete listing of the
calculations of absolute or attributable risk
study base. It is useful if the roster has a
(5) or to learn about the distribution of
telephone number or at least an address with
exposures in the population; for example, a
which to make contact with the subject.
detailed diet questionnaire could be used to
Rosters that have been used include the
study differences in food consumption be-
following: annual residence lists, compiled
tween blacks and whites. By contrast, a hos-
by law in some areas such as Massachusetts
pital control series that is appropriate for a
(8) and several European and Asian coun-
study using cases drawn from the same hos-
tries (9); birth certificate records for studies
pital cannot be used for estimating attribut-
of disease in children (10); Health Care Fi-
able risk in a population without making
nancing Administration files for Medicare
assumptions about the representativeness of
recipients, with coverage of about 98 percent
the case series in that population.
of the United States population aged 65
years and over (11); and electoral lists pre-
Disadvantages of population controls pared for each election by a door-to-door
survey in such countries as Great Britain
Population controls can be inappropriate and Canada. It is important to remember
when there is incomplete case ascertainment that, when the cases are identified by a
or when even approximate random sam- method other than follow-up of subjects on
pling of the study base is impossible because the roster, such as through a disease registry,
of nonresponse or inadequacies of the sam-
cases who are not included on the roster,
pling frame (6). When case ascertainment
such as those who are not citizens when
is incomplete and probability of ascertain-
using electoral lists, should be excluded from
ment depends on the factors being studied,
the study.
hospital-based or other controls may be pref-
erable (7).
Selection of population controls when no
Population controls have some other dis- roster exists
advantages.
Inconvenience. Sampling from the popu- When no roster exists, it can be difficult
lation instead of using a more readily avail- to ensure that every eligible subject in the
able series, such as other hospitalized pa- study base has the same chance of selection.
tients, can be less convenient and more Bias can be induced when methods rely on
expensive. contacts with a household, either by tele-
Recall bias. Differences between cases and phone (12) or in person, and only one eligi-
healthy controls can lead to violation of the ble control per household is selected. Con-
Types of Control Groups 1031

sider a study of a disease in children where numbers in the exchange, but the probability
controls must be within 2 years of age of the that a particular number is chosen from that
case to which they are matched and only exchange is inversely proportional to the
one control per household will be selected. number of working numbers in the ex-
A child with a sibling of similar age is less change. However, the goal in control selec-
likely to be selected as a control than one tion is a random sample of eligible subjects,
with no siblings. This violates the study base not of telephone numbers. Incomplete
principle and, since cases are selected regard- phone coverage, residences that can be
less of the ages of other family members, can reached by more than one phone number,
lead to bias in estimating the effects of vari- more than one person in the household who
ables related to family size (12). Independent is eligible to be a control, and nonresponse
determination of whether to select each sub- can all lead to possible selection bias, unless
ject would eliminate this problem, or more accommodation is made in the design or
than one control can be selected from the analysis. Stratification on numbers of tele-
same home, and the possible dependence in phone lines and eligible residents in the
their responses can be accounted for in the household can alleviate some of the prob-
analysis (3). lems. Advances in technology, such as an-
Random digit dialing. Random digit dial- swering machines and call forwarding, have
ing (RDD) can be used to select population added complications to this method.
controls when no roster exists. RDD and its The first contact with a household is most
variants generate sets of telephone numbers often used for screening and to obtain a
without relying on a directory that would census of the household. Information on the
not have new or unpublished numbers (13). address of the house and on the name, age,
The aim of RDD is to ensure that each sex, and race of each household member is
residential number has an equal chance of obtained. Based on the responses, a sam-
selection, while minimizing the number of pling frame is generated, and a random sam-
phone calls to nonresidential or inappro- ple of identified eligible subjects is selected
priate numbers (13). to be controls. These individuals can then
Investigators have flexibility in the details be contacted by letter, by telephone, or in
of RDD (13, 14). In the standard method person for interview. When the sampling
(13), a random sample is drawn from work- scheme is simple, such as when it is based
ing sets of telephone exchanges provided by simply on age and sex, the census and the
the telephone company, i.e., the first several interview itself can be done in a one-step
numbers of the complete telephone number, process (14, 15), thereby reducing the overall
typically eight of ten (including area code) percentage of refusal.
in the United States. The number is then When telephone coverage is low, RDD
completed with two random numbers. The will miss a substantial proportion of subjects
complete number is dialed to determine and can result in biased estimates of the
whether it is a working residential phone. If effects of exposures related to telephone cov-
it is, a predetermined number of calls are erage, such as socioeconomic status. While
made to that exchange; if not, the exchange telephone coverage in the United States is
is discarded. The extra steps are included in 93 percent, it is lower for residents of the
order to reduce the numbers of calls to ex- South, for blacks, and for the poor, in 1986,
changes that have relatively few residential only 56 percent of Southern blacks living in
numbers. households with yearly income below
Does RDD satisfy the study base principle $5,000 had telephones (16). In the United
or can it generate a selection bias? It is easy States, the difference in the proportion of
to show that each phone number in the area smokers in households with phones and in
has the same chance of being reached. The those without phones is 1 percent, though
probability that an exchange is selected is this difference is 4 percent among those with
proportional to the number of working income below $5,000 (16). Incomplete tele-
1032 Wachdder et al.

phone coverage is often, therefore, a smaller trated geographically (19). However, this
problem than nonresponse and refusal. technique may violate the study base prin-
The clustering of exchanges in RDD re- ciple and lead to bias when the exposures
sults in a sample that is not random, since are related to cultural factors associated with
not every possible subset of the population the diversity of the subject's neighborhood,
can be chosen. This clustering does not bias since eligible subjects living in heteroge-
estimates of effects but can lead to overly neous areas may be underrepresented in the
optimistic estimates of the precision of point control group. For example, Asian Ameri-
estimates unless addressed in the analysis cans living in so-called Chinatowns are likely
(3). to have life-styles and diets different from
Sometimes a variation of RDD is used those living in an ethnically heterogeneous
with controls selected using the exchange of neighborhood.
the case in order to generate subjects Neighborhood controls. Population con-
matched on factors that are difficult to mea- trols can also be selected using residences,
sure but are believed to be related to geo- rather than telephone numbers, as the sam-
graphic area (17,18). It is not clear, however, pling unit. This strategy can be particularly
whether "neighborhood matching" would useful when telephone coverage is low. In
satisfy the study base criterion. Would con- area probability sampling, controls are se-
trols selected to have similar phone numbers lected randomly from a roster of residences,
actually become cases (e.g., be admitted to perhaps obtained from a recent census.
the study hospital, if the cases were a hospital However, creating a roster when one is not
series) had they been diagnosed with disease? available can be extremely expensive.
Moreover, the extent to which such a pro- Instead of using a random sample from a
cedure matches on area or region has not roster of residences, "neighborhood con-
been empirically demonstrated. Matching trols" are typically selected from residences
on primary care practice, discussed below, in the same city block or other geographic
may be a better approach in some situations. area as the case, in an attempt to reduce the
RDD can be expensive and time-consum- variability of factors such as access to med-
ing when targeting subgroups of the popu- ical care and socioeconomic status. For
lation. For example, an average of almost neighborhood controls to satisfy the study
35 households must be screened to identify base principle, one must consider the base
one black male, 64 households to identify as divided into geographically defined strata.
one Hispanic male between 20 and 29 years Use of neighborhood controls in a study with
of age (19), and almost 70 to identify one a secondary base may not satisfy the princi-
male aged 75 or older (13). Hence, some ple; for example, a neighbor who would not
studies in the United States use Health Care be admitted to the same hospital under the
Financing Administration rosters as a sub- circumstances that led to admission of the
stitute to identify controls over age 65. case would be outside the base. Thus, bias
RDD is an option for identifying controls could result if the source of cases is a reli-
from a particular ethnic group who tend to giously affiliated hospital and the neighbor-
be clustered in certain neighborhoods be- hood is religiously heterogeneous.
cause less effort is expended in nonproduc- Neighborhood controls are usually chosen
tive exchanges (13). The Donnelley data deterministically (nonrandomly) within a
base (Donnelley Marketing Information stratum defined geographically. If the selec-
Services, Stanford, Connecticut), which clas- tion is not random, one must rely on an
sifies exchanges by race and income, can assumption that the selection process is in-
also be used for stratification to identify dependent of the exposure, which is equiv-
blacks and Hispanics more cost effectively alent to the exposure distribution being the
(19); it is not so helpful, however, for iden- same as that in the study base (1, 20, 21).
tifying groups such as Asian Americans To protect that independence, the inter-
whose residential patterns are less concen- viewer should not be given the flexibility to
Types of Control Groups 1033

choose which house to select. Instead, a par- nonresponse, since one does not know the
ticular algorithm, such as the one depicted number of eligible subjects in homes for
in reference 22 or perhaps based on a reverse which there is no response. In addition, there
directory (sorted by address), should be used may be overmatching on the study exposure
in order to avoid bias arising from inter- because of similarities between cases and
viewer selection of residences that appear controls from the same neighborhood on
more likely to cooperate. exposures related to residence, particularly
Ideally, the neighborhood control should in buildings with more than one household
have been a resident of the house when the unit. These multiunit dwellings, especially
index case was diagnosed. Controls who re- apartment buildings, present additional
cently moved into a neighborhood and are problems because of the difficulty of enum-
chosen to match cases from the neighbor- erating all household units within the build-
hood diagnosed several years earlier should ing for sampling. Moreover, access to the
be excluded since they are outside the study buildings themselves can be a problem in
base. Excluding controls who have moved many large cities.
into the neighborhood since diagnosis of the
case reduces the study base problem but does HOSPITAL OR DISEASE REGISTRY
not solve it, since people who moved out of CONTROLS
the neighborhood will still be missed (I).
Whenever cases are diagnosed several years When a list of admissions or discharges
before control selection, use of current resi- from a hospital or clinic is the source of the
dents (or any other current roster) raises the cases, the same list can be used as the source
possibility of distortion of the distributions of controls, too. The points below regarding
of factors associated with mortality and mi- hospital controls also apply to disease regis-
gration, particularly if socioeconomic or eth- try controls, drawn, for example, from a
nic characteristics of the neighborhood have tumor or malformation registry. One attrac-
changed. Old reverse directories, visits to tion of hospital controls is that one can
long-term residents, property tax records, reasonably assume that patients admitted to
old plat maps, and registers of deeds can be the same hospital as the cases are members
used to historically reconstruct neighbor- of the same (secondary) base (1, 20). The
hoods (23) in order to find neighborhood most serious danger with hospital controls is
controls contemporaneous with the case. that choosing subjects with other diseases
Neighborhood controls have two main ad- may jeopardize the assumption of represent-
vantages. 1) Control selection does not re- ativeness of exposure (1, 20, 21), namely,
quire the existence of a roster or use of a that the distribution of the exposures under
telephone, and 2) confounding factors asso- study in the controls is the same as that in a
ciated with neighborhood may be balanced random sample from the base that produced
between cases and controls. the cases. This is equivalent to assuming that
Thus, neighborhood controls can be an there is no relation between exposure and
attractive alternative for studies with a pri- the diagnoses used to determine inclusion of
mary base when no roster of the population controls. For example, use of other women
is available or, possibly, for studies where who undergo dilatation and curettage as
the cases are obtained from hospital lists. controls for a study of endometrial cancer
The disadvantages of neighborhood controls (25) probably meets the assumption regard-
include the potential for not satisfying the ing membership in the secondary base but
study base principle, particularly in studies fails the representativeness of exposure as-
with a secondary base; the high cost associ- sumption when estrogen use is related to
ated with contacting each potential control conditions indicating dilatation and curett-
(24); the use of the household as the sam- age (26-28).
pling unit, as in selection of controls by The representativeness of exposure as-
telephone; and the difficulty in documenting sumption is not quite as difficult to meet as
1034 Wacholder et al.

it may seem because it must hold only within and phrasing of the questions, and the dis-
strata used in the analysis or conditionally eases to be included in the control series is
on factors for which adjustment will be needed, since simply being sick does not
made in the analysis (1, 20). Thus, for ex- necessarily entail comparable accuracy and
ample, stratification by sex eliminates bias avoidance of recall bias. Selecting hospital
for an exposure even if the exposure is as- controls with conditions that are believed to
sociated with a control disease uncondition- lead to similar errors in recall may alleviate
ally, as long as it is unassociated with the some of the problems that cause this form
control disease separately for males and fe- of information bias. For example, in a study
males. of birth-related risk factors for testicular can-
Hospital or registry controls are usually cer in men treated at a military or tertiary
more appropriate than are population con- care hospital, controls were age-matched
trols if a sizable fraction of diseased subjects men with other cancers, presumed to be
in the base will not become cases in the unrelated to the study exposures, at the same
study and if the ones who do have different hospitals (30). Since subjects' mothers pro-
exposures from those who do not. For ex- vided information on the key exposures,
ample, multiple sclerosis patients referred to which occurred during early childhood, the
an academic center about 60 miles (about use of hospital controls was particularly ap-
100 km) away were found to have demo- propriate because it ensured that the sons of
graphic and severity characteristics different all the mothers interviewed for the study had
from those of other multiple sclerosis pa- had malignancies (30). Further, the study
tients at the center from the same area who hospitals drew patients from across the
were not referred (29). It would be difficult United States, so these controls were likely
to reflect this heterogeneous referral pattern to have referral patterns similar to those of
using population controls. Use of hospital the cases (30).
controls from another disease with a similar Convenience. Hospital controls may be
referral pattern might provide more assur- the most convenient choice when controls
ance that all subjects share the same study will be asked to provide bodily fluids or to
base; alternatively, stratifying by geography undergo a physical examination, as when
or by referral status might be effective. looking for dysplastic nevi in a study of skin
Use of a hospital control series consisting melanoma.
of subjects with a disease the outward man- In addition to the need to satisfy the rep-
ifestations of which are identical to those of resentativeness of exposure assumptions
the disease of interest can eliminate one noted above, hospital controls have other
source of selection bias. When differential difficulties.
diagnosis is made on these subjects, the ones Different catchments. Even when controls
with the index disease become cases and the are identified from the same registry or hos-
ones with the "imitation" disease become pital as the cases, the catchments for differ-
controls. If the imitation disease is unrelated ent diseases within the same hospital may
to the exposure of interest, these controls be different (31), violating the study base
would be appropriate; Miettinen (20, p. 79) principle. For example, an urban teaching
describes them as "ideal." hospital associated with a medical center
Hospital controls have other advantages. may provide primary medical care to poor
Comparable quality of information. A ma- people in the neighborhood and also serve
jor advantage is that generally hospital con- as a tertiary referral center providing sophis-
trols are more comparable to cases with ticated services for certain medical condi-
respect to quality of information, since they tions. Restricting the study base to people
too have been ill and hospitalized. However, living in the vicinity of the hospital can
careful consideration of the environment alleviate the problem (20, 21) but may re-
where information is gathered, the content duce the number of cases substantially.
Types of Control Groups 1035

Stratification by distance between hospital from the control series to reduce bias due to
and residence or by referral status might be misclassification of disease.
an effective alternative. If there is complete confidence that a sin-
Berkson's bias. If the study exposure is gle disease is unrelated to the exposure of
related to the risk of being hospitalized for interest, the entire control series may be
the control disease, the exposure distribution selected from among patients with that
in the series may not reflect the base. For disease. However, only rarely is there con-
example, diabetics are more likely to be vincing evidence that the assumption of in-
admitted to the hospital with heart disease dependence of the study exposure and a
than are nondiabetics, which could bias control disease is satisfied. Therefore, inclu-
studies focusing on diet. This is an example sion of patients with several diseases mini-
of Berkson's bias, which is caused by selec- mizes potential bias if any one disease turns
tion of subjects into a study differentially on out to be related to exposure (36, 37). When
factors related to exposure (32). related diseases, such as other cancers for a
study of a particular cancer or other peri-
natal outcomes for a study of birth defects,
Composition of a hospital control series are used, the possibility of information bias
may be reduced (38, 39). Again, however,
We believe that the best strategy regarding any of the diseases that are related to the
the selection of diseases to form a hospital- exposure (based on a priori knowledge)
or disease registry-based control group is to should be excluded (33). Overall, we rec-
exclude from the control series all conditions ommend using more diseases rather than
likely to be related to exposure (20, 33). The fewer; this protects the investigators if
payoff for the extra effort in the study design later evidence links one or more of the con-
will be more confidence in the validity of trol diseases positively or negatively to an
the results. If there is an association, subjects exposure.
admitted to the hospital for the disease need
to be excluded from the control series (34); CONTROLS FROM A MEDICAL
however, a previous history of the disease PRACTICE
should not be grounds for exclusion, unless Choosing controls from the primary med-
the exclusion is also applied to cases (35). ical practice of the cases can be a useful
These exclusion rules apply regardless of strategy when it is otherwise difficult to find
whether the association is positive or nega- controls who are comparable to cases on
tive, causal or not. access to medical care or referral to special-
In theory, a possible association of expo- ized clinics. For example, medical practice
sure with a control disease should be assessed controls may be more appropriate than hos-
after controlling for confounders included pital controls when cases are drawn from an
in the analysis. If adjustment for a con- urban teaching hospital, because potential
founder eliminates a crude association be- subjects admitted to this hospital may be
tween exposure and a potential control dis- mixtures of poor clinic patients and high-
ease, adjustment for that same confounder socioeconomic-status private patients in far
in the analysis of the study will eliminate different ratios from those of the case series.
the bias caused by using that disease as a Controls selected from the same medical
source of controls. practices as the cases are drawn from the
Similarly, an association between the con- appropriate secondary base (I, 20), if one
trol disease and a confounder is acceptable, can make the assumption that two patients
if the effect of the confounder is controlled in the same primary care practice with the
in the analysis. Also, patients with any dis- same presentation would follow the same
ease that cannot be clearly distinguished pathway through the medical care system. A
from the study disease should be excluded disadvantage of medical practice controls is
1036 Wacholder et al.

the extra complexity entailed by a random about the use of friend controls. A possible
selection process for controls within several theoretical justification of friend controls is
different practices. that the base is divided into mutually exclu-
The study base principle can be jeopard- sive "friendship strata" and that the expo-
ized with medical practice controls, since sure of a friend control is representative of
the exposure distribution for controls may that of the friendship stratum. Alternative
not be the same as that in the study base, as justifications for friend controls are as a
when patients choose a physician for reasons nonrandom sample from the base, if friends
relating to particular conditions that are will all be in the same study base, or, if not,
themselves related to exposure. Similarly, if as an indirect way to probe the base (1, 20).
those conditions lead to modification of ex- None of these rationales is very persuasive.
posure subsequent to symptoms or treat- It is unrealistic to believe that the study base
ment, the study base principle can be vio- is divided into mutually exclusive friendship
lated, unless the timing of the changes can strata and that the controls are selected from
be ascertained. For example, medical prac- only within the case's stratum (42). Even if
tice controls were used in a study that re- this were true, the control selection within
ported a positive association between coffee the stratum is deterministic and possibly
drinking and pancreatic cancer risk (40). related to exposure (1, 42). The credibility
Because the controls were patients with gas- of representativeness of exposure is low for
trointestinal disorders, some of them had factors related to sociability, such as gregar-
conditions that were either caused by coffee iousness or, possibly, smoking, diet, or al-
drinking or treated by removing coffee from cohol consumption, because sociable people
the diet, and thus the level of coffee drinking are more likely to be selected as controls
was not representative of the study base. The than are loners (42).
magnitude of the bias introduced by inclu- "Friendly control" bias was suspected in
sion of such controls is dependent on the a case-control study of patients with insulin-
proportion of controls with diet-altering dependent diabetes mellitus, where friend
conditions and the relations of these diseases controls were designated by the parents of
to coffee consumption (41). Control condi- cases (43). Insulin-dependent diabetes mel-
tions associated with a confounder do not litus cases were found to be more likely to
need to be excluded, if there will be adjust- have learning problems, to have few friends,
ment for the confounder in the analysis. to dislike school, and to have recent illness
Thus, smoking-related conditions not re- in the family. While these findings could be
lated to coffee drinking might be included due to true risk factors or to recall bias, they
in the control series (34). are more likely to be due to selection bias,
since children perceived to have problems
FRIEND CONTROLS may be less likely to be identified as friends
and, therefore, as controls (44). Further,
Friends of cases may be a more conven- there was some evidence suggesting that par-
ient and inexpensive source of controls than ents gave names of children from families
are other alternatives. Controls can be se- with "mainstream" social characteristics
lected from a list of friends or associates (44).
obtained from the case at little extra effort A less serious problem is that the use of
while the case is being interviewed. Friends friend controls can lead to overmatching,
may be likely to use the medical system in since friends tend to be similar with regard
similar ways. Moreover, biases due to social to life-style and occupational exposures of
class are reduced since usually the case and interest, as in a study of head trauma and
friend control will be of a similar socioeco- seizures (45); the loss of efficiency due to
nomic background. overmatching depends on how strongly head
Nonetheless, we have strong reservations trauma is correlated among friends (e.g.,
Types of Control Groups 1037

motorcycle racers and boxers) and how sibling controls because of overmatching
closely it is related to gregariousness. (31).
These problems can be alleviated to some Trade-offs in using relative controls are
extent by asking cases for the names of sev- illustrated in a study of the association be-
eral friends and choosing controls randomly tween tonsillectomy and Hodgkin's disease
from the list or by asking for names of (52) that used two control groups, siblings
associates rather than friends (31,46) so that and spouses, to control for socioeconomic
the control will not be the case's closest, and status in childhood and adulthood, respec-
perhaps most sociable, friend. However, tively. A higher risk for tonsillectomy was
those on the list will still tend to be more found with spousal controls than with sib-
sociable than is a loner who is not on any- ling controls, suggesting either positive con-
one's list (but can become a case), and there founding by childhood socioeconomic status
is no reason to believe that the extra friends or negative confounding by adult socioeco-
named will have different characteristics nomic status.
from those who would be named on a
shorter list (47). In addition, some cases may
not be willing to provide names of friends THE CASE SERIES AS THE SOURCE
(48), increasing nonresponse. OF CONTROLS
Despite serious shortcomings, friend con-
trols may be useful in some exceptional cir- An individual can serve as his own control
cumstances, such as in a study of exposures for a study of an acute event when the effect
unrelated to friendship characteristics, as is of an exposure is transient (53), such as the
likely in a study of a genetically determined effect of a possible triggering activity on
metabolic disorder (48, 49). myocardial infarction. The impact of the
exposure is evaluated by comparing the pro-
RELATIVE CONTROLS portions of events occurring during the pu-
tative period of elevated risk and the pro-
The choice of relative controls is moti- portions of time each individual has been at
vated by the deconfounding principle, not elevated risk. This "case-crossover" design
the study-base principle (1). When genetic (53) can be thought of as a case-control
factors confound the effect of exposure, design where each stratum consists of a sin-
blood relatives of the case have been used as gle individual (or as a cohort study with
a source of controls (50) in an attempt to many noninformative strata). The study
match on genetic background. Spousal and base principle is clearly satisfied. Although
sibship relationships form strata and meet there is no possibility of between-subject
the reciprocity requirement (1, 42), but the confounding, a second exposure that tends
theoretical justification for other relatives is to occur at the same time or at different
more tenuous. Spouses might be a suitable times from the study exposure can cause
control group if matching on adult environ- confounding (53). This design has the ad-
mental risk factors is sought. When sibling vantages that only patients need to be stud-
controls are used in studies of the association ied (53) and that recurrences can be handled
between genetic markers and the risk of easily.
cancer, confounding by factors related to However, for studies of chronic diseases
ethnicity is minimized (51); however, cases where the main focus is on more stable time-
and controls may be overmatched on a va- dependent covariates, the use of a study
riety of genetic and environmental factors series of cases only, as might be found in a
that are not risk factors but are related to disease registry, requires a complete and ac-
the exposure under study. For example, ef- curate exposure history and the strong as-
fects of risk factors associated with family sumption that the exposure of interest is
size cannot be assessed in a study using unrelated to overall mortality (54). This
1038 Wacholder et al.

study design may also have lower power ately selected living controls do not make
than more conventional studies (54). accuracy fully comparable since the controls
are still alive while the cases are deceased,
and responses by their surrogates may be
PROXY RESPONDENTS AND influenced by factors associated with the
DECEASED CONTROLS subject's death (67). Nonetheless, validation
studies (68, 69), in which the responses of a
Interviews with proxy respondents are proportion of the living subjects and their
often used when subjects are deceased or too proxies are obtained, can be used to reduce
sick to answer questions or for persons with the bias due to errors from proxy responses.
perceptual or cognitive disorders (55). Be- In studies with deceased cases, the use of
cause proxy respondents will tend to be used proxy interviews for appropriately selected
more often for cases than for healthy con- live controls is usually preferable to the use
trols, violation of the comparable accuracy of dead controls, particularly if the study
principle is likely. Surrogates, particularly exposure is likely to be associated with over-
spouses and children, generally provide ac- all mortality. The advisability of insisting on
curate responses for broad categories of ex- a proxy interview for a live control depends
posure information, although more detailed on what information will be obtained from
information is usually less reliable (56-60). the interview. When exposure is assessed
For some variables, such as cigarette smok- directly, comparable accuracy for cases and
ing, and consumption of coffee and alcohol, controls in an interview designed primarily
spouses and children are remarkably accu- to elicit information on confounders does
rate, even when compared with reinter- not necessarily reduce the bias in the esti-
viewed living subjects (61, 62). Proxies may mate of the effect of exposure (1, 67, 70);
even provide better information than the therefore, a proxy interview for the control
index subjects, such as in nutritional studies may not help. Using proxy interviews for
among older subjects, where a subject's wife live controls should be considered only when
may have prepared much of her husband's 1) information about a key study exposure
food (63). is to be obtained by interview and 2) a proxy
When feasible, reducing the time interval report for the case is likely to be substantially
between diagnosis and interview of cases can less accurate than the control's self-report
reduce the number of proxy interviews re- about the key study exposure.
quired. When information is obtained from
a surrogate because the case is dead, using a
living control sampled properly from the Controls in proportionate mortality
base can violate the comparable accuracy studies
principle. However, insisting on a dead con-
trol (64) violates the study base principle, A proportionate mortality study can be
since the base consists of living subjects and viewed as a case-control study with controls
subjects who die represent a special sample obtained from a registry consisting of deaths
from that base. In order to use dead controls, (71). The underlying assumption is that the
one needs to assume representativeness of distribution of the exposure under study
exposure (1), that the dead controls have the among subjects who died from other causes
same distribution of exposure variables as is the same as that in the base, which consists
does the base. This assumption has been of living persons only. Just as in other
demonstrated to be incorrect for a number registry-based studies, deaths from causes
of personal behavior variables, including use related to the study exposure must be ex-
of tobacco and alcohol (65), even after cluded from the control series (21). Thus,
deaths from causes believed to be associated this kind of study may not be suitable for
with the study exposure are excluded (66). investigating exposures such as smoking that
Interviews with surrogates of appropri- are risk factors for causes of death repre-
Types of Control Groups 1039

senting a high proportion of mortality. One 5. Whittemore AS. Estimating attributable risk from
case-control studies. Am J Epidemiol 1983; 117:
advantage of the approach is that a roster 76-85.
for the eligible controls can be established 6. Gail M, Lubin JH, Silverman DT. Elements of
conveniently; any absences from the base design in epidemiologic studies. In: DeLisi C, Ei-
typically will not lead to selection bias, since senfeld J, eds. Statistical methods in cancer epide-
miology. Amsterdam: North Holland Publishing
the efficiency of the system for registering Co, 1985:313-23.
deaths from most causes is unlikely to vary 7. Savitz DA, Pearce N. Control selection with incom-
substantially with cause of death. However, plete case ascertainment. Am J Epidemiol 1988;
127:1109-17.
errors in attribution of cause of death do 8. Cole P, Monson RR, Haning H, et al. Smoking
occur, for example for AIDS, suicide, or and cancer of the lower urinary tract. N Engl J
cirrhosis of the liver, resulting in misclassi- Med 1971,284:129-34.
9. Adami HO, Rimsten A, Stenkvist B, et al. Repro-
fication bias and over- or underexclusion of ductive history and risk of breast cancer. Cancer
subjects. 1978;41:747-57.
10. Gold EB, Diener MD, Szklo M. Parental occupa-
tions and cancer in children. J Occup Med 1982;
24:578-84.
NUMBER OF CONTROL GROUPS 11. Hatten J. Medicare's common denominator the
covered population. Health Care Finan Rev 1980;
Some researchers have suggested choosing 2:53-64.
more than one control group (72, 73). It 12. Greenberg ER. Random digit dialing for control
certainly is reassuring when the results are selection. A review and a caution on its use in
studies of childhood cancer. Am J Epidemiol 1990;
concordant across control series. However, 131:1-5.
when the results are discordant (25, 27, 52), 13. Waksberg J. Sampling methods for random digit
the investigators must decide which result is dialing. J Am Stat Assoc 1978;73:4O-6.
14. Harlow BL, Davis S. Two one-step methods for
"correct" and essentially discard the other. household screening and interviewing using ran-
We therefore believe that doubt is not a good dom digit dialing. Am J Epidemiol 1988; 127:
basis for choosing an additional control 857-63.
15. Harlow B, Hartge P. Telephone household screen-
group. Rather, the best strategy usually is to ing and interviewing. Am J Epidemiol 1983; 117:
decide which series is preferable at the design 632-3.
stage. 16. Thornberry TO, Massey JT. Trends in United
States telephone coverage across time and
Multiple control groups might be helpful subgroups. In: Groves RM, Biemer PB, Lyberg LE,
when each serves a different purpose, as et al., eds. Telephone survey methodology. New
York: John Wiley & Sons, Inc, 1988:25-49.
when each control group provides the ability 17. Ward EM, Kramer S, Meadows AT. The efficacy
to control for a particular confounder. In of random digit dialing in selecting matched con-
this situation, the second control group can trols for a case-control study of pediatric cancer.
act as a form of replication. Am J Epidemiol 1984; 120:582-91.
18. Robison LL, Daigle A. Control selection using
random digit dialing for cases of childhood cancer.
Am J Epidemiol 1984; 120:164-6.
19. Mohadjer L. Stratification of prefix areas for sam-
REFERENCES pling rare populations. In: Groves RM, Biemer PB,
Lyberg LE, et al., eds. Telephone survey method-
1. Wacholder S, McLaughlin JK, Silverman DT, et ology. New York: John Wiley & Sons, Inc, 1988:
al. Selection of controls in case-control studies. I. 161-73.
Principles. Am J Epidemiol 1992,135:1019-28. 20. Miettinen OS. Theoretical epidemiology: principles
2. Wacholder S, Silverman DT, McLaughlin JK, et of occurrence research in medicine. New York:
al. Selection of controls in case-control studies. III. John Wiley & Sons, Inc, 1985.
Design options. Am J Epidemiol 1992; 135: 21. Miettinen OS. The "case-control" study: valid se-
1042-50. lection of subjects. J Chronic Dis 1985;38:543-8.
3. Graubard B, Fears TR, Gail MH. Effects of cluster 22. Cohen BH. Family patterns of longevity and mor-
sampling on epidemiologic analysis in population- tality. In: Neel JV, Shaw MW, Schull WJ, eds.
based case-control studies. Biometrics 1989;45: Genetics and the epidemiology of chronic diseases.
1053-71. Washington, DC: US Department of Health, Edu-
4. Cole P. Introduction. In: Breslow NE, Day NE, cation, and Welfare, 1963:237-63. (DHEW publi-
eds. Statistical methods in cancer research. Vol 1. cation no. 1163).
The analysis of case-control studies. Lyon: Inter- 23. Massey FJ, Bernstein FS, O'Fallon WM, et al.
national Agency for Research on Cancer, 1980:14 Vasectomy and health. JAMA 1984;252:1023-9.
40. (IARC scientific publication no. 32). 24. Vernick LJ, Vernick SL, Kuller KH. Selection of
1040 Wacholder et al.

neighborhood controls: logistics and fieldwork. J 46. Kelsey JL, Thompson WD, Evans AS. Methods in
Chronic Dis 1984;37:177-82. observational epidemiology. New York: Oxford
25. Horwitz RI, Feinstein AR. Alternative analytic University Press, 1986.
methods for case-control studies of estrogens and 47. Thompson WD. Nonrandom yet unbiased. Epide-
endomctrial cancer. N Engl J Med 1978;299: miology 1990; 1:262-5.
1089-94. 48. Shaw GL, Tucker MA, Kase RG, et al. Problems
26. Hutchison GB, Rothman KJ. Correcting a bias. N ascertaining friend controls in a case-control study
Engl J Med 1978;299:1129-30. of lung cancer. Am J Epidemiol 1991 ;133:636.
27. Hulka BS, Grimson RC, Greenberg BG, et al. 49. Flanders WD, Austin H. Possibility of selection
"Alternative" controls in a case-control study of bias in matched case-control studies using friend
endometrial cancer and exogenous estrogen. Am J controls. Am J Epidemiol 1986; 124:150-3.
Epidemiol 1980; 112:376-87. 50. Goldstein AM, Hodge SE, Haile RW. Selection
28. Rothman KJ. Modern epidemiology. Boston: Lit- bias in case-control studies using relatives as the
tle, Brown & Company, 1986. controls. Int J Epidemiol 1989; 18:985-9.
29. Nelson LM, Franklin GM, Hamman RF, et al. 51. Petrakis NL, King MC. Genetic markers and can-
Referral bias in multiple sclerosis research. J Clin cer epidemiology. Cancer 1977,39:1861-6.
Epidemiol 1988;41:187-92. 52. Gutensohn N, Li F, Johnson R, et al. Hodgkin's
30. Brown LM, Pottern LM, Hoover RN. Prenatal and disease, tonsillectomy, and family size. N Engl J
perinatal risk factors for testicular cancer. Cancer Med 1975;292:22-5.
Res 1986;46:4812-16. 53. Maclure M. The case-crossover design: a method
31. MacMahon B, Pugh TF. Epidemiology: principles for studying transient effects on the risk of acute
and methods. Boston: Little, Brown & Company, events. Am J Epidemiol 1991;133:144-53.
1970. 54. Prentice RL, Vollmer WM, Kalbfleisch JD. On the
32. Flanders WD, Boyle CA, Boring JR. Bias associated use of the case series to identify disease risk factors.
with differential hospitalization rates in incident Biometrics 1984;40:445-58.
case-control studies. J Clin Epidemiol 1989;42: 55. Nelson LM, Longstreth WT Jr, KoepseU TD, et al.
395-401. Proxy respondents in epidemiologic research. Epi-
33. Wacholder S, Silverman DT. Re: "Case-control demiol Rev 199O;12:71-86.
studies using other diseases as controls: problems 56. Rogot E, Reid DD. The validity of data from next
of excluding exposure-related diseases." (Letter). of kin in studies of mortality among migrants. Int
Am J Epidemiol 1990;132:1017-18. J Epidemiol 1951;4:51-4.
34. MacMahon B, Yen S, Trichopoulos D, et al. Coffee 57. Kolonel LN, Hirohata T, Nomura AMY. Ade-
and cancer of the pancreas. (Letter). N Engl J Med quacy of survey data collected from substitute re-
1981:304:1605-6. spondents. Am J Epidemiol 1977; 106:476-84.
35. Lubin JH, Hartge P. Excluding controls: misappli- 58. Marshall J, Priorc R, Haughey B, ct al. Spouse-
cations in case-control studies. Am J Epidemiol subject interviews and thereliabilityof diet studies.
1984; 120:791-3. Am J Epidemiol 1980; 112:675-83.
36. Jick H, Vessey MP. Case-control studies in the 59. Lerchen ML, Samet JM. An assessment of the
evaluation of drug-induced illness. Am J Epidemiol validity of questionnaire responses provided by a
1978:107:1-7. surviving spouse. Am J Epidemiol 1986; 123:
37. Axelson O. The case-referent studysome com- 481-9.
ments on its structure, merits, and limitations. 60. Blot WJ, McLaughlin JK. Practical issues in the
Scand J Work Environ Health 1985;11:207-13. design and conduct of case-control studies: use of
38. Linet MS, Brookmeyer R. Use of cancer controls next-of-kin interviews. In: Blot WJ, Hirayama T,
in case-control cancer studies. Am J Epidemiol Hoel DG, eds. Statistical methods in cancer epide-
1987;125:1-11. miology. Hiroshima: Radiation Effects Research
39. Smith AH, Pearce NE, Callas PW. Cancer case- Foundation, 1985:49-62.
control studies with other cancers as controls. Int J 61. McLaughlin JK, Dietz MS, Mehl ES, et al. Relia-
Epidemiol 1988; 17:298-306. bility of surrogate information on cigarette smoking
40. MacMahon B, Yen S, Trichopoulos D, et al. Coffee by type of informant. Am J Epidemiol 1987; 126:
and cancer of the pancreas. N Engl J Med 1981; 144-6.
304:630-3. 62. McLaughlin JK, Mandel JS, Mehl ES, et al. Reli-
41. Silverman DT, Hoover RN, Swanson GM, et al. ability of next-of-kin and self-respondents for cig-
The prevalence of coffee drinking among hospital- arette, coffee, and alcohol consumption. Epidemi-
ized and population-based control groups. JAMA ology 1990;l:408-12.
1983:249:1877-80. 63. Samet JM. Surrogate sources of dietary informa-
42. Robins J, Pike M. The validity of case-control tion. In: Willett W, ed. Nutritional epidemiology.
studies with nonrandom selection of controls. New York: Oxford University Press, 1990:133-42.
Epidemiology 1990; 1:273-84. 64. Gordis L. Should dead cases be matched to dead
43. Siemiatycki J, Colle S, Campbell S, et al. Case- controls? Am J Epidemiol 1982;115:1-5.
control study of insulin-dependent (type I) diabetes 65. McLaughlin JK, Blot WJ, Mehl ES, et al. Problems
mellitus. Diabetes Care 1989; 12:209-16. in the use of dead controls in case-control studies.
44. Siemiatycki J. Friendly control bias. J Clin Epide- I. General results. Am J Epidemiol 1985,121:
miol 1989;42:687-8. 131-9.
45. Hochberg F, Toniolo P, Cole P. Head trauma and 66. McLaughlin JK, Blot WJ, Mehl ES, et al. Problems
seizures as risk factors of glioblastoma. Neurology in the use of dead controls in case-control studies.
1984;34:1511-14. II. Effect of excluding certain causes of death. Am
Types of Control Groups 1041

J Epidemiol 1985;122:485-94. 70. Greenland S. The effect of misclassification in the


67. Walker AM, Velema JP, Robins JM. Analysis of presence of covariates. Am J Epidemiol 1980; 112:
case-control data derived in part from proxy re- 564-9.
spondents. Am J Epidemiol 1988; 127:905-14. 71. Miettinen OS, Wang J-D. An alternative to the
68. Armstrong BG, Whittemore AS, Howe GR. Analy- proportionate mortality ratio. Am J Epidemiol
sis of case-control data with covariate measurement 1981;114:144-8.
error application to diet and colon cancer. Stat 72. Ibrahim MA, Spitzer WO. The case-control study:
Med 1989;8:1151-65. the problem and the prospect J Chronic Dis 1979;
69. Armstrong BG. The effects of measurement errors 32:139-44.
on relative risk regressions. Am J Epidemiol 1990; 73. The case-control study. (Editorial). Br Med J 1979;
132:1176-84. 2:884-5.

Anda mungkin juga menyukai