Anda di halaman 1dari 5

P e r sp ec ti v e s

Perspective
Comparative Effectiveness: Asking The Right
Questions, Choosing The Right Method
Methodological choice should be driven by policy goals.
by Steven M. Teutsch, Marc L. Berger, and Milton C. Weinstein

ABSTRACT: The Medicare Prescription Drug, Improvement, and Modernization Act (MMA)
of 2003 has placed renewed focus on assessing the comparative effectiveness of various
therapeutic options. Unfortunately, all of the evidence needed to fully assess these options
is rarely available to drug formulary decisionmakers. Comparative randomized trials fre-
quently fail to find differences when there indeed are some, while decision-modeling ap-
proaches are more likely to identify differences where there are none. We consider the con-
sequences of these strategies. This paper proposes a framework for using different
methods to assess available evidence. We contend that choosing the appropriate method
can occur only when there are clear policy goals.

T
h e e s c a l at i n g c o s t of medical portant limitations. Perhaps of greatest impor-
care in general, and prescription drugs tance is the relative scarcity of randomized
in particular, has fostered the demand controlled trials (RCTs) to answer critical
for mechanisms to assure that money is being questions about comparative effectiveness.
spent wisely. The Medicare Prescription n Randomized controlled trials. When
Drug, Improvement, and Modernization Act assessing efficacy, RCTs are considered to be
(MMA) of 2003, the advent of the Academy of the gold standard. Even though typically at
Managed Care Pharmacy’s Format for Formu- least two pivotal RCTs are performed to ob-
lary Submissions, and Oregon Medicaid’s evi- tain Food and Drug Administration (FDA)
dence-based formulary decision process are clearance to market a new drug, these rarely
manifestations of the increasing demand for provide all of the information practitioners
good comparative information on effective- need; they usually compare the new drug to
ness. We discuss the information that is placebo and frequently use surrogate or inter-
needed for effective decision making and how mediate measures of efficacy, such as blood
decisionmakers’ needs should inform priori- pressure or low-density lipoprotein (LDL)
ties and the choice of scientific method. cholesterol rather than outcomes such as car-
diovascular mortality. The comparison against
Methods For Assessing Evidence placebo is based on regulatory requirements
While evidence-based decision making as and the desire to minimize the uncertainty
currently practiced may seem like an approach surrounding efficacy assessments. Post-
that should be universally acclaimed, it has im-

Steven Teutsch (steven_teutsch@merck.com) is executive director, outcomes research and management, at Merck
and Company Inc., in West Point, Pennsylvania. Marc Berger is vice president in that department. Milton Weinstein
is the Henry J. Kaiser Professor of Health Policy and Management in the Department of Health Policy and
Management, Harvard School of Public Health, in Boston, Massachusetts.

128 Ja n u a r y/ Fe b r u a r y 2 0 0 5
DOI 10.1377/hlthaff.24.1.128 ©2005 Project HOPE–The People-to-People Health Foundation, Inc.
P erspectiv e: E vidence

launch, comparative efficacy studies of alter- evidence-based decision making. The reality is
native treatments employing intermediate that it will never be feasible to perform enough
measures of efficacy are more common, since RCTs to assess the relative effectiveness of all
their completion requires longer time frames important treatment strategies for different
and larger sample sizes and since they may ad- target populations. Moreover, often RCTs can-
dress competitive marketplace issues. Larger, not provide other critical information. For ex-
longer-term RCTs using true health outcomes ample, harms of treatment are frequently
(such as mortality) have been increasingly per- assessed by long-term comprehensive surveil-
formed during the past decade. The impor- lance, and information on impact among
tance of these trials has been dramatically il- groups at different risks can be assessed from
lustrated by their impact on clinical practice observational studies.
and evidence-based guide- Value of certainty. A salient
lines; examples include the characteristic of RCTs is their
“Adhering to a policy
Antihypertension and Lipid- ability to reduce the uncer-
Lowering Treatment to Pre- that technology will tainty surrounding estimates
ve nt Heart Attack Tr ial be adopted only of efficacy. Thus, RCTs are
(ALLHAT), the Women’s when there is most needed where decision-
Health Initiative, and the unambiguous makers require a high level of
Heart Protection Study. evidence would delay certainty and provide the
Limitations. Even so, groups greatest value where the bur-
the adoption of new
developing guidelines that den of illness (including cost)
demand evidence of the high- technologies by as well as potential benefits,
est quality often find that a many years.” risks, and costs of interven-
sufficient number of studies tions are high. In other cir-
of important clinical ques- cumstances, observational
tions are not available. This situation will studies, systematic reviews, and decision mod-
never disappear, as the full range of benefits els may provide enough certainty for decision
and risks associated with therapeutic deci- making, and results may be available more rap-
sions across the range of potential clinical ap- idly if new studies are needed.1 The level of cer-
plications is not known until long after the tainty required for particular efficacy judg-
technologies have been widely adopted. In ments must be determined by decisionmakers
addition, high-quality RCTs are frequently and stakeholders. Methodologists can then as-
limited by their focus on carefully selected, ad- sess the strategies that can most efficiently
herent populations, which maximizes the op- provide the requisite information.
portunity to demonstrate benefits, and they The subjectivity factor. Subjectivity is intro-
potentially underestimate harms when used duced into all decision making, even when it is
by patients and in settings more typically en- based on evidence. This reflects in part differ-
countered in real-world practice. ences among various stakeholders as to the
Moreover, the majority of studies are de- level of certainty that should be required for
signed with intermediate measures as the trial particular decisions—differences based on
endpoints and do not provide information on variations in preferences and values. These le-
the critical health outcomes that really interest gitimate differences of opinion can be ad-
decisionmakers. Thus, extrapolation from one dressed only by assuring that both delibera-
population to another, from efficacy studies to tions and decision processes are transparent,
effectiveness (that is, results that would be ex- issues are fully vetted, and conflicts of interest
pected in more typical practice settings), or are addressed.2 In doing so, the assumptions
from one technology to another is usually re- underlying projections of risks, benefits, and
quired. This introduces additional uncertainty, costs should be clearly identified, so that all
which belies the aura of rigor associated with stakeholders can provide appropriate input.

H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 129
P e r sp ec ti v e s

When best available evidence suffices. Although dard is very high, and false-positive conclu-
this is not ideal, important decisions—such as sions are uncommon. However, this highly
coverage decisions—must be based on the best specific standard carries a price of reduced
available information. Delaying decisions until sensitivity. Adhering to a policy that technol-
definitive information is available is usually ogy will be adopted only when there is unam-
not an option. Indeed, Mark McClellan, ad- biguous evidence that it has an appropriate
ministrator of the Centers for Medicare and benefit, risk, and cost profile would delay the
Medicaid Services (CMS), has said that the adoption of new technologies by many years
government needs to find alternative ap- and, while it might stimulate some additional
proaches to obtaining information in a timely studies, would likely discourage continued
fashion, including use of systematic reviews medical innovation and public health benefit.
and data from real-world clinical practice.3 Use of systematic evidence reviews can fill the
n Observational studies. Although RCTs information gap and support more rapid adop-
are not available for many key effectiveness tion and dissemination of medical innovations,
questions, a wealth of useful information does but with additional risk.
exist in observational studies and other A strict requirement for good-quality RCTs
sources. It can be assembled through rigorous minimizes the risk of a “Type I error” or “error
literature synthesis, including meta-analysis. of commission.” If multiple high-quality RCTs
The full spectrum of information can then be find that “drug A” is safer and more effective
incorporated into a formal analytic framework, than “drug B,” one can have high confidence
such as a decision tree or Markov model, which that this is so. Because detailed data are often
can be used to assess the benefits, harms, and not available on drug A and drug B for specific
costs of alternative interventions.4 Thus, ana- subpopulations, however, advantages of drug
lysts can extrapolate from one population, time B for those patients may not be identified by
frame, or technology to another. They can take RCT-level evidence. Thus the probability of a
advantage of expert opinion and make reason- “Type II error,” or “error of omission,” may be
able assumptions to link diverse information. large. Real differences among alternatives may
Organizing and combining this information al- not be found because there are insufficient
lows the net benefit of alternatives to become studies available. Conversely, by admitting
apparent. The risks associated with this ap- more varied information, observational studies
proach arise from assumptions that are not well and decision modeling can often elucidate real
founded and from combining information that differences even though head-to-head RCTs are
turns out to be inaccurate or inappropriate. not available or have limitations. This informa-
Transparency and conflicts of interest. The issues tion comes at a price, however. While the risk
of transparency and conflicts of interest may of a Type II error may be small, there is a very
be more acute for decision models because of real possibility of making Type I errors, that is,
the many obvious and less-than-obvious finding differences when none exists.
choices that are made. For example, extrapola-
tion from one therapeutic agent to another be- Toward A Coherent Policy
cause of similar effects on surrogate markers Approaching a coherent policy regarding
(for example, LDL cholesterol, based on as- how to address the lack of critical evidence to
sumptions of analogous mechanisms of action) assess comparative effectiveness requires a
may ignore effects on other important out- more nuanced approach and should be tai-
comes. The harms associated with cerivastatin lored to the situation. For someone with a seri-
(Baycol) led to its recall in 2001, even though it ous health condition such as cancer, for which
was in the same class (statins) as other highly long-term comparative treatment trials are not
effective drugs with proven outcomes.5 available, patients are generally willing to
Comparison with RCTs. When RCTs are avail- choose treatment based on imperfect informa-
able, the specificity of an evidence-based stan- tion. Conversely, for a preventive service in the

130 Ja n u a r y/ Fe b r u a r y 2 0 0 5
P erspectiv e: E vidence

asymptomatic population—such as screening years to complete and existing resources allow


for the breast cancer (BRCA) gene, followed for only a limited number of head-to-head
by treatment with tamoxifen—one would ex- comparative outcomes studies, there has been
pect a high degree of certainty that benefits increasing interest in using observational data
outweigh harms before subjecting the popula- from cohort studies, claims data, and a hoped-
tion to the service. In such a case, a strict re- for electronic medical record to complement
quirement of good-quality RCT evidence may RCTs. This call is tempered by the under-
make sense. standing that even well-done observational
Whether more evidence should be ob- studies can lead to erroneous conclusions. The
tained at any stage ought to depend on experience with combination hormone ther-
whether the value of the information obtained, apy is a recent example, in which observa-
in terms of getting people on or off treatment, tional studies suggested substantial cardiovas-
outweighs the cost and time of conducting the cular benefits. Only with the recent Women’s
necessary studies. It will be important to rec- Health Initiative, a large randomized trial, was
ognize the trade-offs among different method- the overall net increase in harms apparent.
ological approaches to informed decision mak- One policy goal should be to ensure that ad-
ing, so that choices made will be as closely equate data are developed to assess the true
aligned with intended goals as possible. This benefit-risk-cost profiles of new therapies.
will require an open dialogue with all stake- Given the growing number of examples where
holders, as others have described.6 “true outcome” RCTs have yielded results un-
When decisions are made on imperfect in- favorable to the manufacturers that were the
formation, processes need to be in place to re- study sponsors, it is an open question whether
evaluate those decisions as new information such studies will routinely be performed in the
becomes available, and decisionmakers need to future. To encourage this, incentives could be
consider the costs and benefits of changing provided to manufacturers, such as giving for-
those policies once adopted. The recent mulary preference to drugs with endpoint out-
decision by the CMS on narrow coverage of come data or extensive drug experience, or
positron emission tomography (PET) scans for both. There could be more stringent require-
Alzheimer’s disease represents a provisional ments for information regarding effectiveness
decision, which is based on available data and or safety, or both, from new entrants into an
which will be revisited and revised after a clin- established class. Alternatively, these studies
ical trial has been completed. will need to be funded through the National
n Comparative effectiveness of drugs. Institutes of Health (NIH), AHRQ, or other
Over the past few years, interest in the com- governmental sources.
parative effectiveness of drugs has increased. n Criteria for government priority list.
The Oregon Medicaid program has contracted The government will need a systematic ap-
with the Oregon Health and Science Univer- proach to prioritizing what studies need to be
sity (OHSU) Evidence-based Practice Center done. MMA calls for AHRQ to develop a prior-
(EPC) to conduct evidence reviews for use in ity listing of where additional research is
Medicaid formulary decisions. The EPC uses needed; AHRQ has conducted hearings and is
analytic frameworks and evidence-based considering various criteria.
methods adapted from the U.S. Preventive Ser- We propose the following criteria: (1)
vices Task Force (USPSTF).7 Additional states What is the value of gaining additional infor-
will be using these reviews for their Medicaid mation? (2) What do we really need to know
formularies. MMA directs the CMS to work to make a good policy decision regarding the
with the Agency for Healthcare Research and use of one technology or another in the treat-
Quality (AHRQ) on conducting studies of ment of a particular health condition? Aligned
comparative effectiveness. with this is the corollary question: (3) How
Because outcomes trials often take many certain do we need to be about what we know?

H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 131
P e r sp ec ti v e s

How these questions are answered can permit dividual decisions over time. This process
researchers to decide upon the appropriate would increase the likelihood of good decision
methods to assess comparative effectiveness. making, raise public confidence in the decision
n Choosing the “right” method. It is im- process, and provide guidance for the use or
portant that methodological choices be driven conduct of the appropriate types of studies.
by policy goals. Strict reliance on good-quality

T
RCTs may be most appropriate where there h e r e i s n o s i n g l e “right” answer on
are treatment options with excellent benefit- which approach along the continuum
risk-cost profiles or where there exist high lev- from observational data to strict evi-
els of confidence in the ability to identify real dence-based decisions is correct, but rigid ad-
differences among therapies. These differences herence to one approach or another will
are important. For example, for potentially clearly lead to suboptimal decision making.
hazardous interventions among patients at The proper choice of method requires that
low risk of poor outcomes, demanding long- stakeholders clearly assess the purposes,
term outcomes data is an incentive that re- harms, and benefits of alternative approaches
wards innovators who conduct the required and establish criteria against which different
RCTs to demonstrate the true benefit-risk- technologies should be evaluated.
cost profile of a new treatment option.
Alternatively, comparative effectiveness NOTES
studies that use observational data and model- 1. Observational studies, unlike RCTs, do not in-
ing may be preferable when addressing thera- volve any intervention to study participants;
pies for which there are no available treat- they measure health outcomes as they naturally
ments with acceptable benefit-risk-cost occur in real-world populations. Systematic evi-
dence reviews are structured analyses of avail-
profiles or where less certainty in efficacy esti- able evidence from a comprehensive literature
mates is required. This may commonly apply search with a detailed evaluation of the quality of
to the development of novel cancer chemo- studies found and a summary of their findings.
therapeutic agents for patients for whom no Decision models are formal analytic frameworks,
therapeutic options exist. which may incorporate costs, the value of out-
comes, and the probabilities for particular bene-
When a decision must be made regardless fits and harms to assess the overall benefits and
of the quality of data available or when it can costs of treatment alternatives.
be made only when high-quality data are avail- 2. N. Daniels and J. Sabin, “Limits to Health Care:
able, “choosing the right method” to develop Fair Procedures, Democratic Deliberation, and
the evidence (and the appropriate level of cer- the Legitimacy Problem for Insurers,” Philosophy
tainty) is relatively simple. Of course, most and Public Affairs 26, no. 4 (1997): 303–350.
technologies and interventions lie between 3. J.D. Kleinke, “Think Globally, Protect Locally: A
Conversation with Mark McClellan,” Health Af-
these extremes, and in these instances the im-
fairs 23, no. 3 (2004): 177–185.
portance of stakeholders’ preferences and val-
4. S.J. Goldie and P.S. Corso, “Decision Analysis,” in
ues in making decisions increases. Prevention Effectiveness: A Guide to Decision Analysis
We propose that a taxonomy of types of de- and Economic Evaluation, 2d ed., ed. A.C. Haddix,
cisions should be developed. The taxonomy S.M. Teutsch, and P.S. Corso (New York: Oxford
would define what level of evidentiary cer- University Press, 2003), 103–126.
tainty and generalizability should be required 5. See Food and Drug Administration, “Baycol Infor-
for a category of decision and how that level mation,” August 2001, www.fda.gov/cder/drug/
infopage/baycol/default.htm (23 November 2004).
was determined. It would make explicit the
6. Daniels and Sabin, “Limits to Health Care.”
influence of stakeholder values and prefer-
7. Methods, reviews, recommendations, and re-
ences that established the parameters of each ports of the USPSTF are available at Agency for
category. Such a taxonomy across the spec- Healthcare Research and Quality, “Preventive
trum of decisions would promote greater con- Services,” www.preventiveservices.ahrq.gov (21
sistency than would probably emerge from in- October 2004).

132 Ja n u a r y/ Fe b r u a r y 2 0 0 5

Anda mungkin juga menyukai