Anda di halaman 1dari 51

Choice of controls in a case-control

study (AS03)
EPM304 Advanced Statistical Methods in Epidemiology

Course: PG Diploma/ MSc Epidemiology

This document contains a copy of the study material located within the computer
assisted learning (CAL) session.
If you have any questions regarding this document or your course, please contact
DLsupport via DLsupport@lshtm.ac.uk.
Important note: this document does not replace the CAL material found on your
module CDROM. When studying this session, please ensure you work through the
CDROM material first. This document can then be used for revision purposes to
refer back to specific sessions.
These study materials have been prepared by the London School of Hygiene & Tropical Medicine as part of
the PG Diploma/MSc Epidemiology distance learning course. This material is not licensed either for resale
or further copying.
London School of Hygiene & Tropical Medicine September 2013 v2.0

Section 1: Choice of controls in a case-control study


Aim

To learn about the different sampling schemes for a case-control study and
know what the exposure odds ratio estimates.

Objectives
By the end of this session students will be able to:

list the 3 different approaches to sampling controls


describe how the choice of approach determines which measure of effect
(risk, rate, or odds ratio) is estimated by a case control study
choose an appropriate sampling scheme for a given situation
state the basic principles which should always underlie the choice of control
group, to avoid bias (whichever measure of effect we choose to estimate)
list the advantages and disadvantages for the chosen control group in a
specific case-control study

This session should take you between 1.5 and 2.5 hours to complete.

Section 2: Planning your study


In this session you will learn about the appropriate choice of controls for a casecontrol study. The key issue is that the way in which your controls are selected
affects what is estimated by the exposure odds ratio, which is calculated from a
case-control study.
To work through the material you should be familiar with the design of case-control
studies, measures of incidence, prevalence and measures of effect. You should also
understand that a case-control study can only measure the exposure odds ratio.
Students who completed the core units and EP202 can refer to the sessions listed
below. Occasional students should refer to their preferred text for these subjects.
Case-control studies

FE16

Measures of effect

SM01

Case-control studies

SM04

2.1: Planning your study


This session is different from most of the other sessions in this unit because we will
not present methods of data analysis but will discuss design issues. Specifically, we
will discuss the design of a case-control study and the relevance of this to the
interpretation of the exposure odds ratio.
Four different situations that characterise both the disease and the risk factor are
used for illustration. Appropriate sampling schemes are given for each situation.
During this session, you will refer to Mahmood's (1989) paper on infant feeding and
risk of severe diarrhoea, which is in your reader.

Section 3: Introduction
Title: Introduction
The tabs below discuss the origin and main advantages of case-control studies.
Interaction: Tabs: 1:
Case-control studies originated within research on the causes of rare diseases.
Can you think why this design was (and still is) particularly efficient for this area of
research?
Interaction: Button: clouds picture (pop up box appears and card on RHS
appears):
With a rare disease only a small proportion of the population are cases, as shown
opposite.
With a case-control study you can usually focus on all the cases, and then study only
a small fraction of the non-case population.

The controls are a sample of the non-cases, selected from the population that
produced the cases.
Interaction: Tabs: 2:
Case-control studies are also useful for diseases with long latent or incubation
periods. This is because they are retrospective in nature, so there is no need to wait
for a disease to develop.
Interaction: Tabs: 3:
Case-control studies have also been used to assess the effects of health
interventions where, for example, randomised controlled trials (RCTs) are not
possible.
For what reasons might it not be possible to use RCTs?
Interaction: Button: clouds picture (pop up box appears ):
It is usually because of ethical or logistic reasons that it is not possible to use
randomised controlled trials.
Examples
Some examples of where case-control studies have been used are given below.
1. Investigation of cervical cancer and women within a screening programme.
2. The assessment of risk factors for legionnaires disease.
3. The assessment of the efficacy of BCG vaccination against leprosy.

3.1: Introduction
Case-control studies, for many understandable reasons, originated from research
into rare diseases. However, more recently case-control studies have been applied
within the research of 'common' diseases.
Can you think why using a case-control design within common disease research
could be an advantage over cohort or intervention studies?
Interaction: Button: clouds picture (pop up box appears and card appears on
RHS):
A case-control study can avoid many of the ethical issues that arise with prospective
cohort and intervention studies. This is because the disease status of study
individuals is already determined. Another advantage is that a case-control study
may also be simpler and quicker to conduct, if it can, for example, be conducted
entirely within a health facility.
Example
Case-control studies have been applied to common diseases, such as the study of
childhood diseases like diarrhoea and acute respiratory illness.

3.2: Introduction
In case-control studies you must decide at the design stage how to recruit controls.
This is one of the most important decisions for an investigator designing a casecontrol study.
Why do you need to choose the 'correct' controls?
Interaction: Button: clouds picture (box appears and card appears on RHS):
Controls are chosen in order to give an unbiased estimate of the appropriate
measure of effect, i.e. the measure defined in the study objective.
The way in which controls are recruited determines both what is estimated (risk
ratio, rate ratio, odds ratio) and the validity of the estimate.
Before the 1950's, the analysis of case-control studies was restricted to a simple
comparison of the proportion of cases exposed and the proportion of controls
exposed. No attempt was made to estimate the size of the exposure effect.
Click below for an example of this.
Interaction: Button: Example (pop up box appears):
In a case-control study to examine the association between smoking and lung
cancer, initially the analysis would address the question:
Do more lung cancer patients have a history of smoking than controls?

The analysis would not attempt to answer the question of the size of the exposure
effect:
How much does smoking increase the risk of lung cancer?

3.3: Introduction
The first estimate for the magnitude of the exposure effect was suggested by
Cornfield in 1951.
He pointed out for the first time that it was possible to estimate the relative risk
associated with exposure, using the ratio of the odds of exposure of cases to the
odds of exposure of controls. It was suggested that this estimate was only possible if
the disease was rare.

In referring to relative risk, no distinction was made between the risk ratio,
rate ratio and odds ratio of disease .
The rare disease assumption remained until 1976 when Miettinen argued that it only
applied to case-control studies:
"...in which the subjects are ascertained after the end of the entire risk-period of
interest."
It did not apply to the case-control studies in chronic disease epidemiology in which
incident cases and controls are recruited concurrently. Meittinen showed that in this
case it was possible to obtain an exposure odds ratio which estimated the rate ratio,
and that no rarity assumption was needed for this.
When these ideas were explored in more detail (by Smith et al 1984, International
Journal of Epidemiology 13: 87-93; and by Greenland and Thomas 1982; American
Journal of Epidemiology 116: 547-553), it was found that it was possible to design a
case-control study for which the exposure odds ratio gave an estimate of the risk
ratio.

3.4: Introduction
On the following pages we will review the concepts of risk ratio, rate ratio and odds
ratio of disease. If the choice of the sampling scheme for controls is suitable, each of
these estimates can be estimated directly in a case-control study by the exposure
odds ratio. This applies to both 'rare' and 'common' diseases.

Section 4: The different ratio measures of incidence


Let's first review the measurement of relative incidence in terms of a population
cohort and look at the implications for case-control designs.

The diagram opposite illustrates a typical closed cohort. There is a gradual


accumulation of cases over a period of time ( 0 to T ) among persons initially
disease-free.
Click 'show' opposite to see this.
For simplicity, it is assumed that the population is closed over the period. What do
we mean by a closed population?
Interaction: Button: clouds picture(pop up box appears):
In a closed population (over time) we assume no migration in or out of the cohort
and no change in exposure status over time.

4.1: The different ratio measures of incidence


The diagrams opposite illustrate cohorts for:
a) the exposed individuals
b) the unexposed individuals
If we consider the parameters in both diagrams, we can obtain a measure for the
size of the exposure effect. The size of the effect of exposure to the risk factor can
be measured by the ratio of the incidence among those exposed to that among those
who are not exposed.
There are two conceptually different ways of defining incidence; it may be measured
either as a risk or as a rate. We can also measure the odds. Go on to the next page
to review these measures.

4.2: The different ratio measures of incidence


The risk (or cumulative incidence) is the probability that a person initially free from
(and at risk of) the disease develops it at some time during the period of
observation.
How do we calculate the risk? Drag the appropriate terms from the diagram opposite
to complete the formula below.
Risk =

Interaction: Drag and Drop: Risk = numerator:


Correct Response D (text appears in bottom LHS):
That's right, the numerator of the risk is the number of cases at the end of
follow-up.
Incorrect Response N (text appears in bottom LHS):
No, although the risk does depend on the number of individuals initially at
risk, that is not correct. Please try again.
Incorrect Response Y (text appears in bottom LHS):

Sorry, that's not right. The person-time at risk is not used in the risk
formula. Please try again.
Incorrect Response N-D (text appears in bottom LHS):
Sorry, that's not right. This term is not used in the risk formula. Please try
again.
Interaction: Drag and Drop: Risk = denominator:
Correct Response N:
That's right, the denominator of the risk is the number of individuals initially
at risk.
Incorrect Response Y:
Sorry, that's not right. The person-time at risk is not used in the risk
formula. Please try again.
Incorrect Response D:
No, although the risk does depend on the number of cases, that is not
correct. Please try again.
Incorrect Response N-D:
Sorry, that's not right. This term is not used in the risk formula. Please try
again.
When both the numerator and denominator have been placed correctly the following
text appears in a pop up box:
Well done
That's right, the risk is given by the number of cases at the end of follow-up divided
by the number of individuals initially at risk.

4.3: The different ratio measures of incidence


The incidence rate is the rate of contracting the disease among those still at risk;
when a person contracts the disease they are no longer at risk.
How do we calculate the rate? Drag the appropriate terms from the diagram as
before.
Rate =
Interaction: Button: Note (pop up box appears):
Note
The incidence rate is often multiplied by 1000 and expressed as 'per 1000 personyears at risk'.
Interaction: Drag and Drop: Rate = numerator:
Correct Response D (text appears in bottom LHS):
Yes, the numerator of the rate formula is the number of new cases.
Incorrect Response N (text appears in bottom LHS):
Sorry, this term is not involved in the rate calculation. Please try again.
Incorrect Response Y:
The total person-time at risk (i.e. the shaded area under the graph) is
involved in the calculation of the rate, but it is not the numerator. Please try
again.
Incorrect Response N-D:

Sorry, this term is not involved in the rate calculation. Please try again.
Interaction: Drag and Drop: Rate = denominator:
Correct Response: Y:
That's right, the total person-time at risk (i.e. the shaded area under the
graph) is the denominator of the rate formula.
Incorrect Response: N:
Sorry, this term is not involved in the rate calculation. Please try again.
Incorrect Response: D:
Although rate does depend on the number of new cases, this is not the
correct position of this term. Please try again.
Incorrect Response: N-D:
Sorry, this term is not involved in the rate calculation. Please try again.
When both the numerator and denominator have been placed correctly the following
text appears in a pop up box:
That's right
The number of new cases is related not to the number initially at risk but to the sum
of the lengths of time that each person remained at risk during the period of
observation. This sum is called the total person time at risk (often expressed as
person-years). The shaded areas in the figures represent the number of personyears at risk for exposed and unexposed populations. So the rate is calculated as the
number of new cases divided by the person-time at risk. The ratio of the incidence
rate in the exposed group (D1/Y1) to that in the unexposed group (D0/Y0) is called
the rate ratio.

4.4: The different ratio measures of incidence


A third measure of incidence is the odds of disease.
How do we calculate the odds of disease? Drag the appropriate terms from the
diagram as before.
Odds =
Interaction: Drag and Drop: Odds = numerator:
Correct Response D (text appears on bottom LHS):
Yes, the number of cases is the numerator of the odds formula.
Incorrect Response N (text appears on bottom LHS):
Sorry, this term is not involved in the odds calculation.
Incorrect Response Y (text appears on bottom LHS):
Sorry, the total person-time at risk is not involved in the odds calculation.
Incorrect Response N-D (text appears on bottom LHS):
Although the odds does depend on the number of non-cases, that is not
correct. Please try again.
Interaction: Drag and Drop: Odds = denominator:
Correct Response N-D (text appears on bottom LHS):
That's right, the denominator of the odds formula is the number of noncases at the end of follow-up. The odds of disease is defined as the risk
divided by (1-risk), and is estimated as the total number of cases (D)
divided by the number of people who remain without disease at the end of
the study period (N-D)
Incorrect Response N (text appears on bottom LHS):
Sorry, this term is not involved in the odds calculation.
Incorrect Response Y (text appears on bottom LHS):
Sorry, the total person-time at risk is not involved in the odds calculation.
Incorrect Response D (text appears on bottom LHS):
Although the odds does depend on the number of cases, that is not correct.
Please try again.

4.5: The different ratio measures of incidence


Consider again the diagrams opposite. How would you calculate the risk for:
The exposed group?
Interaction: Button: clouds picture:
The risk in the exposed group is D1 / N1.
The unexposed group?
Interaction: Button: clouds picture:
The risk in the unexposed group is D0 / N0.

4.6: The different ratio measures of incidence


Ratios
The ratio of the incidence in the exposed population to that in the unexposed
population is a measure of the size of the effect of exposure to the risk factor.

The ratio of risk in the exposed to the risk in the unexposed is called the risk
ratio.
The ratio of the incidence rate in the exposed group to the incidence rate in the
unexposed group is called the rate ratio.
The ratio of the odds in the exposed group to the odds in the unexposed group is
called the odds ratio.
These relative measures are summarised in the table opposite.
Measure

Definition

Risk ratio

D1 / N1
D0 / N0

Rate ratio

D1 / Y1
D0 / Y0

Odds ratio

D1 / (N1 D1)
D0 / (N0 D0)

4.7: The different ratio measures of incidence


Rare vs. common diseases
For rare diseases, The number at risk will approximately equal the total population
at all times.
The number of persons initially at risk, the average number at risk during the study
and the number still at risk at the end will all approximately equal the total
population. This means that all 3 measures of relative incidence (risk ratio, rate
ratio, and odds ratio) will be virtually identical.
For a common disease, however, the three measures of incidence are different.
Each measure will give a different estimate of relative incidence if used to assess an
association between an exposure and a disease. Which of these measures is
estimated by a case-control study depends upon the way in which controls are
sampled.

4.8: The different ratio measures of incidence


Although a case-control study does not directly estimate incidence among the
exposed and non-exposed populations, it does give a measure of relative incidence
by comparing the odds of exposure among the cases with the odds of exposure
among the controls, through the exposure odds ratio.
Results for an unmatched case-control study can be presented in the form of the
table opposite.

Click 'show' below to see how the exposure odds ratio is calculated from this table.
Interaction: Button: Show (text appears on card on RHS):
Exposure OR =

exposure odds (cases)


exposure odds (controls)

D1 / D0
H1 / H0

D1 x H0
D0 x H1

This exposure odds ratio estimates one of the 3 measures of relative incidence
which one depends on the way in which controls are sampled. We usually refer to it
simply as the exposure odds ratio.

Cases
Controls

Expos
ed
D1
H1

Unexpo
sed
D0
H0

Total
D
H

4.9: The different ratio measures of incidence


The formulae for all three measures of relative incidence can be written in a form
where the numerator is the ratio of exposed to non-exposed cases, D1 / D0, i.e. the
odds of exposure for cases.
Click below to highlight this in the table opposite.
Interaction: Button: Show (highlights text on table on RHS):
Measures of relative incidence
Measure

Definition

Risk ratio

D1 / N1
D0 / N0
D1 / Y1
D0 / Y0
D1/(N1 D1)
D0/(N0 D0)

Rate ratio
Odds ratio

Alternative
formulation
D1 / D0
N1 / N0
D1 / D0
Y1 / Y0
D1 / D0
(N1 D1)/(N0 D0)

This is useful because D1 / D0 is easily estimated from the cases group in a study.
Measures of relative incidence
Measure

Definition

Risk ratio

D1 / N1
D0 / N0

Alternative
formulation
D1 / D0
N1 / N0

Rate ratio
Odds ratio

D1 / Y1
D0 / Y0
D1/(N1 D1)
D0/(N0 D0)

D1 / D0
Y1 / Y0
D1 / D0
(N1 D1)/(N0 D0)

4.10: The different ratio measures of incidence


Exercise
The denominators of the three measures vary. Can you match up the boxes below to
show what each denominator represents?

(N1 D1)/(N0 D0)

N1 / N0

Y1 / Y0

The ratio of exposed


to non-exposed for
people who are
disease-free at the
end of the study.
The ratio of exposed
to non-exposed for
person-time at risk
during the study.

The ratio of exposed


to non-exposed for
people at risk at the
start of the study.

Interaction: Hotspot: (N1 D1)/(N0 D0):


Correct Response The ratio of exposed to non-exposed for people who are diseasefree at the end of the study. (text appears on bottom RHS):
That's right, (N D) is the number of people who are still disease-free at
the end of the study period.
Incorrect Response The ratio of exposed to non-exposed for person-time at risk
during the study. (text appears on bottom RHS):
Sorry, (N D) does not represent person-time. Please try again.
Incorrect Response: The ratio of exposed to non-exposed for people at risk at the
start of the study. (text appears on bottom RHS):
Sorry, (N D) is not the number of people at risk at the start of the study.
Please try again.
Interaction: Hotspot: N1 / N0:
Correct Response The ratio of exposed to non-exposed for people at risk at the start
of the study. (text appears on bottom RHS):

That's right, N is the number of people who are at risk at the start of the
study period.
Incorrect Response: The ratio of exposed to non-exposed for people who are
disease-free at the end of the study. (text appears on bottom RHS):
Sorry, N does not represent the number of people still disease-free at the
end of the study period. Please try again.
Incorrect Response: The ratio of exposed to non-exposed for person-time at risk
during the study.(text appears on bottom RHS):
Sorry, N does not represent person-time. Please try again.
Interaction: Hotspot: Y1 / Y0:
Correct Response: The ratio of exposed to non-exposed for person-time at risk
during the study. (text appears on bottom RHS):
That's right, Y is the total person-time over the duration of the study period.
Incorrect Response: The ratio of exposed to non-exposed for people who are
disease-free at the end of the study. (text appears on bottom RHS):
Sorry, Y does not represent the number of disease-free people at the end of
the study. Please try again.
Incorrect Response: The ratio of exposed to non-exposed for people at risk at the
start of the study. (text appears on bottom RHS):
Sorry, Y does not represent the number of people at risk at the start of the
study. Please try again.

Section 5: The different sampling schemes


In a case-control study we look at the odds of exposure in the cases, and the odds of
exposure in the controls. This gives the exposure odds ratio.
Depending on the design of the case-control study, the odds of exposure in controls
estimates one of:
(i) Persons at risk at the start of the study (N1/N0)
(ii) Person-years at risk for the duration of the study (Y1/Y0)
(iii) Persons still disease free at the end of the study ((N-D1)/(N0-D0))
So, depending on the design of the case-control study, the exposure odds ratio is
an estimate of:
1. the disease odds ratio, or
2. the risk ratio, or
3. the rate ratio.

For any case-control study we take two samples from a population: a sample (or
maybe all) of the cases and a sample of controls. Therefore, all case-control studies
can be thought of as being 'nested' within the total population of interest. Click below
to review the illustration of this that you saw earlier.
Interaction: Button: View (card appears on RHS):

The controls are a sample of the non-cases, selected from the population that
produced the cases.
Conceptually, all case-control studies are 'nested' within a cohort, the cohort being
the population from which the cases and controls were drawn.
The sampling schemes outlined on the following pages illustrate this idea.

5.1: The different sampling schemes


1. Exclusive sampling
This is where controls are sampled from the population still at risk at the end of the
study period.
In this case the odds of exposure among controls will provide an estimate of
(N1 D1)/(N0 D0).
What does the exposure odds ratio estimate in this sampling scheme?
Interaction: Button: clouds picture (text box appears and graph on RHS is altered):
With this sampling scheme, the exposure odds ratio we calculate from the casecontrol study will provide an estimate of the disease odds ratio. This is because
(N1 D1)/(N0 D0) is the denominator of the disease odds ratio in the alternative
formulation you saw earlier.

5.2: The different sampling schemes


2. Inclusive sampling
Controls may be chosen from all individuals in the population , i.e. those at risk at
the start of the study. This is regardless of whether or when they go on to develop
disease, hence the name 'inclusive'.
Thus, the odds of exposure in this control group will provide an estimate of the odds
of exposure in the population as a whole, (N1 / N0).

What does the exposure odds ratio estimate in this sampling scheme?
Interaction: Button: clouds picture (text appears and graph on RHS is altered):
With an inclusive sampling scheme the exposure odds ratio gives an estimate of the
risk ratio. This is because
N1 / N0 is the denominator of the risk ratio in the alternative formulation you saw
earlier.

What is a major advantage of an inclusive sampling scheme?


Interaction: Button: clouds picture:
Advantage

A major advantage of this sampling scheme is that it is not necessary to obtain


disease histories of selected controls since it is not necessary to exclude past cases.

5.3: The different sampling schemes


3. Concurrent sampling
In concurrent sampling, controls are selected at the same time that the cases occur.
Selection is made from those still at risk when each new case is diagnosed.
A person originally selected as a control can, therefore, become a case at a later
date. A case can only become a control in situations where the disease is of short
duration, followed by recovery without the induction of immunity.
Individuals selected as controls who later become cases are included in both groups.
Concurrent sampling

5.4: The different sampling schemes


4. Concurrent sampling (continued)
In this design, the odds of exposure in
the control group provides an estimate of
Y1 / Y0 (the ratio of person time at risk).
So, what do you think the exposure odds ratio estimates within a concurrent
sampling scheme?
Interaction: Button: clouds picture (text appears on bottom LHS, text box appears
and card on RHS changes):

With a concurrent sampling scheme the exposure odds ratio gives an estimate of a
rate ratio, because (Y1 / Y0) is the denominator of the rate ratio in the alternative
formulation you saw earlier.
The analysis needs to be matched on time of selection, and in our estimate of the
rate ratio we are assuming that it isconstant over the study period.If an unmatched
analysis were carried out then the concurrent sampling scheme would give you an
unbiased estimate of the rate ratio only if:(i) the proportion of the at risk population
in the exposed and unexposed groups is constant over time and(ii) the rate ratio is
constant over time. Condition (i) will not generally be true in a closed population if
the rate ratio is not equal to 1, but may hold if the population is open (i.e. new
individuals enter the population during the course of the study). If the disease is rare
this will not matter in practice.
Usually case-control studies with concurrent sampling are analysed using a matched
analysis.
Note: Excluding controls who later become cases will turn a concurrent design into a
exclusive one, leading to an estimate of the disease odds ratio rather than the rate
ratio.

5.5: The different sampling schemes


Sampling of controls
In summary:
1. Exclusive sampling samples controls at the end of the time period, and the
exposure odds ratio gives an estimate of the disease odds ratio.

2. Inclusive sampling samples controls at the beginning of the time period and the
exposure odds ratio gives an estimate of the risk ratio.
3. Concurrent sampling samples controls over time and, hence, the exposure odds
ratio gives an estimate of the rate ratio.

Section 6: Which sampling scheme to choose


So, now you know the 3 different ways of sampling controls.
1. Exclusive
2. Inclusive
3. Concurrent

sample controls at the end


sample controls at the beginning
sample controls over time

How do you know which of these sampling schemes to use? It depends on:
whether the disease is rare or not
what you need to measure to meet the study objectives
Interaction: Tabs: Rare disease?:
For a rare disease, the exposure odds ratio obtained with each of the three different
control sampling schemes will be identical.
For a common disease, the three sampling schemes will yield different values for the
exposure odds ratio.
Interaction: Tabs: Which measure?:
Each sampling scheme produces an exposure odds ratio that estimates either risk
ratio, rate ratio, or odds ratio. The choice of sampling is determined by whichever
measure is most appropriate for the study objectives. This will depend on the type
of disease under study and the nature of the exposure under investigation.

6.1: Which sampling scheme to choose


To illustrate how to choose which sampling scheme to use in any given case-control
study we will now look at four different situations.
Each choice depends on:
1. the characteristics of the disease
2. the mode of action of the risk factor under study

In these examples we assume a closed population and no change of exposure


status for individuals during the study period.

6.2: Which sampling scheme to choose


Situation A: All rare disorders
'All rare disorders' refers to most cancers, congenital malformations, diabetes, and
accidents.
Quantitatively there is no precise definition of 'rare'. It may be considered to include
any disease with an incidence risk below 1% over the study period.

6.3: Which sampling scheme to choose


This is the classic case-control situation. As the disease is rare, the cases constitute a
negligible fraction of the population, and the proportion of the population exposed to
the risk factor remains approximately constant over time.
Use the diagrams opposite (which are not drawn to scale) to calculate the 3 ratio
measures below, to 1 decimal place:
Odds ratio =

Risk ratio

Rate ratio

Interaction: Button: Hint(pop up box appears):


Formulae for ratio measures
Measure
Risk ratio
Rate ratio
Odds ratio

Definition
D1 / N1
D0 / N0
D1 / Y1
D0 / Y0
D1 / (N1 D1)
D0 / (N0 D0)

Interaction: Calculation: Odds ratio =____:


Correct Response 2.0 (pop up box appears):
Correct
That's right, the odds ratio is given by:
OR =

D1 / (N1 D1) = 15 / 1985 = 2.0


D0 / (N0 D0)
30 / 7970

Incorrect Response (pop up box appears):


Sorry, that's not correct. Remember that the odds ratio is the odds in the exposed
population (top plot) divided by the odds in the unexposed population (bottom plot).
Therefore:
OR =

D1 / (N1 D1) = 15 / 1985 = 2.0


D0 / (N0 D0)
30 / 7970

Interaction: Calculation: Risk ratio =____:


Correct Response 2.0 (pop up box appears):
That's right, the risk ratio is given by:
Risk ratio = D1 / N1 = 15 / 2000 = 2.0
D0 / N0
30 / 8000
Incorrect Response (pop up box appears):
Sorry, that's not correct. Remember that the risk ratio is the risk in the exposed
population (top plot) divided by the risk in the unexposed population (bottom plot).
Therefore:
Risk ratio = D1 / N1 = 15 / 2000 = 2.0
D0 / N0
30 / 8000
Interaction: Calculation: Rate ratio = ____:
Correct Response 2.0 (pop up box appears):
That's right, the rate ratio is given by:
Rate ratio = D1 / Y1 = 15 / 3985 = 2.0
D0 / Y0
30 / 15970
Incorrect Response (pop up box appears):
Sorry, that's not correct. Remember that the rate ratio is the rate in the exposed
population (top plot) divided by the rate in the unexposed population (bottom plot).
Therefore:

Rate ratio = D1 / Y1 = 15 / 3985 = 2.0


D0 / Y0
30 / 15970

6.4: Which sampling scheme to choose


You have seen that, in this situation, the disease odds ratio, risk ratio and rate ratio
are equal.
So, which sampling scheme do you think is appropriate for Situation A?
Exclusive

Inclusive

Concurrent

Interaction: Hotspot: Exclusive:


Correct Response (pop up box appears):
Yes, this sampling scheme is appropriate for this situation but since the three
measures of relative incidence are all equal, all three sampling schemes for controls
are appropriate. Therefore any sampling scheme can be used for rare diseases.
Interaction: Hotspot: Inclusive:
Correct Response (pop up box appears):
Yes, this sampling scheme is appropriate for this situation but since the three
measures of relative incidence are all equal, all three sampling schemes for controls
are appropriate. Therefore any sampling scheme can be used for rare diseases.

Interaction: Hotspot: Concurrent:


Correct Response (pop up box appears):
Yes, this sampling scheme is appropriate for this situation but since the three
measures of relative incidence are all equal, all three sampling schemes for controls
are appropriate. Therefore any sampling scheme can be used for rare diseases.

6.5: Which sampling scheme to choose


Situation B: Common, non-recurrent diseases with risk factors which affect
all exposed persons equally and permanently
Common, non-recurrent diseases are those that either give immunity from further
attacks or lead to death. The type of exposures for this situation are vaccines that
give the same protection to all those immunised.
The diagrams on the tabs opposite show an example of a non-recurrent disease with
an exposure affecting all the exposed equally.
represents the instantaneous disease rate in each population, and is assumed
constant over time.
Interaction: Tabs: Exposed:

Interaction: Tabs: Unexposed:

6.6: Which sampling scheme to choose


On the next page, you will calculate ratio measures for years 1, 2 and 3 using the
diagrams opposite. These should be cumulative ratio measures, so you will use totals
across the intervals. For example, the risk ratio for year 2 would be:
Risk ratio = (D11 + D21) / N11
(D10 + D20) / N10
=

(181.3 + 148.4) / 1000


(95.2 + 86.1) / 1000
=
329.7 = 1.82
181.3
Now go on to calculate some other ratios.

6.7: Which sampling scheme to choose


Complete the missing values in the table below, giving your answers to 2 decimal
places. You can click on the already-completed cells to see how they were calculated.
Risk ratio
Rate ratio
Odds ratio

Year 1
1.90
2.00
2.10

Year 2
1.82
2.00

Interaction: Button: 1.90 (Year 1/Risk ratio):


Risk ratio = D11 / N11
D10 / N10
= 181.3 / 1000 = 1.90
95.2 / 1000

Year 3
1.74
2.35

Interaction: Button: 2.00 (Year 1/Rate ratio):


Rate ratio = D11 / Y11
D10 / Y10
= 181.3 / 906.3 = 2.00
95.2 / 951.6
Interaction: Button: 2.10 (Year 1/Odds ratio):
Odds ratio = D1/(N1 D1)
D0/(N0 D0)
= 181.3 / 818.7 = 2.10
95.2 / 904.8
Interaction: Button: 1.82 (Year 2/Risk ratio):
Risk ratio = (D11 + D21) / N11
(D10 + D20) / N10
= (181.3 + 148.4) / 1000 = 1.82
(95.2 + 86.1) / 1000
Interaction: Button: 2.00 (Year 2/Rate ratio):
Rate ratio = (D11 + D21) / (Y11 + Y21)
(D10 + D20) / (Y10 + Y20)
= (181.3 + 148.4) / (906.3 + 742.0) = 2.00
(95.2 + 86.1) / (951.6 + 861.0)
Interaction: Button: 1.74 (Year 3/Risk ratio):
Risk ratio = (D11 + D21 + D31) / N11
(D10 + D20 + D30) / N10
= (181.3 + 148.4 + 121.5) / 1000 = 1.74
(95.2 + 86.1 + 77.9) / 1000
Interaction: Button: 2.35 (Year 3/Odds ratio):
Odds ratio = (D11 + D21 + D31) / N41
(D10 + D20 + D30) / N40
= (181.3 + 148.4 + 121.5) / 548.8 = 2.35
(95.2 + 86.1 + 77.9) / 740.8
Interaction: Calculation: Year 2/Odds ratio:
Correct Response 2.22:
Yes, the odds ratio at the end of year 2 is given by:
Odds ratio = (181.3 + 148.4) / 670.3 = 2.22
(95.2 + 86.1) / 818.7
Incorrect Response:

No, that's not right. Remember that the odds is the total number of cases divided by
the number of individuals still at risk. Therefore, at the end of year 2, this is:
Odds ratio = (D11 + D21) / N31
(D10 + D20) / N30
= (181.3 + 148.4) / 670.3 = 2.22
(95.2 + 86.1) / 818.7
Interaction: Calculation: Year 3/Rate ratio:
Correct Response 2.00:
Correct
Make sure that you reached the correct answer using the following calculation:
Rate ratio = (D11 + D21 + D31) / (Y11 + Y21 + Y31)
(D10 + D20 + D30) / (Y10 + Y20 + Y30)
= (181.3 + 148.4 + 121.5) / (906.3 + 742.0 + 607.5) = 2.00
(95.2 + 86.1 + 77.9) / (951.6 + 861.0 + 779.1)
Incorrect Response:
Sorry, that's not right. The rate ratio for years 1, 2 and 3 is calculated as follows:
Rate ratio = (D11 + D21 + D31) / (Y11 + Y21 + Y31)
(D10 + D20 + D30) / (Y10 + Y20 + Y30)
= (181.3 + 148.4 + 121.5) / (906.3 + 742.0 + 607.5) = 2.00
(95.2 + 86.1 + 77.9) / (951.6 + 861.0 + 779.1)

6.8: Which sampling scheme to choose


How do the 3 ratio measures compare in this situation?
Interaction: Button: clouds picture (pop up box appears):
In this situation,
1. the risk ratio decreases with increasing length of follow-up and is smaller than the
rate ratio even in the first year.
2. the disease odds ratio increases with time and is consistently larger than the rate
ratio.
3. the rate ratio is constant over time at 2.0.
In this cohort situation the rate ratio is constant over time. Which sampling scheme
do you think is appropriate for choosing controls for a case-control study?
Exclusive

Inclusive

Interaction: Hotspot: Concurrent:


Correct Response:
Correct

Concurrent

Risk ratio
Rate ratio
Odds ratio

Year 1
1.90
2.00
2.10

Year 2
1.82
2.00
2.22

Year 3
1.74
2.00
2.35

concurrent sampling is the appropriate sampling scheme.

For the rate ratio,

Interaction: Hotspot: Exclusive:


Incorrect Response:
No, remember that exclusive sampling is used to estimate the disease odds ratio.
Since the disease odds ratio is not constant in this situation, exclusive sampling is
not suitable. Please try again.
Interaction: Hotspot: Inclusive:
Incorrect Response:
No, remember that inclusive sampling is used to estimate the risk ratio. Since the
risk ratio is not constant in this situation, inclusive sampling is not suitable. Please
try again.

Note that an exception to the choice of using the rate ratio for Situation B is if the
focus is on the increased risk to an individual over a specific time period, such as in
an investigation of risk factors for death during infancy. In this case the risk ratio is
the summary measure of choice and the issue of invariance does not arise

6.9: Which sampling scheme to choose


Situation C: Common, non-recurrent diseases with protective factors which
do not affect all exposed persons equally
Some protective factors do not equally affect all individuals, e.g. a vaccine may have
an "all or nothing" effect. So instead of giving partial protection to everyone it may
give complete protection to some individuals and none to others.
In this situation, the exposed group consists of two distinct sub-populations:
1. one which is totally protected against disease
2. the other of which is at the same risk as the unexposed population
Interaction: Tabs: Vaccinated:

Interaction: Tabs: Unvaccinated:

6.10: Which sampling scheme to choose


The diagrams opposite show the vaccinated and unvaccinated populations over a 3year period. For the vaccinated, there was 80% complete protection and 20% no
protection.
Complete the table below, giving the ratios at the end of each year, to 2 decimal
places.

Risk ratio
Rate ratio
Odds ratio

Year 1
0.20
0.18
0.17

Year 2
0.20
0.14

Year 3
0.16
0.12

Interaction: Button: Hint:


For the vaccinated population you can use combined values for protected and
unprotected numbers in your calculations. Click on the completed cells to see how to
do your calculations.
Interaction: Hotspot: 0.20 (Year 1/Risk ratio):
Risk ratio = D11 / (N11u + N11p)
D10 / N10
= 36.3 / 1000 = 0.20
181.3 / 1000
Interaction: Hotspot: 0.18 (Year 1/Rate ratio):
Rate ratio = D11 / (Y11u + Y11p)
D10 / Y10
= 36.3 / 981.3 = 0.18
181.3 / 906.3
Interaction: Hotspot: 0.17 (Year 1/Odds ratio):
Odds ratio = D11 / (N21u + 800)
D10 / N20
= 36.3 / 963.7 = 0.17
181.3 / 818.7
Interaction: Hotspot: 0.20 (Year 2/Risk ratio):
Risk ratio = (D11 + D21) / (N11u + N11p)
(D10 + D20) / N10
= (36.3 + 29.7) / 1000 = 0.20
(181.3 + 148.4) / 1000
Interaction: Hotspot: 0.14 (Year 2/Odds ratio):
Odds ratio = (D11 + D21) / (N31u + 800)
(D10 + D20) / N30
= (36.3 + 29.7) / 934.0 = 0.14
(181.3 + 148.4) / 670.3
Interaction: Hotspot: 0.16 (Year 3/Rate ratio):
Rate ratio = (D11 + D21 + D31) / (Y11u + Y11p + Y21u + Y21p + Y31u + Y31p)
(D10 + D20 + D30) / (Y10 + Y20 + Y30)
= (36.3 + 29.7 + 24.3) / (981.3 + 948.4 + 921.5) = 0.16
(181.3 + 148.4 + 121.5) / (906.3 + 742.0 + 607.5)

Interaction: Hotspot: 0.12 (Year 3/Odds ratio):


Odds ratio = (D11 + D21 + D31) / (N41u + 800)
(D10 + D20 + D30) / N40
= (36.3 + 29.7 + 24.3) / 909.7 = 0.12
(181.3 + 148.4 + 121.5) / 548.8
Interaction: Calculation: Year 2/rate ratio:
Correct Response 0.17:
Correct
You should have reached your answer in the following way:
Rate ratio = (D11 + D21) / (Y11u + Y11p + Y21u + Y21p)
(D10 + D20) / (Y10 + Y20)
= (36.3 + 29.7) / (981.3 + 948.4) = 0.17
(181.3 + 148.4) / (906.3 + 742.0)
Incorrect Response:
Sorry, that's not right. The rate over the first two years is given by the total number
of cases divided by the total person-time at risk across those 2 years.
Therefore the rate ratio is calculated as follows:
Rate ratio = (D11 + D21) / (Y11u + Y11p + Y21u + Y21p)
(D10 + D20) / (Y10 + Y20)
= (36.3 + 29.7) / (981.3 + 948.4) = 0.17
(181.3 + 148.4) / (906.3 + 742.0)
Interaction: Calculation: Year 3/Risk ratio:
Correct
Correct Response 0.20:
That's right, the risk ratio is given by:
Risk ratio = (D11 + D21 + D31) / (N11u + N11p)
(D10 + D20 + D30) / N10
= (36.3 + 29.7 + 24.3) / 1000 = 0.20
(181.3 + 148.4 + 121.5) / 1000
Incorrect Response:
Sorry, that's not correct. Remember that the risk is given by the total number of
cases at the end of year 3, divided by the total number of individuals at risk at the
start of the follow-up period (i.e. both protected and unprotected individuals).
Therefore, the risk ratio is given by:
Risk ratio = (D11 + D21 + D31) / (N11u + N11p)
(D10 + D20 + D30) / N10

(36.3 + 29.7 + 24.3) / 1000 = 0.20


(181.3 + 148.4 + 121.5) / 1000

6.11: Which sampling scheme to choose


In this situation it is the risk ratio which is constant over time. Which sampling
scheme is most appropriate for Situation C?
Inclusive

Exclusive

Concurrent

Interaction: Hotspot: Inclusive:


Correct Response:
Correct
The inclusive sampling scheme is the appropriate choice. The risk ratio remains
constant over time and the case-control study may be based on cases diagnosed
during any period following the point of definition of exposure. It is not necessary to
accumulate cases continuously from this point. In this example, a case-control study
could be based on cases diagnosed during Year 3 only, with controls selected from
the total population.
Interaction: Hotspot: Exclusive:
Incorrect Response:
No, remember that exclusive sampling is used to estimate the disease odds ratio.
Since the disease odds ratio is not constant in this situation, exclusive sampling is
not suitable. Please try again.
Interaction: Hotspot: Concurrent:
Incorrect Response:
No, remember that concurrent sampling is used to estimate the rate ratio. Since the
rate ratio is not constant in this situation, concurrent sampling is not suitable. Please
try again.

6.12: Which sampling scheme to choose


In a study of the effect of vaccination we wish to know the protective effect of the
vaccine. This is called the vaccine efficacy (VE), and is calculated as follows:

VE (%) = (1 - risk ratio) x 100


The table opposite shows the VE for all 3 years of the study. You can click on these
cells to see how they were calculated.

Risk ratio

Year 1
0.20

Year 2
0.20

Year 3
0.20

Rate ratio
Odds ratio
VE

0.18
0.17
80%

80%

0.17
0.14

80%

0.16
0.12

Interaction: Button: 80%(Year 1/Risk ratio):


Year 1
VE = 1 risk ratio
= 1 0.20 = 0.80
= 80%
So, the vaccine was 80% effective in the first year.
Interaction: Button: 80% (Year 2/VE):
Year 2
VE = 1 risk ratio
= 1 0.20 = 0.80
= 80%
So, the vaccine was 80% effective over the first two years.
Interaction: Button: 80% (Year 3/VE):
Year 3
VE = 1 risk ratio
= 1 0.20 = 0.80
= 80%
So, the vaccine was 80% effective over the three years.

6.13: Which sampling scheme to choose


Situation D: Common, recurrent diseases
Common diseases are those affecting many people.
Recurrent diseases are those that an individual may experience more than once.
Some examples of common, recurrent diseases are diarrhoea, acute respiratory
infections and malaria.
Can you think what the appropriate measure is for a study of these types of
diseases?
Interaction: Button: clouds picture (pop up box appears and card appears on
RHS):
The appropriate measure for a study of these types of diseases is the rate ratio.

The rate ratio takes account of not only whether the person experiences the disease
or not, but also how often they experience it.
So why do you think the risk ratio and odds ratio are not appropriate?
Interaction: Button: clouds picture (pop up box appears):
Since the disease is recurrent and cases can return to the population at risk, the
numerator for the measure of incidence is not individuals, but episodes. Therefore
the risk ratio and odds ratio are not valid.
Imagine an extreme situation, in which all children experience an episode of
diarrhoea in the first two years of life, whether or not they were exposed. The risk
ratio will tend towards unity as the period of follow-up increases to two years. This is
because the risk in both the exposed group and unexposed group will tend to 1.

6.14: Which sampling scheme to choose


The diagrams opposite show the exposed and unexposed populations for a common
disease of short duration, over a 3-year period. The instantaneous disease rates are
0.2 in the exposed and 0.1 in the unexposed.
Calculate the rate ratio for each year of follow-up, to one decimal place. Remember
that cases return to the population at risk.
Rate ratio
year 1:
year 2:
year 3:
Interaction: Calculation: Rate ratio

year 1:___:

Correct Response 2.0:


Correct
That's right, the rate ratio is given by:
Rate ratio = 200 / 1000 = 2.0
100 / 1000
Incorrect Response:
Sorry, that's not correct. Remember that the rate ratio is given by:
Rate ratio = D1 / Y1
D0 / Y0
= 200 / 1000 = 2.0
100 / 1000
Interaction: Calculation: Rate ratio
Correct Response 2.0:

year 2:___:

Correct
Yes, but make sure that you have calculated it correctly. Remember to use the total
number of cases and the total amount of person-time since the start of follow-up.
Thus,
Rate ratio = 400 / 2000 = 2.0
200 / 2000
Incorrect Response:
Sorry, that's not correct. Remember that the rate ratio is given by:
Rate ratio = D1 / Y1
D0 / Y0
= 400 / 2000 = 2.0
200 / 2000
Interaction: Calculation: Rate ratio

year 3:___:

Correct Response 2.0:


Yes, but make sure that you have calculated it correctly. Remember to use the total
number of cases and the total amount of person-time since the start of follow-up.
Thus,
Rate ratio = 600 / 3000 = 2.0
300 / 3000
Incorrect Response:
Sorry, that's not correct. Remember that the rate ratio is given by:
Rate ratio = D1 / Y1
D0 / Y0
= 600 / 3000 = 2.0
300 / 3000
Interaction: Tabs: Exposed:

Interaction: Tabs: Unexposed:

6.15: Which sampling scheme to choose


So, which sampling scheme do you think is appropriate for the Situation D, common,
recurrent diseases?
Exclusive

Inclusive

Interaction: Button: Concurrent:

Concurrent

Correct Response (pop up box appears):


Correct
Yes, the rate ratio is constant over time, so concurrent sampling is appropriate.
Interaction: Hotspot: Exclusive:
Incorrect Response (pop up box appears):
No, exclusive sampling is not appropriate in this situation since the odds ratio is not
valid. It is the rate ratio that is constant. Please try again.
Interaction: Hotspot: Inclusive:
Incorrect Response (pop up box appears):
No, inclusive sampling is not appropriate in this situation since the risk ratio is not
valid. It is the rate ratio that is constant. Please try again.
Rate ratio year 1:
year 2:
year 3:
Interaction: Calculation: Rate ratio year 1:___:
Correct Response: 2.0:
Correct
That's right, the rate ratio is given by:
Rate ratio = 200 / 1000 = 2.0
100 / 1000
Incorrect Response:
Sorry, that's not correct. Remember that the rate ratio is given by:
Rate ratio = D1 / Y1
D0 / Y0
= 200 / 1000 = 2.0
100 / 1000
Interaction: Calculation: Rate ratio year 2:___:
Correct Response: 2.0:
Correct
Yes, but make sure that you have calculated it correctly. Remember to use the total
number of cases and the total amount of person-time since the start of follow-up.
Thus,

Rate ratio = 400 / 2000 = 2.0


200 / 2000
Incorrect Response:
Sorry, that's not correct. Remember that the rate ratio is given by:
Rate ratio = D1 / Y1
D0 / Y0
= 400 / 2000 = 2.0
200 / 2000
Interaction: Calculation: Rate ratio year 3:___:
Correct Response 2.0:
Correct
Yes, but make sure that you have calculated it correctly. Remember to use the total
number of cases and the total amount of person-time since the start of follow-up.
Thus,
Rate ratio = 600 / 3000 = 2.0
300 / 3000
Incorrect Response:
Sorry, that's not correct. Remember that the rate ratio is given by:
Rate ratio = D1 / Y1
D0 / Y0
= 600 / 3000 = 2.0
300 / 3000

6.16: Which sampling scheme to choose


If the focus was on individuals who suffered at least 1 episode, rather than episodes,
then the cases would no longer return to the at risk population. How do you think
this would change the approach for this situation?
Interaction: Button: clouds picture (pop up box appears and card appears on
RHS):
If the focus were on individuals then the situation would be the same as for common
non-recurrent diseases, i.e. Situation B.
This would also apply if you wanted to look at the more susceptible individuals who
experienced, say, 3 or more episodes.

6.17: Which sampling scheme to choose


Summary
The tables opposite summarise the choice of sampling scheme according to the
situation.
Interaction: Tabs: A:
Situation

Type of disease

Rare

Type of
risk/protective
factor

All

Example(s)

Most cancers

Cases return to the


population at risk
Proportion exposed
constant
Exposed group at
uniform risk

Does not matter


As near as makes
no difference
?

Invariant measure

All three

Appropriate
sampling

Any of the three

Interaction: Tabs: B:
Situation
Type of disease
Type of
risk/protective
factor
Example(s)
Cases return to the
population at risk
Proportion exposed
constant
Exposed group at
uniform risk

B
Common, nonrecurrent
Risk/protective
factors that affect
all exposed equally
Vaccines that give
partial protection
to those
vaccinated
No
No
Yes

Invariant measure

Rate ratio

Appropriate
sampling

Concurrent

Interaction: Tabs: C:
Situation
Type of disease
Type of
risk/protective
factor
Example(s)
Cases return to the
population at risk
Proportion exposed
constant
Exposed group at
uniform risk

C
Common, nonrecurrent
Protective factor
that does not give
equal protection to
all
Vaccines that give
'all or nothing'
protection
No
No
No

Invariant measure

Risk ratio

Appropriate
sampling

Inclusive

Interaction: Tabs: D:
Situation
Type of disease
Type of
risk/protective
factor
Example(s)
Cases return to the
population at risk
Proportion exposed
constant
Exposed group at
uniform risk

D
Common,
recurrent
All
Diarrhoea, acute
respiratory illness,
malaria
Yes
Yes
?

Invariant measure

Rate ratio

Appropriate
sampling

Concurrent

Section 7: Principles underlying the choice of controls


You have just looked at how to sample controls. However, before actually sampling
controls, you need to determine how to identify potential controls.
How might you do this? Click below for some examples.
Interaction: Button: Examples (timed pop up text appears on bottom LHS and card
appears on RHS):
Interaction: Timed pop up 1 (bottom LHS):
People using the same health services as the case?
Interaction: Timed pop up 2 (bottom LHS):

Random digit dialling?

Interaction: timed pop up 3 (bottom LHS):


Friends/relatives/neighbours of the cases?
Interaction: Timed pop up 4 (RHS card):
There are many possibilities, and this decision is usually the most important decision
the investigator will have to take. Upon this decision will depend the validity of the
study findings.
The most appropriate choice will depend on:
1. the research question being addressed
2. the setting in which the study is to be performed.
Therefore it is not possible to give simple "rules" that will apply in all situations.

7.1: Principles underlying the choice of controls


There are no simple rules, however, we can identify two basic principles that the
investigator should always attempt to uphold. These are shown opposite.
Principle 1
Controls should be individuals who would have been identified as cases had they
suffered the disease of interest during the study period.
Principle 2
The exposure history of controls should be representative of the exposure history
of the population from which the cases are drawn.

7.2: Principles underlying the choice of controls


Related to Principle 1, recruitment of controls by random sampling of the population
is appropriate if ascertainment of cases is complete and you have reasonably
accurate population data. If either of these conditions is not met then this approach
may lead to selection bias.
Related to Principle 2, neighbourhood controls have several potential advantages.
They provide a relatively simple rule for identifying controls from the community
when you do not have reliable population data. Matching on neighbourhood may
help you to control for environmental and social factors that would be difficult to
measure. Matching on neighbourhood may increase the likelihood that, had the
control suffered the disease of interest, they would have ended up in the same
health facility as the case (when cases are recruited in health facilities). However, if
cases come from a wide area, recruitment of neighbourhood controls may create
serious logistic difficulties.
Go on to the next page to see more of what these principles mean in practice.

7.3: Principles underlying the choice of controls


Let's think about how these principles work in practice with a specific scenario:
Imagine cases are recruited from individuals attending a particular health facility.
The controls should be recruited from individuals who would have attended the same
health facility had they suffered the disease of interest during the study period.
To become a case, a person must suffer the disease of interest and attend the health
facility in which the study is being performed.
How would you recruit controls in this situation?
Interaction: Button: clouds picture (card appears on RHS):
In a study recruiting cases in a health facility, it implies restricting recruitment of
controls to individuals who use the health facility where the study is being
performed. This is likely to be difficult if there are many alternative services
available. One way in which this problem has been addressed is by recruiting as
controls individuals presenting to the same health facility but with a different
complaint.
If one adopts this approach and recruits individuals with other diseases as controls,
then one should choose diseases of similar perceived severity (to try to ensure
similar reporting patterns) and one must be sure that the "control diseases" are not
directly or indirectly related to the factors under study (the second principle).

7.4: Principles underlying the choice of controls


Example

Consider a case-control study of the effectiveness of measles vaccine which recruits


measles cases attending a particular health facility.
In order to uphold Principle 1, the investigator might recruit as controls children
presenting to the same health facility with other complaints.
Then, in order to uphold Principle 2, the investigator should exclude from the control
group children presenting with other vaccine preventable diseases.
For example, a child with polio is less likely to have been vaccinated against polio
(and therefore probably measles) than a child drawn at random from the population.
Taking polio cases as controls would tend to underestimate the proportion of the
population vaccinated against measles.

Section 8: Exercise
Exercise 1
The following 3 examples are case-control studies looking at risk factors for cervical
cancer. The population consists of women participating in a screening programme.
Click on each button below to see the example.

Example A

Example B

Example C

Interaction: Button: Example A (pop up box appears):


Example A
Cases are all women diagnosed with cancer of the cervix in the previous year, and
controls are women selected at random from all women registered in the
programme.
Interaction: Button: Example B (pop up box appears):
Example B
Cases are all women diagnosed with cancer of the cervix in the previous year.
Controls are selected (one for each case, individually matched) from women who had
a normal smear taken at the time of the diagnosis of the case to whom she is
matched.
Interaction: Button: Example C (pop up box appears):
Example C
Cases are all women diagnosed with cancer of the cervix in the previous year.
Controls are a random sample of women who had a normal smear after the end of
the previous year.
Consider each example; which measure is estimated by the exposure odds ratio in
each of these studies? Choose from the pull down menu for each example below.

Example A:
Example B:
Example C:
Interaction: Pulldown: Example A: _____:
Incorrect Response Disease odds ratio:
No, note that in this example, the controls are selected from all women registered in
the programme. That means that the sampling is inclusive. So what does that imply
about the measure that is estimated by the exposure odds ratio? Please try again.
Correct Response: Risk ratio:
Correct
Yes, the controls are selected from all women registered in the programme, in other
words inclusive sampling. Therefore the exposure odds ratio in this case estimates
the risk ratio.
Incorrect Response: Rate ratio:
No, note that in this example, the controls are selected from all women registered in
the programme. That means that the sampling is inclusive. So what does that imply
about the measure that is estimated by the exposure odds ratio? Please try again.
Interaction: Pulldown: Example B:____:
Incorrect Response: Disease odds ratio:
No, note that in this example, the controls are selected at the time of diagnosis of
each case. That means that the sampling is concurrent. So what does that imply
about the measure that is estimated by the exposure odds ratio? Please try again.
Incorrect Response: Risk ratio:
No, note that in this example, the controls are selected at the time of diagnosis of
each case. That means that the sampling is concurrent. So what does that imply
about the measure that is estimated by the exposure odds ratio? Please try again.
Correct Response: Rate ratio:
Correct
Yes, the controls are selected at the time of diagnosis, in other words this is
concurrent sampling. Therefore the exposure odds ratio in this case estimates the
rate ratio.
Interaction: Pulldown: Example C:___:

Correct Response: Disease odds ratio:


Correct
Yes, the controls are selected at the end of follow-up, in other words this is exclusive
sampling. Therefore the exposure odds ratio in this case estimates the disease odds
ratio.
Incorrect Response: Risk ratio:
No, note that in this example, the controls are selected at the end of follow-up. That
means that the sampling is exclusive. So what does that imply about the measure
that is estimated by the exposure odds ratio? Please try again.
Incorrect Response: Rate ratio:
No, note that in this example, the controls are selected at the end of follow-up. That
means that the sampling is exclusive. So what does that imply about the measure
that is estimated by the exposure odds ratio? Please try again.

Do you expect these measures to be very different for each of the examples
opposite?
Interaction: Button: clouds picture (pop up box appears):
They are likely to be very similar because cervical cancer is a rare disease.

8.1: Exercise
Exercise 2
This exercise refers to a paper in your reader, Mahmood et al (1989).
Read the extracts from the paper until you reach the section 'Materials and Methods'
and then think about the questions opposite. Click the button at the bottom of each
page to read the (suggested) answer in each case.
Interaction: Tabs: Q1:
From what population were the controls recruited?
Interaction: Button: clouds picture (pop up box appears):
Controls were recruited from infants visiting MCHCs (maternal & child health clinics)
for immunisation and/or routine check-up.
Interaction: Tabs: Q2:
What are the potential advantages of this approach?

Interaction: Button: clouds picture (pop up box appears):


Potential advantages
Recruitment is relatively easy (you don't need to travel all over the city as you
might with neighbourhood controls).
Breast feeding may be associated with risk of other diseases and therefore
"diseased" controls would not be suitable.
Mothers of controls use government health services and would therefore expect
them to take the child to hospital if the child had severe diarrhoea: i.e. controls are
likely to be potential cases.
Interaction: Tabs: Q3:
What are the potential disadvantages of this approach?
Interaction: Button: clouds picture (pop up box appears):
Potential disadvantages
If immunisation coverage/attendance for routine check-ups is low then controls
might not be representative of the population which uses curative services (cases):
i.e. cases might not be potential controls. However, in this particular population this
should not be a major problem since immunisation is compulsory and without it a
person cannot get an ID card.
Mothers at clinics may be anxious to leave and go home, so they may be less cooperative, or their responses may be less careful.
Interaction: Tabs: Q4:
How was the sampling of controls performed?
Interaction: Button: clouds picture (pop up box appears):
Recruitment was concurrent. Potential controls were not excluded if they had
previously been cases, nor if they subsequently became cases.
Interaction: Tabs: Q5:
Which measure of relative incidence will the study estimate?
Interaction: Button: clouds picture (pop up box appears):
The study estimates the rate ratio.
Interaction: Tabs: Q6:
Why do you think this measure was chosen? Does it matter?
Interaction: Button: clouds picture (pop up box appears):
Diarrhoea is a relatively common complaint which (almost) all children suffer at
some time. However, severe diarrhoea leading to hospitalisation is (hopefully) a

relatively rare event and so for the outcome investigated in this study the rate ratio,
risk ratio and odds ratio are all likely to be similar.
Concurrent sampling has the practical virtue that the investigator does not need to
worry whether the control has ever had the outcome of interest nor whether they will
develop it at some time in the future. In this study, though, there was a relatively
wide exclusion window of 1 month.
Interaction: Tabs: Q7:
Why do you think infants "3 months of age and older, with no history of being taken
to an MCHC for immunisation" were excluded from the cases?
Interaction: Button: clouds picture (pop up box appears):
Cases who had never been to an MCHC were excluded to ensure that all cases were
potential controls.

Section 9: Summary
The main points of this session will appear below as you click through the pages
opposite. Click on any of the list entries below to go back to that page.
What does the exposure odds ratio measure?
A case-control study provides an estimate of the exposure odds ratio. Depending on
the design of the study, e.g. the sampling of controls, the exposure odds ratio
estimates:
1. the disease odds ratio
2. the risk ratio
3. the rate ratio
The sampling of controls
Controls are sampled from the same population of interest as cases. The 3 sampling
schemes for control selection are:
Interaction: Tabs: Exclusive:
Exclusive sampling
Controls are sampled at the end of the time period, and the exposure odds ratio
gives an estimate of the disease odds ratio.
Interaction: Tabs: Inclusive:
Inclusive sampling

Controls are sampled at the beginning of the time period and the exposure odds ratio
gives an estimate of the risk ratio.
Interaction: Tabs: Concurrent:
Concurrent sampling
Controls are sampled over time and, hence, the exposure odds ratio gives an
estimate of the rate ratio.
Which situation, which sample
The choice of sampling scheme depends on:
1) whether the disease is rare or not
2) the characteristics of the disease - is it recurrent or not
3) the characteristics of the exposure - e.g. does vaccination confer all-or-nothing
protection, or partial protection to all
You must assess these things before deciding which sampling scheme to use

Anda mungkin juga menyukai