Anda di halaman 1dari 18

EVIDENCE BASED MEDICINE

Najirman
Kelompok studi Clinical Epdemiology and Evidence Based medicine (CEEBM) FK Unand
Ketua SubBagian Reumatologi, Bagian Ilmu Penyakit Dalam FK Unand/
RSUP Dr.M.Djamil
PENDAHULUAN
Seorang pasien laki-laki berumur 51 tahun datang berobat pada seorang
dokter dengan keluhan nyeri dada. 2 minggu sebelumnya dia mersa
sehat-sehat saja, sampai kemudian dia merasa sesak bila berjalan dan
mendaki, Sesak akan hilang bila berhenti selama 2-3 menit. Sesak juga
akan muncul bila berolah raga. Kebiasaan merokok sebungkus perhari
dan juga menderita darah tinggi. Selama ini tidak pernah berobat, karena
tidak

ada

keluhan.

Akan tetapi

akhir-akhir

ini

dia

mencemaskan

kesehatannya, terutama pada jantungnya. Pada pemeriksaan fisik dan


EKG waktu istitrahat dalam batas normal, sedangka tekanan darah
150/96 mmHg
Pada contoh kasus diatas, banyak pertanyaan yang timbul pada pasien,
misalnya apakah saya sakit?, Apa yang menyebabkan penyakitnya?
Bagaimana pengaruh penyakit tersebut terhadap dirinya? Apa yang harus
dia lakukan dan kalau berobat berapa besar biaya yang dibutuhkan?
Sebagai seorang dokter, tentu kita akan melakukan beberapa anamnesis
dan

pemeriksaan

tambahan

sebelum

mengambil

suatu

keputusan

ataupun tindakan. Berbagai informasi tersebut dapat diperoleh dari


berbagai sumber, apakah dari buku teks, internet, pendapat kolega
ataupun

pengalaman

sendiri.

Informasi

yang

kita

inginkan

tentu

seharusnya yang up to date, dan berbasis bukti, dengan kata lain


informasi tersebut haruslah berdasarkan evidence based.
DEFINISI
Istilah evidence-based medicine pertama kali diperkenalkan oleh David
Sackett dkk dari Universitas Mc Master di Ontario Kanada pada awal tahun

1990. Menurut mereka evidence-based medicine/ EBM adalah gabungan


dari penelitian terbaik, pengalaman/ keahlian klinik dan keadaan pasien,
yang digunakan untuk melakukan penatalaksanaan pasien dengan sebaik
baiknya. Dengan kata lain EBM adalah Pemanfaatan bukti mutakhir yang
sahih dalam tata laksana pasien. Dengan demikian EBM membantu
seorang dokter untuk menghindari informasi yang berlebihan, sementara
pada waktu yang bersamaan membantu dokter medapatkan informasi
yang diperlukan untuk penatalaksanaan pasien.

EBM

Gambar 1. Komponen EBM


Pada

awal

diperkenalkannya

EBM,

hanya

beberapa

jurnal

yang

menerapkan prinsip EBM, akan tetapi sejak tahun 2000, sudah lebih dari
1000 jurnal yang menerapkan prinsip EBM.
LANGKAH-LANGKAH DALAM EBM
Untuk mencari bukti sahih yang diperlukan dalam suatu jurnal, ada
beberapa langkah yang harus dilakukan :
1. Memformulasikan suatu pertanyaan
Sebelum membuat pertanyaan, seorang dokter haruslah terlebih
dahulu memahami apa masalah yang dihadapinya, kemudian baru
membuat suatu pertanyaan. Pertanyaan klinis dibuat bila:

Penting untuk kesembuhan pasien

Diperkirakan akan ditemukan jawabannya

Sering dijumpai dalam praktik

Menarik untuk kita

Ada kemungkinan berkaitan dengan isu medikolegal

Misalnya kita menghadapi pasien yang menderita nyeri dada dan


didiagnosa sebagai Infark miokard akut. Ada 2 jenis pertanyaan yang
bisa diajukan, yakni pertanyaan yang bersifat umum/ Foreground
questions dan yang bersifat spesifik/ Background questions. Makin
lama dan malkin luas pengalaman klinis, makin spesifik pertanyaan
yang diajukan seperti tampak pada gambar berikut

Gambar 2 . Background vs Foreground


Maka untuk kasus diatas pertanyaannya dapat berupa : obat apa yang
terbaik untuk mengatasi nyeri dada pada penderita dengan infark
miokrd akut? (umum). Pertanyaan seperti itu bila dicari di internet akan
terdapat dijurnal dalam jumlah yang baik, namun tidak banyak
manfaatnya. pertanyaan yang bersifat lebih spesifik dapat berupa
seperti: Apakah pemberian klopidogrel lebih baik dari isosorbid dinitrat

untuk mengatasi nyeri dada pada penderita dengan infark miokard


akut? (lebih spesifik). Dengan pertanyaan seperti itu, maka akan lebih
mudah untuk menelusurinya di internet dan hasil yang didapat lebih
aplikatif untuk diterapkan ke pasien.
Untuk memformulasikan suatu pertanyaan klinik dibagi atas 4 bagian
yang disebut dengan istilah: PICO
P = Population/ problem/ patients
I = intervention/ indicator
C = Comparison
O = Outcome
Sebagai ilustrasi dapat dilihat contoh dibawah ini

Akan tetapi jenis pertanyaan bukan hanya menyangkut intervensi,


tetapi juga mencakup:
Etiologi
Faktor risiko
Frekuensi
Diagnosis
Prognosis
2. Mencari bukti yang sahih
Setelah pertanyaan diformulasikan, maka langkah selanjutnya adalah
mencari data/ atau penelitian yang kita inginkan di jurnal. Salah satu
website yang sering diakses untuk mencari jurnal adalah pubmed.

Gambar 3. Tampilan umum pubmed

Setelah pubmed diakses, maka muncul tampilan seperti gambar


diatas, kemudian kita klik clinical queries dan akan muncul tampilan
seperti gambar dibawah ini

Gambar 4. Tampilan pubmed

Setelah itu kita ketik apa yang ingin kita cari dari jurnal, misal SLE,
setelah itu klik search, sehingga akan tampak tampilan seperti tampak
pada gambar dibawah ini

Gambar 5. Hasil penelusuran di pubmed


Dari gambar diatas tampak bahwa terdapat 5 buah jurnal dari
sebanyak 265 jurnal yang bersifat systematic review atau metaanalysis.
3. Mengkritisi bukti tersebut
Setelah didapat, maka selanjutnya jurnal tersebut tentu harus dikritisi
agar jurnal tersebut memang sesuai dengan yang kita inginkan. Untuk
itu ada format yang sering dipakai dan disebut dengan istilah VIA
(Validity, Importance dan Aplicability).
Validity/ Validitas :
Valid/ tidaknya suatu penelitian dapat dilihat pada metodologi yang
dilaksanakan pada penelitian tersebut, apakah dilakukan randomisasi,
blinding, folow up dilakukan secara lengkap dan dilakukan intention-totreat, dimana dalam menganalisa suatu data pasien yang drop out juga
dimasukkan untuk dianalisa.

Importance
Hasil suatu penelitian yang kita peroleh dari suatu jurnal belum tentu
sesuai dengan yang kita inginkan. Untuk itu hal yang perlu kita
perhatikan

adalah

keadaan

pasien

kita

sama

atau

mendekati

kesamaan dengan yang didapat pada jurnal tersebut.

Aplicability
Yakni menerapkan hasil yang didapat dari jurnal untuk diaplikasi pada
pasien/ kasus yang dihadapi. Untuk melakukan hal tersebut yang
penting harus diketahui adalah bahwa apa yang didapat di jurnal
haruslah keadaannya sama dengan kondisi pasien yang akan kita
tangani.
4. Mengintegrasikan bukti yang didapat dengan pengalaman
klinis terhadap pasien
Setelah kita dapat apa yang kita inginkan, maka langkah berikutnya
adalah menerapkan hasil tersebut terhadap pasien yang ditangani. Ini
dilakukan dengan mengkombinasikan hal yang kita dapat dari sutu
jurnal dengan pengalaman yang tealh kita punyai sehingga kita
menatalaksanan pasien secara optimal
5. Mengevaluasi aplikasi tersebut untuk perbaikan dimasa depan.
Pada tahap ini setiap penerapan yang kita lakukan terhadap pasien
yang kita peroleh dari hasil penelusran di jurnal haruslah kita evaluasi,
apakah masih bermanfaat ataukah sudah tidak sesuai lagi dengan
kondisi pasien saat ini. Hal ini disebabkan karena ilmu kedokteran itu
terus berkembang dan banyak penelitian baru yang dapat dilihat di
internet setiap saat.

Gambar 6. Langkah-langkan dalam EBM


HIRARKI DISAIN PENELITIAN PADA EBM
Tidak semua penelitian yang dimuat dalam suatu jurnal mempunyai
metode/ disain yang sama, ada disainnya berupa case report, case
control, cohort dan ekpserimental. Secara umum tingkat validitas suatu
penelitan dapat dilihat pada gambar dibawah ini.

Gambar 7. Hirarki suatu disain penelitian


Dari gambar diatas tampak bahwa meta-analisis merupakan tingkat
tertinggi dari siatu disain penelitian yang dimuat dalam suatu jurnal. Akan

tetapi untuk mendapatkan hasil terbaik dari suatu jurnal, tergantung juga
jenis

pertanyaan

yang

kita

ajukan

atau

tergantung

pada

jenis

penelitiannya, misal penelitian untuk mencari prognosis, etiologi ataukah


bersifat intervensi. Untuk hal yang demikian dapat dilihat pada tabel
dibawah ini :
Tabel 1. Jenis pertanyaan klinis dan disain penelitian

HIRARKI REKOMENDASI PADA EBM


Bila kita lihat dalam suatu guidelines pegobatan suatu penyakit, ada
beberapa tingkat rekomendasi yang dicantumkan dalam suatu gidelines
tersebut, mulai dari tingkat I sampai dengan tingkat, atau mulai dari
rekomendasi dengan nilai A sampai dengan D. Hal INI menggambarkan
rekomendasi tingkat I atau A, menunjukkan rekomendasi tersebut dibuat
berdasarkan validitas yang sangat tinggi, sehinggan sangat dianjurkan
untuk

menggunakannya

untuk

menentukan

pilihan

dalam

suatu

pengobatan. Disisi lain bila tingkat rekomendasinya bernilai lebih rendah,


misalnya III atau nilanya C dan D, hal ini menunjukkan rekomendasi
tersebut bersifat lemah eividence basednya.
Tabel 2. Tingkat evidence suatu penelitian

Tabel 3. Tingkat rekomendasi suatu penelitian

TELAAH KRITIS EBM


Untuk mengkritisi suatu penelitian apakah disainnya case control, cohort
ataukah eksperimental, ada beberapa hal yang harus diketahui. Banyak
contoh daftar isian/ checklist yang digunakan untuk mengkritisi hal
tersebut seperti tampak pada lampiran
Contoh checklist telaah kritis aspek terapi
THERAPY STUDY: Are

the results of the trial valid? (Internal Validity)


What question did the study ask?
Patients
Intervention Comparison Outcome(s) 1a. R- Was the assignment of patients to treatments randomised?
What is best?

Where do I find the information?

Centralised computer randomisation is ideal and often


used in multi-centred trials. Smaller trials may use an
independent person (e.g, the hospital pharmacy) to
police the randomization.
This paper: Yes No Unclear
Comment:

The Methods should tell you how patients were allocated


to groups and whether or not randomisation was
concealed.

1b. R- Were the groups similar at the start of the trial?


What is best?
If the randomisation process worked (that is, achieved
comparable groups) the groups should be similar. The
more similar the groups the better it is.
There should be some indication of whether differences
between groups are statistically significant (ie. p values).
This paper: Yes No Unclear
Comment:

Where do I find the information?


The Results should have a table of "Baseline
Characteristics" comparing the randomized groups on a
number of variables that could affect the outcome (ie. age,
risk factors etc). If not, there may be a description of group
similarity in the first paragraphs of the Results section.

2a. A Aside from the allocated treatment, were groups treated equally?
What is best?
Apart from the intervention the patients in the different
groups should be treated the same, eg., additional
treatments or tests.
This paper: Yes No Unclear
Comment:

Where do I find the information?


Look in the Methods section for the follow-up schedule,
and permitted additional treatments, etc and in Results for
actual use.

2b. A Were all patients who entered the trial accounted for? and were
they analysed in the groups to which they were randomised?
What is best?
Losses to follow-up should be minimal preferably less
than 20%. However, if few patients have the outcome of
interest, then even small losses to follow-up can bias the
results. Patients should also be analysed in the groups to
which they were randomised intention-to-treat analysis.
This paper: Yes No Unclear
Comment:

Where do I find the information?


The Results section should say how many patients were
11andomised (eg., Baseline Characteristics table) and how
many patients were actually included in the analysis. You
will need to read the results section to clarify the number
and reason for losses to follow-up.

3. M - Were measures objective or were the patients and clinicians kept


blind to which treatment was being received?
What is best?
It is ideal if the study is double-blinded that is, both
patients and investigators are unaware of treatment
allocation. If the outcome is objective (eg., death) then
blinding is less critical. If the outcome is subjective (eg.,
symptoms or function) then blinding of the outcome
assessor is critical.
This paper: Yes No Unclear
Comment:

Where do I find the information?


First, look in the Methods section to see if there is some
mention of masking of treatments, eg., placebos with the
same appearance or sham therapy. Second, the Methods
section should describe how the outcome was assessed
and whether the assessor/s were aware of the patients'
treatment.

What were the results?


1. How large was the treatment effect?
Most often results are presented as dichotomous outcomes (yes or not outcomes that happen or don't happen) and can
include such outcomes as cancer recurrence, myocardial infarction and death. Consider a study in which 15% (0.15) of
the control group died and 10% (0.10) of the treatment group died after 2 years of treatment. The results can be
expressed in many ways as shown below.
What is the measure?

What does it mean?

Relative Risk (RR) = risk of the outcome in the


treatment group / risk of the outcome in the control
group.

The relative risk tells us how many times more likely it is that
an event will occur in the treatment group relative to the control
group. An RR of 1 means that there is no difference between the
two groups thus, the treatment had no effect. An RR < 1 means
that the treatment decreases the risk of the outcome. An RR > 1
means that the treatment increased the risk of the outcome.

In our example, the RR = 0.10/0.15 = 0.67

Since the RR < 1, the treatment decreases the risk of death.

Absolute Risk Reduction (ARR) = risk of the


outcome in the control group - risk of the outcome
in the treatment group. This is also known as the
absolute risk difference.

The absolute risk reduction tells us the absolute difference in the


rates of events between the two groups and gives an indication
of the baseline risk and treatment effect. An ARR of 0 means that
there is no difference between the two groups thus, the treatment
had no effect.

In our example, the ARR = 0.15 - 0.10 = 0.05 or 5%

The absolute benefit of treatment is a 5% reduction in the death


rate.

Relative Risk Reduction (RRR) = absolute risk


reduction / risk of the outcome in the control group.
An alternative way to calculate the RRR is to
subtract the RR from 1 (eg. RRR = 1 - RR)

The relative risk reduction is the complement of the RR and is


probably the most commonly reported measure of treatment
effects. It tells us the reduction in the rate of the outcome in the
treatment group relative to that in the control group.

In our example, the RRR = 0.05/0.15 = 0.33 or 33%


Or
RRR = 1 - 0.67 = 0.33 or 33%

The treatment reduced the risk of death by 33% relative to that


occurring in the control group.

Number Needed to Treat (NNT) = inverse of the


ARR and is calculated as 1 / ARR.

The number needed to treat represents the number of patients


we need to treat with the experimental therapy in order to prevent
1 bad outcome and incorporates the duration of treatment.
Clinical significance can be determined to some extent by looking
at the NNTs, but also by weighing the NNTs against any harms or
adverse effects (NNHs) of therapy.

In our example, the NNT = 1/ 0.05 = 20

We would need to treat 20 people for 2 years in order to prevent


1 death.

2. How precise was the estimate of the treatment effect?


The true risk of the outcome in the population is not known and the best we can do is estimate the true risk based on the
sample of patients in the trial. This estimate is called the point estimate. We can gauge how close this estimate is to
the true value by looking at the confidence intervals (CI) for each estimate. If the confidence interval is fairly narrow then
we can be confident that our point estimate is a precise reflection of the population value. The confidence interval also
provides us with information about the statistical significance of the result. If the value corresponding to no effect falls
outside the 95% confidence interval then the result is statistically significant at the 0.05 level. If the confidence interval
includes the value corresponding to no effect then the results are not statistically significant.
Will the results help me in caring for my patient? (ExternalValidity/Applicability)
The questions that you should ask before you decide to apply the results of the study to your patient are:
Is my patient so different to those in the study that the results cannot apply?
Is the treatment feasible in my setting?
Will the potential benefits of treatment outweigh the potential harms of treatment for my patient?

Contoh checklist systemic review


THERAPY STUDY: Are the results of the trial valid? (Internal Validity)
What question did the study ask?
Patients
Intervention Comparison Outcome(s) 1a. R- Was the assignment of patients to treatments randomised?
What is best?
Centralised computer randomisation is ideal and often
used in multi-centred trials. Smaller trials may use an
independent person (e.g, the hospital pharmacy) to
police the randomization.
This paper: Yes No Unclear
Comment:

Where do I find the information?


The Methods should tell you how patients were allocated
to groups and whether or not randomisation was
concealed.

1b. R- Were the groups similar at the start of the trial?


What is best?
If the randomisation process worked (that is, achieved
comparable groups) the groups should be similar. The
more similar the groups the better it is.
There should be some indication of whether differences
between groups are statistically significant (ie. p values).
This paper: Yes No Unclear
Comment:

Where do I find the information?


The Results should have a table of "Baseline
Characteristics" comparing the randomized groups on a
number of variables that could affect the outcome (ie. age,
risk factors etc). If not, there may be a description of group
similarity in the first paragraphs of the Results section.

2a. A Aside from the allocated treatment, were groups treated equally?
What is best?
Apart from the intervention the patients in the different
groups should be treated the same, eg., additional
treatments or tests.

Where do I find the information?


Look in the Methods section for the follow-up schedule,
and permitted additional treatments, etc and in Results for
actual use.

This paper: Yes


Comment:

No

Unclear

2b. A Were all patients who entered the trial accounted for? and were
they analysed in the groups to which they were randomised?
What is best?
Losses to follow-up should be minimal preferably less
than 20%. However, if few patients have the outcome of
interest, then even small losses to follow-up can bias the
results. Patients should also be analysed in the groups to
which they were randomised intention-to-treat analysis.
This paper: Yes No Unclear
Comment:

Where do I find the information?


The Results section should say how many patients were
14andomised (eg., Baseline Characteristics table) and how
many patients were actually included in the analysis. You
will need to read the results section to clarify the number
and reason for losses to follow-up.

3. M - Were measures objective or were the patients and clinicians kept


blind to which treatment was being received?
What is best?
It is ideal if the study is double-blinded that is, both
patients and investigators are unaware of treatment
allocation. If the outcome is objective (eg., death) then
blinding is less critical. If the outcome is subjective (eg.,
symptoms or function) then blinding of the outcome
assessor is critical.
This paper: Yes No Unclear
Comment:

Where do I find the information?


First, look in the Methods section to see if there is some
mention of masking of treatments, eg., placebos with the
same appearance or sham therapy. Second, the Methods
section should describe how the outcome was assessed
and whether the assessor/s were aware of the patients'
treatment.

What were the results?


3. How large was the treatment effect?
Most often results are presented as dichotomous outcomes (yes or not outcomes that happen or don't happen) and can
include such outcomes as cancer recurrence, myocardial infarction and death. Consider a study in which 15% (0.15) of
the control group died and 10% (0.10) of the treatment group died after 2 years of treatment. The results can be
expressed in many ways as shown below.
What is the measure?

What does it mean?

Relative Risk (RR) = risk of the outcome in the


treatment group / risk of the outcome in the control
group.

The relative risk tells us how many times more likely it is that
an event will occur in the treatment group relative to the control
group. An RR of 1 means that there is no difference between the
two groups thus, the treatment had no effect. An RR < 1 means
that the treatment decreases the risk of the outcome. An RR > 1
means that the treatment increased the risk of the outcome.

In our example, the RR = 0.10/0.15 = 0.67

Since the RR < 1, the treatment decreases the risk of death.

Absolute Risk Reduction (ARR) = risk of the


outcome in the control group - risk of the outcome
in the treatment group. This is also known as the
absolute risk difference.

The absolute risk reduction tells us the absolute difference in the


rates of events between the two groups and gives an indication
of the baseline risk and treatment effect. An ARR of 0 means that
there is no difference between the two groups thus, the treatment
had no effect.

In our example, the ARR = 0.15 - 0.10 = 0.05 or 5%

The absolute benefit of treatment is a 5% reduction in the death


rate.

Relative Risk Reduction (RRR) = absolute risk


reduction / risk of the outcome in the control group.
An alternative way to calculate the RRR is to
subtract the RR from 1 (eg. RRR = 1 - RR)

The relative risk reduction is the complement of the RR and is


probably the most commonly reported measure of treatment
effects. It tells us the reduction in the rate of the outcome in the
treatment group relative to that in the control group.

In our example, the RRR = 0.05/0.15 = 0.33 or 33%


Or
RRR = 1 - 0.67 = 0.33 or 33%

The treatment reduced the risk of death by 33% relative to that


occurring in the control group.

Number Needed to Treat (NNT) = inverse of the


ARR and is calculated as 1 / ARR.

The number needed to treat represents the number of patients


we need to treat with the experimental therapy in order to prevent
1 bad outcome and incorporates the duration of treatment.
Clinical significance can be determined to some extent by looking
at the NNTs, but also by weighing the NNTs against any harms or
adverse effects (NNHs) of therapy.

In our example, the NNT = 1/ 0.05 = 20

We would need to treat 20 people for 2 years in order to prevent


1 death.

4. How precise was the estimate of the treatment effect?


The true risk of the outcome in the population is not known and the best we can do is estimate the true risk based on the
sample of patients in the trial. This estimate is called the point estimate. We can gauge how close this estimate is to
the true value by looking at the confidence intervals (CI) for each estimate. If the confidence interval is fairly narrow then
we can be confident that our point estimate is a precise reflection of the population value. The confidence interval also
provides us with information about the statistical significance of the result. If the value corresponding to no effect falls
outside the 95% confidence interval then the result is statistically significant at the 0.05 level. If the confidence interval
includes the value corresponding to no effect then the results are not statistically significant.
Will the results help me in caring for my patient? (ExternalValidity/Applicability)
The questions that you should ask before you decide to apply the results of the study to your patient are:
Is my patient so different to those in the study that the results cannot apply?
Is the treatment feasible in my setting?
Will the potential benefits of treatment outweigh the potential harms of treatment for my patient?

Contoh checklist telaah kritis uji diagnostik


Step 1: Are the results of the study valid?
Was the diagnostic test evaluated in a Representative spectrum of
patients (like those in whom it would be used in practice)?
What is best?

Where do I find the information?

It is ideal if the diagnostic test is


applied to the full spectrum of
patients - those with mild, severe,
early and late cases of the target
disorder. It is also best if the
patients are randomly selected or
consecutive admissions so that
selection bias is minimized.

The Methods section should tell


you how patients were enrolled and
whether
they
were
randomly
selected or consecutive admissions.
It should also tell you where
patients came from and whether
they are likely to be representative
of the patients in whom the test is
to be used.

This paper: Yes No


Unclear
Comment:
Was the reference standard applied regardless of the index test result?
What is best?
Ideally both the index test and
the reference standard should
be carried out on all patients in
the study. In some situations where
the reference standard is invasive
or
expensive
there
may
be
reservations
about
subjecting
patients with a negative index test
result (and thus a low probability of
disease) to the reference standard.
An alternative reference standard is

Where do I find the information?


The
Methods
section
should
indicate whether or not the
reference standard was applied to
all patients or if an alternative
reference standard (e.g., follow-up)
was applied to those who tested
negative on the index test.

to
follow-up
people
for
an
appropriate
period
of
time
(dependent on disease in question)
to see if they are truly negative.
This paper: Yes No Unclear
Comment:
Was there an independent, blind comparison between the index test and
an appropriate reference ('gold') standard of diagnosis?
Where do I find the information?
What is best?
There are two issues here. First the The Methods section should have a
description
of
the
reference
reference
standard
should
be standard used and if you are unsure
appropriate - as close to the 'truth' of whether or not this is an
reference standard you
as possible. Sometimes there may appropriate
may need to do some background
not be a single reference test that is searching in the area.
suitable and a combination of tests
The Methods section should also
may be used to indicate the describe who conducted the two
tests and whether each was
presence of disease.
independently
and
Second, the reference standard and conducted
blinded to the results of the other.
the index test being assessed
should be applied to each patient
independently and blindly. Those
who interpreted the results of one
test should not be aware of the
results of the other test.
This paper: Yes No Unclear
Comment:

Step 2: What were the results?


Are test characteristics presented?
There are two types of results commonly reported in diagnostic test
studies. One concerns the accuracy of the test and is reflected in the
sensitivity and specificity. The other concerns how the test performs in the
population being tested and is reflected in predictive values (also called
post-test probabilities). To explore the meaning of these terms, consider a
study in which 1000 elderly people with suspected dementia undergo an
index test and a reference standard. The prevalence of dementia in this
group is 25%. 240 people tested positive on both the index test and the
reference standard and 600 people tested negative on both tests. The first
step is to draw a 2 x 2 table as shown below. We are told that the
prevalence of dementia is 25% therefore we can fill in the last row of
totals - 25% of 1000 people is 250 - so 250 people will have dementia and
750 will be free of dementia. We also know the number of people testing
positive and negative on both tests and so we can fill in two more cells of
the table.

What is the measure?

What does it means?

Sensitivity (Sn) = the proportion of The sensitivity tells us how well the
people with the condition who have test identifies people with the
a positive test result.
condition. A highly sensitive test will
not miss many people.
The sensitivity tells us how well the
test identifies people with the In our example, the Sn = 240/250 =
condition. A highly sensitive test will 0.96
not miss many people.
10 people (4%) with dementia were
In our example, the Sn = 240/250 = falsely identified as not having it.
0.96
This means the test is fairly good at
identifying
people
with
the
condition.
Specificity (Sp) = the proportion of The specificity tells us how well the
people without the condition who test identifies people without the
have a negative test result.
condition. A highly specific test will
not falsely identify many people as
The specificity tells us how well the having the condition.
test identifies people without the
condition. A highly specific test will In our example, the Sp = 600/750 =
not falsely identify many people as 0.80
having the condition.
150 people (20%) without dementia
In our example, the Sp = 600/750 = were falsely identified as having it.
0.80
This means the test is only
moderately good at identifying
people without the condition.
Positive Predictive Value (PPV) = the This measure tells us how well the
proportion of people with a positive test performs in this population. It is
test who have the condition
dependent on the accuracy of the
test (primarily specificity) and the
In our example, the PPV = 240/390 prevalence of the condition.
= 0.62
Of the 390 people who had a
positive test result, 62% will actually
have dementia.
Negative Predictive Value (NPV) =
the proportion of people with a
negative test who do not have the
condition.

This measure tells us how well


test performs in this population.
dependent on the accuracy of
test and the prevalence of
condition.

the
It is
the
the

In our example, the NPV = 600/610


= 0.98
Of the 610 people with a -ve test ,
98% will not have dementia.
Step 3: Applicability of the results
Were the methods for performing the test described in sufficient
detail to permit replication?
What is best?

What is best?

The article should have sufficient The


description of the test to allow its

Methods

section

should

replication and also interpretation of describe the test in detail.


the results.
This paper: Yes No Unclear
Comment: