David Moher a ,*, Sally Hopewell b , Kenneth F. Schulz c , Victor Montori d , Peter C. Gøtzsche e , PJ f
Devereaux ,
Diana Elbourne g , Matthias Egger h , Douglas G. Altman b
a
Pusat Metode Ottawa, Program Epidemiologi Klinis, Institut Penelitian Rumah Sakit Ottawa, Rumah Sakit Ottawa, Ottawa, Ontario K1H 8L6, Kanada b Pusat Statistik
dalam Kedokteran, Universitas Oxford, Wolfson College, Oxford , UK c Family Health International, Research Triangle Park, NC 27.709, USA d UK Pengetahuan dan
Encounter unit Penelitian, Mayo Clinic, Rochester, MN, USA eThe Nordic Cochrane Centre, Rigshospitalet, Blegdamsvej 9, Copenhagen, Denmark f McMaster University
Health Sciences Centre, Hamilton, Kanada g Unit Statistik Medis, London School of Hygiene dan Tropical Medicine, London, UK hInstitut Ilmu Sosial dan Pencegahan
asa menunjukkan kualitas pelaporan uji coba terkontrol secara acak (RCT) tidak optimal. Tanpa pelaporan
Setelah pertemuan ahli pada Januari 2007, pernyataan CONSORT telah direvisi lebih lanjut dan diterbitkan sebagai Pernyataan CONSORT 2010. Pembaruan ini
meningkatkan susunan kata dan kejelasan daftar periksa sebelumnya dan memasukkan rekomendasi yang berkaitan dengan topik
yang baru saja menerima pengakuan, seperti bias pelaporan hasil yang selektif.
dokumen jelas dan elaborasi iniDdimaksudkan untuk meningkatkan penggunaan, pemahaman, dan penyebaran pernyataan CONSORTdjuga telah secara ekstensif
direvisi. Ini menyajikan makna dan alasan untuk setiap item daftar periksa yang baru dan diperbarui memberikan contoh
pelaporan yang baik dan, jika mungkin, referensi untuk studi empiris yang relevan. Beberapa contoh alirdiagram disertakan.
Pernyataan CONSORT 2010, dokumen penjelas dan penjabaran yang direvisi ini, dan situs web terkait
(www.consort-statement.org) harus menjadi sumber daya yang bermanfaat untuk meningkatkan pelaporan uji coba acak. © 2010
Moher et al, dicetak ulang oleh Elsevier Ltd atas nama Surgical Associates Ltd. Ini adalah artikel akses terbuka yang
didistribusikan berdasarkan ketentuan Lisensi Atribusi Creative Commons (http: // creative- commons.org/licenses/by/2.0.0 )
yang memungkinkan penggunaan, distribusi, dan reproduksi tanpa batas dalam media apa pun, asalkan penulis asli dan
sumbernya dikreditkan.
q Ini adalah artikel akses terbuka yang didistribusikan di bawah ketentuan Lisensi Creative Commons Atribusi Non-komersial, yang memungkinkan penggunaan, distribusi, dan reproduksi dalam media apa
pun, asalkan karya aslinya dikutip dengan benar, penggunaannya tidak komersial dan jika tidak sesuai dengan lisensi. Lihat: http: // creativecommons.org/licenses/by-nc/2.0/ dan
http://creativecommons.org/licenses/by-nc/2.0/legalcode. qq Dalam rangka mendorong penyebaran Pernyataan CONSORT 2010, artikel ini diakses secara bebas di bmj.com dan juga telah dipublikasikan
dalam JournalofKlinik.
EpidemiologiPenulis bersama-sama memegang hak cipta dari artikel ini. Untuk perincian tentang penggunaan lebih lanjut, lihat situs web CONSORT (www.consort-statement.org).
* Penulis yang sesuai. Tel .: þ1 613 738 3591.
Alamat email: dmoher@ohri.ca (D. Moher).
GUIDELINE
"Seluruh obat tergantung pada pelaporan transparan uji klinis".1 ffikeampuhanintervensi kesehatan, namun uji coba dengan metode yang tidak
2e5
Dirancang dengan baik dan benar dilaksanakan percobaan terkontrolmadai terkait dengan Bias, efek pengobatan terutama berlebihan. Hasil yang
acak (RCT) memberikan bukti yang paling dapat diandalkans dari uji coba yang dirancang dan dilaporkan dengan buruk dapat menyesatkan
pengambilan keputusan dalam perawatan kesehatan di semua tingkatan, dari Tujuan CONSORT adalah untuk memberikan panduan kepada penulis
keputusan perawatan untuk pasien hingga perumusan kebijakan kesehatan tang cara meningkatkan pelaporan uji coba mereka. Laporan percobaan
masyarakat nasional. Kotak 1. Alokasi perawatan. Apa yang istimewa dari pengacakan?
Penilaian kritis terhadap kualitas uji klinis hanya mungkin jika desain,
pelaksanaan, dan analisis RCT dijelaskan secara menyeluruh dan akurat dalam Metode yang digunakan untuk menetapkan intervensi untuk peserta
laporan. Jauh dari transparan, pelaporan RCT seringtidak lengkap,6e9 uji coba adalah aspek penting dari desain uji klinis. Tugas acak
masalahperacikan yang timbul dari metodologi miskin.10e15 adalah metode yang disukai; telah sukses digunakan secara penuh
dalam uji coba selama lebih dari 50 tahun.24 Ranisasi memiliki tiga
keunggulan utama.25 Pertama, ketika diimplementasikan dengan
1. pelaporan tidak lengkap dan tidak akurat
benar, ia menghilangkan bias seleksi, menyeimbangkan faktor
prognostik yang diketahui dan tidak diketahui, dalam penugasan
Banyak ulasan telah mendokumentasikandefisiensidalam laporan perawatan. Tanpa pengacakan, perbandingan pengobatan dapat
percobaan ical clin-. Misalnya, informasi tentang metode yang digunakan dalam uji berprasangka, baik secara sadar atau tidak, dengan memilih peserta
coba untuk menetapkan peserta ke kelompok pembanding dilaporkan hanya dalam dari jenis tertentu untuk menerima pengobatan tertentu. Kedua,
21% dari 519 laporan uji coba yang diindeks di PubMed pada tahun 2000,16 dan penugasan acak memungkinkan penggunaan teori probabilitas untuk
hanya 34% dari 616 laporan yang diindeks pada tahun 2006.17 Demikian pula, menyatakan kemungkinan bahwa setiap perbedaan hasil antara
hanya 45 % dari laporan percobaan diindeks di PubMed pada tahun 200016 dan kelompok intervensi hanya mencerminkan peluang.26 Ketiga, alokasi
17 acak, dalam beberapa situasi, memfasilitasi membutakan identitas
53% pada tahun 2006 didefinisikantitik akhir primer, dan hanya 27% pada tahun
perawatan kepada penyelidik, peserta, dan evaluator, mungkin
2000 dan 45% pada tahun 2006 melaporkan perhitungan ukuran sampel. Pelaporan
dengan menggunakan plasebo, yang mengurangi bias setelah
tidak hanya sering tidak lengkap tetapi juga terkadang tidak akurat. Dari 119
laporan yang menyatakan bahwa semua peserta dimasukkan dalam analisis dalam penugasan perawatan.27 Dari ketiga keunggulan ini, mengurangi bias
kelompok-kelompok yang semula mereka ditugaskan (analisis intention-to-treat), seleksi pada saat masuk persidangan biasanya yang paling penting.28
15 (13%) mengecualikan pasien atau tidak menganalisis semua pasien sesuai
Pengacakan yang berhasil dalam praktiknya bergantung pada dua
alokasi.18 Banyak ulasan lain menemukan bahwa pelaporan yang tidak memadai aspek yang saling terkait, yaitu tidak memadainya urutan alokasi yang
adalah umum di jurnal khusus16,19 dan jurnal yang diterbitkan dalam bahasa selain tidak terduga dan penyembunyian urutan itu sampai penugasan
bahasa Inggris.20,21 terjadi.2,23 Masalah utama adalah apakah jadwal diketahui atau dapat
Pengacakan yang tepat mengurangi bias seleksi pada entri percobaan diprediksi oleh orang-orang yang terlibat dalam mengalokasikan
dan merupakan komponen penting dari RCT berkualitas tinggi.22 Pengacakan yang peserta ke kelompok pembanding.29 Sistem alokasi perawatan harus
berhasil bergantung pada dua langkah: pembuatan urutan alokasi yang tidak ditetapkan sehingga orang yang mendaftarkan peserta tidak tahu
terduga dan penyembunyian urutan ini dari simpatisan yang mendaftarkan peserta sebelumnya perawatan yang akan didapatkan orang berikutnya,
sebuah proses yang disebut penyembunyian alokasi.2,23 Penyembunyian
(lihat Kotak 1).2,23
alokasi yang tepat melindungi pengetahuan tentang penugasan yang
akan datang, sedangkan urutan acak yang tepat mencegah antisipasi
2. Meningkatkan pelaporan RCT: pernyataan CONSORT yang benar terhadap penugasan di masa mendatang berdasarkan
pengetahuan penugasan sebelumnya.
DerSimonian dan rekan menyarankan bahwa "editor dapat sangat
Sayangnya, terlepas dari peran sentral itu, pelaporan metode yang
meningkatkan pelaporan uji klinis dengan memberikan kepada penulis daftar item
digunakan untuk alokasi peserta ke intervensi juga umumnya tidak
Di awal tahun 1990-an, dua
yang mereka harapkan akan dilaporkan secara ketat.”34 memadai. Sebagai contoh, 5% dari 206 laporan dugaan RCT di jurnal
kelompok editor jurnal, persidangan, dan ahli metodologi menerbitkan kebidanan dan ginekologi menggambarkan penelitian yang tidak
rekomendasi tentang pelaporan persidangan secara independen.35,36 Dalam benar-benar acak.23 Perkiraan ini konservatif, karena sebagian besar
editorial berikutnya, Rennie mendesak kedua kelompok untuk bertemu dan laporan saat ini tidak memberikan informasi yang memadai tentang
mengembangkan seperangkat rekomendasi37; hasilnya adalah pernyataan metode alokasi.20,23,30e33
CONSORT (Standar Konsolidasi Uji Coba Pelaporan).38
Pernyataan CONSORT (atau hanya CONSORT) terdiri dari daftar
barang-barang penting yang harus dimasukkan dalam laporan RCT dan diagram
untuk mendokumentasikan aliranpeserta melalui pengadilan. Ini ditujukan untuk harus jelas, lengkap, dan transparan. Pembaca, peninjau sejawat, dan editor juga
laporan utama RCT dengan dua kelompok, desain paralel. Sebagian besar dapat menggunakan CONSORT untuk membantu mereka menilai dan menafsirkan
CONSORT juga relevan dengan kelas desain uji coba yang lebih luas, seperti laporan RCT secara kritis. Namun, CONSORT tidak dimaksudkan untuk digunakan
non-inferiority, equivalence, factorial, cluster, dan crossover. Ekstensi untuk sebagai instrumen penilaian kualitas. Sebaliknya, konten CONSORT berfokus pada
checklist CONSORT untuk melaporkan percobaan dengan beberapa desain ini item yang terkait dengan validitas uji coba internal dan eksternal. Banyak item yang
tidak disebutkan secara eksplisit dalam CONSORT juga harus dimasukkan dalam
telah dipublikasikan,39e41 sebagai memiliki orang-orang untuk melaporkan jenis
laporan, seperti informasi tentang persetujuan oleh komite etika, memperoleh
tertentu dari data (merugikan42),jenis intervensi (pengobatan non farmakologis,43 persetujuan berdasarkan informasi dari para peserta, dan, jika relevan, keberadaan
intervensi herbal44),dan abstrak.45 komite keamanan dan pemantauan data. Selain itu, setiap aspek lain dari uji coba
yang disebutkan harus dilaporkan dengan benar, seperti informasi yang berkaitan
jurnal dikaitkan dengan peningkatan kualitas laporan RCT.17,50,51 Namun,
46e48
dengan analisis efektivitas biaya. CONSORT adalah inisiatif yang berkelanjutan, dan pernyataan CONSORT direvisi
secara berkala.3 CONSORT terakhir kali direvisi sembilan tahun lalu, di
Sejak diterbitkan pada tahun 1996, CONSORT telah didukung oleh lebih
dari 400 jurnal (www.consort-statement.org) dan beberapa kelompok editorial,
seperti Komite Internasional Editor Jurnal Medis.49 Pengenalan CONSORT dalam
GUIDELINE
55 29
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
2001.52e54Sejak itu basis bukti untuk menginformasikan CONSORT telah tumbuh Contohd“Merokokpengurangan dengan inhaler nikotin lisan:
pesat; data empiris telah menyoroti kekhawatiran baru tentang pelaporan RCT, e blind, percobaan klinis acak dariefficacydan keselamatan.”63
seperti pelaporan hasil selektif.55e57 Oleh karena itu pertemuan Grup CONSORT PenjelasandKemampuan untuk mengidentifikasi laporan
diadakan pada bulan Januari 2007, di Kanada, untuk merevisi pernyataan baan acak dalam database elektronik sangat tergantung pada bagaimana
CONSORT 2001 dan penjelasannya serta dokumen elaborasi yang menyertainya. s itu diindeks. Pengindeks mungkin tidak mengklasifikasikan laporan sebagai
Checklist revisi ini ditunjukkan pada Tabel 1 dan alir,diagram tidak direvisi, pada
baan acak jika penulis tidak secara eksplisit melaporkan informasi ini.64
Gambar.1.52e54 k membantu memastikan bahwa studi yang tepat diindeks dan mudah iDEN-
i, penulis harus menggunakan kata “acak” dalam judul untuk menunjukkan
3. Pernyataan CONSORT 2010: penjelasan dan elaborasi a peserta secara acak ditugaskan untuk kelompok perbandingan mereka.
Selama revisi CONSORT 2001, menjadi jelas bahwa penjelasan Butir 1b. Ringkasan terstruktur desain percobaan, metode, hasil, dan
dan elaborasi prinsip-prinsip yang mendasari pernyataan CONSORT akan pulan
membantu penyelidik dan orang lain untuk menulis atau menilai laporan percobaan. kspesifikyangpanduan melihat CONSORT untuk abstrak.45,65
Penjelasan dan elaborasi artikel CONSORT58 diterbitkan pada tahun 2001 lasandClear, transparan, dan suffiabstrak sien rinci adalah penting karena
bersamaan dengan versi pernyataan CONSORT tahun 2001. Ini membahas aca sering mendasarkan penilaian mereka dari percobaan pada informasi
pemikiran dan ilmiahfilatar belakangc untuk setiap item dan memberikan contoh ut. Beberapa pembaca menggunakan abstrak sebagai alat skrining untuk
yang diterbitkan dari pelaporan yang baik. Alasan untuk merevisi artikel itu sama utuskan apakah akan membaca artikel lengkap. Namun, karena tidak semua uji
dengan merevisi pernyataan, yang dijelaskan di atas. Kami singkatsecaray ersedia secara bebas dan beberapa profesional kesehatan
jelaskan di bawah penambahan utama dan penghapusan untuk versi penjelasan dan dak memiliki akses ke laporan uji coba lengkap, keputusan layanan kesehatan
elaborasi artikel.
erkadang dibuat berdasarkan abstrak dari uji coba acak.66
A abstrak jurnal harus berisisuffiinformasisiententang percobaan untuk
4. CONSORT 2010 penjelasan dan elaborasi: perubahan
melayani sebagai catatan yang akurat dari perilaku dannya fitemuan, memberikan
nformasi yang optimal tentang percobaan dalam batasan ruang dan format jurnal.
Kami telah membuat beberapa perubahan kosmetik substantif dan ebuah abstrak benar dibangun dan ditulis membantu individu untuk menilai
beberapa untuk versi ini CONSORT dokumen jelas (rincian lengkap yang disorot engan cepat relevansi temuandan membantu pengambilan laporan yang relevan
dalam versi 2010 dari pernyataan CONSORT59).Beberapa ari database tronic elektroforesis.67 Abstrak harussecara akuratmencerminkanapa
remencerminkanperubahan pada daftar CONSORT; ada tiga item daftar periksa ang termasuk dalam artikel jurnal lengkap dan tidak boleh menyertakan informasi
baru dalam daftar periksa CONSORT 2010dseperti item 24, yang meminta penulis ang tidak muncul di badan makalah. Studi yang membandingkan keakuratan
untuk melaporkan di mana protokol uji coba mereka dapat diakses. Kami juga telah nformasi yang dilaporkan dalam jurnal abstrak dengan yang dilaporkan dalam teks
memperbarui beberapa penjelasan yang ada, termasuk menambahkan referensi ublikasi lengkap telah menemukan klaim yang tidak konsisten dengan, atau hilang
terbaru ke bukti metodologis, dan menggunakan beberapa contoh yang lebih baik.
ari, isi artikel lengkap.68e71 Sebaliknya, menghilangkan bahaya penting dari
Kami telah menghapus glosarium, yang sekarang tersedia di situs web CONSORT
42,72
(www. Consort-statement.org). Bila memungkinkan, kita menggambarkan bstrak serius bisa menyesatkanseseoranginterpretasi darisidang temuanfi.
temuandari studi empiris yang relevan. Banyak buku bagus tentang uji klinis Perpanjangan baru-baru ini untuk pernyataan CONSORT menyediakan
menawarkan diskusi yang lebih lengkap tentang masalah metodologis.60e62 aftar item penting yang penulis harus sertakan ketika melaporkan hasil utama dari
45
Akhirnya, untuk kenyamanan, kami kadang-kadang merujuk pada "perawatan" dan ercobaan acak dalam jurnal (atau konferensi) abstrak (lihat Tabel 2). Kami
"pasien", meskipun kami menyadari bahwa tidak semua intervensi yang dievaluasi angat merekomendasikan penggunaan abstrak terstruktur untuk melaporkan uji
dalam RCT adalah perawatan dan tidak semua peserta adalah pasien. oba secara acak. Mereka memberi pembaca informasi tentang uji coba di bawah
erangkaian judul yang berkaitan dengan desain, perilaku, analisis, dan
5. Daftar periksa item nterpretasi.73 Beberapa studi telah menemukan bahwa abstrak terstruktur yang
erkualitas lebih tinggi dari abstrak deskriptif lebih tradisional74,75 dan bahwa
5.1. Judul dan abstrak mereka memungkinkan pembaca untuk find informasi dengan lebih mudah.76 Kami
menyadari bahwa banyak jurnal telah mengembangkan struktur dan batas kata
5.1.1. Butir 1a.Identifikasisebagai uji coba secara acak dalam judul mereka sendiri untuk melaporkan abstrak. Bukan niat kami untuk menyarankan
perubahan pada format ini, tetapi untuk merekomendasikan informasi apa yang
ada kanker residual oleh faktor-faktor pertumbuhan yang dikeluarkan oleh operasi
harus dilaporkan. an oleh penyembuhan luka selanjutnya . Oleh karena itu uji coba saat ini dibuat
ntuk membandingkan, pada pasien dengan NSCLC yang dapat direseksi, operasi
5.2. Pendahuluan aja versus tiga siklus kemoterapi berbasis platinum diikuti oleh operasi dalam hal
elangsungan hidup secara keseluruhan, kualitas hidup, pementasan patologis,
5.2.1. Butir 2a.Ilmiahlatar belakang dan penjelasan rasional ngkat kehormatan, tingkat operasi, dan waktu ke dan lokasi relaps”.77
Contohd“Pembedahanadalah pengobatan pilihan untuk pasien dengan PenjelasandBiasanya, pendahuluan terdiri daribebas flow- ing teks, di
stadium penyakit I dan II non-kecil kanker paru-paru sel (NSCLC). Sebuah mana penulis menjelaskan ilmiahfilatar belakangc dan pemikiran untuk percobaan
NSCLC meta-analisis gabungan hasil dari delapan percobaan acak operasi terhadapmereka, dan garis umum. Mungkin juga tepat untuk memasukkan di sini tujuan
operasi ditambah ajuvan berbasis cisplatin kemoterapi dan menunjukkan kecil, tapiercobaan (lihat butir 2b). Alasannya mungkin jelas (misalnya, untuk menilai
mungkin dalammemengaruhiobat pada fungsi ginjal) atau pragmatis (misalnya,
/4 0,08), kelangsungan hidup mutlakdiuntungkandari sekitar
tidaksignifikan(p 1
5% pada 5 tahun (dari 50% hingga 55%). Pada saat uji coba saat ini dirancang ntuk memandu praktek dengan membandingkanbenefitsdan bahaya dua
(pertengahan 1990-an), kemoterapi ajuvan belum menjadi praktik klinis standarerawatan).
. Penulis harus melaporkan bukti bene- fitsdan bahaya intervensi aktif
ermasuk dalam sidang dan harus menyarankan penjelasan yang masuk akal untuk
Dasar pemikiran klinis untuk kemoterapi neo-adjuvant adalah tiga kali lipat: regresi
kanker primer dapat dicapai sehingga memudahkan dan menyederhanakan atau agaimana intervensi mungkin bekerja, jika hal ini tidak jelas.78
mengurangi operasi selanjutnya; metastasis mikro yang tidak terdeteksi dapat
ditangani pada awal pengobatan; dan mungkin ada penghambatan stimulus diduga
PANDUAN
55 3 0
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Deklarasi Helsinki menyatakan bahwa penelitian yang melibatkan biomedis orang harus didasarkan pada pengetahuan yang mendalam tentang
ilmiahfisastrac.79 Artinya, adalah tidak etis untuk mengekspos manusia yang tidak perlu terhadap risiko penelitian. Beberapa uji klinis telah
GUIDELINE
55 3 1
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Tabel 1 CONSORT 2010 daftar periksa informasi untuk memasukkan ketika melaporkan percobaan acak.Section
/ Topik Butir item yang ada Checklist Dilaporkan pada halaman ada
Judul danabstrak
1aIdentifikasi sebagai uji coba secara acak dalam judul 1b Ringkasan Terstruktur desain percobaan, metode, hasil,
Pengantar
dan Pendahuluan
kesimpulan (untukspesifikyangbimbingan melihat CONSORT untuk abstrak4565)
latar belakang dan tujuan 2a Ilmiahfilatar belakangc dan penjelasan tentang alasan
2b Specificfitujuanc atau hipotesis metode percobaan desain 3a Deskripsi desain percobaan (seperti paralel, faktorial)
termasuk alokasi rasio 3b perubahan penting untuk metode setelah dimulainya sidang ( seperti kriteria kelayakan), dengan alasan peserta 4a kriteria kelayakan untuk peserta
4b Pengaturan dan lokasi di mana data dikumpulkan intervensi 5 intervensi untuk setiap kelompok denganmencukupirincian
untuk memungkinkan replikasi, termasuk bagaimana dan kapan mereka benar-benar diberikan Hasil 6A Sepenuhnya definedpra-dispesifikasikanprimer dansekunder,
ukuran hasil termasuk bagaimana dan kapan y dinilai 6b Setiap perubahan hasil uji coba setelah uji coba dimulai, dengan alasan Ukuran sampel 7a Bagaimana ukuran sampel
ditentukan
7b Bila berlaku, penjelasan tentangsementara
analisisdan pedoman penghentian Pengacakan: Pembuatan
urutan 8a Metode yang digunakan untuk menghasilkan urutan alokasi acak
8b Jenis pengacakan; perincian segala batasan
(seperti pemblokiran dan ukuran blok)alokasi
Mekanisme penyembunyian
Judul danabstrak
Metode
9 Mekanisme yang digunakan untuk mengimplementasikan urutan alokasi acak (seperti wadah yang diberi nomor urut), menjelaskan setiap langkah yang diambil untuk
menyembunyikan urutan sampai intervensi ditetapkan. Implementasi 10 Siapa yang menghasilkan urutan alokasi acak, siapa yang
mendaftarkan peserta, dan siapa yang menugaskan peserta untuk intervensi. Blinding 11a Jika dilakukan, siapa yang dibutakan setelah penugasan ke intervensi
(misalnya, peserta, penyedia perawatan, mereka yang menilai hasil) dan bagaimana 11b Jika relevan, deskripsi dari kesamaan intervensi Metode statistik 12a Metode statistik
digunakan untuk membandingkan kelompok untuk hasil primer dan sekunder.
Hasil
12b Hasil
Metode untuk analisis tambahan, seperti analisis subkelompok dan analisis yang disesuaikan
peserta .Aliran
13a Untuk setiap kelompok, jumlah peserta yang (diagram) adalah
acak, treatmen dimaksudkan diterima t, sangat disarankan)
dan dianalisis untuk hasil primer 13b Untuk setiap kelompok, kehilangan dan pengecualian setelah pengacakan,
bersama dengan alasan Perekrutan 14a Tanggalmenentukanperiode perekrutan dan tindak lanjut
14b Mengapa uji coba berakhir atau dihentikan Data dasar 15 Tabel yang menunjukkan karakteristik demografi dan klinis dasar untuk setiap kelompok. Angka dianalisis 16 Untuk
setiap kelompok, jumlah peserta (penyebut) dimasukkan dalam
setiap analisis dan apakah analisis dilakukan oleh kelompok yang ditugaskan asli. Hasil dan estimasi 17a Untuk setiap hasil primer dan sekunder, hasil untuk masing-masing
kelompok, danefeknya
ukuran diperkirakan dan presisi (seperti 95%kepercayaandiri interval) 17b untuk hasil biner, presentasi dari kedua efekrelatif dan mutlak
ukurandianjurkan Tambahan analisis 18 Hasil dari setiap analisis lainnya dilakukan, termasuk analisis subkelompok
dan analisis yang disesuaikan, yang membedakanpra-dispesifikasikandari eksplorasi Harms 19 Semua kerugian penting atau effe yang tidak diinginkan cts dalam setiap kelompok
Discussion
(untuk Diskusi
spesifikbimbingan c melihat CONSORT untuk merugikan42)
Keterbatasan 20 keterbatasan Trial, mengatasi sumber-sumber bias potensial,
ketidaktepatan, dan, jika relevan, banyaknya analisis generalisability 21 generalisability (validitas eksternal, penerapan) dari percobaan temuanInterpretasi 22 Interpretasi konsisten
dengan hasil, menyeimbangkanmanfaat
informasi lainnya Pendaftaran 23 dan Pendaftaran lain bahayats, informasi dan nomor mempertimbangkan dan nama lain dari percobaanregistri yang relevan
bukti
ClinicalTrial.gov protokol
24 Dimana protokol sidang lengkap dapat diakses, jika tersedia pendanaan 25 Sumber pendanaan dan dukungan lainnya (seperti pasokan obat-obatan),
peran penyandang dana
sebuah
Kami sangat menyarankan membaca pernyataan ini dalam hubungannya dengan CONSORT 2010 Penjelasan dan Elaborasi untukklari pentingfikationpada semua item. Jika
40 39 43
relevan, kami juga merekomendasikan membaca ekstensi CONSORT untuk uji coba cluster acak,uji coba non-inferioritas dan ekivalensi, perawatan non-farmakologis, intervensi
herbal,44 dan uji pragmatis.41 Ekstensi tambahan akan datang: untuk referensi terbaru dan relevan dengan daftar periksa ini, lihat www.consort-statement.org.
terbukti tidak perlu karena pertanyaan yang mereka ajukan telah atau bisa dijawab dengan tinjauan sistematis literatur yang ada.80,81 Dengan
demikian, kebutuhan untuk sidang baru harusdibenarkandalam pendahuluan. Idealnya, itu harus mencakup
neffikeampuhandari intervensi terapeutik atau preventif tertentu. Hipotesis
praspesifikpertanyaaned yang diuji untuk membantu memenuhi tujuan.
esis yang lebihspesifikdari tujuan dan setuju untuk
2 Produk untuk menyertakan ketika melaporkan uji coba secara acak dalam abstrak
Deskripsi
lis Rincian kontak untuk penulis yang sesuai Desain percobaan Deskripsi desain
obaan (seperti paralel,
klaster,
nferiority) Metode:
serta Kriteria kelayakan untuk peserta dan pengaturan di
mana data dikumpulkan Intervensi Intervensi
maksudkan untuk masing-masing kelompok Tujuan Spesifikfic objektif atau hipotesis Hasil
asdidefinisikanhasil utama untuk laporan ini Pengacakan Bagaimana peserta dialokasikan
tuk intervensi membutakan (masking) Apakah peserta, pemberi perawatan, dan mereka
menilai hasil yang buta untuk tugas kelompok Hasil:
omor acak Jumlah peserta secara acak untuk setiap kelompok. Rekrutmen Status percobaan.
mlah yang dianalisis. Jumlah peserta yang dianalisis dalam setiap kelompok. Hasil. Untuk
sil utama, hasil untuk setiap
kelompok dan perkiraan efek ukuran dan ketepatannya.
ahaya Efek samping atau efek samping yang penting. Kesimpulan Interpretasi umum dari hasil
aftaran percobaan Pendaftaran registrasi pada jumlah dan nama daftar uji coba Sumber
anaan pendanaan
referensi ke tinjauan sistematis uji coba serupa sebelumnya atau catatan tidak valuasi statistik eksplisit. Dalam praktiknya, tujuan dan hipotesis tidak selalu
adanya uji coba tersebut. 82 mudah dibedakan. Sebagian besar laporan RCT memberikan informasi yang
memadai tentang tujuan dan hipotesis percobaan.84
5.2.2. Butir 2b. Spesifikfitujuanc atau hipotesis
Contohd“Dalamstudi saat ini kami menguji hipotesis bahwa .3. Metode
kebijakan manajemen aktif persalinan nulipara akan: 1. mengurangi tingkat operasi
caesar, 2. mengurangi tingkat persalinan lama; 3. tidakmemengaruhiibu kepuasan .3.1. Butir 3a. Deskripsi desain percobaan (seperti paralel, faktorial) termasuk
dengan pengalamankelahiran”.83 asio alokasi
PenjelasandTujuan adalah pertanyaan-pertanyaan yang dirancang Contohd“Iniadalah multicenter, stratified (6e1 1 tahun dan 12e17 tahun,
untuk dijawab oleh persidangan. Mereka sering berhubungan engan pengacakan seimbang [2: 1]), double blind , studi terkontrol plasebo,
kelompok paralel dilakukan di Amerika Serikat (41 situs)”. 85
aktorial. Juga, sebagian besar uji coba ditetapkan untuk mengidentifikasi
eunggulan intervensi baru, jika ada, tetapi yang lain dirancang untuk menilai
PenjelasandKata “desain” sering digunakan untuk merujuk pada semua
aspek tentang bagaimana uji coba diatur, tetapi juga memiliki interpretasi yang on-inferioritas atau kesetaraan.39 Adalah penting bahwa para peneliti dengan jelas
lebih sempit. Banyakspesifikyangaspek dari desain percobaan yang lebih luas, menggambarkan aspek-aspek percobaan mereka, termasuk unit pengacakan (seperti
termasuk rincian pengacakan dan menyilaukan, dibahas di tempat lain dalam daftar asien, praktik dokter umum, lesi). Diinginkan juga untuk memasukkan perincian
CONSORT. Di sini kami mencari informasi tentang jenis persidangan, seperti ni dalam abstrak (lihat item 1b).
kelompok paralel atau faktorial, dan kerangka kerja konseptual, seperti superioritasJika desain yang kurang umum digunakan, penulis didorong untuk menjelaskan
atau non-inferioritas, dan masalah terkait lainnya yang tidak dibahas di bagian lain
pilihan mereka, terutama karena desain seperti itu mungkin menyiratkan perlunya
dalam daftar periksa. ukuran sampel yang lebih besar atau analisis dan interpretasi yang lebih kompleks.
Pernyataan CONSORT berfokus terutama pada uji coba dengan celana Meskipun sebagian besar percobaan menggunakan pengacakan yang sama (seperti
1: 1 untuk dua kelompok), akan sangat membantu untuk menyediakan rasio alokasi
partici- individual acak salah satu dari dua “paralel” kelompok. Faktanya, sedikit
secara eksplisit. Untuk uji coba obat, menentukan fase uji coba (IeIV) juga
lebih dari setengah percobaan yang diterbitkan memiliki desain seperti itu.16
mungkin relevan.
Alternatif desain utama adalah multi-arm parallel, crossover, cluster,40 dan desain
PANDUAN
55 3 2
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Gbr. 1. Diagram alir kemajuan melalui fase percobaan acak paralel dari dua kelompok (yaitu, pendaftaran, alokasi intervensi, tindak lanjut, dan analisis data)52e54
5.3.2. Butir 3b. Perubahan penting pada metode setelah dimulainya miliki protokol yang specifiessecara detail bagaimana persidangan akan
percobaan (seperti kriteria kelayakan), dengan alasan akukan. There may be deviations from the original protocol, as it is impossible
Contohd “Pasien secara acak ditugaskan ke salah satu dari enampredict every possible change in circumstances during the course of a trial. Some
kelompok paralel, awalnya dalam rasio 1: 1: 1: 1: 1: 1: 1, untuk menerima salah als will therefore have important changes to the methods after trial
satu dari lima otamixaban. rejimen . atau kontrol aktif heparin yang tidak mmencement.
terpusat . Komite Pemantau Data independen mengkaji data yang tidak tertutup Changes could be due to external information becoming avail- able
untuk keselamatan pasien; ada sementara analisis untukefficacyatau kesia-siaanm other studies, or internal financial difficulties, or could be due to a
yang dilakukan. Selama persidangan, komite ini merekomendasikan agar kelompokappointing recruitment rate. Such protocol changes should be made without
yang menerima ota-mixaban dosis terendah (0$035 mg / kg / jam) dihentikan eaking the blinding on the accumulating data on participants' outcomes. In some
karena bukti klinis antikoagulasi yang tidak adekuat. Protokol segera diubah sesuai
als, an independent data monitoring committee will have as part of its remit the
dengan rekomendasi itu, dan peserta kemudian secara acak ditugaskan dalam rasio ssibility of recommending protocol changes based on seeing unblinded data.
2: 2: 2: 2: 1 untuk masing-masing otamixaban dan kelompok kontrol,”.86 ch changes might affect the study methods (such as changes to treatment
gimens, eligibility criteria, randomisation ratio, or duration of follow-up) or trial
PenjelasandBeberapa uji coba dapat memulai tanpa
yangrencanatetap (yaitu, sepenuhnya eksplorasi), tetapi sebagian besar akan nduct (such as dropping a centre with poor data quality).87
Some trials are set up with a formal “adaptive” design. There is no adequately. For example, eight published trials leading to clinical alerts by the
universally accepted definition of these designs, but a working definition might be National Institutes of Health specified an average of 31 eligibility criteria in their
“a multistage study design that uses accumu- lating data to decide how to modify protocols, but only 63% of the criteria were mentioned in the journal articles, and
96
aspects of the study without undermining the validity and integrity of the trial.”88 only 19% were mentioned in the clinical alerts. Similar deficiencies were found
The modifi- cations are usually to the sample sizes and the number of treatment for HIV clinical trials.97 Among 364 reports of RCTs in surgery, 25% did not
arms and can lead to decisions being made more quickly and with more efficient specify any eligibility criteria.98
use of resources. There are, however, important ethical, statistical, and practical
issues in considering such a design.89,90 5.3.4. Item 4b. Settings and locations where the data were collected
Whether the modifications are explicitly part of the trial design or in Exampled“The study took place at the antiretroviral therapy clinic of
response to changing circumstances, it is essential that they are fully reported to Queen Elizabeth Central Hospital in Blantyre, Malawi, from January 2006 to April
help the reader interpret the results. Changes from protocols are not currently well 2007. Blantyre is the major commercial city of Malawi, with a population of
reported. A review of comparisons with protocols showed that about half of journal 1000000 and an estimated HIV prevalence of 27% in adults in 2004.”93
articles describing RCTs had an unexplained discrepancy in the primary
Explanation d
A long with the eligibility criteria for participants (see item
outcomes.57 Frequent unexplained discrepancies have also been observed for details 4a) and the description of the interventions (see item 5), information on the settings
of randomisation, blinding,91 and statis- tical analyses.92 and locations is crucial to judge the applicability and generalisability of a trial.
Were participants recruited from primary, secondary, or tertiary healthcare or from
5.3.3. Item 4a. Eligibility criteria for participants the community? Healthcare institutions vary greatly in their organisation,
Exampled“Eligible participants were all adults aged 18 or over with experience, and resources and the baseline risk for the condition under
HIV who met the eligibility criteria for antiretroviral therapy according to the investigation. Other aspects of the setting (including the social, economic, and
Malawian national HIV treatment guidelines (WHO clinical stage III or IV or any cultural environment and the climate) may also affect a study's external validity.
WHO stage with a CD4 count <250/mm3) and who were starting treatment with a Authors should report the number and type of settings and describe the
BMI <18.5. Exclusion criteria were pregnancy and lactation or participation in care providers involved. They should report the loca- tions in which the study was
carried out, including the country, city if applicable, and immediate environment
another supplementary feeding programme.”93
(for example, commu- nity, office practice, hospital clinic, or inpatient unit). In
ExplanationdA comprehensive description of the eligibility criteria
particular, it should be clear whether the trial was carried out in one or several
used to select the trial participants is needed to help readers interpret the study. In
centres (“multicentre trials”). This description should provide enough information
particular, a clear understanding of these criteria is one of several elements required
so that readers can judge whether the results of the trial could be relevant to their
to judge to whom the
own setting. The environment in which the trial is conducted may differ
results of a trial applydthat is, the trial's generalisability (appli- cability) and
considerably from the setting in which the trial's results are later used to guide
relevance to clinical or public health practice (see item 21).94 A description of the
practice and policy.94,99 Authors should also report any other information about the
method of recruitment, such as by referral or self selection (for example, through
settings and locations that could have influenced the observed results, such as
advertisements), is also important in this context. Because they are applied before
problems with transportation that might have affected patient participation or
randomisation, eligibility criteria do not affect the internal validity of a trial, but
delays in administering interventions.
they are central to its external validity.
Typical and widely accepted selection criteria relate to the nature and stage
5.3.5. Item 5. The interventions for each group with sufficient details to allow
of the disease being studied, the exclusion of persons thought to be particularly replication, including how and when they were actually administered
vulnerable to harm from the study intervention, and to issues required to ensure that
Examplesd“In POISE, patients received the first dose of the study drug (ie,
the study satisfies legal and ethical norms. Informed consent by study participants,
oral extended-release metoprolol 100 mg or matching placebo) 2e4 h before
for example, is typically required in intervention studies. The common distinction
surgery. Study drug administration required a heart rate of 50 bpm or more and a
between inclusion and exclusion criteria is unnecessary; the same criterion can be
systolic blood
phrased to include or exclude participants.95
Despite their importance, eligibility criteria are often not re- ported
GUIDELINE
55 3 3
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
pressure of 100 mm Hg or greater; these haemodynamics were checked before each 45 bpm or their systolic blood pressure dropped below 100 mm Hg, study
administration. If, at any time during the first 6 h after surgery, heart rate was 80 was withheld until their heart rate or systolic blood pressure recovered; the
bpm or more and systolic blood pressure was 100 mm Hg or higher, patients drug was then restarted at 100 mg once daily. Patients whose heart rate was
received their first postoperative dose (extended-release metoprolol 100 mg or stently 45e49 bpm and systolic blood pressure exceeded 100 mm Hg delayed
matched placebo) orally. If the study drug was not given during the first 6 h, g the study drug for 12 h”. 100
patients received their first postoperative dose at 6 h after surgery. 12 h after the
tients were randomly assigned to receive a custom-made neoprene splint to be
first postoperative dose, patients started taking oral extended-release metoprolol
at night or to usual care. The splint was a rigid rest orthosis recommended for
200 mg or placebo every day for 30 days. If a patient's heart rate was consistently
only at night. It covered the base of the thumb and the thenar eminence but not
the wrist (Fig. 1). Splints were made by 3 trained occupational thera- pists, who ffects of the intervention (see item 19), although harms should always be viewed
adjusted the splint for each patient so that the first web could be opened and the s important whether they are labelled primary or secondary.
thumb placed in opposition with the first long finger. Patients were encouraged to All outcome measures, whether primary or secondary, should be
contact the occupational therapist if they felt that the splint needed adjustment, pain dentified and completely defined. The principle here is that the information
increased while wearing the splint, or they had adverse effects (such as skin
rovided should be sufficient to allow others to use the same outcomes.102 When
erosion). Because no treatment can be considered the gold standard in this situation,
utcomes are assessed at several time points after randomisation, authors should
patients in the control and inter- vention groups received usual care at the discretion
lso indicate the pre- specified time point of primary interest. For many non-
of their physician (general practitioner or rheumatologist). We decided not to use a
harmacological interventions it is helpful to specify who assessed outcomes (for
placebo because, to our knowledge, no placebo for splinting has achieved
xample, if special skills are required to do so) and how many assessors there
ExplanationdAuthors should
successful blinding of patients, as recommended.”101
were.43
describe each intervention thor- oughly, including control interventions. The
Where available and appropriate, the use of previously devel- oped and
description should allow a clinician wanting to use the intervention to know exactly
102 alidated scales or consensus guidelines should be re- ported,104,105 both to enhance
how to administer the intervention that was evaluated in the trial. For a drug
106
intervention, information would include the drug name, dose, method of uality of measurement and to assist in comparison with similar studies. For
administration (such as oral, intrave- nous), timing and duration of administration, xample, assessment of quality of life is likely to be improved by using a validated
conditions under which interventions are withheld, and titration regimen if appli- nstru- ment.107 Authors should indicate the provenance and properties of scales.
cable. If the control group is to receive “usual care” i t is important to describe More than 70 outcomes were used in 196 RCTs of non-steroidal
thoroughly what that constitutes. If the control group or intervention group is to
nti-inflammatory drugs for rheumatoid arthritis,108 and 640 different instruments
receive a combination of interventions the authors should provide a thorough
ad been used in 2000 trials in schizo- phrenia, of which 369 had been used only
description of each interven- tion, an explanation of the order in which the
nce.33 Investigation of 149 of those 2000 trials showed that unpublished scales
combination of interventions are introduced or withdrawn, and the triggers for their
were a source of bias. In non-pharmacological trials, a third of the claims of
introduction if applicable.
reatment superiority based on unpublished scales would not have been made if a
Specific extensions of the CONSORT statement address the
ublished scale had been used.109 Similar data have been reported elsewhere.110,111
reporting of non-pharmacologic and herbal interventions and their particular
Only 45% of a cohort of 519 RCTs published in 2000 specified the primary
reporting requirements (such as expertise, details of how the interventions were
utcome16; this compares with 53% for a similar cohort of 614 RCTs published in
standardised).43,44 We recommend readers consult the statements for
non-pharmacologic and herbal inter- ventions as appropriate. 006.17
5.3.6. Item 6a. Completely defined pre-specif ied primary and secondary outcome .3.7. Item 6b. Any changes to trial outcomes after the trial commenced,
measures, including how and when they were assessed with reasons
Exampled“The primary endpoint with respect to efficacy in Exampled“The original primary endpoint was all-cause mortality, but,
psoriasis was the proportion of patients achieving a 75% improve- ment in psoriasis uring a masked analysis, the data and safety monitoring board noted that overall
activity from baseline to 12 weeks as measured by the PASI [psoriasis area and mortality was lower than had been predicted and that the study could not be
severity index] Additional analyses were done on the percentage change in PASI ompleted with the sample size and power originally planned. The steering
ommittee therefore decided to adopt co-primary endpoints of all-cause mortality
scores and improve- ment in target psoriasis lesions.”103
the original primary endpoint), together with all-cause mortality or cardiovascular
ExplanationdAll RCTs assess response variables, or outcomes
ospital admissions (the first pre- specified secondary endpoint).”112
(endpoints), for which the groups are compared. Most trials have
several outcomes, some of which are of more interest than others. The primary Explanation d
T here are many reasons for departures from the initial
outcome measure is the pre-specified outcome considered to be of greatest tudy protocol (see item 24). Authors should report all major changes to the
importance to relevant stakeholders (such a patients, policy makers, clinicians,rotocol, including unplanned changes to eligibility criteria, interventions,
funders) and is usually the one used in the sample size calculation (see item xaminations,
7). data collection, methods of analysis, and outcomes. Such information
Some trials may have more than one primary outcome. Having several primary s not always reported.
outcomes, however, incurs the problems of interpretation associ- ated with As indicated earlier (see item 6a), most trials record multiple outcomes,
multiplicity of analyses (see items 18 and 20) and is not recommended. Primary
with the risk that results will be reported for only a selected subset (see item 17).
re-specification and reporting of primary and secondary outcomes (see item 6a)
outcomes should be explicitly indicated as such in the report of an RCT. Other
outcomes of interest are secondary outcomes (additional outcomes). There mayhould
be remove such a risk. In some trials, however, circumstances require a change
several secondary outcomes, which often include unanticipated or unin- tended
n
GUIDELINE
55 3 4
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
the way an outcome is assessed or even, as in the example above, a switch todpointa based on unblinded data is much more problematic, although it may be
different outcome. For example, there may be external evidence from other trials eci
or fied in the context of an adaptive trial design.88 Authors should identify and
systematic reviews suggesting the endpoint might not be appropriate, or recruitment
plain any such changes. Likewise, any changes after the trial began of the
or the overall event rate in the trial may be lower than expected.112 Changing an
signation of outcomes as primary or secondary should be reported and explained.
A comparison of protocols and publications of 102 randomised trials should be given of any allowance made for attrition or non-compliance during the
found that 62% of trials reports had at least one primary outcome that was changed, study.
introduced, or omitted compared with the protocol.55 Primary outcomes also Some methodologists have written that so called underpowered trials may
differed between protocols and publications for 40% of a cohort of 48 trials funded be acceptable because they could ultimately be combined in a systematic review
117e119
by the Canadian Institutes of Health Research.113 Not one of the subse- quent 150 and meta-analysis, and because some information is better than no
trial reports mentioned, let alone explained, changes from the protocol. Similar information. Of note, important caveats applydsuch as the trial should be unbiased,
results from other studies have been reported recently in a systematic review of reported properly, and published irrespective of the results, thereby becoming
empirical studies examining outcome reporting bias.57 available for meta-analysis.118 On the other hand, many medical researchers worry
that underpowered trials with indeterminate results will remain unpublished and
5.3.8. Item 7a. How sample size was determined insist that all trials should individually have “s ufficient power.” This debate will
continue, and members of the CONSORT Group have varying views. Critically
Examplesd“To detect a reduction in PHS (postoperative hospital stay)
however, the debate and those views are immaterial to reporting a trial. Whatever
of 3 days (SD 5 days), which is in agreement with the study of Lobo et al.17 with a the power of a trial, authors need to properly report their intended size with all their
two-sided 5% significance level and a power of 80%, a sample size of 50 patients
methods and assumptions.118 That transparently reveals the power of the trial to
per group was necessary, given an anticipated dropout rate of 10%. To recruit this
readers and gives them a measure by which to assess whether the trial attained its
number of patients a 12-month inclusion period was anticipated.”114
planned size.
“Based on an expected incidence of the primary composite endpoint of In some trials, interim analyses are used to help decide whether to stop
11% at 2.25 years in the placebo group, we calculated that we would need 950 early or to continue recruiting sometimes beyond the planned trial end (see item
primary endpoint events and a sample size of 9650 patients to give 90% power to 7b). If the actual sample size differed from the originally intended sample size for
detect a significant difference between ivabradine and placebo, corresponding to a some other reason (for example, because of poor recruitment or revision of the
19% reduction of relative risk (with a two-sided type 1 error of 5%). We initially target sample size), the explanation should be given.
designed an event-driven trial, and planned to stop when 950 primary endpoint
Reports of studies with small samples frequently include the erroneous
events had occurred. However, the incidence of the primary endpoint was higher
conclusion that the intervention groups do not differ, when in fact too few patients
than predicted, perhaps because of baseline characteristics of the recruited patients, 120
who had higher risk than expected (eg, lower proportion of NYHA class I and were studied to make such a claim. Reviews of published trials have consistently
higher rates of diabetes and hypertension). We calculated that when 950 primary found that a high proportion of trials have low power to detect clinically
121e123
endpoint events had occurred, the most recently included patients would only have meaningful treatment effects. In reality, small but clinically meaningful true
been treated for about 3 months. Therefore, in January 2007, the executive differences are much more likely than large differences to exist, but large trials are
committee decided to change the study from being event-driven to time-driven, and required to detect them.124
to continue the study until the patients who were randomised last had been followed In general, the reported sample sizes in trials seem small. The median
up for 12 months. This change did not alter the planned study duration of 3
sample size was 54 patients in 196 trials in arthritis,108 46 patients in 73 trials in
years.”115
dermatology,8 and 65 patients in 2000 trials in schizophrenia.33 These small sample
ExplanationdFor scientific and ethical reasons, the sample size for a sizes are consistent with those of a study of 519 trials indexed in PubMed in
trial needs to be planned carefully, with a balance between medical and statistical
December 200016 and a similar cohort of trials (n 1 ⁄4 616) indexed in PubMed in
considerations. Ideally, a study should be large enough to have a high probability 17
(power) of detecting as statistically significant a clinically important difference of a 2006, where the median number of patients recruited for parallel group trials was
given size if such a difference exists. The size of effect deemed important is 80 across both years. Moreover, many reviews have found that few authors report
8,14,32,33,123
inversely related to the sample size necessary to detect it; that is, large samples are how they determined the sample size.
necessary to detect small differences. Elements of the sample size calculation are There is little merit in a post hoc calculation of statistical power using the
(1) the estimated outcomes in each group (which implies the clinically important results of a trial; the power is then appropriately indi- cated by confidence intervals
target difference between the intervention groups); (2) the a (type I) error level; (3) (see item 17).125
the statistical power (or the b (type II) error level); and (4), for continuous
outcomes, the standard deviation of the measure- ments.116 The interplay of these 5.3.9. Item 7b. When applicable, explanation of any interim analyses and
elements and their reporting will differ for cluster trials40 and non-inferiority and stopping guidelines
equivalence trials.39 Examplesd“Two interim analyses were performed during the trial. The
Authors should indicate how the sample size was determined. If a formal levels of significance maintained an overall P v alue of 0.05 and were calculated
power calculation was used, the authors should identify the primary outcomeaccording
on to the O'Brien-Fleming stopping boundaries. This final analysis used a Z
126
which the calculation was based (see item 6a), all the quantities used in score the of 1.985 with an associated P value of 0.0471.”
calculation, and the resulting target sample size per study group. It is preferable to “An independent data and safety monitoring board periodically reviewed
quote the expected result in the control group and the difference between the theefficacy and safety data. Stopping rules were based on modified Haybittle-Peto
groups one would not like to overlook. Alternatively, authors could presentboundaries
the of 4 SD in the first half of the
percentage with the event or mean for each group used in their calculations. Details
GUIDELINE
55 3 5
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
study and 3 SD in the second half for efficacy data, and 3 SD in the first half of the andom methods generally yield biased results.2e4,136 Bias presumably arises
study and 2 SD in the second half for safety data. Two formal interim analyses of the inability to conceal these allocation systems adequately (see item 9).
efficacy were performed when 50% and 75% of the expected number of primary Many methods of sequence generation are adequate. However, readers
events had accrued; no correction of the reported P value for these interim tests was annot judge adequacy from such terms as “r andom alloca- tion,” “randomisation,”
performed.”127
r “r andom” without further elaboration. Authors should specify the method of
ExplanationdMany trials recruit participants over a long period. If equence generation, such as a random-number table or a computerised random
an intervention is working particularly well or badly, the study may need to be umber generator. The sequence may be generated by the process of minimisation,
ended early for ethical reasons. This concern can be addressed by examining results non- random but generally acceptable method (see Box 2).
as the data accumulate, preferably by an independent data monitoring committee. In some trials, participants are intentionally allocated in unequal numbers to each
However, performing multiple statistical examinations of accu- mulating data ntervention: for example, to gain more experience with a new procedure or to limit
without appropriate correction can lead to erroneous results and interpretations.128 costs of the trial. In such cases, authors should report the randomisation ratio (for
If the accumulating data from a trial are examined at five interim analyses that use a example, 2:1 or two treatment participants per each control participant) (see item
P v alue of 0.05, the overall false positive rate is nearer to 19% than to the nominal 3a). In a representative sample of PubMed indexed trials in 2000, only 21%
5%. reported an adequate approach to random sequence generation16; this increased to
17
Several group sequential statistical methods are available to adjust 34% for a similar cohort of PubMed indexed trials in 2006. In more than 90% of
for multiple analyses, 129e131
and their use should be pre- specified in the trial these cases, researchers used a random number generator on a computer or a
protocol. With these methods, data are compared at each interim analysis, and a P random number table.
value less than the critical value specified by the group sequential method indicates
statistical significance. Some trialists use group sequential methods as an aid to .3.11. Item 8b. Type of randomisation; details of any restriction (such as
decision making,132 whereas others treat them as a formal stopping rule (with the locking and block size)
intention that the trial will cease if the observed P v alue is smaller than the critical Examplesd“Randomization sequence was created using Stata 9.0
value). StataCorp, College Station, TX) statistical software and was stratified by center
137
Authors should report whether they or a data monitoring committee with a 1:1 allocation using random block sizes of 2, 4, and 6.”
took multiple “looks” at the data and, if so, how many there were, what triggered “Participants were randomly assigned following simple randomization
them, the statistical methods used (including any formal stopping rule), and rocedures (computerized random numbers) to 1 of 2 treatment groups.”138
whether they were planned before the start of the trial, before the data monitoring
ExplanationdIn trials of several hundred participants or more simple
committee saw any interim data by allocation, or some time thereafter. This
andomisation can usually be trusted to generate similar numbers in the two trial
information is often not included in published trial reports,133 even in trials that
roups139 and to generate groups that are roughly comparable in terms of known
report stopping earlier than planned.134
nd unknown prognostic variables.140 For smaller trials (see item 7a)dand even for
rials that are not intended to be small, as they may stop before reaching their target
5.3.10. Item 8a. Method used to generate the random allocation sequence
izedsome restricted randomisation (procedures to help achieve balance between
Examplesd“Independent pharmacists dispensed either active or roups in size or characteristics) may be useful (see Box 2).
placebo inhalers according to a computer-generated random- isation list.”63
It is important to indicate whether no restriction was used, by stating
“For allocation of the participants, a computer-generated list of uch or by stating that “simple randomisation” was done. Otherwise, the methods
random numbers was used.” 135 sed to restrict the randomisation, along with the method used for random
election, should be specified. For block randomisation, authors should provide
ExplanationdParticipants should be assigned to comparison groups
etails on how the blocks were generated (for example, by using a permuted block
in the trial on the basis of a chance (random) process characterised by
esign with a computer random number generator), the block size or sizes, and
unpredictability (see Box 1). Authors should provide sufficient information that the
whether the block size was fixed or randomly varied. If the trialists became aware
reader can assess the methods used to generate the random allocation sequence and
f the block size(s), that information should also be reported as such knowledge
the likelihood of bias in group assignment. It is important that infor- mation on the
ould lead to code breaking. Authors should specify whether stratification was used,
process of randomisation is included in the body of the main article and not as a
nd if so, which factors were involved (such as recruitment site, sex, disease stage),
separate supplementary file; where it can be missed by the reader.
he categorisation cut-off values within strata, and the method used for restriction.
The term “random” has a precise technical meaning. With random
Although stratification is a useful technique, especially for smaller trials, it is
allocation, each participant has a known probability of receiving each intervention
omplicated to imple- ment and may be impossible if many stratifying factors are
before one is assigned, but the assigned intervention is determined by a chance
sed. If minimisation (see Box 2) was used, it should be explicitly identi- fied, as
process and cannot be predicted. However, “random” is often used inappropriately
hould the variables incorporated into the scheme. If used, a random element
in the literature to describe trials in which non-random, deterministic allocation
hould be indicated.
methods were used, such as alternation, hospital numbers, or date of birth. When
investigators use such non- random methods, they should describe them precisely Only 9% of 206 reports of trials in specialty journals23 and 39% of 80
32
and should not use the term “random” or any variation of it. Even the term rials in general medical journals reported use of stratification. In each case, only
“quasi-random” is unacceptable for describing such trials. Trials based on bout half of the reports mentioned the use of restricted randomisation. However,
the fication had not been mentioned. One possible explanation for the close
these studies and that of Ade- tugbo and Williams8 found that the sizes of trati-
treatment groups in many trials were the same or quite similar, yet blockingalance
or in numbers is underreporting of the use of restricted randomisation.
GUIDELINE
55 3 6
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
GUIDELINE
55 3 7
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
sequence.151 Enclosing assignments in sequentially numbered, opaque, sealed
5.3.12. Item 9. Mechanism used to implement the random allocation sequence envelopes can be a good allocation concealment mechanism if it is developed and
(such as sequentially numbered containers), describing any steps taken to monitored diligently. This method can be corrupted, however, particularly if it is
conceal the sequence until interventions were assigned poorly executed. Investigators should ensure that the envelopes are opaque when
Examplesd“The doxycycline and placebo were in capsule form held and to the light, and opened sequentially and only after the participant's name and
identical in appearance. They were prepacked in bottles and consecutively other details are written on the appropriate envelope.143
numbered for each woman according to the ran- domisation schedule. Each woman
A number of methodological studies provide empirical evidence to support
was assigned an order number and received the capsules in the corresponding 152,153
these precautions. Trials in which the allocation sequence had been
prepacked bottle.”146
inadequately or unclearly concealed yielded larger estimates of treatment effects
“The allocation sequence was concealed from the researcher (JR) enrolling
than did trials in which authors reported adequate allocation concealment. These
and assessing participants in sequentially numbered, opaque, sealed and stapled findings provide strong empirical evidence that inadequate allocation concealment
envelopes. Aluminium foil inside the envelope was used to render the envelope contributes to bias in estimating treatment effects.
impermeable to intense light. To prevent subversion of the allocation sequence,Box the2. Randomisation and minimisation
name and date of birth of the participant was written on the envelope and a video
tape made of the sealed envelope with participant details visible. Carbon paper Simple randomisationdPure randomisation based on a single
inside the envelope transferred the infor- mation onto the allocation card inside allocation
the ratio is known as simple random- isation. Simple
envelope and a second researcher (CC) later viewed video tapes to ensure randomisation with a 1:1 allocation ratio is analogous to a coin toss,
envelopes were still sealed when participants' names were written on them. Cor- although we do not advocate coin tossing for randomisation in an
responding envelopes were opened only after the enrolled partic- ipants completed RCT. “Simple” is somewhat of a misnomer. While other
all baseline assessments and it was time to allocate the intervention.”147
randomisation schemes sound complex and more sophisticated, in
ExplanationdItem 8a discussed generation of an unpredictable sequencereality,
of simple randomisation is elegantly sophisticated in that it is
assignments. Of considerable importance is how this sequence is applied when more unpredictable and surpasses the bias prevention levels of all
other alternatives. Restricted randomisationdAny randomised
participants are enrolled into the trial (see Box 1). A generated allocation schedule
23
should be imple- mented by using allocation concealment, a critical mechanism approach that is not simple randomisation. Blocked random- isation
that prevents foreknowledge of treatment assignment and thus shields those who is the most common form. Other means of restricted randomisation
include replacement, biased coin, and urn randomisation, although
enroll participants from being influenced by this knowledge. The decision to accept
or reject a participant should be made, and informed consent should be obtained these are used much less frequently.141 Blocked
randomisationdBlocking is used to ensure that comparison groups
from the partici- pant, in ignorance of the next assignment in the sequence.148
will be generated according to a predetermined ratio, usually 1:1 or
The allocation concealment should not be confused with blinding (see item
groups of approximately the same size. Blocking can be used to
11). Allocation concealment seeks to prevent selection bias, protects the assignment
ensure close balance of the numbers in each group at any time
sequence until allocation, and can always be successfully implemented.2 during In the trial. For every block of eight participants, for example,
contrast, blinding seeks to prevent performance and ascertainment bias, protects four
the would be allocated to each arm of the trial.142 Improved
sequence after allocation, and cannot always be implemented.23 Without adequate
balance comes at the cost of reducing the unpredictability of the
allocation concealment, however, even random, unpre- dictable assignment sequence. Although the order of interventions varies randomly
sequences can be subverted.2,149 within each block, a person running the trial could deduce some of
143
the next treatment allocations if he or she knew the block size.
Centralised or “third-party” assignment is especially desirable. Many good
allocation concealment mechanisms incorporate external involvement. Use ofBlinding a the interventions, using larger block sizes, and randomly
pharmacy or central telephone randomisation system are two common techniques. varying the block size can ameliorate this problem. Stratified
Automated assignment systems are likely to become more common.150 When randomisationdStratification is used to ensure good balance of
external involvement is not feasible, an excellent method of allo- cation participant characteristics in each group. By chance, particularly in
concealment is the use of numbered containers. The inter- ventions (often drugs) small trials, study groups may not be well matched for baseline
are sealed in sequentially numbered identical containers according to the allocation charac- teristics, such as age and stage of disease. This weakens
the trial's credibility.144 Such imbalances can be avoided without
sacrificing the advantages of ran- domisation. Stratification ensures had obtained the patient's consent, she telephoned a contact who was
that the numbers of participants receiving each intervention are endent of the recruitment process for allocation consignment.”155
closely balanced within each stratum. Stratified randomisation is
ExplanationdAs noted in item 9, concealment of the allocated
achieved by performing a separate randomisation procedure within
ention at the time of enrolment is especially important. Thus, in addition to
each of two or more subsets of participants (for example, those
ing the methods used, it is also important to understand how the random
defining each study centre, age, or disease severity). Stratification
nce was implementeddspe- cifically, who generated the allocation sequence,
by centre is common in multicentre trials. Stratification requires
nrolled participants, and who assigned participants to trial groups.
some form of restriction (such as blocking within strata).
Stratification without blocking is ineffective. The process of randomising participants into a trial has three
ent steps: sequence generation, allocation concealment, and implementation
MinimisationdMinimisation ensures balance between intervention
Box 3). Although the same people may carry out more than one process under
groups for several selected patient factors (such as age).22,60 The
heading, investigators should strive for complete separation of the people
first patient is truly randomly allocated; for each subsequent
ved with generation and allocation concealment from the people involved in
participant, the treat- ment allocation that minimises the imbalance
mplementation of assignments. Thus, if someone is involved in the sequence
on the selected factors between groups at that time is identi- fied.
ation or allocation concealment steps, ideally they should not be involved in
That allocation may then be used, or a choice may be made at
mplementation step.
random with a heavy weighting in favour of the intervention that
would minimise imbalance (for example, with a probability of 0.8). Even with flawless sequence generation and allocation
The use of a random component is generally preferable. alment, failure to separate creation and concealment of the allocation
Minimisation has the advantage of making small groups closely nce from assignment to study group may introduce bias. For example, the
similar in terms of participant characteristics at all stages of the trial. n who generated an allocation sequence could retain a copy and consult it
Minimisation offers the only acceptable alternative to interviewing potential participants for a trial. Thus, that person could bias the
145 ment or assignment process, regardless of the unpredictability of the
randomisation, and some have argued that it is superior. On the
nment sequence. Investigators must then ensure that the assignment schedule
other hand, minimisation lacks the theoretical basis for eliminating
predictable and locked away (such as in a safe deposit box in a building rather
bias on all known and unknown factors. Nevertheless, in general,
essible to the enrolment location) from even the person who generated it. The
trials that use minimisation are considered methodologically
of the trial should specify where the investigators stored the allocation list.
equivalent to randomised trials, even when a random element is not
Box 3. Steps in a typical randomisation process
incorporated.
Despite the importance of the mechanism of allocation
concealment, published reports often omit such details. The mechanism used to Sequence generation
allocate interventions was omitted in reports of 89% of trials in rheumatoid Generate allocation sequence by some random
108 23
arthritis, 48% of trials in obstetrics and gynaecology journals, and 44% of trials procedure Allocation
concealment
in general medical journals.32 In a more broadly representative sample of all rando-
Develop allocation concealment mechanism (such as numbered,
mised trials indexed on PubMed, only 18% reported any allocation concealment
identical bottles or sequentially numbered, sealed, opaque
mechanism, but some of those reported mechanisms were inadequate.16
envelopes) Prepare the allocation concealment mechanism using
the allocation sequence from the sequence generation step
5.3.13. Item 10. Who generated the allocation sequence, who enrolled Implementation
participants, and who assigned participants to interventions
Enrol participants: B
Examplesd“Determination of whether a patient would be treated by Assess eligibility B Discuss the trial
streptomycin and bed-rest (S case) or by bed-rest alone (C case) was made by B Obtain informed consent B
reference to a statistical series based on random sampling numbers drawn up for Enrol participant in trial
each sex at each centre by Professor Bradford Hill; the details of the series were Ascertain intervention assignment (such as opening
unknown to any of the investigators or to the coordinator . After acceptance of a
next envelope) Administer
patient by the panel, and before admission to the streptomycin centre, the
intervention
appropriate numbered envelope was opened at the central office; the card inside
told if the patient was to be an S or a C case, and this information was then given to
the medical officer of the centre.”24
“Details of the allocated group were given on coloured cards
contained in sequentially numbered, opaque, sealed envelopes. These were
prepared at the NPEU and kept in an agreed location on each ward. Randomisation .3.14. Item 11a. If done, who was blinded after assignment to interventions
took place at the end of the 2nd stage of labour when the midwife considered a for example, participants, care providers, those assessing outcomes) and
vaginal birth was imminent. To enter a women into the study, the midwife opened ow
the next consecutively numbered envelope.”154
Examplesd“Whereas patients and physicians allocated to the intervention group
were aware of the allocated arm, outcome assessors and data analysts were kept
“Block randomisation was by a computer-generated random
number list prepared by an investigator with no clinical involve- ment in the trial. blinded “Blinding and equipoise were strictly maintained by
to the allocation.”156
We stratified by admission for an oncology related procedure. After the research emphasis- ing to intervention staff and participants that each diet adheres to healthy
principles, and each is advocated by certain experts to be superior for long-term Benjamin Franklin has been credited as being the first to use blinding in
weight-loss. Except for the interventionists (dieticians and behavioural
scientific experiment.158 He blindfolded participants so they would not know
psychologists), investigators and staff were kept blind to diet assignment of the
when he was applying mesmerism (a popular “healing fluid” of the 18th century)
participants. The trial adhered to established procedures to maintain separation
nd in so doing showed that mesmerism was a sham. Based on this experiment, the
between staff that take outcome measurements and staff that deliver the
cientific community recognised the power of blinding to reduce bias, and it has
intervention. Staff members who obtained outcome measurements were not
emained a commonly used strategy in scientific experiments.
informed of the diet group assignment. Intervention staff, dieticians and
Box 4, on blinding terminology, defines the groups of individuals (that
behavioural psychologists who delivered the inter- vention did not take outcome
s, participants, healthcare providers, data collectors, outcome adjudicators, and
measurements. All investigators, staff, and participants were kept masked to
ata analysts) who can potentially introduce bias into a trial through knowledge of
outcome measure- ments and trial results.”157
he treatment assignments. Participants may respond differently if they are aware of
ExplanationdThe term “blinding” or “m
asking” refers to withholding
heir treatment assignment (such as responding more favourably when they receive
information about the assigned interventions from people involved in the trial who
he new treatment).153 Lack of blinding may also
may potentially be influenced by this knowledge. Blinding is an important
safeguard against bias, particularly when assessing subjective outcomes.153
GUIDELINE
55 3 8
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
influence compliance with the intervention, use of co- interventions, and risk of
dropping out of the trial. .15. Item 11b. If relevant, description of the similarity of
Unblinded healthcare providers may introduce similar biases, and erventions
unblinded data collectors may differentially assess outcomes (such as frequency or Exampled“Jamieson Laboratories Inc provided 500-mg immediate
timing), repeat measurements of abnormal findings, or provide encouragement ease niacin in a white, oblong, bisect caplet. We independently confirmed caplet
during performance testing. Unblinded outcome adjudicators may differentially ntent using high performance liquid chromatography. The placebo was
assess subjec- tive outcomes, and unblinded data analysts may introduce biastched to the study drug for taste, colour, and size, and contained microcrystalline
through the choice of analytical strategies, such as the selection of favourable time
lulose, silicon dioxide, dicalcium phosphate, magnesium stea- rate, and stearic
points or outcomes, and by decisions to remove patients from the analyses. These d.”172
biases have been well documented.71,153,159e162 ExplanationdJust as we seek evidence of concealment to assure us
Blinding, unlike allocation concealment (see item 10), may not always t assignment was truly random, we seek evidence of the method of blinding. In
be appropriate or possible. An example is a trial comparing levels of pain als with blinding of participants or healthcare providers, authors should state the
associated with sampling blood from the ear or thumb.163 Blinding is particularly
milarity of the characteristics of the interventions (such as appearance, taste,
important when outcome measures involve some subjectivity, such as assessment ell, and method of administration).35,173
of pain. Blinding of data collectors and outcome adjudicators is unlikely to matter
Some people have advocated testing for blinding by asking
for objective outcomes, such as death from any cause. Even then, however, lack of rticipants or healthcare providers at the end of a trial whether they think the
participant or healthcare provider blinding can lead to other problems, such as
rticipant received the experimental or control intervention.174 Because
differential attrition.164 In certain trials, especially surgical trials, blinding of rticipants and healthcare providers will usually know whether the participant has
participants and surgeons is often difficult or impossible, but blinding of data perienced the primary outcome, this makes it difficult to determine if their
collectors and outcome adjudicators is often achievable. For example, lesions can ponses reflect failure of blinding or accurate assumptions about the effi- cacy of
be photographed before and after treatment and assessed by an external observer.165intervention.175 Given the uncertainty this type of
Regardless of whether blinding is possible, authors can and should always state Box 4. Blinding terminology
who was blinded (that is, participants, healthcare providers, data collectors, and
outcome adjudicators).
In order for a technical term to have utility it must have consistency in
Unfortunately, authors often do not report whether blinding was its use and interpretation. Authors of trials commonly use the term
used.166 For example, reports of 51% of 506 trials in cystic fibrosis,167 33% of 196 “double blind” and, less commonly, the terms “single blind”or “triple
trials in rheumatoid arthritis,108 and 38% of 68 trials in dermatology8 did not state blind.” A problem with this lexicon is that there is great variability in
whether blinding was used. Until authors of trials improve their reporting of clinician interpretations and epidemiological textbook definitions of
blinding, readers will have difficulty in judging the validity of the trials that they these terms.169 Moreover, a study of 200 RCTs reported as double
may wish to use to guide their clinical practice. blind found 18 different combinations of groups actually blinded when
The term masking is sometimes used in preference to blinding to avoid the authors of these trials were surveyed, and about one in every five
confusion with the medical condition of being without sight. However, “blinding” of these tri- alsdreported as double blindddid not blind participants,
in its methodological sense seems to be understood worldwide and is acceptable for healthcare providers, or data collectors.170
165,168
reporting clinical trials. This research shows that terms are ambiguous and, as such, authors
and editors should abandon their use. Authors should instead
explicitly report the blinding status of the people involved for whomthe authors report any known compromises in blinding. For example, authors
blinding may influence the validity of a trial. should report if it was necessary to unblind any participants at any point during the
conduct of a trial.
Healthcare providers include all personnel (for example, physicians,
chiropractors, physiotherapists, nurses) who care for the participants
5.3.16. Item 12a. Statistical methods used to compare groups for primary and
during the trial. Data collectors are the individuals who collect data on
secondary outcomes
the trial outcomes. Outcome adjudicators are the individuals who
determine whether a participant did experience the outcomes of Exampled“The primary endpoint was change in bodyweight during the 20
interest. weeks of the study in the intention-to-treat pop- ulation . Secondary efficacy
endpoints included change in waist circumference, systolic and diastolic blood
Some researchers have also advocated blinding and reporting thepressure, prevalence of metabolic syndrome . We used an analysis of covariance
blinding status of the data monitoring committee and the manuscript(ANCOVA) for the primary endpoint and for secondary endpoints waist
writers.160 Blinding of these groups is uncommon, and the value ofcircumference, blood pressure, and patient-reported outcome scores; this was
blinding them is debated.171 supplemented by a repeated measures analysis. The ANCOVA model included
treatment, country, and sex as fixed effects, and bodyweight at randomisation as
Sometimes one group of individuals (such as the healthcare
covariate. We aimed to assess whether data provided evidence of superiority of
providers) are the same individuals fulfilling another role in a trial
each liraglutide dose to placebo (primary objective) and to orlistat (secondary
(such as data collectors). Even if this is the case, the authors should 176
explicitly state the blinding status of these groups to allow readers toobjective).”
judge the validity of the trial. ExplanationdData can be analysed in many ways, some of which may not
be strictly appropriate in a particular situation. It is essential to specify which
statistical procedure was used for each analysis, and further clarification may be
necessary in the results
GUIDELINE
55 3 9
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
section of the report. The principle to follow is to, “Describe statistical methods ted analyses
with enough detail to enable a knowledgeable reader with access to the original Examplesd“Proportions of patients responding were compared
data to verify the reported results” (www.icmje.org). It is also important to describe
en treatment groups with the Mantel-Haenszel c2 test, adjusted for the
details of the statistical analysis such as intention-to-treat analysis (see Box 6).
cation variable, methotrexate use.”103
Almost all methods of analysis yield an estimate of the treat- ment
“Pre-specified subgroup analyses according to antioxidant treatment
effect, which is a contrast between the outcomes in the comparison groups. Authors
nment(s), presence or absence of prior CVD, dietary folic acid intake,
should accompany this by a confi- dence interval for the estimated effect, which
ng, diabetes, aspirin, hormone therapy, and multivitamin use were performed
indicates a central range of uncertainty for the true treatment effect. The confidence
stratified Cox proportional hazards models. These analyses used baseline
interval may be interpreted as the range of values for the treatment effect that is
ure assess- ments and were restricted to participants with nonmissing
compatible with the observed data. It is customary to present a 95% confidence
181
interval, which gives the range expected to include the true value in 95 of 100 oup data at baseline.”
similar studies. ExplanationdAs is the case for primary analyses, the method of
Study findings can also be assessed in terms of their statistical oup analysis should be clearly specified. The strongest anal- yses are those
significance. The P v alue represents the probability that the observed data (or a ook for evidence of a difference in treatment effect in complementary
more extreme result) could have arisen by chance when the interventions did not oups (for example, older and younger participants), a comparison known as a
truly differ. Actual P v alues (for example, P 1⁄4 0.003) are strongly preferable to f interac- tion.182,183 A common but misleading approach is to compare P
imprecise threshold reports such as P < 0.05.48,177 s for separate analyses of the treatment effect in each group. It is incorrect to
a subgroup effect (interaction) from one significant and one non-significant P
Standard methods of analysis assume that the data are “inde-
184
pendent.” For controlled trials, this usually means that there is one observation per Such inferences have a high false positive rate.
participant. Treating multiple observations from one participant as independent data Because of the high risk for spurious findings, subgroup analyses
is a serious error; such data are produced when outcomes can be measured on ften discouraged.14,185 Post hoc subgroup comparisons (anal- yses done after
different parts of the body, as in dentistry or rheumatology. Data analysis should be ng at the data) are especially likely not to be confirmed by further studies.
based on counting each participant once178,179 or should be done by using more analyses do not have great credibility.
complex statistical procedures.180 Incorrect analysis of multiple observations per In some studies, imbalances in participant characteristics are
individual was seen in 123 (63%) of 196 trials in rheumatoid arthritis. 108 ed for by using some form of multiple regression analysis. Although the need
djustment is much less in RCTs than in epidemiological studies, an adjusted
sis may be sensible, especially if one or more variables is thought to be
5.3.17. Item 12b. Methods for additional analyses, such as subgroup analyses and
prognostic.186 Ideally, adjusted analyses should be specified in the study protocol Attrition as a result of loss to follow-up, which is often unavoidable,
(see item 24). For example, adjustment is often eeds to be distinguished from investigator- determined exclusion for such reasons
recommended for any stratification variables (see item 8b) on the principle that sthe
ineligibility, withdrawal from treatment, and poor adherence to the trial protocol.
analysis strategy should follow the design. In RCTs, the decision to adjust should Erroneous conclusions can be reached if participants are excluded from analysis,
not be determined by whether baseline differences are statistically significant (see
nd imbalances in such omissions between groups may be especially indicative of
item 16).183,187 The rationale for any adjusted analyses and the statistical methods
ias.192e194 Information about whether the investigators included in the analysis all
used should be specified. articipants who under- went randomisation, in the groups to which they were
riginally allocated (intention-to-treat analysis (see item 16 and Box 6)), is
Authors should clarify the choice of variables that were adjusted for,
indicate how continuous variables were handled, and specify whether the analysis herefore of particular importance. Knowing the number of participants who did not
188 eceive the intervention as allocated or did not complete treatment permits the
was planned or suggested by the data. Reviews of published studies show that
eader to assess to what extent the estimated efficacy of therapy might be
reporting of adjusted analyses is inadequate with regard to all of these
188e191
nderestimated in comparison with ideal circumstances.
aspects.
If available, the number of people assessed for eligibility should also be
eported. Although this number is relevant to external validity only and is arguably
5.4. Results
ess important than the other counts,195 it is a useful indicator of whether trial
articipants were likely to be representative of all eligible participants.
5.4.1. Item 13. Participant f low (a diagram is strongly recommended) 5.4.1.1. Item
13a. For each group, the numbers of participants who were randomly assigned, A review of RCTs published in five leading general and internal
received intended treatment, and were analysed for the primary outcome. medicine journals in 1998 found that reporting of the flow of participants was often
ExamplesdSee Figs 2 and 3. ncomplete, particularly with regard to the number of participants receiving the
ExplanationdThe design and conduct of some RCTs is straightforward,llocated intervention and the number lost to follow-up.51 Even information as
and the flow of participants, particularly were there are no losses to follow-upasicor as the number of participants who underwent randomisation and the number
xcluded from analyses was not available in up to 20% of articles.51 Reporting was
exclusions, through each phase of the study can be described adequately in a few
onsiderably more thorough in articles that included a diagram of the flow of
sentences. In more complex studies, it may be difficult for readers to discern
articipants through a trial, as recommended by CONSORT. This study informed
whether and why some participants did not receive the treatment as allocated, were
he design of the revised f low diagram in the revised CONSORT statement.52e54
lost to follow-up, or were excluded from the analysis.51 This information is crucial
The
for several reasons. Partici- pants who were excluded after allocation are unlikely to suggested template is shown in Fig. 1, and the counts required are described in
be in Table 3.
be representative of all participants in the study. For example, patients may notetail
available for follow-up evaluation because they expe- rienced an acute exacerbation
of their illness or harms of treatment.22,192
GUIDELINE
55 4 0
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Fig. 2. Flow diagram of a multicentre trial of fractional flow reserve versus angiography for guiding percutaneous coronary intervention (PCI) (adapted from Tonino et al.313). The diagram includes detailed
information on the excluded participants.
GUIDELINE
55 4 1
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Fig. 3. Flow diagram of minimal surgery compared with medical management for chronic gastro-oesophageal reflux disease (adapted from Grant et al.196). The diagram shows a multicentre trial with a parallel
non-randomised preference group.
ExplanationdKnowing Some information, such as the number of individuals assessed
when a study took place and over what for eligibility, may not always be known,14 and, depending on the
period participants were recruited places the study in historical nature of a trial, some counts may be more relevant than others. It
context. Medical and surgical therapies, including concurrent will sometimes be useful or necessary to adapt the structure of the
therapies, evolve continuously and may affect the routine care flow diagram to a particular trial. In some situations, other infor-
given to participants during a trial. Knowing the rate at which mation may usefully be added. For example, the flow diagram of
participants were recruited may also be useful, especially to other a parallel group trial of minimal surgery compared with medical
investigators. management for chronic gastro-oesophageal reflux also included
The length of follow-up is not always a fixed period after ran- a parallel non-randomised preference group (see Fig. 3).196
domisation. In many RCTs in which the outcome is time to an event, The exact form and content of the flow diagram may be varied
follow-up of all participants is ended on a specific date. This date according to specific features of a trial. For example, many trials of
should be given, and it is also useful to report the minimum, surgery or vaccination do not include the possibility of discontin-
maximum, and median duration of follow-up.200,201 uation. Although CONSORT strongly recommends using this
A review of reports in oncology journals that used survival graphical device to communicate participant flow throughout the
analysis, most of which were not RCTs,201 found that nearly 80% study, there is no specific, prescribed format.
(104 of 132 reports) included the starting and ending dates for accrual of patients, but only 24% (32 of 132 reports) also reported 5.4.1.2. Item 13b.
For each group, losses and exclusions after ran-
the date on which follow-up ended. domisation, together with reasons. Examplesd“There was only one protocol deviation, in a woman in the
study group. She had an
5.4.3. Item 14b. Why the trial ended or was stopped a bnormal pelvic measurement and was scheduled for elective
Examplesd“At the time of the interim analysis, the total follow- caesarean section. However, the attending obstetrician judged
up included an estimated 63% of the total number of patient-years a trial of labour acceptable; caesarean section was done when there
that would have been collected at the end of the study, leading to was no progress in the first stage of labour.”197
a threshold value of 0.0095, as determined by the Lan-DeMets “The monitoring led to withdrawal of nine centres, in which
alpha-spending function method . At the interim analysis, the existence of some patients could not be proved, or other serious
RR was 0.37 in the intervention group, as compared with the violations of good clinical practice had occurred.”198
control group, with ap value of 0.00073, below the threshold value. ExplanationdSome protocol deviations may be reported in the
The Data and Safety Monitoring Board advised the investigators to flow diagram (see item 13a)dfor example, participants who did not
interrupt the trial and offer circumcision to the control group, who receive the intended intervention. If participants were excluded
were then asked to come to the investigation centre, where MC after randomisation (contrary to the intention-to-treat principle)
(medical circumcision) was advised and proposed . Because the because they were found not to meet eligibility criteria (see item
study was interrupted, some participants did not have a full follow- 16), they should be included in the flow diagram. Use of the term
up on that date, and their visits that were not yet completed are “protocol deviation” in published articles is not sufficient to justify
exclusion of participants after randomisation. The nature of the
described as “planned” in this article.”202
“In January 2000, problems with vaccine supply necessitated protocol deviation and the exact reason for excluding participants
the temporary nationwide replacement of the whole cell compo- after randomisation should always be reported.
nent of the combined DPT/Hib vaccine with acellular pertussis vaccine. As this vaccine has a different local reactogenicity profile, 5.4.2. Item 14a.
Dates def ining the periods of recruitment and
follow-up
we decided to stop the trial early.”203
ExplanationdArguably, trialists who arbitrarily conduct Exampled“Age-eligible participants were recruited . from
unplanned interim analyses after very few events accrue using no February 1993 to September 1994 . Participants attended clinic
statistical guidelines run a high risk of “catching” the data at visits at the time of randomisation (baseline) and at 6-month
a random extreme, which likely represents a large overestimate of intervals for 3 years.”199
treatment benefit.204
Table 3 Information required to document the flow of participants through each stage of a randomised trial.
Stage Number of people included Number of people not included or excluded Rationale
Enrolment People evaluated for potential enrolment
These counts indicate whether trial participants were likely to be representative of all patients seen; they are relevant to assessment of external validity only, and they are often not
available. Randomisation Participants randomly assigned Crucial count for defining trial size and
assessing whether a trial has been analysed by intention to treat Treatment
allocation
People who did not meet the inclusion criteria or met the inclusion criteria but declined to be enrolled
Participants who received treatment
Participants who did not receive treatment
Important counts for assessment of internal as allocated, by study group
as allocated, by study group
validity and interpretation of results; reasons for not receiving treatment as allocated should be given. Follow-up Participants who completed treatment
Participants who did not complete treatment
Important counts for assessment of internal as allocated, by study group
as allocated, by study group
validity and interpretation of results; reasons Participants who completed follow-up
Participants who did not complete follow-up
for not completing treatment or follow-up as planned, by study group
as planned, by study group
should be given. Analysis Participants included in main analysis,
by study group
Participants excluded from main
Crucial count for assessing whether a trial analysis, by study group
has been analysed by intention to treat; reasons for excluding participants should be given.
GUIDELINE
55 4 2
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Readers will likely draw weaker inferences from a trial that was egory.48,177
truncated in a data-driven manner versus one that reports its findings after reaching
Unfortunately significance tests of baseline differences are still
a goal independent of results. Thus, RCTs should indicate why the trial came to an
mmon23,32,210; they were reported in half of 50 RCTs trials pub- lished in leading
end (see Box 5). The report should also disclose factors extrinsic to the trial that
affected the decision to stop the trial, and who made the decision to stop the trial,neral journals in 1997.183 Such significance tests assess the probability that
including reporting the role the funding agency played in the deliberations and in served baseline differences could have occurred by chance; however, we already
134 ow that any differ- ences are caused by chance. Tests of baseline differences are
the decision to stop the trial.
t necessarily wrong, just illogical.211 Such hypothesis testing is superfluous and
A systematic review of 143 RCTs stopped earlier than planned for
n mislead investigators and their readers. Rather, comparisons at baseline should
benefit found that these trials reported stopping after accruing a median of 66
based on consideration of the prognostic strength of the variables measured and
events, estimated a median relative risk of 0.47 and a strong relation between the
211
number of events accrued and the size of the effect, with smaller trials with fewersize of any chance imbalances that have occurred.
events yielding the largest treatment effects (odds ratio 31, 95% confidence interval Box 5. Early stopping
12e82).134 While an increasing number of trials published in high impact medical
journals report stopping early, only 0.1% of trials reported stopping early for RCTs can end when they reach their sample size goal, their event
205 count goal, their length of follow-up goal, or when they reach their
benefit, which contrasts with estimates arising from simulation studies and
206
scheduled date of closure. In these situa- tions the trial will stop in a
surveys of data safety and monitoring committees. Thus, many trials accruing manner independent of its results, and stopping is unlikely to
few participants and reporting large treatment effects may have been stopped earlier introduce bias in the results. Alternatively, RCTs can stop earlier than
than planned but failed to report this action. planned because of the result of an interim analysis showing larger
than expected benefit or harm on the experimental inter- vention. Also
5.4.4. Item 15. A table showing baseline demographic and clinical characteristics RCTs can stop earlier than planned when investigators find evidence
for each group E xampledSee Table 4 ExplanationdA
lthough the eligibility of no important difference between experimental and control
criteria (see item 4a) indicate who was eligible for the trial, it is also important to interventions (that is, stopping for futility). In addition, trials may stop
know the characteristics of the participants who were actually included. This early because the trial becomes unviable: funding vanishes,
information allows readers, especially clinicians, to judge how relevant the results researchers cannot access eligible patients or study inter- ventions, or
of a trial might be to an individual patient. the results of other studies make the research question irrelevant.
Randomised trials aim to compare groups of participants that differ
Full reporting of why a trial ended is important for evidence based
only with respect to the intervention (treatment). Although proper random
decision making (see item 14b). Researchers exam- ining why 143
assignment prevents selection bias, it does not guarantee that the groups are
trials stopped early for benefit found that many failed to report key
equivalent at baseline. Any differ- ences in baseline characteristics are, however,
methodological information regarding how the decision to stop was
the result of chance rather than bias.32 The study groups should be compared at reacheddthe planned sample size (n 1⁄4 28), interim analysis after
base- line for important demographic and clinical characteristics so that readers can which the trial was stopped (n 1⁄4 45), or whether a stopping rule
assess how similar they were. Baseline data are espe- cially valuable for outcomes
informed the decision (n 1⁄4 48).134 Item 7b of the checklist requires
that can also be measured at the start of the trial (such as blood pressure).
the reporting of timing of interim analyses, what triggered them, how
Baseline information is most efficiently presented in a table (see Table many took place, whether these were planned or ad hoc, and whether
4). For continuous variables, such as weight or blood pressure, the variability of the there were statistical guidelines and stopping rules in place a priori.
data should be reported, along with average values. Continuous variables can be Furthermore, it is helpful to know whether an independent data moni-
summarised for each group by the mean and standard deviation. When continuous toring committee participated in the analyses (and who composed it,
data have an asymmetrical distribution, a preferable approach may be to quote the with particular attention to the role of the funding source) and who
median and a centile range (such as the 25th and 75th centi- les).177 Standard errors made the decision to stop. Often the data safety and monitoring
and confidence intervals are not appropriate for describing variabilitydthey are committee makes recom- mendations and the funders (sponsors) or
inferential rather than descriptive statistics. Variables with a small number of the investigators make the decision to stop.
ordered categories (such as stages of disease I to IV) should not be treated as
continuous variables; instead, numbers and proportions should be reported for each Trials that stop early for reasons apparently independent of trial
findings, and trials that reach their planned termina- tion, are unlikely Examplesd“The primary analysis was intention-to-treat and involved all
to introduce bias by stopping.207 In these cases, the authors shouldpatients who were randomly assigned.”212
report whether interim analyses took place and whether these results
“One patient in the alendronate group was lost to follow-up; thus data from
were available to the funder.
31 patients were available for the intention-to-treat analysis. Five patients were
The push for trials that change the intervention in response to interimconsidered protocol violators . consequently 26 patients remained for the per
results, thus enabling a faster evaluation of promising interventions forprotocol analyses.”213
rapidly evolving and fatal conditions, will require even more careful
ExplanationdThe number of participants in each group is an essential
reporting of the process and decision to stop trials early.208 element of the analyses. Although the flow diagram (see item 13a) may indicate the
numbers of participants analysed, these numbers often vary for different outcome
measures. The number of participants per group should be given for all analyses.
For binary
5.4.5. Item 16. For each group, number of participants (denominator) included
in each analysis and whether the analysis was by original assigned groups
GUIDELINE
55 4 3
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
me data are missing. The term “intention-to- treat analysis” is often
ropriately useddfor example, when those who did not receive the first dose of
drug are excluded from the analyses.18
Conversely, analysis can be restricted to only participants who fulfil
rotocol in terms of eligibility, interventions, and outcome assessment. This
sis is known as an “on-treatment” or “per protocol” analysis. Excluding
ipants from the analysis can lead to erroneous conclusions. For example, in a
hat compared medical with surgical therapy for carotid stenosis, analysis
d to participants who were available for follow-up showed that surgery
ed the risk for transient ischaemic attack, stroke, and death. However,
ion-to-treat analysis based on all partici- pants as originally assigned did not
a superior effect of surgery.214
Intention-to-treat analysis is generally favoured because it avoids
ssociated with non-random loss of participants.215e217 Regardless of whether
rs use the term “intention-to-treat,” they should make clear which and how
participants are included in each analysis (see item 13). Non-compliance with
ned therapy may mean that the intention-to-treat analysis underestimates the
tial benefit of the treatment, and
ExamplesdS ee Tables 5 and 6. ExplanationdFor each outcome, study results
hould be re- ported as a summary of the outcome in each group (for example, the
umber of participants with or without the event and the denominators, or the mean
nd standard deviation of measure- ments), together with the contrast between the
roups, known as the effect size. For binary outcomes, the effect size could be the
outcomes, (such as risk ratio and risk difference) the denominators or event rates
isk ratio (relative risk), odds ratio, or risk difference; for survival time data, it
should also be reported. Expressing results as frac- tions also aids the reader in
ould be the hazard ratio or difference in median survival time; and for continuous
assessing whether some of the randomly assigned participants were excluded from
ata, it is usually the difference in means. Confidence intervals should be presented
the analysis. It follows that results should not be presented solely as summary
or the contrast between groups. A common error is the presentation of separate
measures, such as relative risks.
onfidence intervals for the outcome in each group rather than for the treat- ment
Participants may sometimes not receive the full intervention, or ffect.233 Trial results are often more clearly displayed in a table rather than in the
some ineligible patients may have been randomly allocated in error. One widely ext, as shown in Tables 5 and 6.
recommended way to handle such issues is to analyse all participants according to
For all outcomes, authors should provide a confidence interval to
their original group assignment, regardless of what subsequently occurred (see Box
48,235
6). This “intention-to-treat” strategy is not always straightforward to implement. It ndicate the precision (uncertainty) of the estimate. A 95% confidence interval
is common for some patients not to complete a studydthey may dropout or be s conventional, but occasionally other levels are used. Many journals require or
236
withdrawn from active treat- mentdand thus are not assessed at the end. If the trongly encourage the use of confidence intervals. They are especially valuable
outcome is mortality, such patients may be included in the analysis based on n relation to differences that do not meet conventional statistical significance, for
register information, whereas imputation techniques may need to be used if other which they often indicate that the result does not rule out an important clinical
difference. The use of confidence intervals has increased markedly in recent years,
ffect sizes is recommended
although not in all medical specialties.233 Although P v alues may be provided in Exampled“The risk of oxygen dependence or death was reduced by
addition to confidence intervals, results should not be reported solely as6% P (95% CI 25%e7%). The absolute difference was À6.3% (95% CI À9.9% to
values.237,238 Results should be reported for all planned primary and secondaryÀ2.7%); early administration to an estimated 16 babies would therefore prevent 1
endpoints, not just for analyses that were statisti- cally significant or “interesting.
aby ” dying or being long-term dependent on oxygen” (also see Table 7).242
l 5.09 (1.18) 5.08 (1.15) LDL 3.02 (1.01) 3.03 (1.02) HDL 1.27 (0.37) 1.28 (0.41)
Selective reporting within a study is a widespread and serious problem.55,57 In trials
oronary artery disease 2211 (74.8%) 2207 (74.3%) Myocardial infarction 1381
in which interim analyses were performed, interpretation should focus on the fi8%) nal 1360 (45.8%) Angina pectoris 1412 (47.8%) 1412 (47.5%) Peripheral artery
results at the close of the trial, not the interim results.239 ase 349 (11.8%) 323 (10.9%) Hypertension 2259 (76.5%) 2269 (76.3%) Diabetes
1059 (35.8%) 1059 (35.6%)
For both binary and survival time data, expressing the results also as the
number needed to treat for benefit or harm can be helpful (see item 21).240,241 a are means (SD) or numbers (%).
5.4.7. Item 17b. For binary outcomes, presentation of both absolute and relative
GUIDELINE
55 4 4
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
allows strong inferences about cause and effect that are not justified
additional analyses, such as a per protocol analysis, may therefore with be other study designs. In order to preserve fully the huge benefit of
considered. 218,219
It should be noted, however, that such anal- yses are often randomisation we should include all randomised participants in the
220 anal- ysis, all retained in the group to which they were allocated.
considerably flawed.
Those two conditions define an “intention-to-treat” anal- ysis, which is
In a review of 403 RCTs published in 10 leading medical journals in
widely recommended as the preferred anal- ysis strategy.18,223
2002, 249 (62%) reported the use of intention-to-treat analysis for their primary
Intention-to-treat analysis corresponds to analysing the groups exactly
analysis. This proportion was higher for journals adhering to the CONSORT
as randomised. Strict intention-to-treat analysis is often hard to
statement (70% v 4 8%). Among articles that reported the use of intention-to-treat
achieve for two main reasonsdmissing outcomes for some
analysis, only 39% actually analysed all participants as randomised, with more than
participants and non-adherence to the trial protocol.
60% of articles having missing data in their primary analysis.221 Other studies show
similar findings.18,222,223 Trials with no re- ported exclusions are methodologically Missing outcomes
weaker in other respects than those that report on some excluded participants,173
strongly indicating that at least some researchers who have excluded participants do
Many trialists exclude patients without an observed outcome. Often
not report it. Another study found that reporting an intention-to-treat analysis was
this is reasonable, but once any rando- mised participants are
associated with other aspects of good study design and reporting, such as describing
excluded the analysis is not strictly an intention-to-treat analysis.
a sample size calculation.224 Indeed, most randomised trials have some missing observations.
Trialists effectively must choose between omitting the participants
5.4.6. Item 17a. For each primary and secondary outcome, results for each without final outcome data or imputing their missing outcome data.225
group, and the estimated effect size and its precision (such as 95% conf idence A “complete case” (or “available case”) analysis includes only those
interval) whose outcome is known. While a few missing outcomes will not
Table 4 Example of reporting baseline demographic and clinical characteristics.a (Adapted from Table 1 cause a problem, in half of trials more than 10% of randomised
of Yusuf et al.209).Telmisartan (N 1
⁄4 2954) Placebo (N 1⁄4 2972) patients may have missing outcomes.226 This common approach will
Age (years) 66.9 (7.3) 66.9 (7.4) Sex (female) 1280 (43.3%) 1267 (42.6%) Smoking lose power by reducing the sample size, and bias may well be
status: introduced if being lost to follow-up is related to a patient's response
Current 293 (9.9%) 289 (9.7%) Past 1273 (43.1%) 1283 (43.2%) Ethnic origin: to treatment. There should be concern when the frequency or the
Asian 637 (21.6%) 624 (21.0%) Arab 37 (1.3%) 40 (1.3%) African 51 (1.7%) 55 (1.9%) European causes of dropping out differ between the intervention groups.
1801 (61.0%) 1820 (61.2%) Native or Aboriginal 390 (13.2%) 393 (13.2%) Other 38 (1.3%) 40
(1.3%) Blood pressure (mm Hg) 140.7 (16.8/81.8) (10.1) 141.3 (16.4/82.0) (10.2) Heart rate (beats Participants with missing outcomes can be included in the analysis
per min) 68.8 (11.5) 68.8 (12.1) Cholesterol (mmol/l):
only if their outcomes are imputed (that is, their outcomes are
Total 5.09 (1.18) 5.08 (1.15) LDL 3.02 (1.01) 3.03 (1.02) HDL 1.27 (0.37) 1.28 (0.41)
estimated from other information that was collected). Imputation of the
Coronary artery disease 2211 (74.8%) 2207 (74.3%) Myocardial infarction 1381
(46.8%) 1360 (45.8%) Angina pectoris 1412 (47.8%) 1412 (47.5%) Peripheral artery missing data allows the anal- ysis to conform to intention-to-treat
disease 349 (11.8%) 323 (10.9%) Hypertension 2259 (76.5%) 2269 (76.3%) Diabetes analysis but requires strong assumptions, which may be hard to
1059 (35.8%) 1059 (35.6%) justify.227 Simple imputation methods are appealing, but their use may
Box 6. Intention-to-treat analysis be inadvisable. In particular, a widely used method is “last observation
carried forward” in which missing final values of the outcome variable
The special strength of the RCT is the avoidance of bias when a are replaced by the last known value before the participant was lost to
llocating interventions to trial participants (see Box 1). That strength follow-up. This is appealing through its simplicity, but the method may
introduce bias,228 and no allowance is made for the uncertainty of efficacy might vary across baseline risk, we pre-specified our primary subgroup
imputation. 229
Many authors have severely criticised last observation analysis on the basis of the revised cardiac risk index scoring system. We also did
pre-specified secondary subgroup analyses based on sex, type of surgery, and use of
carried forward.229e231
an epidural or spinal anaesthetic. For all subgroup analyses, we used Cox
proportional hazard models that incorporated tests for inter- actions, designated to
Non-adherence to the protocol
be significant at p < 0.05 . Fig. 3 shows the results of our pre-specified subgroup
analyses and indicates consistency of effects . Our subgroup analyses were
A separate issue is that the trial protocol may not have been followed underpow- ered to detect the modest differences in subgroup effects that one might
fully for some trial participants. Common exam- ples are participants expect to detect if there was a true subgroup effect.”100
who did not meet the inclusion criteria (such as wrong diagnosis, too
ExplanationdMultiple analyses of the same data create a risk for false
young), received a proscribed co-intervention, did not take all the
intended treatment, or received a different treatment or no inter- positive findings.246 Authors should resist the temptation to perform many subgroup
vention. The simple way to deal with any protocol devia- tions is to analyses.183,185,247 Analyses that were pre-specified in the trial protocol (see item
ignore them: all participants can be included in the analysis regardless 24) are much more reliable than those suggested by the data, and therefore authors
of adherence to the protocol, and this is the intention-to-treat should report which analyses were pre-specified. If subgroup analyses were
approach. Thus, exclusion of any participants for such reasons is undertaken, authors should report which subgroups were examined, why, if they
incompatible with intention-to-treat analysis. were pre-specified, and how many were pre-specified. Selective reporting of
subgroup analyses could lead to bias.248 When evaluating a subgroup the question
The term “modified intention-to-treat” is quite widely used to describe
is not whether the subgroup shows a statistically significant result but whether the
an analysis that excludes participants who did
subgroup treatment effects are significantly different from each other. To determine
ExplanationdWhen the primary outcome is binary, both the relative effect
this, a test of interaction is helpful, although the power for such tests is typically
(risk ratio (relative risk) or odds ratio) and the absolute effect (risk difference)
low. If formal evalu- ations of interaction are undertaken (see item 12b) they should
should be reported (with confi- dence intervals), as neither the relative measure nor
be reported as the estimated difference in the intervention effect in each subgroup
the absolute measure alone gives a complete picture of the effect and its
(with a confidence interval), not just as P values.
implications. Different audiences may prefer either relative or absolute risk, but
not adequately adhere to the protocol, in particular those who did not
both doctors and lay people tend to overestimate the effect when it is presented in
receive a defined minimum amount of the intervention.232 An
terms of relative risk.243e245 The size of the risk difference is less generalisable to
alternative term is “per protocol.” Though a per protocol analysis may
other populations than the relative risk since it depends on the baseline risk in the
be appropriate in some settings, it should be properly labelled as a
unexposed group, which tends to vary across populations. For diseases where the
non- randomised, observational comparison. Any exclusion of patients
outcome is common, a relative risk near unity might indicate clinically important
from the analysis compromises the randomisation and may lead to
differences in public health terms. In contrast, a large relative risk when the
bias in the results.
outcome is rare may not be so important for public health (although it may be
important to an individual in a high risk category). Like “intention-to-treat,” none of these other labels reliably clarifies
exactly which patients were included. Thus, in the CONSORT
5.4.8. Item 18. Results of any other analyses performed, including subgroup checklist we have dropped the specific request for intention-to-treat
analyses and adjusted analyses, distinguishing pre- specif ied from exploratory analysis in favour of a clear description of exactly who was included in
each analysis.
Exampled“On the basis of a study that suggested perioperative b -blocker
GUIDELINE
55 4 5
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
observed Table 5
during a trial are necessarily a consequence of the inter- Example of reporting of summary results for each study group (binary outcomes).a (Adapted from Table 2 of
Mease et al.103).
vention; some may be a consequence of the condition being treated. Randomised trials offer the best approach for providing Endpoint Number (%) Risk
difference
safety data as well as efficacy data, although they cannot detect rare
Etanercept
(95% CI) harms. ( n 1
⁄4 30)
Many reports of RCTs provide inadequate information on adverse events. A survey of 192 drug trials published from 1967 to 1999 showed that
only 39% had adequate reporting of clinical adverse events and 29% had adequate reporting of laboratory defined toxicity.72 More recently, a
comparison between the adverse event data submitted to the trials database of the National Cancer Institute, which sponsored the trials, and the
information reported in journal articles found that low grade adverse events were underreported in journal articles. High grade events (Common
Toxicity Criteria grades 3e5) were reported inconsis- tently in the articles, and the information regarding attribution to investigational drugs was
incomplete.251 Moreover, a review of trials published in six general medical journals in 2006e2007 found that, although 89% of 133 reports
mentioned adverse events, no information on severe adverse events and withdrawal of patients due to an adverse event was given on 27% and
48% of articles, respectively.252
An extension of the CONSORT statement has been developed to provide detailed recommendations on the reporting of harms in randomised
trials.42 Recommendations and examples of appro- priate reporting are freely available from the CONSORT website (www.consort-statement.org).
They complement the CONSORT 2010 Statement and should be consulted, particularly if the study of harms was a key objective. Briefly, if data
on adverse events were collected, events should be listed and defined, with reference to standardised criteria where appropriate. The methods used
for data collection and attribution of events should be described. For each study arm the absolute risk of each adverse event, using appro- priate
metrics for recurrent events, and the number of participants withdrawn due to harms should be presented. Finally, authors should provide a
balanced discussion of benefits and harms.42
5.5. Discussion
5.5.1. Item 20. Trial limitations, addressing sources of potential bias, imprecision, and, if relevant, multiplicity of analyses
Exampled“The preponderance of male patients (85%) is a limitation of our study . We used bare-metal stents, since drug- eluting stents were
not available until late during accrual. Although the latter factor may be perceived as a limitation, published data indicate no benefit (either
short-term or long-term) with respect to death and myocardial infarction in patients with stable coronary artery disease who receive drug-eluting
stents, as compared with those who receive bare-metal stents.”253
ExplanationdThe discussion sections of scientific reports are often filled with rhetoric supporting the authors' findings254 and provide little
measured argument of the pros and cons of the study and its results. Some journals have attempted to remedy this problem by encouraging more
structure to authors' discussion of their results.255,256 For example, Annals of Internal Medicine P lacebo (n 1⁄4 30)
Primary endpoint Achieved PsARC at 12 weeks 26 (87) 7 (23) 63% (44e83) Secondary endpoint Proportion of patients
meeting ACR criteria: ACR20 22 (73) 4 (13) 60% (40e80) ACR50 15 (50) 1 (3) 47% (28e66) ACR70 4 (13) 0 (0) 13% (1e26)
PsARC 1⁄4 psoriatic arthritis response criteria. ACR 1⁄4 American College of Rheumatology.
a
See also example for item 6a.
In one survey, 35 of 50 trial reports included subgroup analyses, of which only 42% used tests of interaction.183 It was often difficult to determine
whether subgroup analyses had been specified in the protocol. In another survey of surgical trials published in high impact journals, 27 of 72 trials
reported 54 subgroup analyses, of which 91% were post hoc and only 6% of subgroup analyses used a test of interaction to assess whether a
subgroup effect existed.249 Similar recommendations apply to analyses in which adjust- ment was made for baseline variables. If done, both
unadjusted and adjusted analyses should be reported. Authors should indicate whether adjusted analyses, including the choice of variables to
adjust for, were planned. Ideally, the trial protocol should state whether adjustment is made for nominated baseline variables by using analysis of
covariance.187 Adjustment for variables because they differ significantly at baseline is likely to bias the estimated treatment effect.187 A survey
found that unacknowledged discrep- ancies between protocols and publications were found for all 25 trials reporting subgroup analyses and for 23
of 28 trials reporting adjusted analyses.92
5.4.9. Item 19. All important harms or unintended effects in each groupFor specific guidance see CONSORT for harms.42
Exampled“The proportion of patients experiencing any adverse event was similar between the rBPI21 [recombinant
bactericidal/permeability-increasing protein] and placebo groups: 168 (88.4%) of 190 and 180 (88.7%) of 203, respectively, and it was lower in
patients treated with rBPI21 than in those treated with placebo for 11 of 12 body systems . the proportion of patients experiencing a severe
adverse event, as judged by the investigators, was numerically lower in the rBPI21 group than the placebo group: 53 (27.9%) of 190 versus 74
(36.5%) of 203 patients, respectively. There were only three serious adverse events reported as drug- related and they all occurred in the placebo
group.”250
ExplanationdReaders need information about the harms as well as the benefits of interventions to make rational and balanced decisions. The
existence and nature of adverse effects can have a major impact on whether a particular intervention will be deemed acceptable and useful. Not all
reported adverse events
Table 6 Example of reporting of summary results for each study group (continuous outcomes). (Adapted from Table 3 of van Linschoten234).
Exercise therapy (n 1⁄4 65) Control (n 1⁄4 66) Adjusted differencea
Baseline (mean (SD)) 12 months (mean (SD)) Baseline (mean (SD)) 12 months (mean (SD)) (95% CI) at 12 months Function score (0e100) 64.4 (13.9) 83.2 (14.8) 65.9 (15.2) 79.8
(17.5) 4.52 (À0.73e9.76) Pain at rest (0e100) 4.14 (2.3) 1.43 (2.2) 4.03 (2.3) 2.61 (2.9) À1.29 (À2.16 to À0.42) Pain on activity (0e100) 6.32 (2.2) 2.57 (2.9) 5.97 (2.3) 3.54 (3.38)
À1.19 (À2.22 to À0.16)
a
Function score adjusted for baseline, age, and duration of symptoms.
GUIDELINE
55 4 6
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
dically treated athletes, we covered a broad spectrum of injury severity. This
ggests that the present training programme can be implemented in the treatment
all athletes. Furthermore, as it is reasonable to assume that ankle sprains not
ated to sports are comparable with those in sports, the programme could benefit
general population.”260
“This replicates and extends the work of Clarke and colleagues and
demonstrates that this CB (cognitive behavioural) prevention program can be
reliably and effectively delivered in different settings by clinicians outside of the
group who originally developed the intervention. The effect size was consistent
recommends that authors structure the discussion section by pre- senting (1) a brief
with those of previously reported, single-site, indicated depression prevention
synopsis of the key findings, (2) consideration of possible mechanisms and
studies and was robust across sites with respect to both depressive disorders and
explanations, (3) comparison with rele- vant findings from other published studies
symptoms . In this generalisability trial, we chose a comparison condition that is
(whenever possible including a systematic review combining the results of the
relevant to public healthdu sual care . The sample also was predominantly
current study with the results of all previous relevant studies), (4) limita- tions of
working class to middle class with access to health insurance. Given evidence that
the present study (and methods used to minimise and compensate for those
CB therapy can be more efficacious for adolescents from homes with higher
limitations), and (5) a brief section that summarises the clinical and research
incomes, it will be important to test the effects of this prevention program with
implications of the work, as appropriate.255 We recommend that authors follow
more economically and ethnically diverse samples.”261
these sensible suggestions, perhaps also using suitable subheadings in the
discussion section. ExplanationdExternal validity, also called generalisability or applicability,
is the extent to which the results of a study can be generalised to other
Although discussion of limitations is frequently omitted from research
257 circumstances.262 Internal validity, the extent to which the design and conduct of
reports, identification and discussion of the weaknesses of a study have particular
the trial eliminate the possi- bility of bias, is a prerequisite for external validity: the
importance.258 For example, a surgical group reported that laparoscopic
results of a flawed trial are invalid and the question of its external validity becomes
cholecystectomy, a technically difficult procedure, had significantly lower rates of
irrelevant. There is no absolute external validity; the term is meaningful only with
complications than the more traditional open cholecystectomy for management of
regard to clearly specified conditions that were not directly examined in the trial.
acute cholecystitis.259 However, the authors failed to discuss an obvious bias in Can results be generalised to an individual participant or groups that differ from
their results. The study investigators had completed all the laparoscopic those enrolled in the trial with regard to age, sex, severity of disease, and comorbid
cholecystectomies, whereas 80% of the open cholecystectomies had been conditions? Are the results applicable to other drugs within a class of similar drugs,
completed by trainees. to a different dose, timing, and route of administration, and to different concomitant
Authors should also discuss any imprecision of the results. therapies? Can similar results be expected at the primary, secondary, and tertiary
Imprecision may arise in connection with several aspects of a study, including levels of care? What about the effect on related outcomes that were not assessed in
measurement of a primary outcome (see item 6a) or diagnosis (see item 4a). the trial, and the importance of length of follow-up and duration of treatment,
Perhaps the scale used was validated on an adult population but used in a paediatric especially with respect to harms?263
one, or the assessor was not trained in how to administer the instrument.
External validity is a matter of judgement and depends on the
The difference between statistical significance and clinical importance characteristics of the participants included in the trial, the trial setting, the treatment
should always be borne in mind. Authors should particularly avoid the common
regimens tested, and the outcomes assessed.5,136 It is therefore crucial that adequate
error of interpreting a non- significant result as indicating equivalence of
information be described about eligibility criteria and the setting and location (see
interventions. The confidence interval (see item 17a) provides valuable insight into
item 4b), the interventions and how they were administered (see item 5), the
whether the trial result is compatible with a clinically important effect, regardless of
definition of outcomes (see item 6), and the period of recruitment and follow-up
the P v alue.120 (see item 14). The proportion of control group participants in whom the outcome
Authors should exercise special care when evaluating the results of develops (control group risk) is also important. The proportion of eligible
trials with multiple comparisons. Such multiplicity arises from several participants who refuse to enter the trial as indicated on the flowchart (see item 13)
interventions, outcome measures, time points, subgroup analyses, and other factors. is relevant for the generalisability of the trial, as it may indicate preferences for or
In such circumstances, some statisti- cally significant findings are likely to result acceptability of an intervention. Similar consid- erations may apply to clinician
from chance alone. preferences.264,265
Several issues are important when results of a trial are applied to an
5.5.2. Item 21. Generalisability (external validity, applicability) of the trial
individual patient.266e268 Although some variation in treatment response between
findings
an individual patient and the patients in a trial or systematic review is to be
Examplesd“As the intervention was implemented for both sexes, all expected, the differences tend to be in magnitude rather than direction.
ages, all types of sports, and at different levels of sports, the results indicate that the
Although there are important exceptions,268 therapies (espe- cially
entire range of athletes, from young elite to intermediate and recreational senior
269
athletes, would benefit from using the presented training programme for the drugs ) found to be beneficial in a narrow range of patients generally have
prevention of recurrences of ankle sprain. By including non-medically treated and broader application in actual practice. Frameworks
GUIDELINE
55 4 7
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Table 7 Example of reporting both absolute and relative effect sizes. (Adapted from Table 3 of The OSIRIS Collaborative Group242).
Primary outcome Percentage (No) Risk ratio (95% CI) Risk difference (95% CI)
281,282
Early administration (n 1⁄4 selective reporting of outcomes within trials, and of per protocol rather
1344) intention-to-treat analysis have been well documented.55,56,283 Covert
Delayed selective administration (n
dant publication of clinical trials can also cause problems, particularly for
1⁄4 1346)
rs of systematic reviews when results from the same trial are inadvertently
Death or oxygen dependence at “expected
date of delivery” ded more than once.284
0.75e0.93) À6.3 (À9.9 to À2.7) To minimise or avoid these problems there have been repeated calls
for the evaluation of external validity have been proposed, including qualitative he past 25 years to register clinical trials at their inception, to assign unique
and checklists.271 Measures dentification numbers, and to record other basic information about the trial so
studies, such as in integral “process evalua- tions”270
that incorporate baseline risk when calculating therapeutic effects, such as the ssential details are made publicly available.285e288 Provoked by recent serious
number needed to treat to obtain one additional favourable outcome and the number
ems of withholding data,289 there has been a renewed effort to register
needed to treat to produce one adverse effect, are helpful in assessing the
mised trials. Indeed, the World Health Organisation states that “the
benefit-to-risk balance in an individual patient or group with characteristics that
ration of all interventional trials is a scientific, ethical and moral
differ from the typical trial partici- pant.268,272,273 Finally, after deriving patient nsibility” (www.who.int/ ictrp/en). By registering a randomised trial, authors
centred estimates for the potential benefit and harm from an intervention, the lly report a minimal set of information and obtain a unique trial registration
clinician must integrate them with the patient's values and preferences for therapy. er.
Similar considerations apply when assessing the general- isability of results to In September 2004 the International Committee of Medical Journal
different settings and interventions. Editors (ICMJE) changed their policy, saying that they would consider trials for
ublication only if they had been registered before the enrolment of the first
5.5.3. Item 22. Interpretation consistent with results, balancing benef its and articipant.290 This resulted in a dramatic increase in the number of trials being
harms, and considering other relevant evidence
egistered.291 The ICMJE gives guidance on acceptable registries
Exampled“Studies published before 1990 suggested that
prophylactic immunotherapy also reduced nosocomial infections in www.icmje.org/faq. pdf).In a recent survey of 165 high impact factor medical
very-low-birth-weight infants. However, these studies enrolled small numbers of
instructions to authors, 44 journals specifically stated that all recent
patients; employed varied designs, preparations, and doses; and included diverse ournals'
study populations. In this large multicenter, randomised controlled trial, the linical trials must be registered as a requirement of submission to that journal.292
repeated prophylactic administration of intravenous immune globulin failed to Authors should provide the name of the register and the trial's unique
reduce the incidence of nosocomial infections significantly in premature infants egistration number. If authors had not registered their trial they should explicitly
weighing 501e1500 g at birth.”274
tate this and give the reason.
GUIDELINE
55 4 8
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
by other sources.297e300 A systematic review of 30 studies on funding found that de clear in both the main text and the article's abstract.
research funded by the pharmaceutical industry had four times the odds of having Multiarm (>2 group) parallel group trials need the least modi- fication of the
outcomes favouring the sponsor than research funded by other sources (odds ratio andard CONSORT guidance. The flow diagram can be extended easily. The main
4.05, 95% confi- dence interval 2.98e5.51).297 A large proportion of trial differences from trials with two groups relate to clarification of how the study
publications do not currently report sources of funding. The degree of under- hypotheses relate to the multiple groups, and the consequent methods of data
alysis and interpretation. For factorial trials, the possibility of interaction between
reporting is difficult to quantify. A survey of 370 drug trials found that 29% failed
301 the interventions generally needs to be considered. In addition to overall
to report sources of funding. In another survey, of PubMed indexed randomised
comparisons of participants who did or did not receive each intervention under
trials published in December 2000, source of funding was reported for 66% of the
study, investigators should consider also reporting results for each treatment
519 trials.16
combination.303 In crossover trials, each participant receives two (or more)
The level of involvement by a funder and their influence on the design,
treatments in a random order. The main additional issues to address relate to the
conduct, analysis, and reporting of a trial varies. It is therefore important that
aired nature of the data, which affect design and analysis.304 Similar issues affect
authors describe in detail the role of the funders. If the funder had no such
within-person comparisons, in which participants receive two treatments
involvement, the authors should state so. Similarly, authors should report any other
simultaneously (often to paired organs). Also, because of the risk of temporal or
sources of support, such as supply and preparation of drugs or equipment, or in the
systemic carryover effects, respectively, in both cases the choice of design needs
analysis of data and writing of the manuscript.302
justification.
The CONSORT Group intends to publish extensions to CONSORT to
6. Reporting RCTs that did not have a two group parallel design
ver all these designs. In addition, we will publish updates to existing guidance for
ster randomised trials and non-inferiority and equivalence trials to take account
The primary focus of the CONSORT recommendations is RCTs with this major update of the generic CONSORT guidance.
a parallel design and two treatment groups. Most RCTs have that design, but a 7. Discussion
substantial minority do not: 45% (233/519) of RCTs published in December
2000,16 and 39% (242/616) in December 2006.17 Assessment of healthcare interventions can be misleading unless
Most of the CONSORT statement applies equally to all trial designs, investigators ensure unbiased comparisons. Random allo- cation to study groups
but there are a few additional issues to address for each design. Before the remains the only method that eliminates selection and confounding biases.
publication of the revised CONSORT statement in 2001, the CONSORT Group Non-randomised trials tend to result in larger estimated treatment effects than
decided to develop extensions to the main CONSORT statement relevant to specific randomised trials.305,306
trial designs. Exten- sions have been published relating to reporting of cluster
Bias jeopardises even RCTs, however, if investigators carry out such trials
rando- mised trials40 and non-inferiority and equivalence trials.39 Lack of resources
improperly.307 A recent systematic review, aggregating the results of several
has meant that other planned extensions have not been completed; they will cover
methodological investigations, found that, for subjective outcomes, trials that used
trials with the following designs: multi- arm parallel, factorial, crossover,
inadequate or unclear allo- cation concealment yielded 31% larger estimates of
within-person.
effect than those that used adequate concealment, and trials that were not blinded
Authors reporting trials with a cluster design or using a non-
yielded 25% larger estimates.153 As might be expected, there was a strong
inferiority or equivalence framework should consult the CONSORT
association between the two.
recommendations in addition to those in this document. Here we make a few
interim comments about the other designs. In each case, the trial design should be The design and implementation of an RCT require methodolog- ical as
well as clinical expertise, meticulous effort,143,308 and a high level of alertnessadopted
for it.50e54 Other groups are using the CONSORT template to improve the
unanticipated difficulties. Reports of RCTs should be written with similarly close reporting of other research designs, such as diagnostic tests311 and observational
attention to reducing bias. Readers should not have to speculate; the methods used 312
studies.
should be complete and transparent so that readers can readily differentiate trials
The CONSORT website (www.consort-statement.org) has been established
with unbiased results from those with questionable results. Sound science
to provide educational material and a repository database of materials relevant to
encompasses adequate reporting, and the conduct of ethical trials rests on the
the reporting of RCTs. The site includes many examples from real trials, including
footing of sound science.309
all of the examples included in this article. We will continue to add good and bad
We hope this update of the CONSORT explanatory article will assist examples of reporting to the database, and we invite readers to submit further
authors in using the 2010 version of CONSORT and explain in general termssuggestions the by contacting us through the website. The CONSORT Group will
importance of adequately reporting of trials. The CONSORT statement can help continue to survey the literature to find relevant articles that address issues relevant
researchers designing trials in future310 and can guide peer reviewers and editors to inthe reporting of RCTs, and we invite authors of any such articles to notify us
their evalu- ation of manuscripts. Indeed, we encourage peer reviewers and editors about them. All of this information will be made accessible through the CONSORT
to use the CONSORT checklist to assess whether authors have reported on these website, which is updated regularly.
items. Such assessments will likely improve the clarity and transparency of More than 400 leading general and specialty journals and biomedical
published trials. Because CONSORT is an evolving document, it requires a
editorial groups, including the ICMJE, World Association of Medical Journal
dynamic process of continual assessment, refinement, and, if necessary, change, Editors, and the Council of Science Editors, have given their official support to
which is why we have this update of the checklist and explanatory article. As CONSORT.
new We invite other journals concerned about the quality of reporting of
evidence and critical comments accumulate, we will evaluate the need for future clinical trials to endorse the CONSORT statement and contact us through our
updates. website to let us know of their support. The ultimate benefactors of these collective
The first version of the CONSORT statement, from 1996, seems to have
efforts should be people who, for whatever reason, require intervention from the
led to improvement in the quality of reporting of RCTs in the journals that have
healthcare community.
GUIDELINE
55 4 9
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
Acknowledgements
ng
We are grateful to Frank Davidoff and Tom Lang for their We gratefully acknowledge financial support from United Kingdom
involvement in the 2001 version of CONSORT explanation and elaboration nal Institute for Health Research; Canadian Insti- tutes of Health Research;
document. A special thanks to Mary Ocampo, the Ottawa CONSORT coordinator, dents Fund, Canadian Institutes of Health Research; Johnson & Johnson; BMJ;
who helped deliver this explanation and elaboration paper and the CONSORT he American Society for Clinical Oncology.
statement.
The CONSORT Group contributors to CONSORT 2010: Douglas
ences
G Altman, Centre for Statistics in Medicine, University of Oxford, UK; Virginia
Barbour, PLoS Medicine, UK; Jesse A Berlin, Johnson & Johnson Pharmaceutical
ennie D. CONSORT reviseddimproving the reporting of randomized trials.
Research and Development, USA; Isabelle Boutron, University Paris 7 Denis J Am Med Assoc 2001;285:2006e7. 2. Schulz KF, Chalmers I, Hayes RJ, Altman DG.
Diderot, Assistance Publique des Hôpitaux de Paris, INSERM, France; PJ irical evidence of bias. Dimensions of methodological quality associated with estimates of
Devereaux, McMaster University, Canada; Kay Dickersin, Johns Hopkins ment effects in controlled trials. J Am Med Assoc 1995;273:408e12. 3. Moher D. CONSORT: an
Bloomberg School of Public Health, USA; Diana Elbourne, London School of ving tool to help improve the quality of reports of randomized controlled trials. Consolidated
dards of Reporting Trials. J Am Med Assoc 1 998;279:1489e91. 4. Kjaergard LL, Villumsen J,
Hygiene & Tropical Medicine, UK; Susan Ellenberg, University of Pennsylvania d C. Quality of randomised clinical trials affects estimates of intervention eff icacy. Rome, Italy:
School of Medicine, USA; Val Gebski, University of Sydney, Australia; Steven Cochrane Colloquium; 1999. 5. Jüni P, Altman DG, Egger M. Systematic reviews in health care:
Goodman, Journal of the Society for Clinical Trials, USA; Peter C Gøtzsche, ssing the
Nordic Cochrane Centre, Denmark; Trish Groves, BMJ, UK; Steven Grunberg, quality of controlled clinical trials. Br Med J 2 001;323:42e6. 6. Veldhuyzen van
en SJ, Cleary C, Talley NJ, Peterson TC, Nyren O, Bradley LA, et al. Drug treatment of
American Society of Clinical Oncology, USA; Brian Haynes, McMaster
tional dyspepsia: a systematic anal- ysis of trial methodology with recommendations for design
University, Canada; Sally Hopewell, Centre for Statistics in Medi- cine, University ture trials. Am J Gastroenterol 1996;91:660e73. 7. Talley NJ, Owen BK, Boyce P, Paterson K.
of Oxford, UK; Astrid James, Lancet; Peter Juhn, Johnson & Johnson, USA; hological treatments for irritable bowel syndrome: a critique of controlled treatment trials. Am J
Philippa Middleton, University of Ade- laide, Australia; Don Minckler, University roenterol 1 996;91:277e83. 8. Adetugbo K, Williams H. How well are randomized controlled
of California Irvine, USA; David Moher, Ottawa Methods Centre, Clinical reported
in the dermatology literature? Arch Dermatol 2000;136:381e5. 9. Kjaergard LL,
Epidemiology Program, Ottawa Hospital Research Institute, Canada; Victor M
lova D, Gluud C. Randomized clinical trials in HEPATOLOGY:
Montori, Knowledge and Encounter Research Unit, Mayo Clinic College of predictors of quality. Hepatology 1999;30:1134e8. 10. Schor S, Karten I. Statistical
Medicine, USA; Cynthia Mulrow, Annals of Internal Medicine, USA; Stuart ation of medical journal manuscripts. J Am
Pocock, London School of Hygiene & Tropical Medicine, UK; Drummond Rennie, Med Assoc 1966;195:1123e8. 11. Gore SM, Jones IG, Rytter EC. Misuse of
JAMA, USA; David L Schriger, Annals of Emergency Medicine, USA; Kenneth F cal methods: critical assessment
Schulz, Family Health International, USA; Iveta Simera, EQUATOR Network, f articles in BMJ from January to March 1976. Br Med J 1 977;1:85e7.
12. Hall JC, Hill D, Watts JM. Misuse of statistical methods in the Australasian
UK; Elizabeth Wager, Sideview, UK.
Ah-See KW, Molony NC. A qualitative assessment of randomized controlled
surgical literature. Aust NZJ Surg 1 982;52:541e3. 13. Altman DG. Statistics in medical
journals. Stat Med 1 982;1:59e71. 14. Pocock SJ, Hughes MD, Lee RJ. Statistical problems in the trials in otolaryngology. J Laryngol Otol 1 998;112:460e3. 32. Altman DG, Doré CJ.
reporting of clinical trials. A survey of three medical journals. N Engl J Med 1 987;317:426e32. 15. Randomisation and baseline comparisons in clinical
Altman DG. The scandal of poor medical research. Br Med J 1994;308 trials. Lancet 1990;335:149e53. 33. Thornley B, Adams C. Content and quality of 2000
:283e4. 16. Chan AW, Altman DG. Epidemiology and reporting of randomised trials controlled trials in
published in PubMed journals. Lancet 2005;365:1159e62. 17. Hopewell S, Dutton S, Yu schizophrenia over 50 years. Br Med J 1998;317:1181e4. 34. DerSimonian R, Charette
LM, Chan AW, Altman DG. The quality of reports of randomised trials in 2000 and 2006: comparative LJ, McPeek B, Mosteller F. Reporting on methods in
study of articles indexed in PubMed. Br Med J 2 010;340:c723. 18. Hollis S, Campbell F. What is clinical trials. N Engl J Med 1 982;306:1332e7. 35. A proposal for structured reporting of
meant by intention to treat analysis? Survey of randomized controlled trials. The Standards of Reporting Trials Group. J Am Med Assoc 1994;272
published randomised controlled trials. Br Med J 1999;319:670e4. 19. Lai TY, Wong 1926e31. 36. Call for comments on a proposal to improve reporting of clinical trials in the biomedical
VW, Lam RF, Cheng AC, Lam DS, Leung GM. Quality of reporting of key methodological items ofliterature. Working Group on Recommendations for Reporting of Clinical Trials in the Biomedical
randomized controlled trials in clinical ophthalmic journals. Ophthalmic Epidemiol 2007;14:390e8.Literature.
20. Ann Intern Med 1994;121:894e5. 37. Rennie D. Reporting randomized controlled trials. An
Moher D, Fortin P, Jadad AR, Jüni P, Klassen T, Le LJ, et al. Completeness of reporting of trials experiment and a call
published in languages other than English: implications for conduct and reporting of systematic for responses from readers. J Am Med Assoc 1995;273:1054e5. 38. Begg C, Cho M,
reviews. Lancet 1 996;347:363e6. 21. Junker CA. Adherence to published standards of reporting: a Eastwood S, Horton R, Moher D, Olkin I, et al. Improving the quality of reporting of randomized
comparison of placebo-controlled trials published in English or German. J Am Med Assoc controlled trials: the CONSORT statement. J Am Med Assoc 1 996;276:637e9. 39. Piaggio G, Elbourne
1998;280:247e9. 22. Altman DG. Randomisation. Br Med J 1 991;302:1481e2. 23. Schulz KF, DR, Altman DG, Pocock SJ, Evans SJ. Reporting of non- inferiority and equivalence randomized
Chalmers I, Grimes DA, Altman DG. Assessing the quality of randomization from reports of controlled trials: an extension of the CONSORT statement. J Am Med Assoc 2006;295:1152e60. 40. Campbell
trials published in obstetrics and gynecology journals. J Am Med Assoc 1 994;272:125e8. 24. MK, Elbourne DR, Altman DG. CONSORT statement: extension to
Streptomycin treatment of pulmonary tuberculosis: a Medical Research cluster randomised trials. Br Med J 2004;328:702e8. 41. Zwarenstein M, Treweek S,
Council investigation. Br Med J 1948;2:769e82. 25. Schulz KF. Randomized controlled
Gagnier JJ, Altman DG, Tunis S, Haynes B, et al. Improving the reporting of pragmatic trials: an
trials. Clin Obstet Gynecol extension of the CONSORT statement. Br Med J 2 008;337:a2390. 42. Ioannidis JP, Evans SJ,
1998;41:245e56. 26. Greenland S. Randomization, statistics, and causal inference. Gøtzsche PC, O'Neill RT, Altman DG, Schulz K, et al. Better reporting of harms in randomized trials:
Epidemiology an extension of the CONSORT statement. Ann Intern Med 2004;141:781e8. 43. Boutron I, Moher D,
1990;1:421e9. 27. Armitage P. The role of randomization in clinical trials. Stat Med Altman DG, Schulz KF, Ravaud P. Extending the CONSORT statement to randomized trials of
1982;1:345e52. 28. Kleijnen J, Gøtzsche PC, Kunz R, Oxman AD, Chalmers I. So what nonpharmacologic
's treatment: explanation and elaboration. Ann Intern Med 2008;148:295e309. 44.
so special about randomisation. In: Maynard A, Chalmers I, editors. Non-random reflec- tions on Gagnier JJ, Boon H, Rochon P, Moher D, Barnes J, Bombardier C. Reporting randomized, controlled
health services research. BMJ Books; 1997. hlm. 93e106. 29. Chalmers I. Assembling comparison trials of herbal interventions: an elaborated CONSORT statement. Ann Intern Med 2 006;144:364e7.
groups to assess the effects of health care. 45. Hopewell S, Clarke M, Moher D, Wager E, Middleton P, Altman DG, et al. CONSORT for
reporting randomized controlled trials in journal and confer- ence abstracts: explanation and
JR Soc Med 1997;90:379e86. 30. Nicolucci A, Grilli R, Alexanian AA, Apolone G, Torri
elaboration. PLoS Med 2008;5:e20.
V, Liberati A. Quality, evolution, and clinical implications of randomized, controlled trials on the
treatment of lung cancer. A lost opportunity for meta-analysis. J Am Med Assoc 1989;262:2101e7. 31.
GUIDELINE
GUIDELINE
GUIDELINE
GUIDELINE
GUIDELINE
55 54
D. Moher et al. / International Journal of Surgery 10 (2012) 28e
305. Deeks JJ, Dinnes J, D'Amico R, Sowden AJ, Sakarovitch C, Song F, et al. Evaluating dence- based approaches for the Ayurvedic traditional herbal formulations:
non-randomised intervention studies. Health Technol Assess 2 003;7(iii):173. 306. Kunz R, Vist G, toward an Ayurvedic CONSORT model. J Altern Complement Med 2008;14:769e76. 311. Bossuyt PM,
Oxman AD. Randomisation to protect against selection bias in Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate
healthcare trials. Cochrane Database Syst Rev 2 007. MR000012. 307. Collins R, MacMahon S. Reliable reporting of studies of diagnostic accuracy: the STARD Initiative. Ann Intern Med 2 003;138:40e4. 312.
assessment of the effects of treatment on von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the
Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting
mortality and major morbidity, I: clinical trials. Lancet 2 001;357:373e80. 308. Schulz KF. Randomised
observational studies. Ann Intern Med 2007;147:573e7. 313. Tonino PA, De Bruyne B, Pijls NH,
trials, human nature, and reporting guidelines. Lancet
Siebert U, Ikeno F, van't Veer M, et al. Fractional flow reserve versus angiography for guiding
1996;348:596e8. 309. Murray GD. Promoting good research practice. Stat Methods percutaneous coro- nary intervention. N Engl J Med 2009;360:213e24.
Med Res
2000;9:17e24. 310. Narahari SR, Ryan TJ, Aggithaya MG, Bose KS, Prasanna KS.
GUIDELINE
55 55
D. Moher et al. / International Journal of Surgery 10 (2012) 28e