Anda di halaman 1dari 470

Kaedah Penyelidikan Pendidikan 1

Pengenalan kepada Proses Penyelidikan

• Sebab mengapa kajian penting

• Cara-cara mendapat ilmu pengetahuan
• Mendifinasikan kajian
• Proses dalam kajian
• Etika dalam Kajian
• Kemahiran yg diperlukan dalam
melakukan penyelidikan.
Refleksi: Cara mengetahui
• Deria
• Berkongsi maklumat dengan org lain
• Diberitahu oleh pakar
• Penakulan logik

Educational Research 2e: Creswell

Apa itu Kajian?

• Penyelidik menanyakan soalan.

• Penyelidik mengumpul data.
• Penyelidik menunjukkan bagaimana
data mampu menjawap
Kepentingan Kajian

• Alasan 1: Kajian menambah ilmu

pengetahuan berkaitan isu-isu pendidikan.
– Mengutarakan jurang dalam ilmu.
– Mengembangkan ilmu.
– Membolehkan pendapat dan suara individu
diguna pakai sebagai ilmu.
Kepentingan Kajian

• Alasan 2: Kajian membantu memperbaiki

amalan P n P.
– Pendidik memperoleh idea-idea baru.
– Pendidik boleh mendalami sesuatu kaedah.
– Pendidik boleh mendalami pelajar-pelajar.
Kepentingan Kajian

• Alasan 3: Kajian memberi maklumat kepada

polisi-polisi yang diperdebatkan.
– Kajian membantu org ramai menimbang
dan menilai sesuatu isu dari perspektif
hasil kajian.
– Kajian menjelaskan kepada org ramai
tentang hasil dari keputusan dari polisi
yang telah dibuat.
Kepentingan Kajian

• Alasan 4: Kajian membina Kemahiran

Penyelidikan pelajar.
– Kemahiran mengorganisasi.
– Kemahiran menganalisis.
– Kemahiran menulis.
– Kemahiran mempersembahkan.
Masalah dengan Kajian

• Dapatan yang berlawanan atau pelik.

• Data yg boleh dipersoalkan.
• Pernyataan masalah yang kabur
tentang apa sebenarnya yang hendak di
• Kaedah prosedur pengumpulan data
yang tidak tepat.
Kitaran Kajian

Mengenal pasti
masalah kajian
Menilai data
dan Tinjauan
Menulis laporan Literatur

Mengenal pasti
Analisis dan tafsir
tujuan kajian
Kutip data
Kitaran Kajian:
Mengenal pasti masalah kajian

• Spesifikkan masalah
• Justifikasi masalah
• Mencadangkan pada halayak mengapa
ianya perlu dilakukan kajian.
Kitaran Kajian:
Tinjauan Literatur
• Dapatkan sumber-sumber
– Buku
– Journals
– Sumber Electronic
• Memilih sumber-sumber
– Kenal pasti sumber yg relevan dgn topik.
– Mengurus sumber dengan membentuk “peta
• Rumusan dan kesimpulan sumber-sumber ke dalam
tinjauan literatur.
Kitaran kajian: Spesifikan tujuan

• Kenal pasti pernyataan masalah

• Kuncupkan pernyataan masalah
kepada persoalan kajian.
– Kuantitatif: Soalan-soalan kajian dan /atau
– Kualitatif: Central Phenomenon and sub-
Kitaran Kajian:
Pengumpulan data

• Tentukan kaedah pengumpulan data.

• Pilih individu2 utk kajian.
• Rekabentuk instrumen pengumpulan
data dan rangkakan prosedur.
• Dapatkan kebenaran mengutip data.
• Kumpulkan segala maklumat utk di
Kitaran Kajian:
Analisis dan mentafsir data

• Permudahkan data
• Persembahkan data
• Penerangan daripada data.
Kitaran Kajian:
Pelaporan dan penilaian kajian

• Tentukan utk siapa laporan itu perlu

• Strukturkan laporan.
• Tulis laporan dengan tepat dan sensitif.
Etika dalam kajian

• Hormati hak individu yang terlibat dalam

• Hormati syarat dan peraturan di tapak
• Lapurkan dapatan kajian dengan
lengkap dan jujur.
Kemahiran yang diperlukan
dalam kajian

• Rasa ingin tahu menyelesaikan

• Tumpuan yang berpanjangan.
• Mengunakan perpustakaan dan sumber
• Menulis dan mengedit.
Tips utk mencari masalah yg sah

• Lihat di sekeliling anda.

• Baca literatur.
• Hadiri seminar professional.
• Dapatkan nasihat dari pakar.
• Pilih tajuk yang merangsang minat dan
motivasi anda!!
• Pilih tajuk yang membolehkan orang
lain akan tertarik dan bernilai kepada
bidang ilmu.
Tamat kuliah 1

• Apakah bentuk kajian anda? (20 min)

- Pikirkan idea kajian atau masalah yang hendak anda
- Dengan ringkas tuliskan masalah yang hendak anda
kaji itu dalam bentuk ayat
( satu muka surat)
Kenal pasti
masalah kajian
Nilai Data
dan Tinjauan
Tulis laporan literatur

Analisa dan Nyatakan

Tafsir tujuan
Data kajian
Kutip Data
• Masalah kajian adalah isu yang timbul, yang menarik perhatian
atau menjadi penggerak atau dorongan untuk melakukan kajian
• Merupakan faktor pertimbangan utama sama ada kajian perlu
dilakukan atau tidak
• Memberi fokus kepada kemungkinan kajian yang hendak
dijalankan dan implikasinya



• Minat dan pengalaman
• Teori yang diamalkan – timbul ketidak pastian dan keinginan untuk
menguji teori
• Replikasi kajian lepas – kajian yang sama ke atas subjek dan tempat
yang berlainan untuk mengukuhkan dapatan kajian lepas
• Hasil-hasil kajian lepas yang bercanggah
• Masalah kajian haruslah praktikal – boleh dikaji
• Masalah kajian mestilah penting – bermakna untuk
diselidiki –
• kepentingan praktikal : meningkatkan amalan
• Kepentingan teoritikal : boleh memberi sumbangan
kepada bidang ilmu
• Bolehkah masalah dikaji?
• Adakah anda mempunyai akses kepada lokasi kajian?
• Adakah anda mempunyai masa, sumber dan kemahiran
untuk menjalankan kajian?
• Patutkah masalah dikaji?
• Adakah ia menambah pengetahuan?
• Adakah ia menyumbang kepada amalan?

Apa yang
Kekurangan diperolehi oleh
Isu Bukti
TopiK dari segi audiens jika
pendidikan kepada isu
bukti Kekurangan
bukti diatasi
Subject •A Concern •Evidence from •In this body of
Area •A Problem the literature evidence, what is
•Something How will
•Evidence from missing? addressing
that needs a practical •What do we
solution what we need to
experiences need to know know help:
more about? – researchers
– educators
– policy makers
– individuals like
those in the study

Umum Topik Pembelajaran jarak jauh

Masalah Kurang pelajar


Pernyataan Mengkaji mengapa pelajar tidak

mengikuti PJJ di kelas komuniti

Soalan Adakah penggunaan teknologi

laman web dalam bilik darjah
Spesifik tidak menggalakkan pelajar
mengikuti pembelajaran jarak
Guna kualitatif jika masalah kajian
Guna kuantitatif jika masalah anda memerlukan anda utk
kajian anda memerlukan anda • Memahami pandangan individu yang
utk anda ingin kaji.
•Mengukur pembolehubah • Mentaksir proses secara berterusan.
•Melihat kesan dan impak • Membentuk teori berdasarkan
antara pembolehubah bebas perspektif individu perserta yg
dan bersandar. dikaji.
•Menguji teori. • Mendapatkan maklumat yg detail
•Aplikasi keputusan kepada tentang beberapa individu atau
tapak kajian.
bilangan org yg ramai
• Masalah kajian adalah isu atau masalah dalam kajian
• Topik kajian ialah bidang yang ingin dikaji.
• Tujuan adalah tujuan atau objektif kajian
• Soalan kajian adalah soalan yang ingin dicari jawapannya
oleh pengkaji.


Tujuan Kajian Objektif Kajian Persoalan kajian Hipotesis Kajian
Keinginan Arah keseluruhan Menetapkan Kemukakan soalan Membuat ramalan
(Peranan fokus kajian. matlamat untuk untuk dijawab mengenai jangkaan
dalam dicapai
Kegunaan Penyelidikan Lazimnya Penyelidikan Penyelidikan
kuantitatif & penyelidikan kuantitatif & kuantitatif
kualitatif kuantitatif & kualitatif

• Apa itu pembolehubah?

• Apa itu teori?

• Apakah elemen-elemen didalam pernyataan ini

• Pencapaian
• Umur
• Jantina
• IQ
• Motivasi
• Konsep kendiri
Pembolehubah tak bersandar: Ciri atau atribut yang
mempengaruhi sesuatu hasil atau pembolehubah
–Pembolehubah rawatan
–Pembolehubah yang diukur
–Pembolehubah kawalan
–Pembolehubah “Moderating
Pembolehubah bersandar: Ciri atau atribut yang dipengaruhi oleh
pembolehubah tak bersandar.

Sebab yg
mungkin Kesan
(X) (Y) (Z)
Independent Intervening Dependent
Variables Variables Variables

Control Moderating
Variables Variables
Teori sebagai jambatan antara pemboleh-ubah bersandar dan tidak

Pembolehubah Pembolehubah
Tak bersandar bersandar

Menghuraikan secara umum arah & tujuan

kajian dalam bentuk matlamat kajian

Menerangkan mengenai hasrat utama

kajian dijalankan

Perlu dinyatakan setelah pengkaji

mengenal pasti pernyataan masalah

Tujuan kajian ini adalah untuk mengkaji hubungan penggunaan

komunikasi internet antara guru dan ibu bapa di sekolah menengah
daerah Hulu Langat dengan pencapaian ujian pelajar dalam kajian
sosial sekolah menengah.

… perspektif kualitatif:
Tujuan kajian ini adalah untuk meneroka pandangan ibu bapa mengenai
komunikasi internet antara guru-guru dan ibubapa tentang pelajar
mereka di sekolah menengah daerah Hulu Langat.

Dibina berdasarkan penyataan masalah kajian

Mengandungi katakerja yang hendak dicapai

Ditemui pada akhir bahagian "penyataan masalah", selepas latar belakang

kajian, atau dalam bahagian berasingan kajian
1. Objektif umum:
membayangkan maksud menyelesaikan masalah yang telah dibentangkan dalam
penyataan masalah.

Contoh: mengkaji kesan-kesan pendekatan konstruktivisme dalam pengajaran


2. Objektif khusus:

Contoh: mengkaji kesan pendekatan konstruktivisme terhadap pencapaian

pelajar dalam mata pelajaran Sains.

Nota: Sekiranya tujuan kajian sudah berbentuk penyataan umun (seperti objektif
umum) maka hanya tuliskan objektif kajian
Kajian Kuantitatif

Ayat objektif kajian menunjukkan dengan jelas hubungan perbezaan/sumbangan/kesan

interaksi di antara pembolehubah yang dikaji (bersandar dan bebas)

1. Mengenalpasti/menentukan hubungan motivasi dengan pencapaian pelajar
2. Mengenalpasti perbezaan motivasi di kalangan pelajar lelaki dan perempuan

Kajian Kualitatif
Ayat objektif mengambarkan penerokaan ke arah memahami.

Contoh :
1. Meneroka kaedah belajar pelajar cemerlang.
2. Mengkaji masalah pengetua semasa mengurus sekolah
Masalah kajian dikemukakan dalam bentuk
soalan sebagai fokus kepada penyelidikan.

Soalan kajian disenaraikan berdasarkan apa

yang telah dinyatakan dalam tujuan kajian.

Lazimnya soalan kajian dibina berdasarkan

kepada objektif kajian untuk meneroka topik
Kuantitatif (pemboleh ubah) Kualitatif (konsep)
1. Adakah komunikasi internet ibu bapa dengan 1. Bagaimanakah pengalaman komunikasi
guru menjejaskan prestasi pelajar di dalam internet ibu bapa dengan guru-guru
kelas? digunakan bagi menilai prestasi anak-anak?


1. Memberi jangkaan/ramalan sementara tentang sesuatu fenomena:
menghubungkan fenomena dan penyelesaian masalah dengan
badan pengetahuan berdasarkan fakta yang diperolehi.
2. Memberi kenyataan hubungan yang boleh diuji secara langsung.
3. Memberi panduan penyelidikan: sebagai wakil objektif. Hipotesis
menentukan apakah masalah dan bagaimana mengumpul (kaedah
kajian), menganalisis & mentakbir data dan menentukan asas
pemilihan sampel
4. Memberi kerangka laporan dapatan dan kesimpulan
• Boleh dinyatakan dengan mudah, jelas dan tepat
Contohnya, prestasi penguasaan matematik dalam kalangan pelajar perempuan
lebih tinggi berbanding dengan pelajar lelaki
• Mempunyai jangkaan hubungan antara pemboleh ubah (boleh diuji: diukur dan dicerap)
• Harus bersifat khusus dan tidak boleh bersifat umum (khusus untuk perkara yang ingin
diukur). Oleh itu objektif khusus boleh dijadikan panduan dan secara tidak langsung
memudahkan proses mengenal pasti hubungan/ perkaitan/ perbezaan antara pemboleh
• Mesti tekal dengan pengetahuan yang sedia ada/teori, dan tidak bertentangan dengan
teori dan hukum yang ada.
• Mempunyai kuasa penjelasan: logik, tepat, jelas dan rasional. Contohnya, pendekatan
konstruktivisme dapat meningkatkan minat pelajar terhadap mata pelajaran sains.
• Mesti boleh diuji – pengujian hipotesis membolehkan kepastian dibuat sama ada bukti
yang ada menyokong ataupun tidak hipotesis yang bersifat sementara.
4.2 Jenis-jenis hipotesis


Hipotesis Nol Hipotesis Alternatif

Berarah Tidak berarah








Ho: Tidak terdapat perbezaan signifikan prestasi B.Inggeris antara pelajar bandar dan luar

Ha: Pelajar bandar menunjukkan prestasi B.Inggeris lebih tinggi berbanding pelajar luar bandar.

Jika terbukti prestasi B. Inggeris di bandar lebih tinggi berbanding dengan di luar bandar, maka
tindakan seterusnya ialah menggubal program yang sesuai bagi meningkatkan prestasi B.
Inggeris dalam kalangan pelajar luar bandar, dan mendorong penumpuan penggunaan sumber
pendidikan dengan maksimum.

1. Memandangkan hipotesis berarah sangat berkesan, pengkaji perlu mendapatkan sokongan

literatur sebelum membuat keputusan mengemukakan hipotesis berarah.
2. Tanpa sokongan teori hipotesis berarah mungkin kurang bernas dan kurang berwibawa,
dan seterusnya menghasilkan teori yang berasingan yang akan merosakkan badan
pengetahuan sesuatu disiplin ilmu. 34
Hipotesis alternatif (Ha)







Ho Tidak terdapat perbezaan pencapaian dalam mata pelajaran
kejuruteraan teknologi antara pelajar lelaki dan perempuan di Sek.
Men. Sri Indah, Ampang.

Intepretasi hipotesis

Jika hasil ujian statistik menolak hipotesis nul, ini bermakna populasi
kajian menunjukkan terdapat perbezaan pencapaian dalam mata
pelajaran kejuruteraan teknologi antara kedua-dua kumpulan jantina.

Sebaliknya, jika hasil ujian statistik menerima hipotesis nul, ini

menunjukkan bahawa dalam populasi kajian tidak menunjukkan
perbezaan pencapaian dalam mata pelajaran kejuruteraan teknologi
antara kedua-dua kumpulan jantina. 36
Contoh kajian kuantitatif

Bil Objektif Kajian Persoalan kajian Hipotesis Kajian

1 Mengenal pasti Adakah terdapat Ho: Tidak terdapat
hubungan motivasi hubungan antara motivasi hubungan antara
dengan pencapaian dengan pencapaian motivasi dengan
pelajar pelajar? pencapaian pelajar
2 Mengenal pasti Adakah terdapat Ho: Tidak terdapat
perbezaan tahap perbezaan tahap motivasi perbezaan tahap
motivasi dalam kalangan dalam kalangan pelajar motivasi mengikut
pelajar lelaki dan lelaki dan perempuan? jantina.

Contoh kajian kualitatif

Bil Objektif Kajian Persoalan kajian Hipotesis Kajian

1 Meneroka peranan pengetua Bagaimanakah peranan Tiada
semasa mengurus sekolah. pengetua semasa mengurus

2 Mengkaji kesan penggunaan Bagimanakah kesan Tiada

Modul KBAT-Sains terhadap penggunaan Modul KBAT Sains
minat murid dalam mata terhadap minat murid dalam
pelajaran Sains. mata pelajaran Sains?



Kajian Literatur
GGGA3232 Penyelidikan Pendidikan 1
Topik kuliah
Kepentingan Kaedah dan Cara
Tinjauan Literatur / Menulis/ Method
Significance of the and Writing
Literature Review Approach

Cara Menulis
Rujukan/ Writing
Definisi Tinjauan Literatur – what?
• Kajian literatur adalah ringkasan bertulis dari artikel jurnal, buku,
dan dokumen lain mengenai topik tertentu atau beberapa topik
yang berkaitan kajian penyelidikan anda.
• Menggunakan pendekatan sistematik
Bila perlu mulakan Tinjauan
Literatur? – when?
• Secepat mungkin sebelum menjalankan sesuatu projek kajian
• Maklumat yang diperoleh setelah menjalankan tinjauan literatur
akan bantu beri idea untuk menulis:
– permasalahan kajian,
– objektif kajian,
– persoalan kajian,
– metodologi kajian
– Sampel kajian
Kenapa perlu jalankan Tinjauan
Literatur ? – why?
• Untuk mengenal pasti jurang/lompang kajian
• Untuk mengetahui teori atau model kajian sedia ada
• Untuk mengetahui metodologi yang biasa digunakan anda
boleh mengguna pakai metodologi yang belum pernah atau
jarang digunakan (ini satu cara untuk menjalankan kajian
berdaya cipta/original studies)
• Untuk menunjukkan bagaimana kajian anda memberi
sumbangan kepada literatur sedia ada
Di mana boleh dicari sumber
rujukan? – where?
• Jurnal berimpak tinggi • Books
• Conceptual and/or broad
• Buku – berguna untuk • Major theories
memahami konsep namun • The ‘classics’
ada juga yang menerbitkan • Provide very good summaries
• Articles
kompilasi kajian empirikal
• Tend to be rather empirical
• Grey literature • Include research methods
• Latest developments in the field
• Find and existing literature review!
Di mana boleh dicari sumber
rujukan? – where?
• Alberani (1990):
– Penerbitan bukan konvensional, tidak popular, dan kadangkala yang
bersifat sementara

• Contoh: laporan, tesis, prosiding konferens, panduan pengguna

teknikal dan standard, terjemahan bukan komersil, dokumentasi
teknikal dan komersil, dan dokumen rasmi yang tidak diterbitkan
secara komersil (kebiasaannya laporan dan dokumen kerajaan)
Di mana boleh dicari sumber
rujukan? – where?
• Pilih istilah atau technical terms yang selalu digunakan dalam
bidang kajian
• Lihat pada kata kunci / keywords article untuk mengenalpasti
istilah yang penting
• Contoh: “kajian tindakan” jika ingin mencari artikel yang
menjalankan kajian tindakan
Di mana boleh dicari sumber
rujukan? – where?
Search engines

EBSO, ProQuest, • Specialized databases

ScienceDirect, etc. • Very good search functions available

Google Scholar • Searches across different databases

( • Less detailed search functions available
• Links to university library
• Includes articles, books, conference papers
(everything that can be found online!) be
cautious regarding quality!
Di mana boleh dicari sumber
rujukan? – where?
Artikel yang mana perlu dipilih?

• Sitasi/Citation
– Artikel yang banyak menjadi rujukan memberi impak yang besar kerana
banyak dibincangkan dalam artikel penyelidik lain
– Kekurangan : artikel lama selalunya memiliki jumlah sitasi/citation yang
tinggi berbanding artikel yang baru diterbitkan – outdated knowledge
– Artikel mengenai Multimedia Learning (Mayer) – banyak yang sudah
dikemaskini seiring dengan peredaran zaman teknologi
Di mana boleh dicari sumber
rujukan? – where?
Tips untuk mencari artikel?
• Gunakan kata kunci yang relevan
• Kurangkan jumlah kata kunci (jangan terlalu banyak)
• Carian berdasarkan nama penulis / author
– Leading author dalam bidang

• Teknik snowball

Bagaimana Tinjauan Literatur ditulis?
– How?

Kesimpulan /
Tajuk Abstrak Rumusan / Keseluruhan artikel
Bagaimana Tinjauan Literatur ditulis?
– How?
• Mulakan dengan betul - mengenalpasti
– Apakah permasalahan / tujuan kajian anda?
– Apakah persoalan kajian anda?
– Apakah cabang/tema kajian yang terlibat?
– Apakah sumbangan kajian yang dijangkakan menerusi kajian anda?
Bagaimana Tinjauan Literatur ditulis?
– How?
Coding articles (create your own database)
• Berikan kod/label kepada artikel (easy sorting). Contoh:
– Jenis penerbitan (konseptual, empirikal dsb.)
– Teori yang digunakan
– Pembolehubah / konstruk yang dikaji
– Metodologi / kaedah yang digunakan
– Dapatan kajian
Bagaimana Tinjauan Literatur ditulis?
– How?
• Analsisi perlu bersifat kritis
• Perbincangan KAJIAN LITERATUR perlu dibahagikan mengikut
– Contohnya dari kategori umum kepada spesifik atau disusun mengikut
– Contoh umum kepada spesifik - mulakan dengan tinjauan kajian oleh
sarjana luar sebelum menfokus tinjauan kepada kajian sarjana tempatan
– Contoh tema mengikut kategori - dapatan positif vs. dapatan negatif,
dapatan mengikut subjek kajian (sekolah rendah, menengah, IPT)
Bagaimana Tinjauan Literatur ditulis?
– How?
• perbincangan kekuatan dan kelemahan konsep dan
metodologi kajian lepas
• perkaitan sumber rujukan antara satu sama lain dan antara
kajian lepas dan kajian sendiri
• titik persamaan dan perbezaan antara kajian
• persoalan yang belum terjawab yang kajian anda cuba
• Elakkan menerangkan dan meringkaskan sumber secara satu
Bagaimana Tinjauan Literatur ditulis?
– How?
• Selain itu, boleh juga disusun mengikut research variable
(pemboleh ubah kajian)
– kesan pengajaran kepada pemikiran kritis
– motivasi pembelajaran
– atitud bahasa
– tahap penguasaan bahasa

• Namun perlu pastikan variable yang dibincangkan juga

merupakan variable yang disiasat dalam kajian anda
Bagaimana Tinjauan Literatur ditulis?
– How?
• Bagi kajian kuantitatif - buat tinjauan literatur berkaitan:
– pemboleh ubah tidak bersandar (independent variable)
– pemboleh ubah bersandar (dependent variable)

• Bagi kajian kualitatif - buat tinjauan literatur berkaitan:

– konsep atau fenomena yang dikaji
Proses merekod rujukan – How?
• Jangan bertangguh - rekod secepat mungkin – jika bukan dalam
bentuk apps atau software programmes, tangkap gambar bahan
rujukan dengan nama pengarang, tahun terbit, nama penerbit, tajuk
penerbitan, muka surat, dsbnya
• EndNote – contoh software programme mudah
• Tutorial atas talian, sila rujuk:

• Gaya UKM:

• Software programmes lain: Mendeley, Zotero, JabRef

Proses merekod rujukan – How?
• Hindari penggunaan cited from, cited in (dipetik dari) – situasi ini
terjadi apabila penulis hanya menceduk quotation penulis asal
daripada penulis lain;
– sebaiknya baca dan petik dari rujukan asal kerana boleh jadi penulis lain
salah tafsir idea penulis asal
• Jenis petikan dalam teks:
– Petikan pendek, contoh: According to Harwood (1998), “Teachers often
faced difficulty implementing differentiated instruction in the classroom" (p.
– Petikan panjang: 40 patah kata dan lebih, contoh di bawah:
• Harwood's (1998) study found the following:
– Teachers often faced difficulty implementing differentiated instruction in the
classroom. This difficulty could be attributed to the fact that differentiating
lessons is time-consuming. (p. 199)
• Alberani V, Pietrangeli PDC, Mazza AMR (1990). The use of grey
literature in health sciences: a preliminary survey. Bulletin of the
Medical Library Association, 78(4), 358-363.
• GL’99 Conference Program. Fourth International Conference on
Grey Literature: New Frontiers in Grey Literature.GreyNet, Grey
Literature Network Service. Washington D.C. USA, 4-5 October
Terima Kasih
Penyelidikan Pendidikan 1

Siapa yang akan dikaji?

 Persampelan bermaksud proses memilih sekumpulan
(orang, institusi, tempat, atau fenomena) oleh pengkaji
untuk untuk dijadikan responden kajianyang mewakili
kumpulan besar (orang, institusi, tempat, atau
fenomena) yang dipilih.

 Penggunaan sampel yang tidak sesuai akan

mengurangkan kesahan dan kebolehpercayaan
 Perancangan persampelan yang rapi akan dapat;
 Memudahkan pengumpulan data
 Mengurangkan ralat pengukuran
 Menjimatkan masa dan perbelanjaan

 Maklumat daripada sampel dikenali sebagai statistik


Membolehkan penyelidik
membuat inferens atau
penyeluruhan penemuan Menjimatkan Menjimatkan Memudahkan
kajiannya terhadap keseluruhan
populasi kajian, tanpa perlu
bertanya kepada setiap seorang
kos masa pengurusan
dalam populasi. penyelidikan
Sampel, Populasi, Representasi

 Sampel – individu-individu yang dilibatkan dalam kajian

 Populasi – kumpulan manusia yang berkenaan dengan kajian tersebut
(kepada mereka inilah dapatan kajian akan terpakai/digeneralisasi)
 Sampel kajian yang baik: mirip ciri-ciri populasi kajian dalam ciri-ciri yang
paling penting (contohnya, dari segi umur, jantina, kumpulan etnik,
kemampuan akademik, kefasihan bahasa, dan status sosioekonomi)

 Isu representasi sangatlah penting kerana kita membuat kesimpulan daripada

dapatan kajian yang melibatkan sampel kajian untuk digeneralisasi kepada
populasi kajian
 Perlu guna pakai prosedur persampelan yang sesuai
Kumpulan sasaran yang anda ingin
generalisasikan dapatan kajian anda

Pilih sampel dan buat Membuat generalisasi

kajian menggunakan dapatan daripada
sampel tersebut sampel kepada populasi

Subjek yang sebenarnya
digunakan dalam kajian
Populasi dan sampel


Sampel Sampel
-Semua guru di sekolah -100 guru di sekolah
tinggi tinggi
di satu bandar di satu bandar
-pelajar kolej di semua IPTA -pelajar kolej di satu
sampel dipilih?
Berapa ramai individu yang perlu dilibatkan
dalam kajian saya = Seberapa besar saiz
sampel kajian saya?

Siapa subjek/responden dalam
Kajian persampelan kajian saya yang perlu
Dalam kajian QUAN, persampelan perlu
ditentukan dengan lebih awal – ini akan
mempengaruhi perancangan awal,
perancangan masa, dan jadual
perkembangan projek
Jenis persampelan

Persampelan Rawak Persampelan bukan

Setiap ahli dalam rawak
populasi mempunyai Bukan semua ahli dalam
peluang untuk dipilih populasi mempunyai
peluang untuk dipilih

Rawak mudah Persampelan Convenience Snow-balling

berlapis/berstrata Sampling Sampling
Simple Random
Sampling Stratified Random
Sampling Purposive
Rawak sistematik Persampelan
Systematic Random
Sampling Cluster Sampling
Persampelan Rawak

Setiap unit di dalam

peluang digunakan
didalam proses
yang sama untuk
dipilih sebagai

Dikenali sebagai
Bias dihapuskan persampelan
didalam proses berkebarangkalian
pemilihan. (Probability
Tidak semua unit populasi mempunyai
kebarangkalian untuk dipilih kedalam

Terbuka untuk bias pemilihan

Tidak Rawak Kaedah pemilihan data tidak
bersesuaian bagi kebanyakan kaedah

Dikenali sebagai persampelan tidak

berkebarangkalian (non-probability
A. Persampelan Rawak Mudah (Simple Random
 Semua subjek dalam populasi diambil kira dan
setiap subjek mempunyai peluang yang sama
untuk dipilih.
 Tujuannya ialah untuk memilih sampel yang boleh
mewakili populasi
PERSAMPELAN  Kaedah ini perlukan rangka persampelan
PROBABILITI  Cara yang biasa digunakan:
 Nomborkan setiap unit dalam kerangka dari 1
hingga N.
 Gunakan jadual nombor rawak atau penjana
nombor rawak untuk memilih n nombor yang
berbeza diantara 1 hingga N.
Contoh Jadual Nombor Rawak
97446 30328 05262 77371
15453 75591 60540 77137
69995 77086 55217 53721
69726 58696 27272 38148
23604 31948 16926 26360
13640 17233 58650 47819
90779 09199 51169 94892
71068 19459 32339 10124
A. Persampelan Rawak Mudah (Simple
Random Sampling)
 Kelebihan: Dapatan kajian dapat dibuat
generalisi kepada populasi.
PERSAMPELAN  Kelemahan:Memakan masa.
PROBABILITI  Agak mustahil atau sukar untuk
mendapatkan senarai keseluruhan subjek
dalam populasi.
B. Persampelan Rawak Sistematik
(Systematic Random Sampling)
 Setiap subjek/sampel dipilih daripada
senarai populasi bermula daripada
satu nilai rawak (random point).
 Senarai populasi mestilah rawak.
PERSAMPELAN Misalnya senarai nama murid mengikut
‘alphabetical order’.
PROBABILITI  Sampel hendaklah dipilih bermula
pada satu nilai rawak dan seterusnya
berdasarkan sela sampel yang
ditetapkan. Tidak semestinya bermula
dengan A atau no 1 dalam senarai.
 Sela sampel ditentukan berdasarkan
saiz sampel dan populasi.
B. Persampelan Rawak Sistematik
(Systematic Random Sampling)

Contoh cara memilih sampel:

 Bilangan pelajar Tingkatan empat di
NS untuk tahun lepas diberi nombor siri
1 hingga 10,000 (N = 10,000).
PERSAMPELAN  Hendak memilih sampel seramai 100 (n
 k = 10,000/100 = 100
 Unsur sampel pertama dipilih secara
rawak dari 100 pelajar yang pertama.
Andaikan pelajar yang ke 45 adalah
 Turutan unsur sampel: 145, 245, 345, . . .
PERSAMPELAN B. Persampelan Rawak Sistematik
PROBABILITI (Systematic Random Sampling)

 Kelebihan: Mudah dijalankan jika

senarai populasi diperolehi.
 Kelemahan:Kemungkinan berlaku
bias sistematik.
C. Persampelan Rawak Berstrata (Stratified
Random Sampling)
 Populasi dibahagikan kepada sub-
populasi/ kumpulan/ strata.
 Sampel dipilih mewakili setiap strata.
 Terdapat dua cara pemilihan:
 Proportionate Stratified Random
Sampling : Berkadaran → peratus
PERSAMPELAN sampel diambil dari setiap strata
PROBABILITI adalah berkadaran dengan
peratus setiap strata didalam
 Disproportionate Stratified Random
Sampling: Tidak berkadaran →
bahagian strata dikalangan
sampel adalah berbeza dari
bahagian strata dalam populasi
Persampelan rawak
(Stratified Random
 Populasi dibahagikan
kepada dua atau lebih
kumpulan yang
dipanggil strata
berdasarkan kriteria
tertentu seperti lokasi,
tahap pencapaian,
umur, pendapatan;
dan sub-sampel dipilih
secara rawak
daripada setiap strata
Persampelan rawak berstrata
.66 dr pop. 200

.33 dr pop 100

Jumlah sampel = 300

D. Persampelan Rawak Kluster (Cluster Random
 Kenal pasti kumpulan yang mempunyai ahli
yang heterogenus.
 Pilih kumpulan tersebut secara rawak.
PERSAMPELAN  Semua ahli dalam kumpulan yang dipilih
secara rawak dijadikan sampel kajian.
PROBABILITI  Kelebihan: Lebih praktikal dan ekonomi jika
dibandingkan dengan kaedah persampelan
 kelemahan:Paling kurang reliable jika
dibandingkan dengan kaedah persampelan
Jenis persampelan

Persampelan Rawak Persampelan bukan

Setiap ahli dalam rawak
populasi mempunyai Bukan semua ahli dalam
peluang untuk dipilih populasi mempunyai
peluang untuk dipilih

Rawak mudah Persampelan Convenience Snow-balling

berlapis/berstrata Sampling Sampling
Simple Random
Sampling Stratified Random
Sampling Purposive
Rawak sistematik Persampelan
Systematic Random
Sampling Cluster Sampling
A. Convenience Sampling

Subjek/sampel dipilih berdasarkan
kesediaan dan secara sukarela serta
kemudahan kepada penyelidik.
B. Persampelan Bertujuan ( Purposive
(also known as judgment, selective or
subjective sampling) is a sampling
technique in which researcher relies
PERSAMPELAN on his or her own judgment when
NONPROBABILITI choosing members of population to
participate in the study
 Subjek/sampel dipilih berdasarkan
kriteria tertentu selaras dengan tujuan
D. Snowball Sampling
 (or chain sampling, chain-referral
PERSAMPELAN sampling, referral sampling) is a non-
probability sampling technique where
NONPROBABILITI existing study subjects recruit future
subjects from among their
Sampel kajian perlu sebesar mana?

 Pertimbangkan beberapa garis panduan am:

1. Rule of Thumb:
 Dalam literatur kajian jenis tinjauan (survey): antara 1% dan 10% daripada
jumlah populasi dengan minimum 100 responden
 Lebih saintifik prosedur persampelan, lebih kecil saiz sampel:
 Kajian korelasi: sekurang-kurangnya 30 responden
 Kajian perbandingan dan eksperimental: sekurang-kurangnya 15 ahli dalam
setiap kumpulan
 Factor analytic dan prosedur multivariate yang lain: sekurang-kurangnya 100
 Penentuan bilangan sampel menggunakan
konsep-konsep tersebut agak rumit
(menggunakan formula yang tertentu)
 Ada juga perisian komputer yang boleh
membantu mengira bilangan sampel yang
sesuai berdasarkan faktor-faktor yang tertentu
 Satu lagi cara am menentukan bilangan
sampel ialah dengan mengenal pasti prosedur
statistik yang digunakan
SAIZ SAMPEL  Creswel (2005) mencadangkan:
 Anggaran 15 peserta bagi satu kumpulan
 Anggaran 30 peserta bagi kajian
 Anggaran 350 subjek bagi kajian tinjauan
(saiz berubah-ubah bergantung kepada
faktor-faktor tertentu
 Ada penulis yang mencadangkan jadual
penentuan sampel berdasarkan faktor-
SAIZ SAMPEL faktor yang tertentu
 Lihat Cohen
 Krejcie & Morgan
 Lipsey
For survey research, if the population is fewer than 200
individuals, the entire population should be sampled. This
would considered census sampling. At around a
population of 400, approximately 50% of the population
should make up the sample, and population over 1000
require about 20% for an appropriate sample. For large
population of 5,000 or more, samples of 350 to 500 person s
are often adequate.

For correlational studies, a minimum of 30 participants

should be tested.

Experimental research studies generally require at least 30

participants per group.

These generalizations are based on the work of Krejie and

Morgan (1970), and their articles should be consulted for
more precise information about sample size.
Jadual Penentuan Saiz Sampel oleh Krejcie dan Morgan (1970)
10 10 220 140 1200 291
15 14 230 144 1300 297
20 19 240 148 1400 302
25 24 250 152 1500 306
30 28 260 155 1600 310
35 32 270 159 1700 313
40 36 280 162 1800 317
45 40 290 165 1900 320
50 44 300 169 2000 322
55 48 320 175 2200 327
60 52 340 181 2400 331
65 56 360 186 2600 335
70 59 380 191 2800 338
75 63 400 196 3000 341
80 66 420 201 3500 346
85 70 440 205 4000 351
90 73 460 210 4500 354
95 76 480 214 5000 357
100 80 500 217 6000 361
110 86 550 226 7000 364
120 92 600 234 8000 367
130 97 650 242 9000 368
140 103 700 248 10000 370
150 108 750 254 15000 375
160 113 800 260 20000 377
170 118 850 265 30000 379
180 123 900 269 40000 380
190 127 950 274 50000 381
200 132 1000 278 75000 382
210 136 1100 285 100000 384
2. Pertimbangan Statistik (Statistical Consideration)
 Keperluan
asas dalam kajian kuantitatif untuk
mendapatkan taburan normal
 Perlu
libatkan sekurang-kurangnya 30 responden
(Hatch and Lazaraton, 1991)
 Walaubagaimanapun, saiz sampel yang lebih kecil
boleh diterima sekiranya dianalisis dengan ujian
bukan parametric
3. Komposisi Sampel
 Adakah terdapat sub-kumpulan yang berbeza atau mempunyai keperluan berbeza atau
akan bertingkah laku secara berbeza, contohnya lelaki vs perempuan, bahasa pertama
yang berbeza
 Perlu tentukan saiz sampel agar saiz minimum memenuhi keperluan sub-kumpulan
dengan jumlah ahli yang paling sedikit

4. Margin keselamatan (safety margin)

 Pastikan ada margin yang mencukupi untuk berdepan dengan situasi yang tidak
dijangka, contohnya sesetengah responden/subjek kajian boleh jadi menarik diri
(berkaitan kadar mortaliti atau atrisi), ataupun mengisi borang soal selidik secara main-
main (jadi mereka perlu dikeluarkan dari sampel kajian)
Ralat persampelan

Ralat persampelan
Data dari sampel tidak
merupakan perbezaan
rawak adalah tidak sesuai Ralat Persampelan terjadi
atau variasi antara min
untuk dianalisis oleh apabila sampel tidak
bagi sampel yang dirawak
kaedah statistik mewakili populasi
dengan min populasi yang
bertaburan secara normal.
Missing Data, Recording, Data
Entry, and Analysis Errors

Ralat bukan Konsep yang lemah, definasi

tidak jelas, dan soal selidik yang
persampelan mengelirukan

Ralat jawapan terjadi apabila

responden tidak tahu, tidak
menjawab, atau jawapan yang
Pengumpulan data, Jenis data dan
Pembinaan Instrumen (Soal Selidik)

GA3232: Penyelidikan Penyelidikan I

Zolkepeli Haron 2020

Suatu proses yang sistematik untuk memberikan nombor-nombor
kepada kuantiti atau kualiti sesuatu trait atau ciri yang diukur
(assigning numbers to something that is being measured).

Pengukuran ditakrifkan sebagai pemberian angka-angka kepada objek,

atau peristwa mengikut peraturan yang tertentu dengan skala yang telah
ditetapkan terlebih dahulu.
Merupakan satu proses yang sistematik untuk
mendapatkan maklumat, membuat pertimbangan dan
seterusnya membolehkan seseorang untuk membuat

Isu penting berkait dengan pengumpulan maklumat;

• Maklumat mesti relevan dan sesuai dengan persoalan
• Fikirkan bagaimana maklumat dikumpul, diproses,
dianalisis, ditafsir dan dilaporkan.
• Jenis maklumat yang perlu dikumpul ditentukan selepas
fokus kajian dan soalan kajian ditetapkan.
Dua Jenis Data
i. Data berbentuk kuantitatif
ii. Data berbentuk kualitatif

• Data yang bersifat kuantitatif diperoleh melalui proses

pengukuran yang memerlukan alat-alat pengukuran
seperti ujian, soal selidik dan lain-lain.
• Data yang bersifat kualitatif diperoleh melalui
pemerhatian, temubual, analisis dokumen dan lain-lain
cara untuk mendapatkan segala maklumat yang lengkap
agar apa yang dikaji dapat dibuktikan kebenarannya.
Masalah Dalam Pengukuran Sains Sosial
 Dalam sains tulin, aspek/objek yang diukur adalah dalam bentuk
konkrit. Justeru pengukuran adalah mudah dibuat dengan
menggunakan alat ukur.
 Dalam sains sosial, aspek yang diukur kebanyakan adalah abstrak
seperti sikap, motivasi, minat, perasaan, komitmen dll.
 Perkara yang abstrak amat sukar diukur.
 Kewujudan dan ukuran mengenai aspek-aspek tersebut hanya boleh
diinfer daripada tingkah laku.

 Sejauhmanakah tingkah laku itu dapat menjadi petunjuk kepada
aspek yang hendak diukur.
 Sejauhmanakan inferens yang dibuat itu tepat?
 Konsep konstruk atau gagasan adalah cara untuk mengatasinya

I. Ralat berpunca daripada Pengukur(raters)

II. Ralat berpunca daripada Diri Responden Sendiri
III. Ralat berpunca daripada Situasi
IV. Ralat berpunca daripada Alat Pengukuran
V. Ralat berpunca daripada Hal-hal Pentadbiran

Antara sebab-sebab yang mungkin ialah:

a. Kekaburan mengenai gagasan (construct) yang diukur.
b. Kefahaman yang berbeza-beza
c. Terpengaruh dengan kualiti responden
Contoh: kesan "halo" (halo effect)
Calon yang lancar berkomunikasi mempengaruhi markah
d. Ralat antara pengukur (inter-rater)
Tidak konsisten; cara penyelerasan mengenai pemberian markat;
tafsiran terhadap trait calon dan lain-lain.

Antara sebab-sebab yang mungkin ialah:

a. Responden tidak mempunyai trait yang stabil
misalnya: gugup, gagap
b. Responden tidak tenang (restless)
c. Responden tidak sihat
d. Responden yang boleh memberi kesan "Hawthrone" contoh: pandai
"berlakon" atau pandai menggembeleng tenaga untuk menunjuk
kebolehan yang dikehendaki.

Antara sebab-sebab yang mungkin ialah:

a. Suasana tempat temubual yang tidak menyenangkan (aspek fizikal)
b. Suasana emosi yang tidak sesuai
contoh: tidak mesra; menakutkan; terlalu formal
c. Masa yang tidak sesuai

Antara sebab-sebab yang mungkin:

a. Soalan-soalan tidak mengukur apa yang sepatutnya diukur (tidak
b. Soalan-soalan tidak jelas kepada responden
misal: terlalu panjang; banyak aspek yang ditanya; bahasa yang
kurang difahami
c. Soalan tidak mencakup aspek-aspek yang diperlukan
d. Penggunaan skala
misalnya: kesan terlalu "pemurah" atau terlalu "ketat" (generosity
error and severity error); skala yang ekstrim (central tendency error)
e. Penentuan gred dan markat

Antara sebab-sebab yang mungkin

a. Cara mengisi dan merekod borang
b. Ralat interaksi sosial
misal: interaksi pengukur-responden
a. Perlu memahami tugasan/gagasan yang diukur.
b. Kemahiran menyoal (soalan sah, soalan ringkas dan tepat, soalan
mencakupi kriteria yang diukur, soalan pelbagai aras/tahap)
c. Kesediaan diri (maklumat terkini, keselesaan tempat bertugas,
kelengkapan dan keperluan, pembahagian/pengagihan tugas,
borang/instrumen untuk merekod hasil pemerhatian)
d. Memastikan responden dalam keadaan yang selesa dan bersedia
untuk ditemubual (menjadi rapport)
e. Kemahiran memerhati dan mendengar (perlakuan dan pertuturan
responden, mengesan kekuatan, kelemahan atau keraguan
f. Kemahiran membuat interpretation, inference dan grading (mengukur
mengikut kriteria yg ditetapkan, guna skala yg betul, tidak
terpengaruh kesan halo, pastikan setiap kriteria diukur)
Skala Pengukuran Sifat

Nominal Pengkelasan objek atau orang dan sebagainya kepada

kategori yang diskrit mengikut sifat kualitatif.
Ukuran paling asas.
Contoh: Pembolehubah jantina dikategori kepada 1: lelaki
dan 2: perempuan.
No. rumah, no. tel, no kereta dll.

Ordinal Pengkelasan objek atau orang dan sebagainya mengikut

urutan keutamaan atau rank
Contoh: Tahap kelulusan akademik, persepsi.

Jeda/Sela (interval) Mempunyai urutan atau rank serta wujud perbezaan antara
jeda tetapi tiada mutlak kosong
Contoh: Markat pelajar, IQ, sikap, minat.

Nisbah (ratio) Mempunyai urutan atau rank serta wujud perbezaan antara
jeda dan mempunyai mutlak kosong
Contoh: Tinggi, Umur, Berat
Surveys often use dichotomous questions that ask for a Yes/No, True/False or
Agree/Disagree response.
There are a variety of ways to lay these questions out on a questionnaire:
We might measure occupation using a nominal question.
Here, the number next to each response has no meaning
except as a placeholder for that response. The choice of a
"2" for a lawyer and a "1" for a truck driver is arbitrary -
- from the numbering system used we can't infer that a
lawyer is "twice" something that a truck driver is.
Other examples of LOM questions:
Ordinal Rank in order of most important to least important
____ Salary
____ Benefits
____ Management Style
____ Advancement Opportunities

Likert Scale Your supervisor is fair

1 2 3 4 5
Strongly Disagree Neutral Agree Strongly

Disagree Agree

Decide how satisfied or dissatisfied you are with each characteristic of your
personal computer using the scale below. Circle the number that best describes
your feelings for each statement.

Very Neither Satisfied nor Very

Dissatisfied Dissatisfied Dissatisfied Satisfied Satisfied

1 2 3 4 5

My satisfaction with:
1. Initial price of the computer 1 2 3 4 5

2. What I paid for the computer 1 2 3 4 5

3. How quickly the computer performs calculations 1 2 3 4 5

4. How fast the computer runs programs 1 2 3 4 5

5. Helpfulness of the salesperson 1 2 3 4 5

6. How I was treated when I bought the computer 1 2 3 4 5

Semantic Differential

Please check all that apply

___ I think my company cares about me
___ I think my company is moral
___ I think my supervisor is moral
___ I think my company reflects my core values

a. Perancangan instrumen
b. Pembinaan instrumen
c. Pentadbiran dan pengujian instrumen
d. Penganalisisan dan pemurnian instrumen
Kebanyakan data diperolehi dari soal selidik dan interview.
Keesahan dapatan bergantung kepada kualiti instrument.
 Soal selidik yang baik sukar dibentuk; soal selidik yang lemah sukar untuk di analisa.

Beberapa sebab mengapa sukar mereka bentuk:

 Setiap soalan mesti memberikan ukuran yang sah dan boleh dipercayai.
 Soalan dalam soal selidik mesti mengambarkan apa yang dihasratkan dalam
matlamat kajian.
 Soalan perlu dikumpulkan dalam aturan yang logik bagi mengekalkan
kecenderungan minat untuk menjawab hingga ke akhir soalan.

Matlamat adalah untuk mendapatkan maklumat yang:
Sah: mengukur apa yg sepatutnya diukur.
Kebolehpercayaan: sejauhmankah kekalnya jawapan responden
Tidak bias
Kebolehan mendeskriminasi


1. Tulis matlamat utama dan sekunder kajian anda.

2. Tuliskan konsep / maklumat yang akan dikumpulkan yang
berkaitan dengan matlamat ini.
3. Mengkaji literatur semasa untuk mengenal pasti soal selidik yang
telah disahkan yang mengukur perkara yang sama dgn minat
khusus anda.
4. Tuliskan draf soal selidik.
5. Semak semula draf.
6. Produk akhir soal.

Tulis masalah kajian dan matlamat utama dan sekunder menggunakan
satu ayat setiap matlamat. Merumuskan pelan untuk analisis statistik

Pastikan untuk menentukan sasaran populasi dalam matlamat kajian


Tulis senarai terperinci mengenai maklumat yang akan dikumpulkan
dan konsep yang akan diukur dalam kajian ini. Adakah anda cuba
mengenal pasti:
 Sikap
 Keperluan
 Tingkahlaku
 Demografi
 Beberapa kombinasi dari konsep diatas
Terjemahkan konsep kepada pembolehubah yang boleh diukur.
Nyatakan peranan setiap pembolehubah dalam analisis statistik:
 Peramal / bebas
 Mediator /confounder
 Hasilan/ bersandar

Semak literatur semasa untuk mengenal pasti kaji selidik
berkaitan dan instrumen pengumpulan data yang
mengukur konsep yang serupa dengan tujuan kajian anda.
Menjimatkan masa membangunkan dan membolehkan
perbandingan dengan kajian lain.
Teruskan dengan berhati-hati jika hanya menggunakan
subset soal selidik yang sedia ada kerana ini mungkin
mengubah maksud skor. Hubungi pengarang soal selidik
jika perlu maklumat lanjut.

Tentukan cara pentadbiran tinjauan: wawancara bersemuka,
wawancara telefon, soal selidik lengkap, pendekatan yang dibantu
oleh komputer.
Tulis lebih banyak soalan daripada yang akan dimasukkan dalam
draf akhir.
Formatkan draf itu seolah-olah ia adalah versi akhir dengan ruang
kosong yang sesuai untuk mendapatkan anggaran yang tepat. soal
selidik panjang – kurang mendapat maklumbalas.
Letakkan item yang paling utama pada bahagian awal soal-selidik.
Aliran dari satu item ke item mengikut “flow”.

Question: How many cups of coffee or tea do you drink in a day?
Principle: Ask for an answer in only one dimension.
Solution: Separate the question into two –
 (1) How many cups of coffee do you drink during a typical day?
 (2) How many cups of tea do you drink during a typical day?

Question: What brand of computer do you own?
 (A) IBM PC
 (B) Apple
Principle: Avoid hidden assumptions. Make sure to accommodate all
possible answers.
 (1) Make each response a separate dichotomous item
 Do you own an IBM PC? (Circle: Yes or No)
 Do you own an Apple computer? (Circle: Yes or No)
 (2) Add necessary response categories and allow for multiple responses.
 What brand of computer do you own? (Circle all that apply)
 Do not own computer
 Apple
 Other

Question: Have you had pain in the last week?
[ ] Never[ ] Seldom [ ] Often [ ] Very often

Principle: Make sure question and answer options match.

Solution: Reword either question or answer to match.
 How often have you had pain in the last week?
[ ] Never [ ] Seldom [ ] Often [ ] Very Often

Question: Where did you grow up?
 Country
 Farm
 City

Principle: Avoid questions having non-mutually exclusive answers.

Solution: Design the question with mutually exclusive options.
 Where did you grow up?
 House in the country
 Farm in the country
 City

Question: Are you against drug abuse? (Circle: Yes or No)
Principle: Write questions that will produce variability in the
Solution: Eliminate the question.

Question: Which one of the following do you think increases a
person’s chance of having a heart attack the most? (Check one.)
[ ] Smoking [ ] Being overweight [ ] Stress
Principle: Encourage the respondent to consider each possible
response to avoid the uncertainty of whether a missing item may
represent either an answer that does not apply or an overlooked item.
Solution: Which of the following increases the chance of having a
heart attack?
 Smoking: [ ] Yes [ ] No [ ] Don’t know
 Being overweight: [ ] Yes [ ] No [ ] Don’t know
 Stress: [ ] Yes [ ] No [ ] Don’t know

 (1) Do you currently have a life insurance policy? (Circle: Yes or No)
 If no, go to question 3.
 (2) How much is your annual life insurance premium?

Principle: Avoid branching as much as possible to avoid confusing

Solution: If possible, write as one question.
 How much did you spend last year for life insurance? (Write 0 if none).

Shorten the set of questions for the study. If a question does not
address one of your aims, discard it.
Refine the questions included and their wording by testing them with a
variety of respondents.
 Ensure the flow is natural.
 Verify that terms and concepts are familiar and easy to understand for your target
 Keep recall to a minimum and focus on the recent past.


Decide whether you will format the questionnaire yourself or use

computer-based programs for assistance:
 Adobe Live Cycle Designer 7.0
 GCRC assistance
At the top, clearly state:
 The purpose of the study
 How the data will be used
 Instructions on how to fill out the questionnaire
 Your policy on confidentiality
Include identifying data on each page of a multi-page, paper-based
questionnaire such as a respondent ID number in case the pages

Group questions concerning major subject areas together and
introduce them by heading or short descriptive statements.
Order questions in order to stimulate recall.
Order and format questions to ensure unbiased and balanced results.

Include white space to make answers clear and to help increase
response rate.
Space response scales widely enough so that it is easy to circle or
check the correct answer without the mark accidentally including the
answer above or below.
 Open-ended questions: the space for the response should be big enough to allow
respondents with large handwriting to write comfortably in the space.
 Closed-ended questions: line up answers vertically and precede them with boxes or
brackets to check, or by numbers to circle, rather than open blanks.

Use larger font size (e.g., 12) and high contrast (black on white).

When writing questions and assembling the final questionnaire, edit
with a view towards saliency: apparent relevance, importance, and
interest of the survey to the respondent
Consider either pre-notifying those in your sample or sending
reminders to those who received the survey (if self-administered).
Studies have shown that making contact with the sampled individuals
increases the response rate.
If possible, offer an incentive.

Understanding the characteristics of those who did not respond to the
survey is important to quantify what, if any, bias exists in the results.
To quantify the characteristics of the non-responders to postal surveys,
Moser and Kalton suggest tracking the length of time it takes for
surveys to be returned. Those who take the longest to return the
survey are most like the non-responders. This result may be situation-

You need plenty of time!
 Design your questionnaire from research hypotheses that have been carefully studied
and thought out.
 Discuss the research problem with colleagues and subject matter experts is critical to
developing good questions.
 Review, revise and test the questions on an iterative basis.
 Examine the questionnaire as a whole for flow and presentation.


Essential elements of questionnaire design and development

Janice Rattray PhD, MN, DipN, Cert Ed RGN, SCM
Senior Lecturer in Nursing, Postgraduate Student Advisor, School of Nursing and Midwifery, University of Dundee, Dundee, UK

Martyn C Jones PhD, C Psychol, RNMH Dip Ed, Dip NBS, ILTM
Senior Lecturer in Nursing, School of Nursing and Midwifery, University of Dundee, Dundee, UK

Submitted for publication: 7 April 2005

Accepted for publication: 20 April 2005

Correspondence: R A T T R A Y J & J O N E S M C ( 2 0 0 7 ) Journal of Clinical Nursing 16, 234–243

Janice Rattray Essential elements of questionnaire design and development
School of Nursing and Midwifery Aims. The aims of this paper were (1) to raise awareness of the issues in ques-
University of Dundee, Ninewells Hospital,
tionnaire development and subsequent psychometric evaluation, and (2) to provide
Dundee, DD1 9SY
strategies to enable nurse researchers to design and develop their own measure and
Telephone: þ44(0)1382 632304
E-mail: evaluate the quality of existing nursing measures.
Background. The number of questionnaires developed by nurses has increased in
recent years. While the rigour applied to the questionnaire development process may
be improving, we know that nurses are still not generally adept at the psychometric
evaluation of new measures. This paper explores the process by which a reliable and
valid questionnaire can be developed.
Methods. We critically evaluate the theoretical and methodological issues associated
with questionnaire design and development and present a series of heuristic decis-
ion-making strategies at each stage of such development. The range of available
scales is presented and we discuss strategies to enable item generation and devel-
opment. The importance of stating a priori the number of factors expected in a
prototypic measure is emphasized. Issues of reliability and validity are explored
using item analysis and exploratory factor analysis and illustrated using examples
from recent nursing research literature.
Conclusion. Questionnaire design and development must be supported by a logical,
systematic and structured approach. To aid this process we present a framework
that supports this and suggest strategies to demonstrate the reliability and validity of
the new and developing measure.
Relevance to clinical practice. In developing the evidence base of nursing practice
using this method of data collection, it is vital that questionnaire design incorporates
preplanned methods to establish reliability and validity. Failure to develop a ques-
tionnaire sufficiently may lead to difficulty interpreting results, and this may impact
upon clinical or educational practice. This paper presents a critical evaluation of
the questionnaire design and development process and demonstrates good practice
at each stage of this process.

Key words: nurses, nursing, psychometric evaluation, questionnaire design, scale


234  2007 Blackwell Publishing Ltd

doi: 10.1111/j.1365-2702.2006.01573.x
Issues in clinical nursing Elements of questionnaire design and development

responses are then converted into numerical form and

statistically analysed. These items must reliably operational-
The use of questionnaires as a method of data collection in ize the key concepts detailed within specific research ques-
health-care research both nationally and internationally has tions and must, in turn, be relevant and acceptable to the
increased in recent years (Sitzia et al. 1997, Bakas & target group. The main benefits of such a method of data
Champion 1999, Chen 1999, Jones & Johnston 1999, collection are that questionnaires are usually relatively quick
Jeffreys 2000, Waltz & Jenkins 2001, Siu 2002, Rattray to complete, are relatively economical and are usually easy
et al. 2004). The increasing emphasis on evidence-based to analyse (Bowling 1997).
health care makes it even more important that nurses This approach to data generation is not without criticism.
understand the theoretical issues associated with such meth- It assumes that the researcher and respondents share under-
ods. When interpreting results from questionnaires, the lying assumptions about language and interpret statement
development process should be defined in sufficient detail wording in a similar manner. Closed questions which are
and with sufficient rigour to enable a practitioner to make an commonly used may restrict the depth of participant response
informed decision about whether to implement findings. We (Bowling 1997) and thus the quality of data collected may be
use questionnaires to enable the collection of information in a diminished or incomplete. Questionnaire-based methods are,
standardized manner which, when gathered from a represen- therefore, not the method of choice where little is known
tative sample of a defined population, allows the inference of about a subject or topic area. In such an instance, qualitative
results to the wider population. This is important when we methods may be more appropriate.
want to evaluate the effectiveness of care or treatment. While
the rigour applied to the questionnaire development process
The range of scales available
may be improving, nurses are still neither generally adept nor
confident at the psychometric evaluation of such measures There are a range of scales and response styles that may be
(Jones & Johnston 1999). Central to the understanding of used when developing a questionnaire. These produce differ-
results derived from questionnaires are the issues of reliability ent types or levels of data (see Table 1) and this will influence
and validity which underpin questionnaire development from the analysis options. Therefore, when developing a new
item generation, the proposal of an a priori factor structure to measure, it is important to be clear which scale and response
subsequent psychometric analysis. format to use. Frequency scales may be used when it is
Whilst relevant texts may provide information about these important to establish how often a target behaviour or event
issues, rarely is sufficient detail provided in a single source to has occurred, e.g. the Intensive Care Experience Question-
guide the questionnaire development process. This paper naire (Rattray et al. 2004). Thurstone scales are less common
provides a critical analysis of key methodological issues from in nursing research. Such scales use empirical data derived
item generation to planned psychometric evaluation and from judges to ensure that attitudes or behaviours being
collates a series of heuristic decision-making strategies to measured are spaced along a continuum with equal weight-
assist practitioners to develop their own measure or evaluate ing/spacing, e.g. Nottingham Health Profile (Hunt et al.
the work of others. Two worked examples illustrate these 1985). Guttman scaling is a hierarchical scaling technique
strategies drawn from clinical practice and nurse education that ranks items such that individuals who agree with an item
(Jones & Johnston 1999, Rattray et al. 2004). Issues of will also agree with items of a lower rank, e.g. Katz Index of
reliability and validity are explored using item analysis and Activities of Daily Living (Katz et al. 1963). Rasch scaling is a
exploratory factor analytic techniques. similar type of scale, e.g. De Jong Gierveld and Kamphuis
(1985) and Kline (1993). Knowledge questionnaires may be
helpful when evaluating the outcome of a patient education
What will the questionnaire measure?
programme, e.g. Furze et al. (2001). They generally offer
Nurse researchers use questionnaires to measure knowledge, multiple choice or dichotomous yes/no response options.
attitudes, emotion, cognition, intention or behaviour. This Within research in nursing Likert-type or frequency scales
approach captures the self-reported observations of the are most commonly used. These scales use fixed choice
individual and is commonly used to measure patient percep- response formats and are designed to measure attitudes or
tions of many aspects of health care (see Table 1 for opinions (Bowling 1997, Burns & Grove 1997). These
examples). When developing a questionnaire, items or ordinal scales measure levels of agreement/disagreement. A
questions are generated that require the respondent to Likert-type scale assumes that the strength/intensity of
respond to a series of questions or statements. Participant experience is linear, i.e. on a continuum from strongly agree

 2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243 235
J Rattray and MC Jones

Table 1 Stages in questionnaire development: item generation and scale construction

Questionnaire development Key issues Examples of measures

What will the questionnaire Knowledge The York Angina Beliefs Questionnaire,
measure? Attitude/beliefs/intention (Furze et al. 2001)
Cognition Operationalising the Theory of Planned
Emotion Behaviour (Conner & Sparks 1995)
Behaviour Illness Perception Questionnaire
(Weinman et al. 1996)
Anxiety, depression (Spielberger et al.
1983, Goldberg & Williams 1988)
Functional Limitations Profile, FLIP
(Patrick & Peach 1989)
What types of scale can be Frequency ICEQ, (Rattray et al. 2004)
used? Thurstone Nottingham Health Profile, (Hunt et al.
Rasch 1985)
Guttman Loneliness scale (De Jong Gierveld &
Mokken Kamphuis 1985)
Likert type FLIP (Patrick & Peach 1989)
Multiple choice Edinburgh Feeding Evaluation in
Dementia, (Watson & Deary 1996)
SNSI, (Jones & Johnston 1999)
The York Angina Beliefs
Questionnaire, (Furze et al. 2001)
How do I generate items for my Ensure relevance of items? Check research questions, explore
questionnaire? Wording issues literature, experts, target population
Which response format is best? Follow established guidelines
Which types of question are possible? (Oppenheim 1992, Bowling 1997).
Free text options? Discard poor items.
Does your measure have subscales? Consider and pilot response format
Questionnaire layout (five-point, seven-point, visual analogue
In standardized measures most are closed,
to allow combination of scores from
large numbers of respondents.
May have some open, free text
Construct items that represent each
different hypothesized domain
Carefully consider order of items

to strongly disagree, and makes the assumption that attitudes It is acceptable to treat scores from this type of response
can be measured. Respondents may be offered a choice of five format as interval data to allow the use of common
to seven or even nine precoded responses with the neutral parametric tests (Ferguson & Cox 1993, Polgar & Thomas
point being neither agree nor disagree. There is no assump- 1995, Bowling 1997, Burns & Grove 1997). As with any data
tion made that equal intervals exist between the points on the set, subsequent statistical analysis should be determined by
scale; however, they can indicate the relative ordering of an the normality of distribution of the data and whether the data
individual’s response to an item. While this is perhaps too meets the underlying assumptions of the proposed statistical
simplistic, until an alternative model is developed, it is a test.
relatively easy and appropriate method to use (Oppenheim It would be unusual to develop a questionnaire that relied
1992). Some controversy exists as to whether a neutral point upon a single-item response, and multi-item scales are
should be offered. If this option is removed, this forces the generally used in preference to single-item scales to avoid
respondent to choose a response, which may lead to bias, misinterpretation and reduce measurement error (Bowl-
respondent irritation and increase non-response bias (Burns ing 1997, Burns & Grove 1997). Such questionnaires have a
& Grove 1997). number of subscales that ‘tap’ into the main construct being

236  2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243
Issues in clinical nursing Elements of questionnaire design and development

measured. For example, the Short-Form 36 (Ware & identify items that lack clarity or that may not be appropriate
Sherbourne 1992) measures health-related quality of life for, or discriminate between, respondents. Ideally, the ques-
using 36 items representing eight health subscales. tionnaire should be piloted on a smaller sample of intended
respondents, but with a sample size sufficient to perform
systematic appraisal of its performance. Item analysis is one
Item generation, wording and order
way to pilot a questionnaire. This provides a range of simple
The generation of items during questionnaire development heuristics on item retention or deletion, see Table 2. High
requires considerable pilot work to refine wording and endorsement of an option within a particular item suggests
content. To assure face or content validity, items can be poor discriminatory power or the redundancy of an item that
generated from a number of sources including consultation requires deletion (Priest et al. 1995). Alternatively, a Cron-
with experts in the field, proposed respondents and review of bach’s a < 0Æ70 may suggest that items in a questionnaire or
associated literature (Priest et al. 1995, Bowling 1997; see subscale are poorly grouped. To identify specific items that
Table 1). In addition, a key strategy in item generation is to do not add to the explanatory power of the questionnaire or
revisit the research questions frequently and to ensure that subscale an item-total correlation cut-off of <0Æ3 can be used
items reflect these and remain relevant (Oppenheim 1992, (Ferketich 1991, Kline 1993). However, it is important when
Bowling 1997). It is during this stage that the proposed revising the questionnaire to refer constantly to the original
subscales of a questionnaire are identified (Ferguson & Cox research questions that are being addressed and retain items
1993) and to ensure that items are representative of these. The that are thought to reflect the underlying theoretical domains
item and factor analysis stages of the questionnaire develop- of the questionnaire despite poor psychometric analysis.
ment process may then be used to establish if such items are Problem items may also be identified because of high levels of
indeed representative of the expected subscale or factor. non-response.
The type of question, language used and order of items
may all bias response. Consideration should be given to the
Demonstrating reliability
order in which items are presented, e.g. it is best to avoid
presenting controversial or emotive items at the beginning of It is essential that the reliability of a developing questionnaire
the questionnaire. To engage participants and prevent bore- can be demonstrated. Reliability refers to the repeatability,
dom, demographic and/or clinical data may be presented at stability or internal consistency of a questionnaire (Jack &
the end. Certain questions should be avoided, e.g. those that Clarke 1998). One of the most common ways to demonstrate
lead or include double negatives or double-barreled questions this uses the Cronbach’s a statistic. This statistic uses inter-
(Bowling 1997). A mixture of both positively and negatively item correlations to determine whether constituent items are
worded items may minimize the danger of acquiescent measuring the same domain (Bowling 1997, Bryman &
response bias, i.e. the tendency for respondents to agree with Cramer 1997, Jack & Clarke 1998). If the items show good
a statement, or respond in the same way to items. internal consistency, Cronbach’s a should exceed 0Æ70 for a
To allow respondents to expand upon answers and provide developing questionnaire or 0Æ80 for a more established
more in-depth responses, free text response or open questions questionnaire (Bowling 1997, Bryman & Cramer 1997). It is
may be included. Respondents may welcome this opportun- usual to report the Cronbach’s a statistic for the separate
ity. However, whilst this approach can provide the inter- domains within a questionnaire rather for the entire ques-
viewer with rich data, such material can be difficult to tionnaire.
analyse and interpret (Polgar & Thomas 1995). However, Item-total correlations can also be used to assess internal
these problems may be outweighed by the benefits of consistency. If the items are measuring the same underlying
including this option and can be especially useful in the early concept then each item should correlate with the total score
development of a questionnaire. Free text comments can from the questionnaire or domain (Priest et al. 1995). This
inform future questionnaire development by identifying score can be biased, especially in small sample sizes, as the
poorly constructed items or new items for future inclusion. item itself is included in the total score (Kline 1993).
Therefore, to reduce this bias, a corrected item-total corre-
lation should be calculated. This removes the score from the
Piloting a questionnaire using item analysis
item from the total score from the questionnaire or domain
(N £ 100)
(Bowling 1997) prior to the correlation. Kline (1993)
It is important to ensure that sufficient pilot work is carried recommends deleting any questionnaire item with a corrected
out during the development of a new measure. This will item-total correlation of <0Æ3. Item analysis using inter-item

 2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243 237
J Rattray and MC Jones

Table 2 Stages in questionnaire development: piloting the questionnaire: item analysis (Adapted from Rattray et al. 2004)

Questionnaire development Key issues Examples of decision aids

Piloting the questionnaire: Spread of responses across options: High endorsement of a single option is problematic
Item analysis Initial psychometric analysis: (Priest et al. 1995). An item should be considered for
Clarity and relevance of items: removal if ‡80%, £20% of responses endorsed one
Items deemed theoretically important: response.
Is your measure affected by social desirability Items with an inter-item correlation of <0Æ3 or >0Æ7
bias? should be considered for removal (Ferketich 1991).
Items with a poor Cronbach’s a, i.e. <0Æ7 should be
considered for removal (Kline 1993).
Researcher’s interpretation of patient comments.
Alternatively, if respondents fail to complete an item
it suggests that the item may lack clarity.
Items should be retained if they are deemed to be
theoretically important even if they do not meet the
above criteria.
Explore the relationship between item and scale total
with measure that captures this response tendency,
e.g. Marlowe–Crown Social Desirability Index
(Crowne & Marlowe 1960)
Reliability Internal consistency Corrected inter-item correlations (Ferketich 1991)
Test–retest Item-total correlation (Ferketich 1991)
Inter-observer Cronbach alpha (Kline 1993)
Temporal stability of the measure (Johnson 2001)
Observational studies (e.g. Ager 1998, Ager et al. 2001)
Validity Face or content Do the items sufficiently represent different
Concurrent or discriminant hypothesized domains?
Predictive Do subscale scores correlate with existing, validated
measures presented concurrently?
Do subscale scores predict hypothesis reports on
existing, validated measures presented longitudinally?

correlations will also identify those items that are too similar. the proposed domains or concepts the questionnaire is
High inter-item correlations (>0Æ8) suggest that these are intended to measure. This is an initial step in establishing
indeed repetitions of each other (sometimes referred to as validity, but is not sufficient by itself. Convergent (or
bloated specifics) and are in essence asking the same question concurrent) and discriminant validity must also demonstrated
(Ferketich 1991, Kline 1993). by correlating the measure with related and/or dissimilar
Test–retest reliability can assess stability of a measure over measures (Bowling 1997). When developing a questionnaire it
time and this should be included in the process of any is, therefore, important to include, within the research design,
questionnaire development. This is of particular importance additional established measures with proven validity against
if the intended use of the measure is to assess change over which to test the developing questionnaire. Construct validity
time or responsiveness. relates to how well the items in the questionnaire represent the
underlying conceptual structure. Factor analysis is one statis-
tical technique that can be used to determine the constructs or
Demonstrating validity
domains within the developing measure. This approach can,
Validity refers to whether a questionnaire is measuring what it therefore, contribute to establishing construct validity.
purports to (Bryman & Cramer 1997). While this can be
difficult to establish, demonstrating the validity of a develop-
Further development: exploratory factor analysis
ing measure is vital. There are several different types of
(N > 100)
validity (Polgar & Thomas 1995, Bowling 1997, Bryman &
Cramer 1997). Content validity (or face validity) refers to Following initial pilot work and item deletion, the question-
expert opinion concerning whether the scale items represent naire should be administered to a sample of sufficient size to

238  2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243
Issues in clinical nursing Elements of questionnaire design and development

allow exploratory factor analytic techniques to be performed. kurtosis of variables and the appropriateness of the correla-
Ferguson and Cox (1993) suggest that 100 respondents is the tion matrix.
absolute minimum number to be able to undertake this
analysis. However, others would suggest that this is insuffi-
Factor extraction
cient and a rule of thumb would be five respondents per item
(Bryman & Cramer 1997). This type of analysis must follow Two main methods are used to decide upon the number of
a predefined and systematic analytic sequence (Ferguson & emerging factors, Kaiser’s criterion for those factors with an
Cox 1993). eigenvalue of >1 and the scree test. An eigenvalue is an
Principal components analysis (PCA) explores the inter- estimate of variance explained by a factor in a data set
relationship of variables. It provides a basis for the removal (Ferguson & Cox 1993), and a value >1 indicates greater
of redundant or unnecessary items in a developing measure than average variance. A scree test is the graphic represen-
(Anthony 1999) and can identify the associated underlying tation of this. Figure 1 shows the scree test that demonstrated
concepts, domains or subscales of a questionnaire (Oppen- the four-factor structure from the SNSI (Jones & Johnston
heim 1992, Ferguson & Cox 1993). The terms of factor 1999). The number of factors is identified from the break in
analysis and PCA are often used synonymously in this the slope. If a straight line is fitted along the eigenvalue
context. In practice, however, PCA is most commonly used. rubble, the number of domains within the questionnaire is
Rarely is a questionnaire uni-dimensional and PCA usually revealed by the number of factors above the line. This latter
identifies the presence of one principal component that method includes a degree of subjectivity in its interpretation.
accounts for most of the variance and subsequent compo- With PCA, the removal of redundant items within a
nents that account for less and less. developing measure occurs within an iterative process. Agius
In the initial PCA analysis of an unrotated solution, most et al. (1996) describe an iterative process of removing
items should ‘load’, i.e. correlate with the first component. variables with general loadings (of 0Æ40 on more than one
This can make interpretation of results difficult (Kline 1994), factor) and weak loadings (failing to load above 0Æ39 on any
and to assist the interpretation of a factor solution, rotation factor). This process is applied to the initial unrotated PCA
of factors (components) is often performed. This should be a before applying a varimax or oblimin rotation to interpret the
standard option on statistical packages, e.g. Statistical Pack- structure of the solution. In the development of the SNSI, first
age for Social Scientists (SPSS Inc., Chicago, IL, USA). Factor unrotated principal component revealed the loading of 41
rotation maximizes the loadings of variables with a strong items accounting for 24Æ9% of the variance in the correlation
association with a factor, and minimizes those with a weaker matrix. The scree plot suggested a four-factor solution for
one (Oppenheim 1992) and often helps make sense of the rotation. Four further iterations of this variable reduction
proposed factor structure. Varimax rotation, which is an process led to the final 22-item solution, accounting for
orthogonal rotation (i.e. one in which the factors do not 51Æ3% of the variance in the correlation matrix.
correlate), is often used, particularly if the proposed factors Two recent examples of questionnaire development are the
are thought to be independent of each other (Ferguson & Cox Intensive Care Experience Questionnaire (Rattray et al.
1993). However, oblimin rotation may be used, when factors 2004) and the Student Nurse Stress Index (Jones & Johnston
are thought to have some relationship, e.g. Jones and 1999) for use in clinical and educational contexts respec-
Johnston (1999). It is, therefore, vital to state a priori the tively. Both measures used the questionnaire development
number of factors you expect to emerge and to have decided approach described in this paper (see Table 4). In particular,
which rotation method you will use ahead of any analysis. the suitability of the data set for this type of analysis was
established following the range of pre-analysis checks in each
case. The questionnaires were piloted using both item and
Pre-analysis checks
exploratory factor analysis. The hypothesized factor structure
Ferguson and Cox (1993) give a detailed account of the was demonstrated in the ICEQ (Rattray et al. 2004) but not
process of exploratory factor analysis and provide a set of in the SNSI (Jones & Johnston 1999) in which a fourth factor
heuristics for its three stages of pre-analysis checks, extrac- emerged. This finding demonstrates the exploratory nature of
tion and rotation (see Table 3 for the pre-analysis checks). this type of factor analytic technique and the need to confirm
These pre-analysis checks are necessary to ensure the findings in an independent data set (Agius et al. 1996).
proposed data set is appropriate for the method. The checks Confirmation of the initial four-factor structure was achieved
include determining the stability of the emerging factor in an independent data set with the SNSI. Work is cur-
structure, sampling requirements, item scaling, skewness and rently being undertaken for the ICEQ. Both questionnaires

 2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243 239
J Rattray and MC Jones

Table 3 Stages in questionnaire development: factor analysis

Questionnaire development Key issues Pre-analysis checks (Ferguson & Cox 1993)

Further development: Principal components analysis (PCA): Stable Factor Structure

Exploratory Factor analysis Explores the inter-relationship of variables Minimum number of participants: 100
Provides a basis for the removal of redundant or Minimum participant to variable ratio,
unnecessary items (Anthony 1999), N/p: 2:1–10:1
PCA is used to identify the underlying domains Minimum variable to factor ratio,
or factors within a measure. p/m: 2:1–6:1
Prior to analysis, must propose an underlying Minimum participant to factor ratio,
theoretical structure N/m: 2:1–6:1
Ensure that the data set is appropriate Sampling
Must follow a predefined and systematic analytic Random sampling from a population.
sequence, e.g. Ferguson and Cox (1993) Item scaling
Likert, Mokken and frequency scales are
Normality of distribution/skewness and
Underlying assumption is of normal
distribution. Values of skewness and kurtosis
should be calculated for each variable, and
values out with accepted levels dealt with
Appropriateness of the correlation matrix
Kaiser Meyer–Olkin: can the correlations
between variables be accounted for by a
smaller set of factors? should be >0Æ5.
Bartlett Test of Sphericity: based on the
chi-squared test, – a large and significant test
used to indicate discoverable relationships
Further development: Allows the further testing of the construct Confirmation of factor structure on an
Confirmatory factor analysis validity of the measure independent data set, using exploratory and
confirmatory methods, see Agius et al. (1996),
Jones and Johnston (1999).
Same underlying assumptions as exploratory
Confirmatory process uses single sample and
multi-sample approaches

Analysis with 22 S.N.S.I. items

2004). These papers provide a step-by-step account of the
questionnaire development process in a level of detail that is
5 not available in traditional textbooks. This will be of particular
4 use to the nurse researcher or research-minded practitioner.

1 This paper emphasizes the need to adopt a logical, systematic
and structured approach to questionnaire development. We
1 3 5 7 9 11 13 15 17 19 21 have presented a framework that supports this type of
Factor number
approach and have illustrated the questionnaire development
Figure 1 Scree test from the SNSI (Jones & Johnston 1999). process using item analysis, factor analytic and related
methods and have demonstrated strategies to demonstrate
demonstrated good reliability and concurrent validity. For the reliability and validity of the new and developing
further details of the domain structure of the ICEQ and SNSI, measure. We have suggested the need to preplan each stage
see original papers (Jones & Johnston 1999, Rattray et al. of the questionnaire development process and provide a series

240  2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243
Issues in clinical nursing Elements of questionnaire design and development

Table 4 Development of the ICEQ (Rattray et al. 2004) and SNSI (Jones & Johnston 1999)

ICEQ (Rattray et al. 2004) SNSI (Jones & Johnston 1999)

Purpose The rationale for this questionnaire was The main purpose of this measure was to develop
identified from literature. Patients had limited a reliable and valid questionnaire to measure
recall of the ICU experience, yet described it as the sources of stress for student nurses. Previous
being frightening and persecutory in nature. research had demonstrated high levels of
Reported perceptions of this experience have distress associated with training to be a student
been linked to poorer emotional outcome. nurse (Jones & Johnston 1997). It was
Previous research in this field was mainly important to identify the sources of stress for
qualitative and, therefore, a standardized students, to inform a stress management
questionnaire was developed intervention (Jones & Johnston 2000)
Research questions Research questions were identified A four-factor structure was hypothesized
including academic load, clinical concerns and
interface worries
Scale and response format Likert-type and frequency scales with a Likert-type items with a five-choice format
five-choice format.
Three open questions included
Generation of items Items generated from experts, literature review An existing questionnaire with 43 items (Beck &
and an underlying theoretical structure of five Srivastava 1991). Fifteen additional items were
domains was proposed. generated from literature review and student
Thirty-eight items generated, randomly placed feedback
throughout the measure, with a mix of
positively and negatively worded items
Test and pilot of items Pilot work: 34 patients interviewed Pilot work was with a large data set of 320
Amendments based on item analysis Amendments made using criteria presented in Item reduction carried out using exploratory
or related techniques Table 1. factor analysis methods, rather than item
Eighteen items were removed, 11 were added analysis.
leaving a 31-item questionnaire. Unrotated PCA. Weak Items (failing to load
Research questions revisited. above 0Æ39) and general items (loading at or
Again the underlying theoretical structure of four above 0Æ40 on more than one factor in the
domains was proposed unrotated solution) were deleted in an iterative
Principal component’s analysis Administered to 109 patients as part of a Forty-three plus 15 items were administered to
structured interview. 320 students. Pre-analysis check ensured data
Pre-analysis check ensured data were were appropriate.
appropriate. Oblimin rotation
Unrotated PCA Items were reduced to a 22 item simple oblique
Varimax rotation solution.
Factors with a loading of ‡0Æ4 on one factor Four subscales were identified, academic load,
only were retained. Items were reduced from clinical concerns, interface worries and
31 to 24. personal problems
Four domains were identified
Reliability Cronbach a statistic for each domain was ‡0Æ7 Cronbach a statistic for each domain was ‡0Æ73
(interface worries in an initial data set a 0Æ68)
Validity Concurrent validity established by correlating Concurrent validity was shown by correlating
domain scores with scores from two measures SNSI subscale scores with GHQ 30
with demonstrated validity, e.g. Hospital (continuously scored).
Anxiety and Depression Scale, Impact of Event Discrminant validity demonstrated with
Scale distressed students scoring higher on all SNSI
Confirmation on an independent Data is being gathered to confirm the four-factor Four-factor structure was confirmed on an
data set structure of the ICEQ independent data set (N ¼ 195) using
exploratory and confirmatory factor analytic
techniques (Deary et al. 1993)
Revision of measure A revised 49-item version of the SNSI is currently
in development (Jones & Johnston 2003)

 2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243 241
J Rattray and MC Jones

of heuristic strategies to enable the nurse researcher to Anthony D (1999) Understanding Advanced Statistics: A Guide for
achieve this (Deary et al. 1993, Kline 1993, Agius et al. Nurses and Health Care Researchers. Churchill Livingstone,
Bakas T & Champion V (1999) Development and psychometric
While there has been an increase in the use of question- testing of the Bakas caregiving outcomes scale. Nursing Research
naires within the nursing literature, few such measures have 48, 250–259.
been developed using the full set of strategies used by Rattray Beck DL & Srivastava R (1991) Perceived level and sources of stress
et al. (2004) and Jones and Johnston (1999), summarized in baccalaureate nursing students. Journal of Nursing Education
here. In developing the evidence base of nursing practice 30, 127–133.
Bowling A (1997) Research Methods in Health. Open University
using this method of data collection, it is vital that the nurse
Press, Buckingham.
researcher incorporates methods to establish the reliability Bryman A & Cramer D (1997) Quantitative Data Analysis with SPSS
and validity, particularly of new questionnaires. Failure to for Windows. Routledge, London.
develop a questionnaire sufficiently may lead to difficulty Burns N & Grove SK (1997) The Practice of Nursing Research
interpreting results. For example, failure to demonstrate an Conduct, Critique, & Utilization. W.B. Saunders and Co.,
expected correlation of a new measure with an established
Chen M-L (1999) Validation of the structure of the perceived
scale may arise because of limited variation in scores on a meanings of cancer pain inventory. Journal of Advanced Nursing
developing questionnaire and the subsequent suppression of 30, 344–351.
correlations between scores on the two questionnaires. Conner M & Sparks P (1995) The theory of planned behaviour and
Alternatively, there may really be no reliable relationship health behaviours. In Predicting Health Behaviour (Conner M
between such variables. If a measure is poorly designed and & Norman P eds). Open University Press, Buckingham, pp.
has had insufficient psychometric evaluation, it may be
Crowne DP & Marlowe DA (1960) A new scale of social desirability
difficult to judge between such competing explanations. In independent of psychopathology. Journal of Consulting Psychol-
addition, it may not be possible to use the findings from an ogy 24, 349–354.
established measure, if that measure cannot be shown to be De Jong Gierveld J & Kamphuis F (1985) The development of a
reliable in a particular sample. Rasch-type loneliness scale. Applied Psychological Measurement 9,
If clinical or educational practice is to be enhanced or
Deary IJ, Hepburn DA, MacLeod KM & Frier BM (1993) Parti-
changed using findings derived from questionnaire-based tioning the symptoms of hypoglycaemia using multi-sample con-
methods, it is vital that the questionnaire has been sufficiently firmatory factor analysis. Diabetologia 36, 771–777.
developed. This paper presents a critical evaluation of the Ferguson E & Cox T (1993) Exploratory factor analysis: a
questionnaire design and development process and demon- user’s guide. International Journal of Selection and Assessment 1,
strates good practice at each stage of this process. This paper 84–94.
Ferketich S (1991) Focus on psychometrics: aspects of item analysis.
will enable the informed nurse researcher to plan the design
Research in Nursing and Health 14, 165–168.
and development of their own questionnaire, to evaluate the Furze G, Lewin RJP, Roebuck A, Thompson DR & Bull P (2001)
quality of existing nursing measures, and to inspire confid- Attributions and misconceptions in angina: an exploratory study.
ence in applying findings into practice. Journal of Health Psychology 6, 501–510.
Goldberg D & Williams P (1988) A Users Guide to the General
Health Questionnaire. NFer-Nelson, Windsor.
Contributions Hunt S, McEwen J & McKenna SP (1985) Measuring health status: a
new tool for clinicians and epidemiologists. Journal of the Royal
Study design: JR, MCJ; data analysis: JR, MCJ and manu- College of General Practitioners 35, 185–188.
script preparation: JR, MCJ. Jack B & Clarke A (1998) The purpose and use of questionnaires in
research. Professional Nurse 14, 176–179.
Jeffreys MR (2000) Development and psychometric evaluation of the
References transcultural self-efficacy tool: a synthesis of findings. Journal of
Transcultural Nursing 11, 127–136.
Ager A (1998) The British Institute of Learning Disabilities Life Johnson J (2001) Evaluation of learning according to objectives tool.
Experiences Checklist. BILD Publications, Kidderminster. In Measurement of Nursing Outcomes (Waltz C & Jenkins L eds).
Ager A, Myers F, Kerr P, Myles S & Green A (2001) Moving home: Springer Publishing Company, New York, pp. 216–223.
social integration for adults with intellectual disabilities resettling Jones MC & Johnston DW (1997) Distress, stress and coping
into the community. Journal of Applied Intellectual Disabilities 14, in first-year student nurses. Journal of Advanced Nursing 26,
392–400. 475–482.
Agius RM, Blenkin H, Deary IJ, Zealley HE & Wood RA (1996) Jones MC & Johnston DW (1999) The derivation of a brief student
Survey of perceived stress and work demands of consultant doc- nurse stress index. Work and Stress 13, 162–181.
tors. Occupational and Environmental Medicine 53, 217–224.

242  2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243
Issues in clinical nursing Elements of questionnaire design and development

Jones MC & Johnston DW (2000) Evaluating the impact of a Rattray JE, Johnston M & Wildsmith JAW (2004) The intensive care
worksite stress management programme for distressed student experience: development of the intensive care experience (ICE)
nurses: a randomised controlled trial. Psychology and Health 15, questionnaire. Journal of Advanced Nursing 47, 64–73.
689–706. Sitzia J, Dikken C & Hughes J (1997) Psychometric evaluation of a
Jones MC & Johnston DW (2003) Further Development of the SNSI. questionnaire to document side-effects of chemotherapy. Journal of
Paper presented at the Royal College of Nursing Annual Inter- Advanced Nursing 25, 999–1007.
national Research Conference, University of Manchester, April Siu O-L (2002) Predictors of job satisfaction and absenteeism in two
2003. samples of Hong Kong Nurses. Journal of Advanced Nursing 40,
Katz S, Ford A & Moskowitz R (1963) Studies of illness in the aged: 218–229.
the index of ADL. A standardised measure of biological and psy- Spielberger C, Gorsuch R & Lushene R (1983) The State-Trait
chosocial function. Journal of American Medical Association 185, Inventory: Test Manual for Form Y. Consulting Psychologists
914–919. Press, Palo Alto, CA.
Kline P (1993) The Handbook of Psychological Testing. Routledge, Waltz C & Jenkins L (2001) Measurement of Nursing Outcomes:
London. Volume 1: Measuring Nursing Performance in Practice, Education,
Kline P (1994) An Easy Guide to Factor Analysis. Routledge, Lon- and Research. Springer Publishing Company, New York.
don. Ware JE & Sherbourne CD (1992) The MOS 36-Item short-form
Oppenheim AN (1992) Questionnaire Design, Interviewing and health survey (SF-36): conceptual framework and item selection.
Attitude Measurement. Pinter, London. Medical Care 30, 473–481.
Patrick D & Peach H (eds) (1989) Disablement in the Community. Watson R & Deary I (1996) Is there a relationship between feeding
Oxford University Press, Oxford. difficulty and nursing interventions in elderly people with de-
Polgar S & Thomas S (1995) Introduction to Research in the Health mentia? Nursing Times Research 1, 44–54.
Sciences. Churchill Livingstone, Melbourne. Weinman J, Petrie K, Moss-Morris R & Horne R (1996) The illness
Priest J, McColl BA, Thomas L & Bond S (1995) Developing and perception questionnaire: a new measure for assessing the cognitive
refining a new measurement tool. Nurse Researcher 2, 69–81. representation of illness. Psychology and Health 11, 431–445.

 2007 Blackwell Publishing Ltd, Journal of Clinical Nursing, 16, 234–243 243
Kaedah Tinjauan Apakah Kaedah Tinjauan?

• Kaedah tinjauan adalah satu prosidur

dalam penyelidikan dimana penyelidik
mentadbir satu tinjauan terhadap sesuatu
sampel atau populasi bagi menghuraikan
sikap, pandangan, kepercayaan perlakuan
atau ciri-ciri populasi.
• Kaedah tinjauan adalah kajian bukan
eksperimental dan juga merupakan kaedah
penyelidikan deskriptif.

Semasa bila mengunakan

Kaedah Tinjauan
• Kajian tinjauan digunakan bagi mengukur
pembolehubah yang berkait dengan sesuatu • Menaksir trend
fenomena tanpa menyoal mengapa
pembolehubah itu wujud. • Sikap, kepercayaan,
• Tujuan tinjauan adalah untuk mengumpul data pandangan
bagi menghuraikan sifat atau ciri pada responden.
Kaedah tinjauan adalah satu cara yang spesifik
• Penilaian
bagi mengumpul maklumat berkaitan populasi.
• Oleh itu, mendapatkan sampel yang mewakili
populasi adalah penting dalam kaedah tinjauan

Ciri-ciri utama penyelidikan
Jenis Kaedah Tinjauan
Penyelidikan tinjauan boleh dibahagikan kepada dua
• Sampel daripada populasi jenis berdasarkan jangka masa yang diperlukan untuk
mengumpul data iaitu Cross-sectional dan Longitudinal
• Pungutan data melalui soal-selidik atau
temubual Tinjauan Silang (Cross-sectional): penyelidik
mengumpul data hanya sekali daripada satu sampel.
• Membina instrumen bagi mengutip data Sebagai contoh : Satu kajian yang ingin mengenal pasti
• Mendapatkan respons yang banyak tahap kerisauan pelajar terhadap subjek matematik,
pada bulan Mac 2014. Penyelidik boleh juga
menentukan hubungan antara kerisauan matematik dan
ketabahan pelajar dalam matematik.

Dr Effandi Zakaria

Jenis Kaedah Tinjauan Tinjauan Longitudinal

• Penyelidik mengkaji perkembangan kemahiran

Longitudinal: mengira pelajar-pelajar KBSR dengan memerhati
• Penyelidik mengumpul data berulang kali daripada pelajar yang sama bermula dari darjah satu hingga
sesuatu sampel sepanjang satu jangkamasa yang tamat darjah enam.
panjang. • Oleh kerana subjek yang dikaji adalah sama, faktor-
• Penyelidik kemudian membuat analisis perubahan faktor asal seperti kebolehan individu, kecerdasan,
dalam populasi dan cuba menjelaskan atau sikap masih tetap sama. Perubahan yang berlaku
menghuraikannya. dalam tempoh tersebut biasanya disebabkan oleh
p/ubah yang diselidiki. Semakin lama semakin
• Terdapat tiga jenis tinjauan jangka panjang iaitu: a) dipercayai dapatan kajian.
kajian aliran(trend) b) kajian kohort c) kajian panel

Dr Effandi Zakaria Dr Effandi Zakaria

Kajian Aliran (Trend) Kajian Kohort
• Kaedah aliran ialah satu tinjauan jangka panjang, penyelidik • Kaedah kohort ialah satu tinjauan jangka panjang,
memilih satu sampel secara berulang-ulang daripada suatu penyelidik memilih satu sampel secara berulang-
populasi am yang besar dalam satu jangka masa yang tertentu
ulang kali daripada satu populasi khusus yang kecil
• Kemudian data akan dianalisis daripada setiap sampel dan
membandingkan hasil satu sampel dengan yang lain untuk
sepanjang satu jangkamasa tertentu
menentukan jika wujud satu aliran atau pola perubahan atau • Melalui kajian ini penyelidik ingin mengkaji kestabilan
kestabilan dalam data tersebut atau perubahan ciri dan tingkah laku dalam kumpulan
• Kajian aliran digunakan oleh penyelidik untuk mengkaji dan kecil (kohort) itu.
membandingkan perubahan dalam kepercayaan, aktiviti, tahap
kesedaran populasi sepanjang satu masa. • Kajian yang melihat perubahan dalam kohort
• Kajian yang melihat perubahan dalam populasi merentas merentas masa.
masa. • Setiap kajian mengutip data daripada individu
• Setiap kajian mengutip data daripada individu berlainan berlainan dalam kohort yang sama.

Dr Effandi Zakaria Dr Effandi Zakaria

Kajian Panel Kelebihan Kaedah Tinjauan

• Kajian panel, penyelidik mengumpul dan Tinjauan secara relatifnya adalah lebih murah (terutama yg
menganalisis data daripada semua ahli daripada satu ditadbir sendiri).
sampel yang sama sepanjang penyelidikan dalam Tinjauan adalah berguna dalam menghuraikan ciri-ciri populasi
yg besar.
satu jangka masa yang tertentu
Boleh ditadbir dari lokasi yg jauh menggunakan surat,email atau
• Kajian yang melihat perubahan dalam individu telefon.
merentas masa. Boleh menganalisis banyak pembolehubah yg menghasilkan
• Setiap kajian mengutip data daripada individu yang statistik signifikan
sama. Banyak soalan boleh ditanya tentang sesuatu topik kajian.
Terdapat fleksibiliti dalam menentukan bagaimana soalan akan
• Daripada kajian ini keciciran (kehilangan subjek)
ditadbir: temubual, telefon, tinjauan lisan atau menggunakan
boleh menjadi satu masalah yang serius, sebab cara eletronik.
sampel akan menjadi lebih kecil. Biasanya, kebolehpercayaan yg tinggi mudah didapati

Dr Effandi Zakaria Dr Effandi Zakaria

Kelemahan Kaedah Tinjauan Alat Kajian (Instrumen)

Penyelidik membina soalan yg umum, • Alat kajian yang digunakan dalam kajian
kemungkinan tidak sesuai bagi kebanyakan tinjauan ialah untuk mendapatkan
responden. maklumat yang piawai atau standard
Penyelidik perlu memastikan sebahagian daripada semua subjek daripada
besar responden memberi respons.
Kadang-kadang susah bagi responden
mengingat maklumat atau bercakap benar
terhadap soalan kontroversi.
Penemubual boleh mempengaruhi jawapan
responden melalui cara soalan dikemukakan
Dr Effandi Zakaria

Alat Kajian (Instrumen) Masalah membina item/soalan

• Instrumen yang selalu digunakan dalam

• Ayat terlalu panjang
kajian tinjauan ialah:
• Terlalu banyak soalan yang berayat negatif
• Soalselidik • Soalan yang berbentuk bahasa teknikal
• Temubual secara individu • Elakkan soalan yang mengelirukan atau tidak
• Temubual melalui talipon jelas.
• Soalan mesti sesuai dengan kebolehan
• Tinjauan menerusi mel (surat) responden.
• Tinjauan berasaskan internet

Langkah-Langkah dalam
Kajian Rintis
menjalankan Kajian Tinjauan
• Tentukan terlebih dahulu adakah rekabentuk
• Uji kepada sekumpulan kecil individu tinjauan adalah yang terbaik untuk digunakan
dalam sampel • Tentukan persoalan kajian atau hipotesis
• Dapatkan maklumbalas bertulis • Kenalpasti populasi kajian dan sampel
terhadap item/soalan itu. • Tentukan bentuk tinjauan dan prosidur
pengutipan data
• Berdasarkan maklumbalas bertulis, buat
• Bina atau dapatkan instrumen
pembaikan terhadap item/soalan
• Tadbir instrumen
• Jangan masukkan peserta kajian rintis • Analisis data untuk memberi jawapan kepada
dalam kajian sebenar persoalan kajian atau hipotesis
• Tulis laporan

Soalan tinjauan Soalan Tinjauan

• Terbuka (Open-ended) • Sebahagian Terbuka (Partially open-ended)

Apakah pengalaman anda di kolej Berkaitan dengan pengalaman di kolej,yang
mana faktor berikut anda rasa memuaskan ?
• Tertutup (Closed-ended)
Kehidupan sosial___
Adakah pengalaman anda di kolej Perkhidmatan makanan___
memuaskan sehingga kini?
Perkhidmatan wifi___

Item/Soalan Tinjauan Contoh skala Likert

• Berbentuk skala-Likert
Saya sangat berpuas hati dengan
pengalaman di kolej?
1 2 3 4
Sangat Tidak Setuju Tidak Setuju Setuju Sangat Setuju

Dr Effandi Zakaria

Kebaikan soalan tertutup Kelemahan soalan tertutup

• Senang dan cepat untuk dijawab. • Mencadangkan idea yang mungkin tidak pernah
difikir oleh responden
• Jawapan dapat dibandingkan dengan
• Responden dapat menjawab walaupun tidak tahu
mudah. langsung mengenai isu yang ditanya.
• Kategori jawapan dapat membantu • Responden akan berasa kecewa sekiranya kategori
responden memahami soalan. jawapan yang diberi tidak sesuai dengan mereka.
• Perbezaan jawapan antara responden tidak dapat
• Responden lebih cenderung menjawab dilihat.
soalan yang sensitif • Memaksa responden memberi jawapan yang mudah.
• Mengurangkan kekeliruan dalam • Memaksa responden membuat keputusan yang tidak
mungkin/perlu dibuat dalam kehidupan harian.
kalangan responden.
Dr Effandi Zakaria Dr Effandi Zakaria

Kebaikan soalan terbuka Kelemahan soalan terbuka

• Memberi kebebasan kepada responden • Jawapan yang tidak berkaitan atau

untuk menjawab relevan
• Responden dapat memberi jawapan • Perbandingan dan analisis statistik
yang terperinci menjadi sukar
• Penemuan yang tidak dijangka mungkin • Memerlukan masa yang lebih panjang
berlaku untuk menjawab.
• Memberi peluang untuk kreativiti dan • Mengambil ruang yang lebih besar
menerangkan apa yang difikirkan.

Dr Effandi Zakaria Dr Effandi Zakaria

Isu yang perlu dipertimbangkan

Apakah persoalan kajian anda?
Populasi: boleh akses, literasi, isu bahasa?
Sampling: data yg ada mewakili? penglibatan

Soalan/Item: jenis, panjang, kesukaran?
Soal-selidik sedia ada?
Kandungan: pengetahuan tentang?
Bias: respons jujur?
Pentadbiran: kos, masa, peralatan?

• Penyelidikan yang mendalam tentang suatu aspek
lingkungan sosial melibatkan manusia (nasution,
• Fokus kepada penghuraian yang holistik/ menyeluruh dan
memberikan penjelasan kepada sesuatu peristiwa yang
berlaku secara mendalam.
• Terjadi apabila penyelidik melakukan penerokaan
terhadap fenomena tunggal (the case) yg dibatasi oleh
waktu, aktiviti dan pengumpulan data dengan
menggunakan pelbagai prosedur pengumpulan data.
(Cresswel, 1994:11)
• Kes didefinisikan sebagai suatu fenomena yang berlaku dalam suatu konteks tertentu
dengan lingkungan atau sempadan yang tersendiri (bounded system)
• Individu atau aspek tertentu dalam sempadan berkenaan dipilih sebagai fokus kajian
untuk menghuraikan fenomena yang menjadi kes.
• Maka, kes merupakan suatu unit analisis dalam sesuatu kajian kes.
• Kajian kes merupakan suatu inkuiri empirikal yang mengkaji fenomena semasa dalam
lingkungan konteks kehidupan sebenar.
• Kajian kes digunakan untuk menjawab soalan kajian yang bertujuan untuk
menghuraikan “bagaimana” atau ` “mengapa” tentang sesuatu fenomena.
• Pemilihan kajian kes juga dipengaruhi oleh kawalan penyelidik ke atas peristiwa atau
fenomena yang dikaji. Sekiranya peristiwa atau fenomena sukar dikawal oleh penyelidik,
maka kajian kes lebih sesuai.
• Di samping itu, kajian kes juga sesuai sekiranya fokus kajian ialah fenomena
kontemporari dalam konteks kehidupan sebenar (yin, 1994)
• Instrumental case study
• kajian bertujuan untuk memahami sesuatu teori yang telah sedia ada dengan lebih mendalam.
• Cnth: kesesakkan jalanraya: kajian kes di kuala lumpur

• Intrinsic case study:

• dijalankan utk melihat fenomena secara intrinsik (mendapatkan pemahaman yang lebih terperinci)
• untuk mengetahui perkara sebenar disebalik kejadian yang berlaku
• tidak dijalankan utk membina teori.
• Cnth: kesesakkan jalanraya di kuala lumpur

• Collective or multiple case study ▪

• dijalankan utk membuat kesimpulan atau generalisasi atas fenomena atau populasi berdasarkan kaitan dengan kes-kes lain ▪
• contoh: kesesakkan jalanraya: kajian kes di kuala lumpur dan pulau pinang
Intrinsic Case Study
Unusual Case Study an intrinsic, unusual case.

Instrumental Case Study

Issue Case Study a case that provides insight into
an issue or theme
Multiple Instrumental Case Study
(also called a Collective Case Study)
Study several cases that
Issue Case provide insight into an
issue (or theme)
• Pemerhatian
• Temubual
• Dokumen analisis,
• * Penyelidik kajian kes menggunakan data daripada pelbagai sumber untuk
triangulasi data.
Triangulasi bermaksud integrasi method yang berbeza bagi
mendapatkan lebih dari satu bentuk data supaya dapatan
kajian kualitatif yang bersifat subjektif akan lebih
meyakinkan dan dapat meningkatkan kesahan data yang
• Merupakan pendekatan kajian jenis kualitatif.
• Kajian dilakukan pada persekitaran semulajadinya atau tapak kajian.
• Berasal daripada perkataan greek, “ethnos” bererti manusia, bangsa atau budaya.

• Sejenis kajian lapangan yang berbentuk pemerhatian yang sering digunakan dalam kajian sosiologi dan

• Etnografi sebagai penjelasan bertulis mengenai sesuatu budaya tentang adat, kepercayaan, tingkah laku yang
berdasarkan kepada maklumat-maklumat yang dikumpulkan dari kerja lapangan.

• Etnografi adalah kajian deskriptif ke atas budaya, sub budaya, institusi atau kumpulan sesebuah masyarakat.
1. Memahami isu yang dikaji itu dari kaca mata kumpulan atau budaya
2. Kajian etnografi berusaha untuk menambah pengetahuan mengenai
sesuatu budaya atau mengenal pasti corak interaksi sosial
3. Membangunkan satu penafsiran yang menyeluruh terhadap sesuatu
masyarakat atau institusi sosial



• Pemerhatian tidak beserta/ beserta

• TEMUBUAL terbuka, berstruktur, SEMI BERSTRUKTUR
• Penelitian rekod atau dokumen
• Dokumentasi bergambar
• Penyelidik mengutip data deskriptif berbentuk naratif dan visual.
• Data dikumpul bagi menjawab soalan “apakah yang sedang berlaku dalam seting ini?
• Teknik-teknik tersebut dilaksanakan untuk satu tempoh masa yang panjang agar penyelidik
dapat membuat huraian, analisis dan tafsiran seting sosial yang menjadi tumpuan
• Konsep triangulasi memainkan peranan penting dalam penyelidikan etnografi. (Triangulasi
ialah penggunaan pelbagai kaedah penyelidikan, strategi pengutipan data dan sumber)
• Penyelidik menjadi pemerhati –peserta, pemerhati-peserta aktif , pemerhati-peserta aktif
yang mempunyai hak istimewa dan pemerhati pasif
Case study Ethnography

• It does not only depend on participant- • It may require certain periods of time in the
observer data but mainly uses interviews ‘Field’ and emphasize details of observational

• The ethnographer may use an interview as an

additional technique to capture whole
participant’s perspective
• The central difference between ethnography and case study lies in the study’s intention.
Ethnography is inward looking, aiming to uncover the tacit knowledge of culture

• Case study is outward looking, aiming to delineate the nature of phenomena through
detailed investigation of individual cases and their contexts.
• Ethnography is an art of describing a group or culture, case study is an in depth analysis of a
particular instance, event, individual, or a group

• Ethnography requires participant observation as a data collection method whereas it is not

necessary in a case study.

• Case study is outward looking while ethnography is inward looking

• Ethnography takes a longer time than a case study.

• Non experimental or descriptive research methods

• Costly and time consuming
• In-depth studies
• Subjective biases from researcher
Adakah produk Qu Puteh berkesan untuk mencerahkan muka anda?
Adakah anda percaya pada testimoni?
Skala ton warna kulit (pembolehubah bersandar)
Produk memutihkan muka
(Pembolehubah bebas)
Penyelidikan yang dijalankan untuk menentukan kesan sesuatu olahan
atau treatment.
Penyelidik dengan sengaja dan sistematik memperkenalkan olahan atau
treatment ke dalam sesuatu fenomena dan mengamati perubahan yang
berlaku ke atas fenomena tersebut.

Apakah kesan pemberian susu percuma kepada kecergasan murid-murid
sekolah rendah?

Biasanya digunakan untuk menguji hipotesis.

Menguji hipotesis tentang perbandingan keberkesanan pengajaran
secara tradisional dengan pengajaran menggunakan kaedah multimedia.

1. Pembolehubah Bersandar

Pembolehubah yang menjadi kesan kepada olahan atau treatment.

Pembolehubah bersandar ini bergantung kepada perlakuan atau
perubahan pembolehubah lain.

2. Pembolehubah Tak Bersandar (Pembolehubah Bebas)

Pembolehubah yang dimanipulasikan atau yang dikaji kesannya.

3. Pembolehubah Ekstranus/ Intervening/ Contaminating

(Pembolehubah Pencemar)

Pembolehubah yang kewujudannya boleh mempengaruhi atau

mendatangkan kesan kepada pembolehubah bersandar dan juga
pembolehubah tak bersandar.
dalam penyelidikan eksperimen, penyelidik perlu mengawal kesan
pembolehubah ekstranus daripada mempengaruhi pembolehubah
bersandar dan juga pembolehubah tak bersandar.
If X, then Y
If the program is given, then the outcome occurs

If not X, then not Y

If the program is not given, then the outcome does not occurs

Pemahaman tentang kesahan kajian eksperimen adalah penting bagi

membolehkan kita mengawal kesan pembolehubah tak bersandar dan
pembolehubah ekstranus.

Terdapat dua jenis kesahan penyelidikan eksperimen

1. Kesahan Dalaman

2. Kesahan Luaran
KESAHAN DALAMAN (Internal validity)

Merujuk kepada sejauhmanakah kesan atau perubahan yang berlaku

kepada pembolehubah bersandar disebabkan oleh olahan
pembolehubah tak bersandar, bukan disebabkan oleh pembolehubah

Terdapat lapan faktor (jenis pembolehubah extranus) yang boleh

mempengaruhi kesahan dalaman sesuatu kajian eksperimen
(Campbell & Stanley 1963)
KESAHAN DALAMAN (Internal validity)
1. Sejarah (History)
Events that occur between the first and second measurement that
are unrelated to the experiment but could affect the result
Perubahan pada pembolehubah bersandar bukan sahaja terjadi
disebabkan oleh pembolehubah tak bersandar tetapi juga disebabkan
oleh berlakunya peristiwa, perkembangan atau pengalaman
pendidikan responden.
cth. Kehadiran peperiksaan SPM menyebabkan pelajar tingkatan 5
lebih mengulangkaji pelajaran berbanding pelajar tingkatan 4.
Adanya kempen Cintai Bahasa Kita menyebabkan responden
mengamalkan bahasa Malaysia dengan lebih tekun.
2. Kematangan (Maturation)
Biological and psychological processes within the subject may
change during the progress of the experiment which will affect their
The subject may perform better or worse on the posttest not
because of the effect of X (treatment), but because the are older,
more interested or less interested than when they took pretest.

3. Pra Ujian (Pretesting prosedures)

Pretest may serve as a learning experience that will cause the
subject to alter the responses on posttest, whether or not X
(treatment) is applied.
4. Alat Ukur (Measuring Instrument)
Changes in the testing instrument, human ratters, or interviewers
can affect the obtained measurements.
If posttest is more difficult than pretest, or a different person rates
subjects on the rating scales, these factors rather than X (treatment)
can cause the difference in the two scores.

5. Perbezaan Pemilihan Subjek (Differential Selection of Subject)

Selection bias may be introduced as a result of differences in the
selection of subjects for the comparison groups.
If the experimental and control groups are exposed to X
(treatment), a method of teaching spelling and afterward a test
given, the test results may reflect a pre-X difference in the two
groups rather than the effect of X. Perhaps the experimental group
could spell better than the control group before X was applied.
6. Mortaliti Ujikaji (Experimental mortality)
If a particular type of subject drops out of one group after the
experiment is underway, this differential loss may affect the finding
of the investigation.
Suppose that the subject in the experimental group who received
the lowest pretest scores drop out after taking the test. The
remainder of the experimental group may show a greater gain on
posttest than the control group, not because of its exposure to the X
(treatment) but because the low scoring subjects are missing.
7. Regresi Statistik (Statistical regression)
In some educational research, groups are selected on the basis of
the extreme scores. When this procedure is employed, the effect of
what is called ‘statistical regression’ may be mistaken for the effect
of X (treatment).
Statistical regression occurs in educational research due to
extraneous factors unique to each experimental group. Regression
means, simply, that the subjects scoring highest on a pre-test are
likely to score relatively lower on a post-test; conversely, those
scoring lowest on a pre-test are likely to score relatively higher on a
8. Interaksi Pemilihan Subjek Dengan Kematangan, Sejarah dll. –
Kombinasi reaktif antara faktor-faktor (Interaction of selection and
maturation, selection and history, etc)
When the experimental and control groups have the same T1 (pre-
test) some other differences between them, such as, intelligence or
motivation – rather than X (treatment) may cause one of them to
get higher T2 (post-test) scores.
KESAHAN LUARAN (External validity)

Merujuk kepada sejauhmanakah inferens boleh dibuat terhadap

populasi berasaskan kepada dapatan eksperimen.(to what
populations or setting can they be generalized)
Empat faktor yang mempengaruhi Kesahan Luaran.
1. Kesan Reaktif PraUjian (Reactive or interaction effects of pretesting)
2. Kesan Interaksi Dari Pemilihan Yang Tidak Adil (Interaction effects of
selection bias)
3. Kesan Reaktif Penyusunan Eksperimen (Reactive effects of
experimental arrangements)
4. Gangguan Treatment Yang Pelbagai (Multiple treatment interference)

Three main categories

There are three general types of pre-experimental design
1. One-Shot Case Study (Rekabentuk Kajian Kes Sekali)
Treatment Posttest
2. One-Group Pretest-Posttest Design (Rekabentuk Praujian Pasca Ujian
Satu Kumpulan
Pretest Treatment Posttest
O1 X O2
3. The Static-Group Comparison (Rekabentuk Perbandingan Kumpulan
Treatment Posttest Treatment Posttest
X O or X1 O
O X2 O
There are three general types of true experimental design
1. Pretest-Posttest Control Group Design (Rekabentuk Kumpulan
Kawalan Praujian-Pasca Ujian)

Group Pretest Treatment Posttest

Exp ® O1 X O2
Control ® O1 - O2

2. Solomon Four Group Design (Rekabentuk Solomon Empat


Group Pretest Treatment Posttest

Exp 1 ® O1 X O2
Control 1 ® O1 - O2
Exp 2 ® - X O2
Control 2 ® - - O2

3. Posttest Only Control Group Design (Rekabentuk Kumpulan Kawalan

Pasca Ujian Sahaja)

Group Pretest Treatment Posttest

Exp ® - X O
Control ® - - O
Quasi-Experiment research

• Menggunakan kumpulan yang tersedia

sebagai subjek kajian tanpa melakukan
perawakan dalam membina kumpulan.
• Cth: menggunakan kelas ting. 4 mawar
(kump. Rawatan) dan kelas ting 4 melor
(kump. Kawalan); tanpa merombak
kedua-dua kelas.
1. Nonequavalent Control Group Design (Rekabentuk Kumpulan
Kawalan Tidak Serupa)

Group Pretest Treatment Posttest

Exp O1 X O2
Control O1 - O2

2. Time-Series Experimental Design (Rekabentuk Siri Masa)

Pretest Treatment Posttest

O1 O2 O3 O4 X O5 O6 O7 O8
3. Rekabentuk Siri Masa Pelbagai

Group Pretest Treatment Posttest

Exp O1 O2 O3 O X O5 O6 O7
Control O1 O2 O3 - O5 O6 O7
• Diperkenalkan pada 1946 dalam kajian
PENGENALAN berkaitan dengan masalah komuniti
setempat. (Kurt,1946 ; Jonida Lesha,
KajianTindakan 2014)
• Berpendapat perkembangan teori kajian
tindakan yang selari dengan keperluan
perubahan sosial maka kedua-duanya
boleh dicapai bersama-sama bagi
menyelesaikan masalah sosial.

Adelman C. 1993. Kurt Lewin and the Origins of Action Research

Satu bentuk inkuri refleksi kendiri yang dilakukan secara
PENGENALAN kolektif, dilakukan oleh peserta yang berada dalam
sesuatu situasi sosial bertujuan meningkatkan kefahaman
mereka tentang amalan-amalan kemasyarakatan serta
Definisi situasi di mana amalan itu dilakukan.
KajianTindakan (Kemmis & McTaggart, 1988)

Suatu pengkajian yang sistematik oleh sekumpulan

pengamal terhadap usaha-usaha untuk mengubah dan
memperbaiki amalan pendidikan melalui tindakan praktis
mereka sendiri serta refleksi terhadap kesan atau akibat
tindakan tersebut.
(Ebbut, 1985)

Pendekatan untuk memperbaiki/meningkatkan kualiti

pendidikan melalui perubahan dengan menggalakkan
guru² menjadi lebih sedar tentang amalan mereka sendiri,
menjadi kritis terhadap amalan tersebut & bersedia untuk
(McNiff, 1988)

Adelman C. 1993. Kurt Lewin and the Origins of Action Research

Satu kajian terhadap situasi sosial yang melibatkan peserta
PENGENALAN dalam situasi social itu sendiri sebagai penyelidik dengan
niat untuk memperbaiki kualiti amalan masing-masing.
(Somekh, 1989)

Kajian bilik darjah yang dilakukan oleh guru² bagi

membolehkan mereka meneroka isu² yang mereka sukai.
(Cochran- Smith & Lytle, 1990)

Melibatkan pengamal (guru) untuk mencuba membaiki

pengajaran mereka melalui kitaran yang melibatkan proses
perancangan, pelaksanaan, pemerhatian & refleksi
(Kember & Gow, 1992)

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Tujuan Kajian

Memperbaiki Memperbaiki Memperbaiki

amalan atau situasi di
pengajaran meningkatkan mana
kefahaman pengajaran
terhadap dijalankan

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

PENGENALAN Dilakukan secara
Dilakukan oleh berkumpulan dengan
guru/pendidik dalam membentuk kumpulan
Ciri-ciri Kajian konteks tugas harian guru yang bersifat kritis

Pengkaji terlibat secara

langsung dalam
Banyak menggunakan
memperbaiki &
aktiviti atau kaedah
refleksi kendiri
kefahaman tentang

Dilakukan secara
berterusan &
Dilakukan secara
berkembang mengikut
sistematik & rigorous
kitar refleksi kendiri (4

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Model Kajian

Gelung Kajian
(Kemmis &
Model Kajian

Model Somekh
Model Kajian

Model McBride
& Schostak

Gelung Kajian Tindakan (Kemmis & McTaggart, 1988)

• Merancang tindakan/intervensi bagi
Merancang mengatasi masalah yang difokuskan

• Melaksanakan tindakan bagi mengatasi

Bertindak masalah yang menjadi fokus

• Mengumpul & menganalisis data bagi

Memerhati menilai keberkesanan tindakan

• Munasabah/ merefleksi diri bagi melihat

Mereflek kekuatan & kelemahan PdPc sendiri

Teruskan ke lingkaran seterusnya (jika perlu)

Gelung Kajian Tindakan (Kemmis & McTaggart, 1988)

Kajian dijalankan berpandukan proses
Prosedur tersebut :

• Guru menjalankan refleksi ke atas satu²

isu pengajaran & pembelajaran yang
1 hendak ditangani

• Guru menyediakan satu rancangan yang

sesuai untuk mengatasi masalah yang
2 dihadapi

• Guru melaksanakan rancangan yan

dihasilkan & memerhatikankemajuan
3 tindakan sepanjang pelaksanaan kajian

Gelung Kajian Tindakan (Kemmis & McTaggart, 1988)

Jenis Kajian
Jenis-jenis Kajian Tindakan

Kajian Tindakan Kajian Tindakan

Amalan Penyertaan


• Mengkaji amalan • Mengkaji isu-isu sosial
setempat yang menghalang
• Melibatkan inkuiri kehidupan individu
berasaskan individu atau • Menekankan
pasukan kesaksamaan kolaborasi
• Memfokus kepada • Memfokus kepada
pembangunan guru dan perubahan-perubahan
pembelajaran murid yang meningkatkan
• Melaksana pelan kualiti kehidupan
tindakan • Melahirkan penyelidik
• Melahirkan guru sebagai yang “bebas”
penyelidik (emancipated researcher)

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Merancang &
Kajian Tindakan

Mengenalpasti Mengenalpasti
fokus kajian objektif kajian

Merancang & Menentukan

melaksana Kumpulan
tindakan Sasaran

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

• Refleksi merupakan proses mengingat kembali
pengalaman yang telah dilalui, memikirkannya,
Refleksi mempertimbangkannya dan seterusnya membuat
Definisi penillaian terhadap pengalaman tersebut
• Inkuiri yang sistematik terhadap amalan pdp sendiri
bagi memperbaiki & meningkatkan keberkesanannya
• Sikap berfikiran refelektif (Dewey,1909 ; 1933) :
• Berfikiran terbuka
• Kesungguhan
• Tanggungjawab

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Refleksi Menilai amalan pdp Menilai kesesuaian
sendiri & membaiki strategi PdPc & BBB
/mempertingkat yang digunakan

Menganalisis nilai Mengkaji &

pendidikan menjelaskan peranan
nilai peribadi
yang menjadi asas
berkaitan pendidikan
amalan pengajaran secara berterusan

penjelasan tentang
murid, interaksi bilik
darjah serta proses

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Analisis audio
Refleksi Guru merakam sesi PdPc
Analisis video
sendiri & Guru merakam sesi PdPc
Kaedah menganalisisnya sendiri/ bantuan rakan &
membuat menganalisisnya

Perbincangan Penulisan Jurnal

Dengan rakan Berkongsi idea &
setugastentang PdPc bertukar-tukar fikiran

Menggunakan artikel
Analisis dokumen terkini berkaitan isu²

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Hasil refleksi:
Hasil Penerangan
• Guru memahami keberkesanan PdPc (berjaya/tidak
Kefahaman baru berjaya)
• Boleh mengambil tindakan pemulihan/
memperkukuh teknik pengajaran
Kesedaran • Guru boleh membuat environmental scanning untuk
mendapatkan maklum balas bagi mewujudkan
kesedaran tentang situasi setempat
Keinsafan • Sentiasa memikirkan cara terbaik untuk
meningkatkan pengetahuan & kemahiran
Perubahan persepsi • Mengubah persepsinya kepada sesuatu yang lebih
Perubahan amalan • Belajar mengenai teknik pengajaran yang paling

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Masa Penerangan

refleksi: •
Membuat analisis isu
Menganalisis kefahaman sendiri berkaitan isu
Peringkat Awal
• Menentukan isu
• Mengenalpasti tindakan intervensi
• Membuat keputusan awal terhadap perjalanan proses
Semasa Kajian tindakan (intervensi)
• Menerima komen & teguran spontan
• Mengatasi rintangan dengan penambahbaikkan sewajarnya
• Menilai keberkesanan tindakan intervensi tersebut
• Mensintesis perubahan & peningkatan diri guru/murid
Selepas Kajian • Menilai Kembali kesan Tindakan yang telah diambil &
kefahaman baru yang diperoleh
• Menentukan rumusan
• Merancang tindakan sususlan

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Pengumpulan Data Soal selidik

data: Analisis


Ujian Pemerhatian

KPM. 2008. Buku Manual Kajian Tindakan : Edisi Ketiga

Terima kasih

Dr Tom Clark & Dr Liam Foster

Department of Sociological Studies | University of Sheffield

Describing and summarising
*Once we’ve planned our research project, worked out the research aims and research
questions, and designed our variables, we can now begin to think about analysis. There
are two broad methods that we can use to analyse our data: descriptive statistics and
inferential statistics. Descriptive statistics provide us with a range of techniques that
Descriptive Nominal and Exploring Cross allow us to summarise and describe data they can tell us how much variation there is
statistics ordinal data associations tabulations in our answers, for example, or where the central tendencies lay. On the other hand,
in categorical inferential statistics allow us to infer answers from our data using hypotheses (we’ll
data deal with this in more detail in workbook 4).
page 2 page 2 page 6 page 6

3.1 Descriptive statistics tal punishment? And, what are the most common
qualifications workers have in a particular factory?
This workbook will introduce you to a range of
techniques that will help you to describe your data The frequency distribution is calculated by identi-
and help you to choose which techniques are ap- fying the categories within a variable and counting
propriate for the data you have. the number of appearances they make across your
Odds ratios Working with Working with Measures of sample. This allows us to see how our data is dis-
larger tables interval central By the end of the book you should be able to: tributed across the answers to the variable. For ex-
• Calculate frequencies and proportions and cor- ample, if we were interested in the socio-economic
variables tendency
rectly identify when to use them status of 178 individuals in sample ‘A’ we could cre-
• Understand the presentation of data includ- ate a frequency chart in the following manner:
ing pie charts and bar charts and when to use
them Table 3.1. Frequency of socio-economic class in
page 7 page 9 page 14 page 14 • Calculate measures of central tendency sample ‘A’
• Calculate measures of dispersion
Occupation Frequency
3.2. Nominal and ordinal data: Frequencies, bar Professional or 14
charts, proportions and pie charts Managerial
Intermediate 29

Measures of Using interval Rounding up If you are working with nominal or ordinal data then Routine or Manual 135
you are working with categorical data. As a general Total 178
dispersion data
rule of thumb, frequencies, proportions, bar charts
and pie charts are particularly useful with this type From this table we are able to see that of the 178
of data. We shall deal with each in turn. people surveyed in sample ‘A’, the professional or
page 15 page 22 page 25 managerial class is the least common value with
3.2.1. Frequency tables 14 of the total people surveyed. Routine or Manual
workers are most common with a total of 135.
Frequency tables allow you to identify the amount, Here we can see that the most common highest
or counts, associated with the given categories of a
educational achievement in our sample is ‘A-levels’;
variable. They allow you to summarise the differ-
‘no educational achievement’ is the least common.
ent answers of a variable clearly so you can answer
It’s easy to see that frequency tables are useful for
questions like: How many divorced people are there
presenting the specific values of our variable simply
in our sample? How many people agree with capi-
and clearly.

1. 2.

Here’s another example: mine and as a result ordinal data should also not Table 3.3. Frequency of socio-economic class in 3.2.5. A note about sampling
be treated as continuous? Look at the gaps between business ‘A’ with proportions
Table 3.2. Frequency of highest educational at-
the bars – these gaps indicate that the data we are
tainment in sample ‘A’
working with is not continuous. This is why we use Although proportions are easy to work out, they
bar-charts to describe nominal and ordinal data. Occupation Educational Frequency need to be used with caution. Let’s return to a previ-
Educational Level Frequency Level ous example that we looked at in workbook 2.
Degree+ 35 It’s also worth remembering that when you are us- Professional 14 7.87%

57% ofsmoke
ing bar charts, too many categories make the graph or Managerial University of Sheffield students
Intermediate 23
very difficult to read – we wouldn’t want many more Intermediate 29 16.29% less than 10 cigarettes a week.
A Levels 45 categories than the ones we currently have in the Routine 135 75.84%
GCSE 30 above example. or Manual How meaningful this claim actually is depends
Other 30
Total 178 100% on the size of our sample and our sampling strat-
None 15 egy. Let’s say that there are approximately 15,000
3.2.4. Proportions
Total 178 students at the University of Sheffield. If we inter-
viewed 100 students using a convenience sampling
Like frequencies, proportions are another method T
Work out the proportions in the follow- method, our sample is hardly likely to be represent-
3.2.2. Bar charts of summarizing nominal and ordinal data numeri- ing table: ative of the student population as a whole. As a re-
cally. Whilst there are no hard and fast rules of when sult, authoritively claiming that ‘57% of University
Bar charts are also commonly used with categori- to use proportions and when to use frequencies,
proportions are really useful if we want to compare
of Sheffield students smoke less than 10 cigarettes
a week’ would be a little dubious as our data is very
cal data and can be used to pictorially represent the Table 3.4. Frequency of educational attainment in
two categories that have markedly different sample Business ‘A’ with proportions limited and unlikely to be representative. We might,
values contained in frequency tables. Whilst the ex-
sizes as they summarise all data on a scale of 0-100. for instance, have been asking people outside of the
act frequencies are difficult to determine from bar
Proportions also tend to be used when we have Union building. Given restrictions on smoking in-
charts, they can be usefully used to visually summa- Educational Frequency Proportion
large sample sizes or large values. Of course, what doors, people who smoke now have to go outside
rise the distribution of categorical data. Look at Fig- Level
actually constitutes a large sample is relative to the in order to smoke. If our interviewer was waiting
ure 3.1. Using the frequency of current marital sta- Degree+ 35
population, but as a general rule of thumb, propor- at a place where smokers congregated, then we
tus of those individuals in sample ‘A’ we can see how
tions are better to use than frequencies if we want Intermediate 23 would be in danger of over-estimating the number
the heights of the nine bars help us to see how the
to imply that our sample is moving towards being of smokers in the population because our sample
frequency counts compare with one another. It is A Levels 45
representative of that population. This is often dif- was biased. On the other hand, if we had a random
easy to see that ‘married’ is the most frequent group GCSE 30
ficult if our sample is small or the sampling strategy sample of 100 students drawn from all facets of the
followed by living with a partner. The least frequent
is problematic. Other 30 University population where everyone had an equal
group is ‘in a civil partnership’.
None 15 chance of being involved in the study, then the con-
If we do decide that proportions are useful, then clusion would be more valid - but the sample would
Figure 3.1. Frequency of current marital status in Total 178
they are easy to work out - they are the same as the have to be completely random, and the sample size
sample ‘A’
percentages. All we have to do is divide each par- would still be a little small. Indeed, one of the prob-
ticular value by the total number in the sample and lems of random sampling is that it often (re)produc-
multiply by 100. es natural variation around a mean. Even though
Table 3.5. Frequency of education attainment in
the composition of gender in the UK is around 49%
Business ‘A’ with proportions and answers
In Table 3.3., for example, 14 divided by 178 is .0787. male and 51% female, a random sample would not
Multiply this answer by 100 and we have 7.87. necessarily have 49 men and 51 women in it. This
Therefore 14 is 7.87% of 178 and the professional or Educational Frequency Proportion is why stratified sampling is common in many na-
managerial class make up 7.87% of the sample total. Level tional datasets. This strategy involves managing the
Degree+ 35 19.66% random sample so that it conforms to pre-deter-
mined characteristics of the wider population. Even
Intermediate 23 12.92%
where multi-stage stratified sampling is used, many
A Levels 45 25.28% of the secondary datasets have samples that are
GCSE 30 16.85% large enough to use proportions with confidence
Other 30 16.85% as they have been developed to be representative of
Remember how we said that nominal data has no
the population.
mid-points between the categories? You are either None 15 8.43%
in one or the other – never between. Hence the cat-
Total 178 100% Of course, and remembering some of the lessons
egories are mutually exclusive and non-continuous.
learnt in workbook 2, if we summarised the results
Similarly, remember how we highlighted that the
in the following format, we’d be on much safer
scaling of ordinal data is often difficult to deter-
ground. Even then, a good case could be made for

3. 3. 4.

using the frequencies if n (number of respondents) Figure 3.3. Highest educational achievement in ity, and employment status. Depending on your
equalled 100. sample ‘A’ project, this demographic information about your Q Is this sample likely to be representative?
sample can be crucial in revealing whether your
57% of University of Sheffield students in the present sample is roughly representative of your population
sample (n=100) reported that they smoked less than or whether some groups are under-represented. The distribution here is relatively even and there
10 cigarettes a week. appears to be little cause for alarm. The surprising
Let’s go back to Table 3.1: thing here, however, is that given the relatively high
Table 3.1. Frequency of socio-economic class in count of routine/manual workers in Sample ‘A’, the
3.2.6. Pie charts sample ‘A’ ‘Degree+’ seems to be a little high. This may be ac-
counted for by the ‘Professional’ and ‘Intermediate’
categories, but this may be worth further investiga-
When using proportions, a good way of visually Occupation Frequency tion.
representing the whole picture is using a pie chart. Professional or 14
Pie charts help us to see that all the categories make Managerial However, just because you can do something doesn’t
up a “whole” variable. They also provide a sense of Intermediate 29 mean that you should include it in your final report.
which attributes contain more counts. Look at this Whilst you should conduct descriptive analyses on
example from the data on Socio-economic class. Routine or Manual 135
all your variables, as (another) rule of thumb, only
Total 178 report the results if they are interesting in some way.
Figure 3.2. Socio-economic class of sample ‘A’ For instance, if your sample is likely to over-esti-
Q Why might this pie-chart be problematic? mate or under-estimate a particular group, or if the
Does this suggest that the sample is rep- results are counter-intuitive or unexpected in some
Q resentative? way. There is also often some merit in reporting ex-
Unlike Figure 3.2, the problem here is that the rela- pected results, providing you have a good rationale
tive sizes of the proportions are very similar. As a for doing so. However, do try to avoid reporting the
result, the chart is not adding anything visually be- The answer, of course, is that it depends on the di- results of your descriptive analyses for the sake of it.
yond what had been given in the table previously. mensions of the population. However, if we wanted
Indeed, it is worth noting that sometimes data is not to generalize the findings beyond sample ‘A’ to the
so difficult to understand and it can often be easily population more generally, then our results may be 3.3. Exploring associations in categorical data
described without the use of these charts. People of- affected by the large proportion of routine/manual
ten make the mistake of trying to include too many workers in the sample. For instance, did the research
unnecessary charts and tables in projects and this take place in an area where there was a strong em- Frequencies and proportions are also crucial in
can detract from the flow of their work. Although phasis on routine and manual forms of employ- another way as they form the foundations that help
charts can be used to as an alternative to tables in ment or where there were limited opportunities to us to explore associations, relationships, and effects
order to vary the presentation, try to only use charts participate in other forms of employment? Without between variables. Indeed, descriptive techniques
to emphasize important points or to demonstrate some additional context, the answers to these ques- can be used to analyse particular variables of inter-
Pie charts can be useful when you want the reader something that isn’t necessarily clear from your tions are not clear. est that emerge from our research rationales.
to notice that there are more people in one group original tables. Try to use them where there is a spe-
than the others. It is easy to see how routine or man- cific reason to do so. 3.3.1. Cross tabulations
ual workers are by far the largest group in Sample Q What about the data in Table 3.2?
‘A’, followed by the intermediate group, and then the
professional or managerial group. Pie charts help A cross tabulation (often abbreviated as cross tabs)
3.2.7. Using descriptive statistics to look at the is a way to present the joint distribution of two or
an audience gain a sense of the distribution quickly. Table 3.2. Frequency of highest educational attain-
dimensions of a sample more variables. Cross tabs are usually presented as
ment in sample ‘A’ a contingency table. Whereas a frequency distribu-
There are some some limitations in conveying the
relative magnitudes of the differences. Deciphering D escriptive statistics allow us to summarise data
Educational Level Frequency
tion provides the distribution of one variable, a con-
tingency table describes the distribution of two or
wedges in a pie chart is more difficult than compar- and there are a number of very good reasons why
ing heights of bars. Angles are typically harder to you would employ descriptive statistics to analyse Degree+ 35 more variables simultaneously.
compare than lengths, so pie charts are not always the distributions of nominal and ordinal variables. Intermediate 23
great for comparing different quantities - especially In its most basic form, a 2 2 table, it is a measure of
A Levels 45 how many people in one group also feature in an-
when they are similar. This is why you should pro- However, descriptive techniques are particularly
vide the numbers too. Similarly, they tend to be GCSE 30 other. For instance if we had collected information
useful when we are exploring the distribution of de-
used where the number of categories is small as too mographic variables. This may include the variables Other 30 about the gender of gym membership, our findings
many slices can make the chart difficult to read. we have mentioned such as marital status, educa- None 15 may look something like this:
tional achievement, and social class, but it could Total 178
Look at the next example using data from Table 3.5: also include variables such as age, gender, ethnic-

5. 6.

Table 3.6. Frequency of membership by gender Table 3.7. Frequency of reasons for attending the Table 3.8. Reason for visiting the gym by gender
Reasons for going to the gym
Gender of gym Frequency Total
Reason for going to Frequency Social Physical
the gym Male 25 44 69
Male 69 Gender
Social 76 Female 73 51 124
Female 124
Physical 117 Total 98 95 193
Total 193
Total 193
Odds ratios are a useful way of standardizing summary scores when we have a two by two table. They es-
Let’s imagine that we also collected information While these tables may be interesting in themselves, sentially summarise the odds of a particular event occurring for one group compared to that of another
about the motivations of people who attend the they do not tell us anything about the relationship group. Odds ratios are particularly useful in situations where the total numbers on your particular groups
gym and arranged these into two categories social between the two variables. are uneven as they provide a way to standardize scores so you are comparing like with like. In the next 3
(seeing friends etc) and physical (getting fit). sections, it is worth keeping Table 3.8 in sight as it will help you work out what is being compared with what
and where all the numbers are coming from.

e In order to find out the answers to these questions, we need to cross-tabulate the data in the form of a con- Measuring chance On the other hand, the odds of nearly 6 to 10 re-
tingency table. fers to the number of men that will go to the gym
for physical reasons in comparison to those going
To do this, all you need to do is count how many women ticked the social box and how many ticked the Firstly, however, we need to know the difference for social reasons. Again, 10 is used as a relatively
physical box. between the chance of an event occurring, and accessible method of summarizing the data at a
the odds of it doing so. This is perhaps best seen glance. The difference between in and to is subtle,
If we do the same for men, we have a table with 2 categories on the horizontal axis, and 2 categories on the through the example. but it is important in understanding the difference
vertical axis. between chance and odds.
We can see from Table 3.8 that 25 men out of 69 went
This is what we mean by a 2 2 table. This is how it might look if we did so. to the gym for social reasons. This is effectively the
chance that men went to the gym for social reasons. Calculate the chance and odds that
The likelihood of men going to the gym for social T women will go to the gym for social
Table 3.8. Reason for visiting the gym by gender reasons is 25 in 69 – or 36% (25/69 = .36). reasons. Summarise the results.

This is, however, different from the odds. The chance of women going to the gym for social
Reasons for going to the gym
Total reasons is 73 in 124 - or 58%. For every 10 women,
Social Physical Measuring odds nearly 6 will go to the gym for social reasons. The
Male 25 44 69 odds of women going to the gym for social reasons,
Female 73 51 124
We can also see from the table that the odds of however, is 73 to 51 - or 143%. For every 17 women,
around 10 will go to the gym for social reasons.
Total 98 95 193 men going to the gym for social reasons, as opposed
to physical ones, is 25 to 44. For every 25 men who
went to the gym for social reasons, 44 did not. 25/44 So, it looks like men are less likely to go to the gym
By doing this we can now see the reasons for visiting the gym tended to differ according to the gender of the = .57 - effectively .57 to 1 or 57%. This value is just for social reasons, but can we tell by how much?
visitor. It was not possible to see this from the frequency tables alone. Indeed, cross tabulations allow us to under 6 to 10, and is the same as saying that for eve-
go deeper into the data and compare and contrast different groups and answers. They are a fundamental ry 16 men, nearly 6 will go to the gym for social rea-
tool for quantitative social research. sons whereas 10 will go for physical reasons. So, the
chance of men going to the gym for social reasons
is 36%. This is just under 4 in 10. The odds that men
3.3.2. Odds Ratios will go to the gym for social reasons, however, is 57%
- or just under 6 to 10.
So, you should now understand what is meant by a cross tabulation, and you should also understand the It is important to note the phrasing here. Under 4 in
principles of working out proportions. Good, because when we combine these two techniques we can do
something rather clever – we can utilize odds ratios to help us interpret our table and assess whether men 10 refers to the number of men in total that will go
were more likely to visit the gym for social or physical reasons, and how this compares with women. Let us to the gym for social reasons. The use of the number
return to our gender and gym membership example. 10 is relatively arbitrary, but as a round(ish) number
it gives a sense of the data at a glance.

7. 8.

for every 4 people that go to the gym for social
reasons, 3 will be women.
With cases like the murder of Stephen Law-

you about being physically attacked because of
skin colour, ethnic origin or religion?’
Well, yes, this is exactly what an odds ratio meas- rence and Lee Rigby – not to mention incident
ures. It can be calculated quite simply as a compar-
ative measure of women going to the gym for social packed marches by the English Defence League,
reasons as opposed to men that go for social rea- as well as rises in reports of Islamaphobia since We already have a good measure of eth-
sons. Remember that for women the odds of going
3.3.3. Working with larger tables 7/11 - race related violence is often a headline
Q nicity (Q2.4), but how might we meas-
to the gym for social reasons rather than physical grabbing issue within the UK. Indeed, reports of ure ‘fear of racist abuse’?
ones was 1.43. For men, however, the odds were .57. Let us suppose that I had the following research ra- racist abuse - be it physical or verbal - are still

To calculate the odds ratio all we have to do is divide tionale: all too common in Britain.
the odds for women, by the odds for men. Well, the Crime Survey for England and Wales has
Whilst fear of the outsider, and now the insider, asked this question on an annual basis for a number
In this case, the odds ratio is 1.43 to .57., or 1.43/.57 are not new in the context of race relations (see of years. They use the following measure:
which is 2.51. This means that women have two and Reports of racist abuse – be it physical or verbal Solomos, 2000), on-going debates concerning
a half times the odds of going to the gym for social – are still all too common in Britain. However, it Q3.1. How worried are you about being subject to a
reasons as men do. immigration have only served to focus popular
remains to be seen how these reports are inter- debates on the dangerous ‘other’. physical attack because of your skin colour, ethnic
However, this is not the same as saying that women
preted by particular ethnic groups and how origin or religion? Would you say you were? e
are 2.5 times more likely to go to the gym for social they influence their fear of racist abuse. Using However, it remains to be seen how these reports
data from the British Crime Survey (2000), this A. Very worried
reasons than men are. Indeed, the odds ratio is a are interpreted by particular ethnic groups and
study will explore the relationship between eth- B. Fairly worried
measure that compares odds, not chance. how they influence their fear of racist abuse. Us-
nicity and fear of racist abuse by examining the ing data from the British Crime Survey (2000), C. Not very worried
To work out this likelihood, we need to compare the
chance between men and women directly. This is 25
out of 98 men (25.5%), and 73 out of 98 for wom-
en (74.5%). If we divide 74.5 by 25.5 - the answer is

responses to the question ‘how worried are you
about being physically attacked because of skin
colour, ethnic origin or religion?’
this study will explore the relationship between
ethnicity and fear of racist abuse by examining
the responses to the question ‘how worried are
D. Not at all worried
E. Not applicable

2.92 - it is easy to see that women are nearly 3 times

more likely to go to the gym for social reasons than
The following table summarises the frequency of the answers by ethnicity.
men are. Effectively, for every 4 people that go to the
gym for social reasons, 3 will be women.
Q Do you think this is a good rationale? Table 3.9. Fear of racist abuse by ethnic group
We can summarise this quite effectively:

Worried about being physically attacked because
Well, as a sketch it is not bad. However, remember of skin colour, ethnic origin or religion
our guidelines for a good rationale? They suggest Very Fairly Not very Not at all Not Total
the following: worried worried worried worried applicable
According to this sample, the chance of men White 877 1170 4889 9482 1911 18329
going to the gym for social reasons is 36%. This • Why is the project an interesting thing to do?
• What has been said in the area before? All black 68 61 88 45 1 263
is under 4 in 10. The odds of men going to the groups
• What has not been said in the area before and
gym for social reasons rather than physical ones
why is it important that this issue is addressed? Ethnicity Indian 89 61 64 29 0 243
is 57% - or just under 6 to 10. However, whilst • What will this project do? Pakistani/ 58 44 33 25 0 160
the chance of women going to the gym for social
reasons is 58% - approximately 6 in 10 - the
odds of women going to the gym for social rea- Other 73 65 122 87 8 355
Whilst it does provide some context as to why this is groups
sons rather than physical ones is 143%. an interesting topic to explore, it doesn’t do this with
Total 1165 1401 5196 9668 1920 19350
much force – and it doesn’t really attempt to link the
For every 17 women, around 10 will go to the problem to the literature. Hence, the knowledge gap
gym for social reasons. This means that women that is being created is not really very convincing.
have two and a half times the odds of going Admittedly, the project is not that original in scope,
to the gym for social reasons than men, with but it could be improved with a little further back-
women being nearly 3 times more likely to go to ground reading.
the gym for social reasons than men. Effectively,

9. 10.

As you can see there is a lot of information contained It would also be worth noting in the methodol- Figure 3.4. Fear of racist abuse: Non-white ethnic
in this table and at first glance it might appear to be ogy sections of our report that although the British Can you see how each category is now on groups
difficult to make sense of. However, an effective use Crime Survey does go to great lengths to ensure that Q a scale of 0-100? This makes the
of descriptive statistics can help us to make sense of the people surveyed are as representative as is pos- differences between groups much
the data. sible, the ‘all black’, ‘Indian’, ‘Pakistani/Bangladeshi’, clearer.
and ‘Other groups’ categories are so small that there
It is worth noting that you can calculate the per-
may be a concern that the sample will not gener-
Q Can you analyse the table? centages of a table in different ways. In the above
alize to overall populations. However, even though
example, we have calculated them as within row
the sample sizes within groups are small, because
percentages. The percentage is calculated accord-
the BCS does make an effort to systematically sam-
ing to the row total. However, we could also cal-
Our first point might be: ple target populations, the results are likely to be in-
culate the percentages by the column totals if we
dicative of general trends.
wanted to. Some tables will even calculate the per-
Overall, the vast majority of people are ‘not very
centages according to the table total. The general
worried’ or ‘not at all worried’ about being attacked The differences in the number of people in the eth-
rule of thumb is to calculate percentages in the di-
because of their skin colour, ethnic origin or reli- nic group categories also has another knock-on ef-
rection of the variable that you think is having an
gion. fect – it makes the frequencies between the groups
impact on a particular measure. In the above ex-
difficult to compare directly.
ample we are interested in the differences between
However, the first thing to note here is that although
ethnic groups on a measure of fear of racist attack.
our sample is large (n= 19350) there are a lot more One way of overcoming this difficulty is to convert
We think, therefore, that ethnic group might have
people in the ‘white’ category than all of the others the frequencies to within group proportions ie that
an impact on fear. Hence we calculate the percent-
put together and this is having a great deal of influ- is, working out the proportions of each cell within a
ages according to the groups of interest. In this case
ence on the overall totals. Therefore it would not be particular category, such as ‘white’. This will help us
it is a within row percentage.
entirely correct to assume that most people are not to see the differences in the distribution of the an-
worried about being physically attacked because swers between categories more easily. To do this we
When we look at the within row percentages, we can
of skin colour, ethnic origin or religion, as the table divide each cell on our scale (for example 877 in the
go through each category in turn and describe the
demonstrates quite clearly that the fear of a racially ‘very worried’ cell) by the total number of respond-
distribution of the data, pulling out the most inter-
motivated attack varies by ethnicity. ents within a particular category (for the white cat-
esting material. Remember, we don’t want to report
egory this is 18329), and multiply by 100 (4.78%). Figure 3.5. Fear of racist abuse: White ethnic group
everything, we just want to highlight the most note-
This will transform the data into percentages and
worthy information that is relevant to our research
make the categories easier to compare.
aim and research questions.

Our second point might be something along the

Convert the table into ‘within group pro-
T lines of:
White people are less likely to be worried about be-
ing physically attacked because of skin colour, eth-
Table 3.10. Proportions of fear of racist abuse by ethnic category – within row percentages nic origin or religion than those in the non-white
Worried about being physically attacked because
We might want to represent this visually in the form
of skin colour, ethnic origin or religion
of two pie charts. Of course, we could present pie
Very Fairly Not very Not at all Not Total
charts of each and every category, but this would
worried worried worried worried applicable probably interrupt the flow of our analysis and
White 4.8% 6.4% 26.7% 51.7% 10.4% 100% would probably be unnecessary as it wouldn’t add
All black 25.9% 23.2% 33.5% 17.1% 0.4% 100% information that isn’t already contained in our ta-
groups ble. However, a pie chart demonstrating the differ-
ence between ‘white’ and ‘non-white’ would require
Ethnicity Indian 36.6% 25.1% 26.3% 11.9% 0% 100%
us to add together the answers in our ‘all black’, ‘In-
Pakistani/ 36.6% 27.5% 20.6% 15.6% 0% 100% dian’, ‘Pakistani/Bangladeshi’, and ‘other groups’
Bangladeshi categories together to create a new variable – ‘non-
Other 20.6% 18.3% 34.4% 24.5% 2.3% 100% white’. This would be new and interesting material
groups that isn’t contained in our original tables.
Total 6% 7.2% 26.9% 50% 9.9% 100%

11. 12.

Using these pie charts, it’s easy to see that the ‘very Once we’ve analysed the table, our work still isn’t 3.4. Working with interval variables: Measures of
worried’ and ‘fairly worried’ categories are much over. We would still need to think about why the re- central tendency and measures of dispersion
bigger in the non-white ethnic groups. sults are distributed in this fashion and account for
our findings; we have to try to explain the patterns in Descriptive analysis of Table 3.10 reveals that 3.4.1. Measures of central tendency
Although you may have found more, these are the our data and not just describe them. The data does the vast majority of people are ‘not very wor-
other main points to note from our data. not just speak for itself. Why are the Indian group
least likely to be ‘not at all worried’? Why are the Pa-
ried’ or ‘not at all worried’ about being attacked
because of their skin colour, ethnic origin or
Whenever you collect data at the interval level, you
Nearly 60% of the ‘other groups’ category are ‘not kistani/Bangladeshi group most likely to be ‘very will not get the same ‘answer’ each time you meas-
very worried’ or ‘not at all worried’ about being a worried’ or ‘fairly worried’? What is it about the ‘oth- religion. However, this is mainly due the large ure the variable. If you wanted to record how many
victim of a physical attack. Although this is the high- er groups’ category that make them the least likely amount of people in the ‘white’ category who crimes are committed each week in Sheffield, the
est proportion of any non-white ethnic group, it is of the no-white categories to be ‘not very worried’ or show a marked tendency to either be ‘not wor- answers will vary from week to week. They might be
still much lower than the white group. Over 40% ‘not at all worried’? The answers to these questions ried at all’ or ‘not very worried’ about the pros- similar, but they will not be exactly the same every
of the ‘other group’ are still ‘fairly worried’ or ‘very probably lie in the way the groups have been meas- pect of being physically attacked due to their week. There will always be variation in recorded
worried’ about the prospect of an attack. ured, the particular histories of the ethnic groups, skin colour, ethnic origin or religion. crime from week to week and the ‘answers’ will be
and the identities, communities, and cultures of distributed across a range of scores. Measures of
The ‘Pakistani/Bangladeshi’ group demonstrate the the particular groups. This would all have to be ac- Over three quarters of the white group fall into central tendency attempt to summarise the centre
most concern about being physically attacked be- counted for in your final analysis and you could find of these distributions of ‘answers’. The overall aim
these categories. This is not the case with the
cause of skin colour, ethnic origin or religion. Over out much from both the methodological literature is to produce a figure which best represents a mid-
non-white groups and the fear of a racially point in the data. There are three different ways of
64% suggest they are ‘very worried’ or ‘fairly wor- on the BCS and the more specific literature on eth-
nicity, racism, and the ‘fear of crime’.
motivated attack varies by ethnicity. Indeed, doing this: the mean, median and mode.
ried’ compared with only 10% of the white group.
Figures 1 and 2 demonstrate quite clearly that
The ‘Indian’ group are also the least likely group to Of course, we wouldn’t present our results like we white people are less likely to be worried about
be ‘not at all worried’ with just under 12% suggest- have above, but our analysis would form the basis being physically attacked because of skin colour, The mean
ing they have no concern. Whilst over 15% of the of our results sections. It might run something like ethnic origin or religion than non-whites.
Pakistani/Bangladeshi group fall into the same cat- this.
egory, in comparison, over 51% of the white group The ‘Indian’ and ‘Pakistani/Bangladeshi’ groups
The mean is the most familiar of all of the meas-
suggest they are ‘not at all worried’. ures of central tendency and is sometimes referred
are the ethnic groups that are most likely to to as the average. In order to calculate the mean it is
demonstrate concern about being physically necessary to add together all of the values in a batch
attacked. Over 36% in each group are ‘very wor- and divide them by the number of values.
Table 3.10. Fear of a racist attack by ethnicity - within row percentages ried’. Indeed, the ‘Indian’ group are also the least
likely group to be ‘not at all worried’ with just T
under 12% suggesting they have no concern. Work out the mean for this distribution
Worried about being physically attacked because
T of age in a seminar class of 7 students.
of skin colour, ethnic origin or religion
Very Fairly Not very Not at all Not Total In comparison, over 51% of the white group 18, 18, 18, 19, 20, 20, 20
worried worried worried worried applicable suggest that they are ‘not at all worried’. Over
White 4.8% 6.4% 26.7% 51.7% 10.4% 100% 64% of those surveyed in the Pakistani/Bangla-
The mean would be the total of the ages added to-
deshi group suggest that they are ‘very worried’
All black 25.9% 23.2% 33.5% 17.1% 0.4% 100% gether (133) divided by the total number of students
groups or ‘fairly worried’. This compares to just 10% of (7) giving an answer of 19.
the white group.
Ethnicity Indian 36.6% 25.1% 26.3% 11.9% 0% 100%
However, there are some weaknesses associated
Pakistani/ 36.6% 27.5% 20.6% 15.6% 0% 100%
Nearly 60% of the ‘other groups’ category are with the use of the mean. In particular it is dispro-
‘not very worried’ or ‘not at all worried’ about portionately affected by extreme values in a distri-
Other 20.6% 18.3% 34.4% 24.5% 2.3% 100% being the victim of a physical attack. Although bution.
this is the highest proportion demonstrated by
Total 6% 7.2% 26.9% 50% 9.9% 100% any of the non-white ethnic groups, it is still Let us say that one of the 20 year olds

much lower than the white group. Indeed, over
40% of the ‘other group’ are still ‘fairly worried’
or ‘very worried’ about the prospect of an attack. T
in our seminar group was to change
seminar groups and was replaced by a
56 year old student so the distribution
was as follows:
18, 18, 18, 19, 20, 20, 56
Work out the mean.

13. 14.

The mean now becomes 24.1 (169 divided by 7). We words it is a less precise measure of central tenden- Clearly the similarity of the mean and median for
can see how the introduction of one student con- cy. each year hides the fact that the range is very dif-
T Work out the mean and median.
siderably older than the rest has dragged the mean ferent.
up notably, even though the other students are all The mode
under 21 – this one outlier has skewed the mean What conclusions can you draw from
So in order to work out the mean age of the fam- Q
and it no longer describes the central tendency of this?
the distribution very well. The mode is the easiest measure of central tenden- ily we would add the 7 ages together (2, 2, 8, 19, 20,
cy to calculate. It is simply the value which occurs 40, 42) and divide them by seven. The median is the
Well, this data is actually quite intriguing. Students
most in a distribution. fourth number in the series - 19. Interestingly the
We call these very high or very low values that dis- who took the module the first time it ran, on aver-
mean and median again turns out to be 19.
tort the mean an outlier. So in this case 56 is an out- age, tended to do a little better and a little worse than
lier. We often refer to an outlier skewing the finding. In the example above when the 56 year those in year two. In fact, one did really well and one
However, we can see here that these ages are far
Outliers can positively or negatively skew the data. old joined the class, our age did a lot worse than average. When the course ran
more spread out than the figures for the ages of stu-
Outliers positively skew the data if they pull the distribution is: for the second time, all the marks are much more
T dents in the seminar groups. Therefore we need to
mean toward the left, and they negatively skew the closely bunched around the mean. So whilst no-one
18, 18, 18, 19, 20, 20, 56 do more than just measuring central tendency if we
data if they pull it to the right. did really well, no-one did really badly either.
want to describe the two datasets better. In other
What is the mode? words, we need to measure the dispersion of the dis-
Indeed, although the range can be a useful tool to
tribution in order to find out the spread of the data. The median help us describe our data, it is a rather crude meas-
The most common age is 18 – this is the modal Fortunately, there are a number of measures which
ure of dispersion as it is totally dependent on the
group. can tell us more about how dispersed or spread
two most extreme values - hence it is rather suscep-
The median is the middle value in a distribution out the values in a distribution are. The measures
tible to outliers. We need to treat the range with cau-
when the scores are ranked in order of size. Simply of dispersion we will introduce you to in this work-
tion if these differ substantially from the rest of the
line up the scores in ascending order and take the What is the modal group before the 56 book are the range, the inter-quartile range, and the
middle value – if you have an even number of scores year old joined the group? standard deviation. However it is worth noting that
take the two numbers either side of the centre, add there are no appropriate measures of dispersion for
18, 18, 18, 19, 20, 20, 20 nominal variables and when dealing with ordinal
them and divide by two. The inter-quartile range
data we are restricted to range and inter-quartile
T range. The standard deviation is only calculated for
Work out the median age for our In this instance, we can see that there are two modal
T original seminar group values - 18 and 20 - this is an example of a bimodal
interval/ratio variables. The inter-quartile range is used to overcome the
distribution. main flaw of the range by eliminating the most ex-
18, 18, 18, 19, 20, 20, 20 treme scores in the distribution. Essentially, the IQR
Again, the mode is resistant to outliers, however, it The range is the range of the middle half of the distribution.
The median age is 19. This is the value that splits the does not take into account all the other data - even The following diagram depicts the idea of the inter-
group into two with three scores above it and three
scores below it.
less so than the mean - and as a result it is not a sen- The range is calculated by subtracting the smallest quartile range.
sitive measure of central tendency. value from the largest - it is that simple! Inter-quartile range
One extra student has now joined the The mode is the only measure of central tendency Let’s suppose that I wanted to examine the marks 1/4 2/4 3/4 4/4
T group – he’s 21. that can be used with nominal data, and it can also from my Research Methods module in more detail
be used with data at the ordinal level. Statistically speaking, the IQR is the difference be-
Work out the median now. to see if the students had done better from one year
tween the first (Q1) and third (Q3) quartile. Q1 lies
to the next.
at the midpoint between the first and second quar-
The median would be between 19 and 20 and is eas- 3.4.2. Measures of dispersion ter of the distribution, and Q3 lies at the midpoint
ily calculated by adding 19 and 20, and dividing by Consider the two sets of results: between the third and fourth quarters. Q2 is the me-
two. This gives a median of 19.5. dian.
In our example of the distribution of age in our Year one: 85 52 64 51 29 59 47 58 42 37 66
51 (mean 53.4 median 51)
Unlike the mean, the median is resistant to outliers seminar group we saw that both the mean and me- Q1 Q2 Q3
or extreme values. When one of the 20 year olds was dian of the 7 students was 19 (18, 18, 18, 19, 20, 20, Year two: 62 42 59 45 57 51 46 47 56 55 48
replaced by the 56 year old in our original example, 20). Let’s say we wanted to conduct some further re- 51 (mean 51.6 median 51)
it gave us a distribution of: 18,18,18,19,20,20,56. T
search on the average age of the household that the
Here, despite the presence of an outlier the median student lived in before University. The first person Work out the range in each example. 1/4 2/4 3/4 4/4
would remain at 19 because it doesn’t take into ac- we chose to ask was one of the 20 year olds. They Year one: Maximum mark 85, minimum
count the values of the data. However, whilst this had four younger siblings aged 2, 2, 8 and 19 and a mark 29 – the range = 56 In some cases Q1 and Q3 will fall quite naturally on
does make it resistant to outliers, because it doesn’t mum and dad aged 40 and 42. the 25% and 75% percentile in your distribution. For
Year two: Maximum mark 62, minimum instance, if your distribution is 1, 2, 3, 4, 5, 6, 7, 8, 9,
take into account the actual values within the distri- mark 42 – the range = 20
bution it lacks the sensitivity of the mean – in other 10, 11, then Q1 is 3, Q2 is 6, and Q3 is 9.

15. 16.

In other cases, it is a little more complicated - but limited sample here is likely to make any conclu- ues, the negatives would cancel out the positives absolute deviation is to calculate the difference be-
not much. For instance in the distribution 1, 2, 3, 4, sions problematic, interrogation of the IQR might and we’d always end up with 0. The absolute mean tween each score and the mean, add those differ-
5, 6, 7, 8, 9, Q2 is 5 with Q1 being the midpoint be- suggest that part of the difference in the range of deviation, therefore, is a collective summary of how ences up, and divide by the number of scores.
tween 2 and 3, which is 2.5. year one and year two can be accounted for by the the scores in our distribution differ, absolutely, from
presence of individual differences in year one where the mean. Remember that our mean was 53.4 so the first thing
Let us return to our previous example. an excellent student scored particularly highly, and we need to do is to work out how far each value is
a weak one quite low. However, there is still a dif- Effectively, all we need to do to work out the mean away from that mean.
Year one: 85 52 64 51 29 59 47 58 42 37 66 51
ference of 7 between the IQRs of year one and year
Year two: 62 42 59 45 57 51 46 47 56 55 48 51
two. As such, there remains some suggestion that
So, for year one:
students were more likely to be better and worse in
To work out the inter-quartile range we first need to
year one than they were in year two.
rank the scores in ascending order. Table 3.11: Absolute deviation scores for year one
Year one: 29 37 42 47 51 51 Indeed, unlike the range, the inter-quartile range is
52 58 59 64 66 85 a useful measure of dispersion as it has the excellent Value 29 37 42 47 51 51 52 58 59 64 66 85
property of not being too sensitive to outlying data Mean 53.4 53.4 53.4 53.4 53.4 53.4 53.4 53.4 53.4 53.4 53.4 53.4
Year two: 42 45 46 47 48 51 values. That is, it is a resistant measure. However, it
52 55 56 57 59 62 doesn’t take all the values into account, therefore it Absolute 24.4 16.4 11.4 6.4 2.4 2.4 1.4 4.6 5.6 10.6 12.6 31.6
does lack sensitivity, and, like the median, it does deviation
Now we need to separate them into four quarters - suffer from the disadvantage that its calculation in-
we have 12 data points so we simply divide 12 by 4. volves sorting the data. This can be very time-con-
There will be 3 data points in each quarter. Now all that remains is to take the mean of the absolute differences - which means adding the values and
suming for large samples when a computer is not
dividing the result by the number of scores.
available to do the calculations.
Year one: 29 37 42 47 51 51
52 58 59 64 66 85
T Calculate the absolute mean deviation. Mean Absolute Deviation (MAD)
Year two: 42 45 46 47 48 51
52 55 56 57 59 62
Like the range and the inter-quartile range, the The answer is 10.82.
In year one, it is possible to see that the median, or mean absolute deviation gives a measure of the
Q2, is the point between 51 and 52 - effectively 51.5. spread of our distribution. This time however, the
The mid-point between the first and second quarter mean, rather than the median, is used as the point T Calculate the mean absolute deviation for year two.
- Q1 - is the average of 42 and 47. of focus. In fact, the mean absolute deviation is ac-
tually the average distance from the average! Unlike
So: Q1 = (47 + 42)/2 = 44.5 the range and the IQR, however, it takes all of the Table 3.12: Absolute deviation scores for year two
data in the distribution into account, not just select
Calculate the mid-point between the parts of it. Of course, this can make it particularly Value 42 45 46 47 48 51 52 55 56 57 59 62
T third and fourth quarter (Q3). sensitive to outliers, but it does also make it a much Mean 51.6 51.6 51.6 51.6 551.6 51.6 51.6 51.6 51.6 51.6 51.6 51.6
more precise measurement of dispersion too be-
Q3 = (64+59)/2 = 61.5. cause we are no longer ignoring parts of the data. Absolute 9.6 6.6 5.6 4.6 3.6 .6 .4 3.4 4.4 5.4 7.4 10.4
Fortunately, it is easy to calculate. deviation
Now all that remains to be done is to subtract Q1
from Q3. Before we start, we need to understand what is
meant by an absolute value. Absolute values are The answer is 5.17
The IQR for year one is 17.
those that are measured from a fixed point. This
means that the absolute distance from that point is So now we have our mean absolute deviation for each sample, we have to interpret them: data never speaks
the issue of interest, not whether the value is above for itself. Simply by eyeballing the data we can quickly determine that the deviation for year one is more
T Now calculate the IQR for year two.
or below it. For instance, the absolute value of 7 is 7. than double what it is for year two. This means that the data is much more spread in year one than it is in
The absolute value of -7 is also 7. In absolute terms, year two. Actually, this isn’t too surprising as we already know that the range is quite a bit higher than the
both 7 and -7 are 7 away from 0. Got that? inter-quartile for year one. Comparison of the mean absolute deviation confirms our original conclusion:
The IQR for year two is 10 (56.5 - 46.5). students were more likely to be better and worse in year one than they were in year two.
An even simpler way of saying this is that absolute
Now that we have eliminated the outliers in our dis-
values are effectively permission to ignore any mi-
tribution, it is possible to see that the central distri-
nus signs. This is really useful when working out
butions for year one and year two are more similar
how much individual scores deviate from the mean
than the range might initially suggest. Whilst the
as if we were to do this without using absolute val-

17. 18.
DESCRIBING AND SUMMARISING DATA DESCRIBING AND SUMMARISING DATA The standard deviation (SD) The standard deviation is relatively easy to work out. which are further away from the mean. Indeed, the Cohen’s d: Measuring effect sizes between
SD effectively places more and more importance on two means
You can do it in 6 simple steps. those values that are further and further away from
Like the range, the inter-quartile range, and the the mean. Indeed, if you inspect the squared val-
mean absolute deviation, the standard deviation 1. Work out the mean ues you can see the effect this operation has on the All of this discussion now leads us to an inevita-
gives a measure of how widely dispersed the values 2. Calculate how far each individual value is from original values. Those that are comparatively close ble question: is there actually any meaningful dif-
in a distribution are around the mean. Unlike the the mean to the mean, contribute less to the sum of squares ference between year one and year two and can we
range and the IQR, however, it is sensitive to all of 3. Square these values (this gets rid of the minus than those that are further away. This means that demonstrate this statistically?
the data in the distribution, not just select parts of signs so the negative values don’t just cancel out the SD is more sensitive to larger deviations than
it. Of course, like the mean absolute deviation this the positive ones) the MAD – and making a comparison between the Assessing difference is actually quite a complex
makes it particularly sensitive to outliers, but it does 4. Add up these squared values MAD and the SD allows you to summarise this ef- problem. However, a man by the name of Jacob Co-
also make it a much more precise measurement of 5. Divide this by the number of values minus 1 fectively. hen provided a simple and effective way of measur-
dispersion too as we are no longer ignoring parts of 6. Take the square root of this answer ing effect size and the statistic that was named after
the data. There are two questions that emerge from this: him, Cohen’s d, is used as a measure to indicate the
what is the point of the sum of squares operation, standard difference between two means. It is de-
and is the MAD or SD better in reporting the varia- rived from the difference between the two means,
We will use the marks of year one to run through an example. Our mean was 53.4 (step 1) so the next thing
tion around the mean? and dividing the answer by their weighted pooled
we need to do is to work out how far each value is away from that mean (step 2).
standard deviations.
Table 3.13: Deviation from mean for year one The answer to the first question is still a point of
debate amongst social statisticians. The SD is more Helpfully, Cohen also provided a rule of thumb to
commonly used, but this is, at least in part, due to interpret the result. Where Cohen’s d is equal to 1,
85 52 64 51 29 59 47 58 42 37 66 51 we know that the means of the two groups differ by
historical convention. This relates to the fact that the
31.6 -1.4 10.6 -2.4 -24.4 5.6 -6.4 4.6 -11.4 -16.4 12.6 -2.4 algebra that underpins these techniques is easier to one standard deviation. On the other hand, a d of
handle when the SD is used. This becomes quite .5 would demonstrate that the means differ by half
important when using the SD for more complex a standard deviation, etc etc. Cohen suggested that
3. Now we need to square each value
statistical tests that use thousands of data points. where d is equal to 0.2, the effect size is small. Where
However, some researchers have argued that the d is equal to 0.5, we can consider the effect size to
Table 3.14: Deviation from mean squared for year one
MAD should be used as it handles real-life samples be medium, with anything over 0.8 being large. This
and populations better than the SD. Indeed, whilst means that if two groups’ means don’t differ by 0.2
85 52 64 51 29 59 47 58 42 37 66 51
the SD might be better in ideal conditions, and the standard deviations or more, the difference is quite
31.6 -1.4 10.6 -2.4 -24.4 5.6 -6.4 4.6 -11.4 -16.4 12.6 -2.4 statistical evidence does suggest this is the case, weak.
998.56 1.96 112.36 5.76 595.36 31.36 40.96 21.16 129.96 268.96 158.76 5.76 in instances where there might be more chance of
measurement error, amplifying this error in our cal- Before we do that, however, we need to make sure
culations might not be a good idea1. that our distributions are not skewed as Cohen’s d is
unreliable in cases where the distribution is skewed.
Again, this clearly indicates that the marks for year As a result, it is probably best to present both the
4. Now we total these squared values. one were much more dispersed than they were in MAD and the SD in your findings. It is certainly a
2370.92 year two. The standard deviation is comparatively very good idea to calculate both measures as com- Assessing skewness
smaller in year two because the scores are more paring the values can give a better idea of what is
5. Divide this by the number of values clustered around the mean. In essence, there is
in our sample minus 1 - which is 11. much less variation. This is despite the fact that the
going on in the data – just as it has in this particular
Briefly, skewness tells us about the lack of symme-
means initially appeared similar. try in our distribution. Taking a measure of skew-
215.54 ness allows us to see whether our data is lop-sided
Some of you may now be thinking ‘what is the differ- on one side or the other. That is, it provides us with
T 6. And take the square root. ence between the mean absolute deviation and the the means to assess whether the bulk of the distri-
standard deviation?’ Well, part of the answer lies in bution lies to the left or the right of the mean. A neg-
The standard deviation for year one is something else you may have noticed. Whilst there ative skew is said to occur if the bulk of the distribu-
14.68 is a little bit of difference between the SD and the tion lies to the left of the mean, and a positive skew
MAD for year two, relatively speaking, the SD ap- occurs if it is to the right. Generally speaking, if the
Do the same for year two. pears quite a lot higher than the MAD for year one. mean mode and median are around the same place,
Why might this happen? 1 For a not too difficult to understand paper on this then the data is not considered to be skewed. How-
The standard deviation for year two is issue, you can read Stephen Gorrard’s (2004) paper ever, working out the level of skewness by ‘eyballing’
6.21. This difference occurs because of the ‘sum of entitled ‘Revisiting a 90-year-old debate: the advantages the data is often unreliable. Fortunately, it is quite
of the mean deviation’. Presented at the British Educa- easy to work out a standard measure of skewness
squares’ operation we perform when calculated the
tional Research Association’s Annual Conference, it is with what is called a skewness co-efficient. Quite
SD. This has the effect of amplifying those values freely available on the internet.

19. 20.

simply, a skewness co-efficient ascribes a value to The good thing about this coefficient is that it e For example, if we were to make an estimate of the Now we have all the necessary detail to compute
the level of skewness in our data that allows us to should always give an answer of between -1 and +1. population standard deviation in a case where sam- Cohen’s d for our means – we just need to divide the
see whether our distribution is positively or nega- Anything below -.2 is negatively skewed, and any- ple 1 has 100 data points, and sample 2 only has 50, difference (1.8) by the pooled standard deviation
tively skewed. thing above .2 is positively skewed. then a simple average would over-estimate the im- (11.27). The answer is:
portance of the sample 2 in the calculation of the
A standard method of assessing skewness is by using To report this, all we would need to do would be to population estimate. They are being treated on a 1.8 / 11.27 = 0.16
Karl Pearson’s skewness coefficients - often abbrevi- state the following: 50:50 basis, where, in fact, they should probably be
ated to SK. He devised two of these, one around the ‘Using Pearson’s modal and median skewness coef- treated on a 100:50 basis. This means that our pop- Using Cohen’s rule of thumb to interpret the result,
modal value, and one around the mean. Both are ficients, the distributions of year one and year two ulation estimate would be biased in favour of the we can conclude that the effect size between year
easy to use. were not found to be skewed’. standard deviation of sample 2. Hence, we need to one and year two is quite small.
find a way of taking sample size into account when
His first SK coefficient is calculated by subtracting Should you find that your data is skewed, don’t working out Cohen’s d. It is worth noting, however, that Cohen himself
the mode of the distribution from the mean, and worry. We’ll be dealing with another statistical tech- recognised the dangers of interpreting effect sizes
multiplying the answer by 3. This is then divided by nique, the Mann and Whitney U test, in a later work- In experimental research designs, this problem can in such a rigid way. Whilst Cohen’s d does provide
the standard deviation. book. This test can be used to measure whether be avoided by using the standard deviation of the a measure of the standard difference between two
there is a difference between two distributions that control group in that a relatively random sample means, what is a big or small effect will depend
So, for our year one data, this is (53.4 – 51), or 2.3, have skewed means. should already be representative of the population. on the context of the phenomena under scrutiny.
multiplied by 3, which is 6.9, divided by 14.68. The However, this becomes more complicated where we In many applied fields large and even medium ef-
answer is .47. are dealing with real-life scenarios where the meas- fects are often difficult to obtain and effects that are Calculating Cohen’s d urement of population estimates are not so clear smaller than Cohen’s original rule of thumb of .2 are
cut. This is where the pooled standard deviation often considered to be important. Indeed, the fur-
comes into play. ther one moves away from the experimental condi-
T Calculate the skewness coefficient for
year two.
So, now we know that our data is not skewed, we tions of the more behavioural sciences to the often
can go ahead and calculate Cohen’s d. Remember Calculating the pooled standard deviation is a little much more complicated world of social sciences
51.6 – 51 is .6. Multiply that by 3 and we get 1.8. Di- this Cohen’s d is derived from the absolute differ- more complicated – as you can tell by this formu- where confounding variables cannot be controlled,
vide this by the standard deviation - which is 6.21. ence between the two means, divided by their la(!): Cohen’s rule of thumb may actually under-estimate
The skewness coefficient is .29. weighted pooled standard deviations. the importance of otherwise apparently small or
weak effects.
Generally speaking anything above +/- 1 is skewed. The first bit of this is easy - subtract the smaller
Between +/- 1 and +/- .5 is moderately skewed, but mean from the larger one. The mean for year one Fortunately it is not that scary when you know what However, it is quite easy to see why, for example,
probably OK if it is closer to +/-.5. Anything under was 53.4, and the mean for year two was 51.6. The Cohen’s d is quite small in our case. As Cohen’s d
difference is 1.8. you’re doing. n1 is the sample size for the first case. n2
+/- .5 is relatively symmetric. Using this Pearson’s is the sample size in the second case. s12 is the squared is a function of the difference of the means by the
modal coefficient of skewness, we can see that our standard deviation in the first case, and s22 is the pooled standard deviation, we can see that in order
data is relatively symmetric. One quick and easy way of calculating the pooled for the differences between our two means to ap-
standard deviation is to simply take the average squared standard deviation from the second case. So:
proach a medium effect size, we would need to see
However, some statisticians prefer to use the me- standard deviation of the groups under investiga- a difference of over 5.6 (11.27 .5). For a large effect
dian as a comparator to the mean when making an tion. In this case, it would be 14.68 + 6.21 - which is size we would need a difference of over 9 (11.27 .8).
assessment of skewness. Fortunately, this is really 20.89 - divided by two - which resolves to be 10.45. Given that the range of our distributions are com-
easy to work out too. paratively quite large - 56 and 20 respectively - it is
To calculate d, all we need to then do is divide not too surprising that a difference of 1.8 between
the difference by the pooled standard deviation. Now we need to square 14.68 – effectively multiply-
To calculate Pearson’s second median-based skew- ing it by itself (14.68 x 14.68). Doing the same for the means is quite small. That is to say, there is ac-
ness coefficient, all we need to do is subtract the 1.8/10.045 is 0.17. tually quite a bit of variation around the mean of
6.21, we get:
median from the mean, and divide by the standard year one and year two, but given that variation, the
deviation. According to Cohen’s rule of thumb, we can see means are actually quite close together. Cohen’s d is
that our effect is relatively small. The problem with simply a numerical description of that observation.
So, for year one, this is 53.4 – 51, which is 2.3, di- this method, however, is that the Cohen’s d statis-
vided by 14.68. The answer is .16. tic assumes that both sample standard deviations So:
come from the same population, with the average
3.5. Using interval data: ‘The myth of the lazy
of the standard deviations being a measured esti-
Calculate Pearson’s second skewness mate of that population. In doing this, each of the
T coefficient for year two. two standard deviation scores are being treated as
equal. Unfortunately, this logic breaks down when So, how do we use these techniques in practice?
51.6 – 51 is .6. Divided by the standard deviation, we are dealing with unequal sample sizes. In these Actually, whilst these techniques are certainly rel-
6.21, gives a skewness coefficient of .10. cases, the relative weights of the standard deviation evant, much of the data you will come across in sec-
Which means that the pooled standard deviation is
scores needs to be taken into account. ondary datasets and questionnaires will be nominal

21. 22.

or ordinal. However, under some circumstances, You could have working rationale that reads some- * missing cases = 3386 Therefore, a descriptive analysis of the data from the
you may wish to explore some interval data. Let us thing like this: ** missing cases = 28763 Labour Force Survey (2002) suggests that the differ-
suppose that I want to conduct a research project *** mean absolute deviation not available ence in the amount hours worked between those
on the economic conditions of first generation im- born outside the UK and those born inside is rela-
migrants to the UK. Go to the ESDS website that we The issue of immigration has received much Our analysis might run something like this: tively small – although that small difference does
explored in Workbook 1, available here: attention from the popular media and politi- equate to nearly hour more in favour of those born cians alike. Frequently portrayed as ‘scroung- outside of the UK. Working immigrants do demon-
T Analyse the tables
ers’ or ‘a drain on the state’, recent immigrants strate a tendency to work longer hours on average.

> Click on the + next to the ‘Teaching Dataset’ on the Similarly, although those born outside the UK show
are often seen as lazy ‘good-for-nothings’ only
left hand side of the page more variation in their hourly pay, they are likely to
here for the generous state benefits. However, get paid slightly more on average than those born in
> Click on the + next to the ‘Labour Force Survey’ many immigrants are professionally qualified the UK. These results, therefore, challenge the no-
or willing workers looking to use their skills Descriptive analysis of ‘Table 3.15. Immigration tion of a ‘lazy immigrant’. Indeed, in terms of those
> Now click on the icon next to ‘Labour Force in a range of employment contexts. Using the by usual hours worked’ reveals that on average people who are working, it would appear that the
Survey, 2002: Teaching Dataset’ ‘whether born outside of the UK’ question from those born outside the UK actually work nearly differences in usual hours worked and gross hourly
the Labour Force Survey (2002), this study will an hour longer than those born within the UK. pay between first generation migrants and those
> Click on the + next to ‘Variable Description’ attempt to explore preconceived ideas of the born within the UK are not very big at all. There is
Similarly, although the mode is the same for
‘lazy immigrant’ by examining the usual weekly certainly no evidence presented here that would
both groups (40 hours), the median value is also

Explore these variables to see if you can
come up with a method of exploring the
economic conditions of first generation
immigrants to the UK using the interval

hours of work for recent immigrants and those
born within the UK as well as their self-reported
gross hourly pay.
one hour more for those born outside the UK.
Using Cohen’s d, however, this effect was found
to be relatively weak (d = .06). The distribution
support the notion that first generation immigrants
are ‘lazy’ or a ‘drain on state resource’.

This is certainly an interesting study, but

techniques you have just read about – of usual hours worked also demonstrates that Q can you offer any critique of this inter-
write a research rationale (don’t forget to the distribution of hours worked are slightly pretation?
include research aims and questions) more spread for those born in the UK (IQ
range=15; s.d.=14.14) than for those born out-
side the UK (IQ range=13; s.d.= 13.77). Gener- In addition to the usual concerns around self-re-
The following is a summary of the descriptive data from the ‘Total usual hours in main job per week (in- ported measures, there are a few things that are
ally speaking, those born in the UK demonstrate
cluding overtime)’: worth noting about this particular dataset. Firstly,
more variation in weekly hours worked than
look at the range in Table 3.15. and Table 3.16. Some
Table 3.15. Immigration by usual hours worked: Descriptive statistics those born outside the UK. people are doing a lot more hours work per week
than most. 97 hours a week is actually over 12 hours
n Mean Median Mode Range Inter Standard Similarly, investigation of ‘Table 3.16. Immigra- a day. Similarly, some people are earning a lot above
quartile devia- tion by gross hourly pay’ reveals that those born the mean – particular for those who were born in
range tion outside the UK actually earn over £1 per hour the UK. In both cases this is making the range very
Whether Yes 3522* 37.93 40 40 97 13 13.77 more on average (£10.87) than those born with- large and this is likely to be inflating the mean and
born in the UK (£9.61). Although Pearson’s modal our data is skewed as a result. Whilst these may just
outside No 41429** 37.09 39 40 97 15 14.14 coefficient for skewness suggests that the data be outliers, some further investigation may be nec-
the UK is not symmetrical - a common facet of income essary to see whether the distribution of the data
distributions - Cohen’s d was found to be small is spread all the way up to these end points, and
whether there any differences in spread between
* missing cases = 2083 (d = 0.18).
the two groups. Indeed, if we were to read the lit-
** missing cases = 16525 erature around income distributions, we would dis-
*** mean absolute deviation not available That said, compared to every pound earned cover that they are often skewed. This makes direct
by first generation immigrants in work, those comparisons between the means difficult and other
Table 3.16. Immigration by gross hourly pay: Descriptive Statistics born in the UK earn just over 88p. However, the measures of difference based around the median
distribution of gross hourly pay also demon- may be preferable (see workbook 5).
n Mean Median Mode Range Inter Standard strates that there is more variation in the pay
quartile devia- of those born outside the UK (IQ range=7.73; Secondly, the position of the modes in the gross










s.d.=7.94) than for those born inside the UK (IQ
range=6.39; s.d.=6.91).
hourly pay category are some way below both the
mean and the median in Table 3.16. This might be
because people are estimating their gross hourly
wage due to a lack of specific knowledge. The adult
the UK minimum wage was £4.20 in 2001 when the sur-
vey was actually carried out and the roundness of

23. 24.

the figure does give cause for concern. As a result 3.6. Rounding up
it may not be the best measure of central tendency.
Indeed, if we use Pearson’s second coefficient of
skewness based on the median value, the data looks You should now have a much better idea of some
to be symmetrical. Of course, it might be that £5 per of the most useful techniques of descriptive analy-
hour is the most common, but further investiga- sis. Not only will these techniques help you to read
tion/comparison could be useful. quantitative research more effectively, they should
also provide you with the understanding necessary
Related to this is the issue of the missing cases. Both to conduct quantitative projects of your own. In-
Table 3.15. and Table 3.16. have large numbers of deed, with a little bit of effort, a good understanding
missing cases. With respect to Table 3.15., nearly of the tools presented in this workbook will help you
half of the people in the survey have not given an an- to be able to interpret complex data and allow you
swer either because they chose not to or answered to interpret it more effectively. You should now be
‘don’t know’ – or were simply out of work. If the peo- able to:
ple that haven’t given this information share some
characteristics then it may threaten the reliability of • Calculate frequencies and proportions and cor-
the data and any conclusions we might draw from rectly identify when to use them
it. For instance, if those who didn’t answer are pre- • Understand the presentation of data including
dominately in a lower social class, where pay is usu- pie charts and bar charts and implement them
ally lower, then the measure is at risk of over-esti- where appropriate
mating gross mean pay. Conversely, if those people • Understand what is meant by the term ‘cross-
who didn’t offer an answer are predominately peo- tabulation’ and be able to describe the data
ple on high salaries who don’t know their hourly pay within it
then the results may under-estimate mean pay. • Calculate measures of central tendency and cor-
rectly identify when to utilise them for descrip-
Furthermore, the data presented in the tables also tive analysis
give no information concerning those people not in • Calculate measures of dispersion and correctly
work, or any details of those who are claiming state identify when to use them so you can describe
benefits. Currently our sample is only exploring the your data with greater insight
circumstances of those in employment. It also does
not make any differentiation between those who are A good understanding of these techniques will also
working part-time or full-time. We might also want serve a solid foundation for the next step we are go-
to explore the association between socio-economic ing to make with respect to doing quantitative so-
group and immigration, or even housing tenure. As cial research: inferential statistics. Indeed, many of
a result, further investigation is necessary to exam- the techniques you now have experience of using
ine the ‘myth of the lazy immigrant’ more fully. will help you to understand the mechanics of statis-
tical tests that can help us investigate our data fur-
In short, our investigation has produced more ther, and ultimately allow us to answer our research
questions than we might have originally intended. aims.
Indeed, although quantitative research is often pre-
sented as if it were a very linear process, it is actually
quite iterative and cyclical. Analysis often produces
more questions and more analysis, which might
lead to amending the original rationale, or creating
a new one altogether. Rarely do we get ‘straight-for-
ward’ answers to our questions and recognising the
complications of ‘real-life’ data is a key point in the
process of analysis. In some cases, we will be able to
(re)interrogate our data to make our findings more
robust, in other situations we will have to recognise
the limits of our analysis and any accompanying in-
terpretation in our findings.

25. 26.
This workbook by Tom Clark and Liam Foster
is licensed under a Creative Commons
Attribution Non Commercial - ShareAlike 4.0
International License.

Contains public sector information licensed

under the Open Government Licence v2.0.
Crown Copyright.

Korelasi dan
Regrasi mudah
GA4112 2019
Obj: Korelasi

• Soalan2 yg dijawab dengan analisis korelasi

• Scatterplots
• Contoh Korelasi
• Pekali korelasi (correlation coefficient)
• Jenis-jenis korelasi
• Faktor yg mempengaruhi korelasi
• Ujian signifikan

• Adakah dua pembolehubah saling berkait?

• Adakah kedua-duanya meningkat?
• Cth. Kemahiran dan pendapatan
• Adakah satu meningkat dan satu menurun?
• cth. Berat badan dan bil. langkah dalam sehari
• Bagaimanakah kita boleh mengukur secara berangka darjah sesuatu

• Dikenali sebagai carta titik atau “scattergram”.

• Secara grafik menggambarkan hubungan antara dua pemboleh ubah
dalam ruang dua dimensi.
Hubungan Songsang
Scatterplot: Video Games and Test Score

Exam Score

0 5 10 15 20
Average Hours of Video Games Per Week

• Adakah merokok meningkatkan tekanan darah?

• Plotkan bilangan rokok yg dihisap sehari lawan tekanan darah?
• Hubungan agak sederhana
• Hubungan yang positif
• Tiada hubungan
Trend hubungan merokok dgn tekanan







0 10 20 30

Smoking and BP

• Note relationship is moderate, but real.

• Why do we care about relationship?
• What would conclude if there were no relationship?
• What if the relationship were near perfect?
• What if the relationship were negative?
Country Cigarettes CHD
The Data
11 26
2 9 21
3 9 24
4 9 21
5 8 19
6 8 13
7 8 19
Surprisingly, the 8 6 11
9 6 23
U.S. is the first 10 5 15
country on the list- 11 5 13
12 5 4
-the country 13 5 18
with the highest 14 5 12
15 5 3
consumption and 16 4 11
highest mortality. 18
19 3 13
20 3 4
21 3 14
Scatterplot of Heart Disease

• CHD Mortality (Y axis)

• Why?
• Cigarette consumption (X axis)
• Why?
• What does each dot represent?
• Best fitting line included for clarity
CHD Mortality per 10,000 30



{X = 6, Y = 11}

2 4 6 8 10 12

Cigarette Consumption per Adult per Day

Apa yg ditunjukkan pada
• Apabila merokok meningkat, masalah penyakit jantung juga
• Hubungan kelihatan kuat
• Tidak semua titik data berada diatas garisan.
• ini memberikan kita “residuals” atau “ralat dalam meramal”
Korelasi (Correlation)

• Co-relation
• Hubungan diantara dua pembolehubah
• Diukur dengan pekali korelasi
• Pekali korelasi yang popular: Pekali korelasi Pearson (Pearson Product-
Moment Correlation)
Bentuk-bentuk Korelasi
¨ Korelasi positif
¤ Apabila x meningkat, y turut meningkat.
¤ Peningkatan nilai y ada hubungan dengan meningkatnya nilai x.
¨ Korelasi negatif
¤ Apabila x meningkat, y menurun.
¤ Penurunan nilai y ada hubungan dengan peningkatan nilai x.
¨ Tiada korelasi
¨ Tiada kecenderungan nilai y dipengaruhi oleh nilai x, sama ada x meningkat
atau menurun
Pekali Korelasi (r)

• Pengukuran darjah kekuatan hubungan

• Diantara 1 dan -1
• Tanda ± merujuk kepada arah hubngan.
• Berdasarkan kepada kovarian (covariance)
• Melihat varians dua pembolehubah serentak
berbanding dgn melihat satu-satu
Rumus pengiraan r

• Kaedah skor-Z
å z z x y

N -1
• Kaedah pengiraan (Data mentah)

N å XY - å X å Y
éë N å X 2 - (å X ) 2 ùû éë N å Y 2 - (å Y ) 2 ùû

Pekali korelasi yang lain
• Pekali korelasi Spearman Spearman Rank-Order Correlation Coefficient (rsp)
• used with 2 ranked/ordinal variables
• uses the same Pearson formula

Attractiveness Symmetry
3 2
4 6
1 1
2 3
5 4
6 5 18
rsp = 0.77
Point biserial correlation coefficient
• Point biserial correlation coefficient (rpb)
• used with one continuous scale and one nominal or ordinal or dichotomous
• uses the same Pearson formula

Attractiveness Date?
3 0
4 0
1 1
2 1
5 1
6 0
rpb = -0.49

Phi coefficient
• Phi coefficient (F)
• used with two dichotomous scales.
• uses the same Pearson formula

Attractiveness Date?
0 0
1 0
1 1
1 1
0 0
1 1
F = 0.71

Factors Affecting r
¨ Range restrictions
¤ Looking at only a small portion of the total scatter plot (looking at a smaller
portion of the scores’ variability) decreases r.
¤ Reducing variability reduces r
¨ Nonlinearity
¤ The Pearson r (and its relatives) measure the degree of linear relationship
between two variables
¤ If a strong non-linear relationship exists, r will provide a low, or at least
inaccurate measure of the true relationship.
Factors Affecting r
• Heterogeneous subsamples
• Everyday examples (e.g. height and weight using both men and women)
• Outliers
• Overestimate Correlation
• Underestimate Correlation


Testing Correlations

¨ So you have a correlation. Now what?

¨ In terms of magnitude, how big is big?
¤ Small correlations in large samples are “big.”
¤ Large correlations in small samples aren’t always “big.”
¨ Depends upon the magnitude of the correlation coefficient
¨ The size of your sample.

Testing r

• Population parameter = r
• Null hypothesis H0: r = 0
• Test of linear independence
• What would a true null mean here?
• What would a false null mean here?
• Alternative hypothesis (H1) r ¹ 0
• Two-tailed
Computer Printout
• Printout gives test of significance.


CIGARET Pearson Correlation 1 .713**
Sig. (2-tailed) . .000
N 21 21
CHD Pearson Correlation .713** 1
Sig. (2-tailed) .000 .
N 21 21
**. Correlation is significant at the 0.01 level (2-tailed).
Regresi Mudah
Apakah itu regresi?

• Bagaimana kita boleh meramal satu pembolehubah bersandar dari

satu pembolehubah bebas?
• Bagaimana apabila satu pembolehubah bebas berubah akan
mempengaruhi pembolehubah bersandar?
• Pengaruh /Influence

Regresi Linear Mudah

• Satu teknik meramal pembolehubah bersandar dari satu

pembolehubah bebas.
• Menggunakan hubungan (cth. korelasi) diantara dua pembolehubah
bagi membantu ramalan.

Linear Regression: Parts

• Y – pembolehubah yang diramal

• i.e. pembolehubah bersandar
• X – pembolehubah yang digunakan untuk meramal
• i.e. pembolehubah bebas
• Ŷ- ramalan anda berdasarkan model (juga dikenali sebagai Y’)

Why Do We Care?
• We may want to make a prediction.
• More likely, we want to understand the relationship.
• How fast does CHD mortality rise with a one unit increase in smoking?

An Example

• Cigarettes and CHD Mortality again

• Data repeated on next slide
• We want to predict level of CHD mortality in a country averaging 10
cigarettes per day.

Country Cigarettes CHD
1 11 26
The Data 2
4 9 21

Based on the data we have 5

what would we predict the 7
rate of CHD be in a country 9
that smoked 10 cigarettes on 11 5 13
12 5 4
average? 13 5 18
14 5 12
First, we need to establish a 15 5 3

prediction of CHD from 16

smoking… 18
20 3 4
21 3 14


We predict a
CHD Mortality per 10,000

CHD rate of
about 14


For a country that

smokes 6 C/A/D…
2 4 6 8 10 12

Cigarette Consumption per Adult per Day

Regression Line
• Formula
Yˆ = bX + a
• Yˆ = the predicted value of Y (e.g. CHD mortality)
• X = the predictor variable (e.g. average cig./adult/country)

Regression Coefficients

• “Coefficients” are a and b

• b = slope
• Change in predicted Y for one unit change in X
• a = intercept
• value of when X = 0

• Slope
cov XY é sy ù
b = 2 or b = r ê ú
sX ë sx û
N å XY - å X å Y
or b =
• Intercept
éë N å X - (å X ) ùû
2 2

a = Y -bX
SPSS Printout

• The values we obtained are shown on printout.
• The intercept is the value in the B column labeled “constant”
• The slope is the value in the B column labeled by name of predictor

Making a Prediction

• Second, once we know the relationship we can predict

Yˆ = bX + a = 2.042 X + 2.367
Yˆ = 2.042*10 + 2.367 = 22.787
• We predict 22.77 people/10,000 in a country with an average of 10
C/A/D will die of CHD

SPSS output

Model Summary

Adjusted Std. Error of

Model R R Square R Square the Estimate
1 .713a .508 .482 4.81640
a. Predictors: (Constant), CIGARETT


Sum of
Model Squares df Mean Square F Sig.
1 Regression 454.482 1 454.482 19.592 .000a
Residual 440.757 19 23.198
Total 895.238 20
a. Predictors: (Constant), CIGARETT
b. Dependent Variable: CHD

Testing Slope and Intercept
• These are given in computer printout as a t test.

Apakah itu Regresi Berganda (Multiple Regression?

• Meramal hasilan (pembolehubah bersandar) berdasarkan kepada

beberapa pembolehubah bebas secara serentak.
• Mengapa ianya penting?
¤Tingkah laku jarang sekali berfungsi hanya satu pembolehubah,
tetapi sebaliknya dipengaruhi oleh banyak pembolehubah. Jadi
ideanya adalah bahawa kita harus dapat memperoleh skor yang lebih
tepat yang diramalkan jika menggunakan pelbagai pembolehubah
untuk meramalkan hasilan kita.
How to Perform a One-Way ANOVA in SPSS

Purpose of ANOVA

The ANOVA is a statistical technique which compares different sources of variance

within a data set. The purpose of the comparison is to determine if significant differences
exist between two or more groups.

Why ANOVA and not T-test?

1. Comparing three groups using t-tests would require that 3 t-tests be conducted.
Group 1 vs. Group 2, Group 1 vs. Group 3, and Group 2 vs. Group 3. This
increases the chances of making a type I error. Only a single ANOVA is required
to determine if there are differences between multiple groups.
2. The t-test does not make use of all of the available information from which the
samples were drawn. For example, in a comparison of Group 1 vs. Group 2, the
information from Group 3 is neglected. An ANOVA makes use of the entire data
3. It is much easier to perform a single ANOVA then it is to perform multiple t-tests.
This is especially true when a computer and statistical software program are used.

The Theory in Brief

Like the t-test, the ANOVA calculates the ratio of the actual difference to the difference
expected due to chance alone. This ratio is called the F ratio and it can be compared to
an F distribution, in the same manner as a t ratio is compared to a t distribution. For an F
ratio, the actual difference is the variance between groups, and the expected difference is
the variance within groups. Please read the ANOVA handout for more information.

Let’s Roll

Just like with the independent t-test, you'll need two columns of information. One column
should be whatever your dependent variable is (BMD in Figure 1 below), and the other
should be whatever you want to call your grouping variable (that is, your independent or
quasi-independent variable; this is Athlete in Figure 1). Notice that each score in the
BMD column is classified as being in group 1, group 2, or group 3; SPSS needs to know
which scores go with which group to be able to carry out the ANOVA.
Figure 1: Data View

How did I name the variables BMD and Athlete? There are two tabs at the bottom of the
Data Editor, one labeled Data View, and the other labeled Variable View, as shown in
Figure 1: You can toggle back and forth between the Data View (see Figure 1) and the
Variable View, which is illustrated in Figure 2:

Figure 2: Variable View

In the Name column, you can type whatever labels you wish for your variables. If you
don't type in labels, SPSS will use labels like VAR001 and VAR002 by default.

When viewing the results of the ANOVA it will be helpful to know what each Athlete
number represents. In our case, 1.00 is the Control group, 2.00 is the Swimmer group,
and 3.00 is the Weight Lifter group. While in the Variable View, click on the Values cell
of the Athlete variable to enter labels for each condition number. See Figure 3.

Figure 3: Name the Grouping Variable

To actually perform the ANOVA, you need to click on the Analyze menu, select
Compare Means and then One-Way ANOVA, as in Figure 4.

Figure 4: Starting the ANOVA

After this, a dialog box will appear. In this box, you'll be able to select a variable for the
"Dependent List" (this is what SPSS calls a dependent variable for this kind of analysis)
and a "Factor" (this is the independent variable). I've selected BMD for the Dependent
List and Athlete as the Factor, as show in Figure 5.

Figure 5: Selecting the Test and Grouping Variables

Notice also that there's an Options button. You should click this, and then, in the new
dialog box that appears, check the Descriptives box (as illustrated in Figure 6a). This tells
SPSS to give you descriptive statistics for your groups (things like means and standard
deviations). You should also check the Homogeneity of variance test box, and the Means
plot box

Figure 6a: The Options Button

After you click Continue, select the Post Hoc button and then OK, a new window will
appear as shown in Figure 6b. Select Tukey which is a test that will determine
specifically which groups are significantly different.

Figure 6b: The Post Hoc Button

After you click Continue and then OK, a new window will appear (called the SPSS
Viewer) with the results of the ANOVA. The important part of the output is shown in
Figure 7.

Figure 7: The results of the ANOVA


95% Confidence Interval
for Mean
Std. Std. Lower Upper
N Mean Deviation Error Bound Bound Minimum Maximum
Control 20 .9235 .19535 .04368 .8321 1.0149 .62 1.25
Swimmer 20 .9740 .29605 .06620 .8354 1.1126 .22 1.50
20 1.2100 .30253 .06765 1.0684 1.3516 .62 1.79
Total 60 1.0358 .29299 .03783 .9601 1.1115 .22 1.79

Test of Homogeneity of Variances

Statistic df1 df2 Sig.
.974 2 57 .384


Sum of
Squares df Mean Square F Sig.
Between Groups .936 2 .468 6.457 .003
Within Groups 4.129 57 .072
Total 5.065 59

There's a lot of useful information here. In the first box there are group statistics, which
provide the means and standard deviations of the groups.

The second box contains the results for the test of homogeneity of variance. Is the
variance within each group similar? The high significance value (.247) is good because it
means we do have homogeneity of variance. We would have to make adjustments to our
analysis if the significance approached .05.

In third box are the results of the ANOVA, in a summary table that should look almost
exactly like Table 9.6 from the handout. You do not need to look up a critical value for F
to decide if you should reject the null hypothesis or not. Instead, just compare the "Sig."
value to alpha (which is usually .05, as you know). The decision rule is as follows: If the
significance value (which is usually labeled p in research reports) is less than alpha, reject
H0; if it's greater than alpha, do not reject H0. So, in this case, because the significance
value of .003 is less than alpha = .05, we reject the null hypothesis. We would report the
results of this ANOVA by saying something like, "There was significant differences
between the groups, F(2, 57) = 6.457, p = .003."

Now that we know the groups are significantly different, it would be helpful to determine
specifically which groups are different from each other. For this we can review the Post
Hoc test (in this case Tukey). See Figure 8

Figure 7: The results of the Tukey Post Hoc Tests

Post Hoc Tests

Multiple Comparisons

Dependent Variable: BMD

Tukey HSD

95% Confidence Interval

(I) Athlete (J) Athlete (I-J) Std. Error Sig. Upper Bound Lower Bound
Control Swimmer -.05050 .08511 .824 -.2553 .1543
Weight Lifter -.28650(*) .08511 .004 -.4913 -.0817
Swimmer Control .05050 .08511 .824 -.1543 .2553
Weight Lifter -.23600(*) .08511 .020 -.4408 -.0312
Weight Lifter Control .28650(*) .08511 .004 .0817 .4913
Swimmer .23600(*) .08511 .020 .0312 .4408
* The mean difference is significant at the .05 level.

From the table, we can see that the Swimmer group does not differ significantly from the
Control, but the Weight Lifter group is significantly different from the Control. We
can also see that the Weight Lifter group is also significantly different from the
Swimmer group. Note that these significant values are slightly different than the one’s
determined using the Scheffe Post Hoc test in Excel. However, the end result is the
same. Tukey is slightly more powerful than (smaller p-values) than Scheffe.

The difference between groups is confirmed graphically by looking at the Plot of Means
shown below.

Mean of BMD




Control Swimmer Weight Lifter



Boleh diselesaikan dgn one

sample t test


Non-parametric technique available

• Chi-square test for goodness of fit

• Chi-square test for independence
• Mann-Whitney test
• Wilcoxon signed-rank test
• Kruskal-Wallis test


Learning Outcomes

The Chi-Square Statistic

Parametric and nonparametric

Concepts to review
statistical tests
 Proportions
 Hypothesis tests used thus far tested hypotheses about population
 Frequency distributions parameters
 Parametric tests share several assumptions
 Normal distribution in the population
 Homogeneity of variance in the population
 Numerical score for each individual

 Nonparametric tests are needed when the research situation does

not conform to the requirements of parametric tests.


Chi-Square and other Chi-Square Test for

nonparametric tests Goodness of Fit

 Uses sample data to test hypotheses about the shape or proportions

 Do not state the hypotheses in terms of a specific of a population distribution.
population parameter
 Tests the fit of the proportions in the obtained sample with the
 Make few assumptions about the population hypothesized proportions of the population.
 Often termed distribution free tests

 Participants usually classified into categories

 Nominal or ordinal scales are used
 Data for nonparametric tests are frequencies

Null hypothesis for Goodness 50% pelangan yang lepak di

of Fit mamak minum teh tarik!

 Specifies the proportion (or percentage) of the population in each

 Rationale for null hypotheses: Tarik
 No preference among categories.
 No difference in one population from the proportions in another known


Data for the Goodness of Fit Expected frequencies in the

Test Goodness of Fit Test

 In a sample of data, individuals in each category  Goodness of Fit test compares the Observed
are counted. Frequencies of the data with the assumptions of the
null hypothesis.
 Observed Frequencies in each category
are measured.  Construct Expected Frequencies that are in perfect
agreement with the null hypothesis.
 Each individual is counted in one and only one
category.  Expected Frequency is the frequency value that is
predicted from H0 and the sample size.
 Ideal, hypothetical sample distribution

Chi-Square Statistics Chi-Square distribution

 Notation
 Null hypothesis should be
 χ2 is the lower-case Greek letter Chi
 Retained if the discrepancy between the Observed and Expected values is
 f o is the Observed Frequency small
 f e is the Expected Frequency  Rejected if the discrepancy between the Observed and Expected values is
 Chi-Square Statistic
 Chi-Square distribution includes values for all possible random samples
when H0 is true
 All chi-square values ≥ 0.
( fo − fe )2
 =2  When H0 is true, Chi-square values will
be small


Degrees of freedom and Chi- Chi-square distribution

Square and the critical region

 Chi-square distribution is positively skewed

 Chi-square is a family of distributions
 Distributions determined by degrees of freedom
 Slightly different shape for each value of df
 Degrees of freedom for Goodness of Fit Test
 df = C – 1
 C is the number of categories

Chi-square distributions Critical region for a Chi-Square

for different values of df Test

 Significance level is determined.

 Critical value of chi-square is located in a
table of critical values according to
 Value for degrees of freedom (df)
 Significance level chosen


Critical region for Goodness of

Fit of the Tarik test!


Output spss: Goodness of fit Goodness of Fit and the

test Single-sample t Test

 Both tests use data from one sample to test a hypothesis about a
single population
 Level of measurement determines test:
 Numerical scores (interval / ratio scale) make it appropriate to compute a
mean and use
a t-test

Ho: peratusan yang minum teh  Classification in nonnumerical categories (ordinal or nominal scale) make
it appropriate to compute proportions or percentages and
Tarik ialah 50%
carry out a chi-square test

H1: peratusan yang minum teh

Tarik adalah tidak sama dengan


Chi-Square Test for Null hypothesis for

Independence Test of Independence

 Chi-Square Statistic can test for the existence of a relationship  Null hypothesis: two variables are independent
between two variables.  Two versions
 Each individual classified on each variable  Single population: No relationship between two variables in this population.
 Counts are presented in the cells of a matrix  Two separate populations: No difference between distribution of variable in
 Research may be experimental or nonexperimental the two populations (defined by a nominal variable)

 Frequency data from a sample is used to evaluate the relationship of  Variables are independent when there is no consistent predictable
two variables in the population. relationship between them.

Observed and expected Computing expected

frequencies frequencies

 Frequencies in the sample are the Observed frequencies for the test.  Frequencies computed by same method for each
cell in the frequency distribution table
 Expected frequencies are based on the null hypothesis of same

proportions in each category (population)
fc fr
 Proportions of each row total to the cells in each column fe =

 f c is frequency total for the column

 f r is frequency total for the row


Chi-Square Statistic for Test

Soalan Kajian.
of Independence
 Same equation as the Chi-Square Test of Goodness of Fit
 Chi-Square Statistic

Adakah terdapat hubungan yang signifikan antara pilihan

menonton berita tv dengan lokasi?

( fo − fe )2

2 = 
fe Ho: Tidak terdapat hubungan antara pilihan siaran berita tv
dengan lokasi
 Degrees of freedom df = (R-1)(C-1)
 R is the number of rows H1: Terdapat hubungan antara pilihan siaran berita tv
dengan lokasi
 C is the number of columns

Output spss: Jadual kontigensi Output spss ujian khi kuasa dua

Perlu dihuraikan secara deskriptif apa yang diperolehi dari

Keputusan: Tolak hipotesis nol; p<0.05


4. Measuring effect size for

Effect size:

 The Chi-square hypothesis test indicates that the difference did not
occur by chance
 Does not indicate the size of the effect

 For a 2x2 matrix, the phi-coefficient Φ measures the strength of the



5. Assumptions and restrictions

Effect size in a larger matrix
for Chi-Square Tests

 For a larger matrix, a modification of the

phi-coefficient is used: Cram er’s V  Independence of observations
 Each observed frequency is generated by a
different individual

 2  Size of expected frequencies

 V =  Chi-square test should not be performed when the expected frequency of
n( df *) any cell is
less than 5.

 df* is the smaller of (R-1) or (C-1)


6. Special applications for the

Chi-Square Tests

 Chi-square and Pearson correlation both evaluate relationships

between two variables.

 Type of data obtained determines which is the appropriate test to use.
Chi-square is sometimes used instead of
t-tests or ANOVA, when counts rather than means of categories are
being compared.
 Chi-square can evaluate the significance.
 Parametric tests measure strength and effect size with greater

slide 1

• Setiap sel di dalam jadual mewakili
• Jadual kontigensi membolehkan kita
membandingkan ciri-ciri sesuatu sampel, cth:
satu kombinasi ciri-ciri hubungan
tahap fundalisme agama, utk kump atau subset antara dua pembolehubah.
kes mengikut kategori, misalnya jantina.
• Dalam spss jadual kontigensi di panggil ‘cross- 29 lelaki adalah
tabulated table’ seperti ditunjukan di bawah.

42 w anita

Walaupun bilangan w anita lebih besar tahap , kita

tidak boleh menyatakan w anita adalah kerana
bilangan total w anita (146) adalah berbeza
Baris dengan bilangan total lelaki (107).

untuk menjaw ap soalan “yang mana lebih mirip”,

kita perlu membandingkan peratusan.

3/12/15 Slide 3


Tiga bentuk peratusan yang boleh dikira • peratusan total kes dikira dengan membahagikan
nombor bagi setiap sel (cth 29, 42, dll) dengan jumlah
dalam jadual kontigensi: total (253).
peratus total untuk setiap kes
peratus total untuk setiap baris
peratus total untuk setiap lajur
11.5% kes adalah 16.6% kes adalah
lelaki dan w anita dan
fundamentalist. fundamentalist.

setiap peratusan memberikan maklumat

yang berbeza dan menjawap soalan
yang berbeza.
Tips 2: angka 100%
tips: lihat pada hanya keluar pada
label “% of Total.” total besar dibaw ah

• Peratusan baris dikira dengan • peratusan lajur dikira dengan

membahagi setiap angka (29, 42) membahagikan setiap sel
dengan total baris (71). (29,36)dengan total lajur (107).
40.8% di
adalah lelaki
27.1% dikalangan
lelaki adalah

adalah w anita
Tips: lihat label
33.6% dikalangan
lelaki adalah

pada bahagian total: setiap

baris ialah 100% Tips: % dikira mengikut sex
Tips: % setiap lajur ialah
100% pada baris ini.


• Amalan saya ialah pembolehubah

bebas letakkan di lajur dan
kita berminat untuk melihat peratusan pembolehubah bersandar di baris,
bersyarat atau kontigensi itu kerana ia analisis dengan membandingkan %
akan memberikan maklumat mengenai lajur.
hubungan antara dua pembolehubah.

• Berdasarkan kepada peratusan lajur, • Berdasarkan kepada peratusan lajur,

kita boleh membuat pernyataan kita boleh membuat pernyataan
berikut: berikut:

Lelaki lebih cenderung

Lelaki lebih cenderung menjadi liberal(39.3%)
kearah liberal (39.3%) berbanding w anita
manakala w anita lebih menjadi liberal (26.7%)
cenderung kearah moderate


Let do the analysis

using spss

The t-test in IBM SPSS Statistics

An Example: are invisible people mischievous?

In my SPSS book (Field, 2013) I imagine a future in which we have some cloaks of invisibility to
test out. As a psychologist (with his own slightly mischievous streak) I might be interested in the
effect that wearing a cloak of invisibility has on people’s tendency for mischief. I took 24
participants and placed them in an enclosed community. The community was riddled with hidden
cameras so that we could record mischievous acts. Half of them were given cloaks of invisibility:
they were told not to tell anyone else about their cloak and that they could wear it whenever they
liked. We measured how many mischievous acts they performed in a week. These data are in Table
1. The file Invisibility.sav shows how you should have entered the data: the variable Cloak
records whether or not a person was given a cloak (cloak = 1) or not (cloak = 0), and Mischief is
how many mischievous acts were performed.
Table 1: Data from Invisibility.sav

Participant Cloak Mischief

1 0 3
2 0 1
3 0 5
4 0 4
5 0 6
6 0 4
7 0 6
8 0 2
9 0 0
10 0 5
11 0 4
12 0 5
13 1 4
14 1 3
15 1 6
16 1 6
17 1 8
18 1 5
19 1 5
20 1 4
21 1 2
22 1 5
23 1 7
24 1 5

© Prof. Andy Field Page 1

1.1. The independent t-test using SPSS

The general procedure

Figure 1 shows the general process for performing a t-test: as with fitting any model, we start by
looking for the sources of bias. Having satisfied ourselves that assumptions are met and outliers
dealt with, we run the test. We can also consider using bootstrapping if any of the test assumptions
were not met. Finally, we compute an effect size.

Check for outliers, normality, Boxplots, histograms,

Explore data
homogeneity etc. descriptive statistics

Bootstrap if problems with

Run the t-test
the data

Calculate an effect

Figure 1: The general process for performing a t-test

Compute the independent t-test

To run an independent t-test, we need to access the main dialog box by selecting
(see Figure 2). Once the dialog box is activated,
select the dependent variable from the list (click on Mischief) and transfer it to the box labelled
Test Variable(s) by dragging it or clicking on . If you want to carry out t-tests on several
dependent variables then you can select other dependent variables and transfer them to the
variables list. However, there are good reasons why it is not a good idea to carry out lots of tests.
Next, we need to select an independent variable (the grouping variable). In this case, we need to
select Cloak and then transfer it to the box labelled Grouping Variable. When your grouping
variable has been selected the button will become active and you should click on it to
activate the Define Groups dialog box. SPSS needs to know what numeric codes you assigned to
your two groups, and there is a space for you to type the codes. In this example, we coded our no
cloak group as 0 and our cloak group as 1, and so these are the codes that we type.
When you have defined the groups, click on to return to the main dialog box. If you click on
then another dialog box appears that gives you the chance to change the width of the
confidence interval that is calculated. The default setting is for a 95% confidence interval and this
is fine; however, if you want to be stricter about your analysis you could choose a 99% confidence
interval but you run a higher risk of failing to detect a genuine effect (a Type II error). To run the
analysis click on .
If we have potential bias in the data we can reduce its impact by using bootstrapping to generate
confidence intervals for the difference between means. We can select this option by clicking
in the main dialog box to access the bootstrap function. Select to
activate bootstrapping, and to get a 95% confidence interval click or

© Prof. Andy Field Page 2

. For this analysis, let’s ask for a bias corrected (BCa) confidence
interval. Back in the main dialog box click on to run the analysis.

Figure 2: Dialog boxes for the independent-samples t-test

Output from the independent t-test (1)

The output from the independent t-test contains only three tables (two if you don’t opt for
bootstrapping). The first table (Output 1) provides summary statistics for the two experimental
conditions (if you don’t ask for bootstrapping this table will be a bit more straightforward). From
this table, we can see that both groups had 12 participants (row labelled N). The group who had
no cloak, on average, performed 3.75 mischievous acts with a standard deviation of 1.913. What’s
more, the standard error of that group is 0.552. The bootstrap SE estimate is .53, and the
bootstrapped confidence interval for the mean ranges from 2.29 to 4.58. For those that were given
an invisibility cloak, they performed, on average, 5 acts, with a standard deviation of 1.651, a
standard error of 0.477. The bootstrap standard error is a bit lower at 0.46, and the confidence
interval for the mean ranges from 4.33 to 5.67. Note that the confidence intervals for the two
groups overlap, implying that they might be from the same population.
The second table of output (Output 2) contains the main test statistics. The first thing to notice is
that there are two rows containing values for the test statistics: one row is labelled Equal variances
assumed, while the other is labelled Equal variances not assumed. Parametric tests assume that the
variances in experimental groups are roughly equal. The rows of the table relate to whether or not
this assumption has been broken.

© Prof. Andy Field Page 3

Output 1

We can use Levene’s test to see whether variances are different in different groups (although there
are problems with this test discussed in my book), and SPSS produces this test for us. Levene’s test
tests the hypothesis that the variances in the two groups are equal. Therefore, if Levene’s test is
significant at p £ .05, it suggests that the assumption of homogeneity of variances has been violated.
If, however, Levene’s test is non-significant (i.e., p > .05) then we can assume that the variances are
roughly equal and the assumption is tenable. For these data, Levene’s test is non-significant
(because p = .468, which is greater than .05) and so we should read the test statistics in the row
labelled Equal variances assumed. Had Levene’s test been significant, then we would have read the
test statistics from the row labelled Equal variances not assumed.

Output 2

Having established that the assumption of homogeneity of variances is met, we can look at the t-
test itself. We are told the mean difference (𝑋"# − 𝑋# = 3.75 − 5 = −1.25) and the standard error of
the sampling distribution of differences. The t-statistic is −1.71, which is assessed against the value
of t you might expect to get if there was no effect in the population when you have certain degrees
of freedom. For the independent t-test, degrees of freedom are calculated by adding the two sample
sizes and then subtracting the number of samples (df = N1 + N2−2 = 12 + 12 − 2 = 22). SPSS produces
the exact significance value of t, and typically we are interested in whether this value is less than
or greater than .05. In this case the two-tailed value of p is .107, which is greater than .05, and so
we would have to conclude that there was no significant difference between the means of these
two samples. In terms of the experiment, we can infer that having a cloak of invisibility did not
significantly affect the amount of mischief a person got up to.

Output 3

© Prof. Andy Field Page 4

Output 3 shows the results of the bootstrapping (if you selected it). You can see that the
bootstrapping procedure has been applied to re-estimate the standard error of the mean
difference (which is estimated as .726 rather than .730). Using this bootstrapped standard error
confidence intervals for the difference between means are computed. The difference between
means was -1.25, and the confidence interval ranged from -2.606 to 0.043. The confidence
interval implies that the difference between means in the population could be negative, positive or
even zero (because the interval ranges from a negative value to a positive one). In other words, it’s
possible that the true difference between means is zero—no difference at all. Therefore, this
bootstrap confidence interval confirms our conclusion that having a cloak of invisibility seems not
to affect acts of mischief.

Reporting the independent t-test (1)

You usually state the finding to which the test relates and then report the test statistic, its degrees
of freedom and the probability value of that test statistic. We could write:
ü On average, participants given a cloak of invisibility engaged in more acts of mischief (M
= 5, SE = 0.48), than those not given a cloak (M = 3.75, SE = 0.55). This difference, -1.25,
BCa 95% CI [-2.606, 0.043], was not significant t(22) = −1.71, p = .101; however, it did
represent a medium-sized effect d = .65.

1.2. Matched-samples t-test using SPSS

Entering Data
Let’s imagine that we had collected the cloak of invisibility data using a repeated measures design;
so, the data are identical to before. In this scenario we might have recorded everyone’s natural
level of mischievous acts in a week, then given them an invisibility cloak and counted the number
of mischievous acts in the next week.
The data would now be arranged differently in SPSS. Instead of having a coding variable, and a
single column with mischief scores in, we would arrange the data in two columns (one
representing the Cloak condition and one representing the No_Cloak condition). The data are in
Invisibility RM.sav if you had difficulty entering them into SPSS yourself.

Compute the paired-samples t-test

To conduct a paired-samples t-test, we need to access the main dialog box by selecting
(Figure 3). Once the dialog box is activated, you need
to select pairs of variables to be analysed. In this case we have only one pair (Cloak vs. No_Cloak).
To select a pair you should click on the first variable that you want to select (in this case No_Cloak),
then hold down the Ctrl key (Cmd on a Mac) and select the second (in this case Cloak). To transfer
these two variables to the box labelled Paired Variables click on . (You can also select each
variable individually and transfer it by clicking on , but selecting both variables as just described
is quicker.) If you want to carry out several t-tests then you can select another pair of variables,
transfer them to the variables list, then select another pair and so on. If you click on then

© Prof. Andy Field Page 5

another dialog box appears that gives you the same options as for the independent t-test. Similarly
we can click on to access the bootstrap function (section Error! Reference source not
found.). As with the independent t-test, select and
. Back in the main dialog box click on to run the analysis.

Figure 3: Main dialog box for paired-samples t-test

Output from the paired-samples t-test

The resulting output produces four tables (3 if you don’t select bootstrapping). Output 4 shows a
table of summary statistics for the two experimental conditions (if you don’t ask for bootstrapping
this table will be a bit more straightforward). For each condition we are told the mean, the number
of participants (N), the standard deviation and standard error. These values are the same as when
we treated the data as an independent design.
Output 4 also shows the Pearson correlation between the two conditions. When repeated
measures are used it is possible that the experimental conditions will correlate (because the data
in each condition come from the same people and so there could be some constancy in their
responses). SPSS provides the value of Pearson’s r and the two-tailed significance value. For these
data the experimental conditions yield a very large correlation coefficient, r = .806, which is highly
significant, p = .002 and has a bootstrap confidence interval that doesn’t include zero, BCa 95% CI
[.185, .965].
Output 5 shows us whether the difference between the means of the two conditions was significant.
First, the table tells us the mean difference between the mean scores of each condition: 3.75 − 5 =
−1.25. The table also reports the standard deviation of the differences between the means and
more important the standard error of the differences between participants’ scores in each
condition. The size of the test statistic, t, is compared against known values based on the degrees
of freedom. When the same participants have been used, the degrees of freedom are the sample

© Prof. Andy Field Page 6

size minus 1 (df = N −1 = 11). SPSS uses the degrees of freedom to calculate the exact probability
that a value of t at least as big as the one obtained could occur if there was no difference between
population means. This probability value is in the column labelled Sig. SPSS provides the two-tailed
probability, which is the one I recommend using. Typically, We are interested in whether this value
is less than or greater than .05, and because the value of p is less than .05 we can conclude that
there was a significant difference between the means of these two samples. In terms of the
experiment, we can infer that having a cloak of invisibility significantly affected the amount of
mischief a person got up to, t(11) = −3.80, p = .003.

Output 4

Output 5

Finally, this output provides a 95% confidence interval for the mean difference. However, a more
robust confidence interval, estimated using bootstrapping, is produced in Output 6. remember that
confidence intervals are constructed such that in 95 of samples the intervals contain the true value
of the mean difference. So, assuming that this sample’s confidence interval is one of the 95 out of
100 that contains the population value, we can say that the true mean difference lies between
−1.67 and −0.83. The importance of this interval is that it does not contain zero (both limits are
negative), which tells us that the true value of the mean difference is unlikely to be zero. In other
words, there is an effect in the population reflecting more mischievous acts performed when
someone is given an invisibility cloak.

© Prof. Andy Field Page 7

Output 6

Reporting the paired-samples t-test

We can basically report the same information for matched-samples t-test as the independent t-test,
but obviously the confidence intervals, degrees of freedom and values of t and p have changed:
ü On average, participants given a cloak of invisibility engaged in more acts of mischief (M
= 5, SE = 0.48), than those not given a cloak (M = 3.75, SE = 0.55). This difference, -1.25,
BCa 95% CI [−1.67, −0.83], was significant t(11) = −3.80, p = .003 and represented a
medium-sized effect d = .65.

Field, A. P. (2013). Discovering statistics using IBM SPSS Statistics: And sex and drugs and rock 'n'
roll (4th ed.). London: Sage.

Terms of Use
This handout contains material from:
Field, A. P. (2013). Discovering statistics using SPSS: and sex and drugs and rock ‘n’ roll (4th Edition).
London: Sage.
This material is copyright Andy Field (2000-2016).
This document is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives
4.0 International License (, basically you
can use it for teaching and non-profit activities but not meddle with it without permission from
the author.

© Prof. Andy Field Page 8

Pengenalan kepada ujian hipotesis

Zolkepeli 2019
Ujian Hipotisis
 Ujian hipotesis ialah kaedah menguji hujah (claim)
atau hipotesis berkaitan parameter dalam populasi
dengan menggunakan data yang diperolehi dari
 Kita menguji hipotesis dengan menentukan sejauh
mana kemungkinan statistik sampel yang telah di
pilih menyamai atau menghampiri parameter
Dengan kata mudah..
 Ujian Hipotesis ialah kaedah menguji sama ada hujah atau
hipotesis berbaitan populasi adalah kemungkinan benar.
Hujah (claim)
 Purata kanak-kanak malaysia menonton tv dalam sehari ialah
3 jam.
Pada kebiasaannya saya boleh menunggang basikal sejauh 100km dalam
masa kurang dari 4jam!!
Secara putara bilangan anak guru perempuan ialah 5 orang.
Hujah (claim)!!
 Pelajar dari latarbelakang sains akan cemerlang dalam
peperiksaan kursus ini!!!!
4 langkah dalam melakukan ujian
1. Nyatakan hipotesis
2. Tentukan kriteria untuk keputusan (menetapkan aras
3. Kira ujian statistik (MANUAL ATAU SOFTWARE)
4. Buat keputusan
Langkah 1: pernyataan hipotesis
 Hipotesis nul(H0), pernyataan berkaitan parameter
populasi, misalnya min populasi, yang ianya
dianggap betul.
 Hipotesis nul ialah titik permulaan. Kita akan menguji sama
ada nilai yang dinyatakan dalam nul hipotesis itu
kemungkinan benar.
Samb:Pernyataan hipotesis
 Hipotesis alternatif (H1 ): pernyataan yang secara langsung
berlawanan dengan apa yang dinyatakan dalam nul hipotesis;
boleh jadi kurang, lebih, atau tidak sama dengan nilai dalam
hipotesis nul.
 Hipotesis alternatif dinyatakan dengan apa yang kita fikirkan
salah berkenaan hipotesis nul.
 Hipotesis alternatif merupakan hipotesis yg pengkaji jangka.
Analogi dalam kes mahkamah
 Defenden apabila dibicarakan; juri menganggap ianya tidak
 Menjadi tanggungjawap pendakwa membuktikan ianya
 Dalam penyelidikan ,pernyataan hipotesis nul adalah benar
sebelum kajian dilakukan.
 Penyelidik menjalankan kajian bagi menunjukkan bukti
pernyataan itu tidak berlaku (tolak hipotesis nul) atau gagal
melakukanya ( kita kekalkan hipotesis nul.)
 Hipotesis Alternatif (H1) juga dikatakan sebagai menjadi
Hipotesis Kajian.
Mari belajar menulis hipotesis dari soalan
 Pernyataan masalah:…salah satu punca kanak-kanak tidak
aktif dan mengalami kegemukan ialah kerana mereka
menonton tv terlalu lama.
 Soalan kajian: Adakah min jumlah jam menonton tv
dikalangan kanak-kanak obesiti di malaysia lebih dari 3 jam
 Hipotesis:
 Hipotesis nul (Ho): purata menonton tv sehari ialah 3 jam.
 Ho: µ = 3 jam
 Hipotesis alternatif (H1): purata menonton tv sehari tidak sama
dengan 3 jam.
 H1: µ ≠ 3 jam
Langkah 2:Tentukan kriteria untuk keputusan

 Menetapkan kriteria untuk keputusan, kita nyatakan aras signifikan, α

(level of significance) untuk ujian.
 Sama seperti juri menetapkan sama ada bukti itu boleh
menimbulkan keraguan yang munasabah.
 Samalah juga dalam ujian hipotesis, kita mengumpul data dari
sampel yang berkemungkinan dari populasi bagi menunjukkan
hipotesis nul itu tidak betul. Kemungkinan itulah yang menjadi
 Dalam kajian sosial sains Kemungkinan atau aras signifikan
ditetapkan pada aras 5%.
 Apabila kebarangkalian memperolehi min sampel kurang dari 5%,
jika hipotesis nul adalah benar maka kita membuat kesimpulan
bahawa sampel yang kita pilih itu sangat berbeza dan kita
menolak hipotesis nul.
Aras signifikan, α
 Aras signifikan, α : menjadi kriteria atau “cut-
off” untuk membuat keputusan menolak atau
menerima hipotesis nul.
 Aras signifikan α juga menentukan risiko
melakukan ralat jenis 1 (Type I error)
 α adalah kebarangkalian kita melakukan ralat
jenis I.
Kawasan kritikal (critical region)
 Kenalpasti kawasan kritikal. Kawasan kritikal adalah kawasan
dimana nilai min itu tidak akan berlaku jika hipotesis nul itu
 Dengan kata lain, 5% kebarangkalian kita menolak min =3,
padahal min itu adalah betul.

Langkah 3: Kira ujian statistik
 Ujian statistik ialah formula matematik yang membolehkan
penyelidik menentukan kemungkinan hasil dari sampel yang
di ambil jika hipotesis nul adalah benar.
 Nilai dari ujian statistik akan digunakan untuk membuat
keputusan terhadap hipotesis nul.
 Gunakan spss untuk mendapatkan nilai ujian statistik
Apa yang berlaku ialah…
 Daripada sampel kajian anda mendapati min menonton tv
ialah 4 jam sehari.
 Melalui kiraan statistik kita akan dapat maklumat sejauh
mana, atau berapa sisihan piawai min sampel kita berbeza
dengan min populasi.
 Semakin besar nilai ujian statistik, semakin jauh min sampel
kita dari min populasi yang dinyatakan dalam hipotesis nul.
Langkah 4: Membuat keputusan
 Kita guna nilai dari ujian statistik untuk membuat keputusan
mengenai hipotesis nul.
 Keputusan adalah berdasarkan kebarangkalian
(p)memperolehi min sampel yang sama dengan yang
dinyatakan dalam hipotesisi nul.
 Jika kebarangkalian (p) memperolehi min sampel kurang
dari 5% bila hipotesis nul adalah benar, maka keputusanya
ialah kita menolak hipotesis nul.
 Jika kebarangkalian (p) min sampel besar dari 5% bila
hipotesis nul adalah benar, maka keputusannya ialah kita
kekalkan hipotesis nul.
Dua keputusan penyelidik boleh buat:
 1. Tolak hipotesis nul. Kebarangkalian yang rendah bagi
memperolehi min sampel yang menyamai seperti yang
dinyatakan dalam hipotesis nul yang di anggap betul.
 2. Kekalkan hipotesis nul. Kebarangkalian yang tinggi bagi
memperolehi min sampel yang menyamai seperti dinyatakan
dalam hipotesis nul yang di anggap betul. (p>0.05)
Nilai p
 Nilai p adalah kebarangkalian untuk mendapatkan
nilai yang dinyatakan dalam hipotesis nul yang
dianggap benar.
 Nilai p akan dibandingkan dengan aras signifikan α
bila membuat keputusan.
 Keputusan menolak atau mengekalkan hipotesis nul
di panggil “signifikan”.
 Apabila nilai p kurang dari 0.05, kita mencapai signifikan;
keputusan adalah menolak hipotesis nul. (p<0.05)
 Apabila nilai p besar dari 0.05, kita gagal mencapai signifikan ;
keputusan adalah kekalkan hipotesis nul. (p>0.05)
Uji tahap kefahaman anda!
1. Nyatakan 4 langkah dalam ujian hipotesis.
2. Keputusan dalam ujian hipotesis adalah untuk kekal atau
tolak hipotesis mana; Hipotesis Nul atau Hipotesis
3. Kriteria atau aras signifikan dalam kajian sains sosial
selalunya ditetapkan pada nilai kebarangkalian berapa?
4. Jika ujian statistik nilai p kurang dari 0.05 atau 5%. Apakah
keputusan dalam ujian hipotesis ini?
5. Jika hipotesis nul ditolak, adakah kita mencapai signifikan?
Ralat-ralat dalam ujian hipotesis
 Hanya kerana min sampel (mengikuti rawatan) adalah
berbeza daripada min populasi, tidak semestinya
menunjukkan bahawa rawatan telah menyebabkan perubahan.
 Anda perlu ingat bahawa biasanya terdapat beberapa
perbezaan antara min sampel dan min populasi hanya akibat
dari ralat persampelan.

Ralat dalam Ujian Hipotesis (samb.)
 Oleh kerana ujian hipotesis bergantung kepada data sampel,
data sampel tidak benar-benar boleh dipercayai, sentiasa ada
risiko bahawa data yg mengelirukan akan menyebabkan ujian
hipotesis untuk mencapai kesimpulan yang salah.
 2 Jenis ralat mungkin berlaku.

Ralat jenis I
 Ralat Jenis Jenis I berlaku apabila data sampel muncul untuk
menunjukkan kesan rawatan apabila, pada hakikatnya tidak.
 Dalam kes ini, penyelidik akan menolak hipotesis nol dan kesilapan
menyimpulkan bahawa rawatan mempunyai kesan.
 Ralat Jenis I adalah disebabkan oleh sampel yang luar biasa dan tidak
mewakili. Hanya dengan peluang (nasib) penyelidik memilih sampel
yang melampau dengan keputusan bahawa sampel jatuh di rantau
kritikal walaupun rawatan tidak menunjukkan kesan.
 Ujian hipotesis distrukturkan supaya Ralat Jenis I adalah sangat tidak
mungkin; secara spesifik, kebarangkalian ralat Jenis I adalah sama
dengan aras signifikan α.

Ralat jenis II
 Ralat Jenis II berlaku apabila sampel tidak menunjukan
peningkatan, pada hakikatnya, rawatan sebenarnya
menyebabkan peningkatan.
 Dalam kes ini, penyelidik akan gagal untuk menolak
hipotesis nol dan secara silap akan menyimpulkan bahawa
rawatan tidak mempunyai kesan.
 Ralat Jenis II biasanya disebabkan oleh kesan rawatan yang
sangat kecil. Walaupun rawatan mempunyai kesan, ia tidak
cukup besar untuk muncul dalam kajian penyelidikan.

Rumusan: Kemungkinan ralat dalam keputusan
Ujian Berarah (Directional Tests)
 Apabila penyelidik meramalkan arah khusus untuk kesan rawatan
(kenaikan atau penurunan), ia adalah dibolehkan untuk
menggabungkan ramalan berarah ke dalam ujian hipotesis.
 Keputusannya dipanggil ujian berarah atau ujian satu hujung.
Ujian satu arah adalah termasuk ramalan arah dalam kenyataan
hipotesis dan di lokasi rantau kritikal.

Ujian berarah (smbg.)
 Sebagai contoh, jika populasi mempunyai min μ = 80
dan rawatan adalah diramalkan akan meningkatkan
markah, maka hipotesis nol akan menyatakan bahawa
selepas rawatan:
 H0: μ ≤ 80 (tiada peningkatan)
 Dalam kes ini, seluruh rantau kritikal akan terletak di
ekor kanan taburan kerana nilai-nilai yang besar untuk M
akan menunjukkan bahawa terdapat peningkatan dan
akan cenderung untuk menolak hipotesis nul.

Ujian tidak berarah (Nondirectional tests)
 Ujian Nondirectional, atau ujian dua hujung, adalah
ujian hipotesis di mana hipotesis alternatif
dinyatakan sebagai tidak sama (≠). Penyelidik
berminat dalam mana-mana alternatif dari hipotesis
Berdasarkan soalan berikut tuliskan hipotesis nul
dan hipotesis alternatif serta nyatakan arah ujian.
1. Adakah terdapat berbezaan purata gaji selepas bergraduasi
dari Universiti mengikut jantina?

 Ho: µlelaki = µperempuan

 H1: µlelaki ≠ µperempuan

 Ini adalah ujian hipotesis dua hujung.

2. Adakah terdapat perbezaan pencapaian dalam keputusan
peperiksaan kursus KP2 hasil dari tiga kaedah pengajaran?
Anggaran sela keyakinan
 Kita mempunyai maklumat mengenai populasi dan
menggunakan teori Taburan Persampelan, kita belajar
tentang sifat-sifat sampel.
 Taburan Persampelan juga memberi kita asas yang
membolehkan kita untuk mengambil sampel dan
menggunakannya untuk menganggar parameter populasi.
 Titik anggaran (point estimate) adalah satu nombor,
 Berapa banyak ketidakpastian dikaitkan dengan anggaran titik parameter populasi?
 Sela anggaran menyediakan maklumat lebih lanjut mengenai ciri-ciri populasi
daripada melakukan anggaran titik. Ia menyediakan tahap keyakinan untuk
anggaran. Anggaran sela tersebut dipanggil sela keyakinan (confidence

Limit Atas
Limit Bawah
Titik anggaran

Lebar sela keyakinan

Extended version of Table 9.2: Coding in thematic analysis: a worked example of the early stages
Data Codes
Moderator: What do you think about the
modern lifestyle and weight and obesity?
Do you think that’s had a big effect?
Sally: I think it’s had a huge effect because I Important factor influencing obesity
remember, say forty years ago, we had a lot Modern lifestyles are sedentary
more industry in this country, so people Lack of physical work nowadays
were actually what you might call working Hard physical work is beneficial to avoid obesity
harder. I know we all work hard, but erm Times have changed
working more…
?: ((in overlap)) Physically.
Sally: ...physically harder. Erm and, you know, Lack of exercise
we didn’t all have cars. So like my Mum Times have changed
used to walk two or three miles to go to the Choice and exercise (none in past)
train station to go another ten miles to Physical activity was an integral part of life in
work, you know, it was like there was a lot the past
more impact. There was no bus for her so Humans as naturally lazy
she had to walk. And nowadays we think ‘oh Exercise as negative (chore and burden)
I can’t do that, can’t walk four miles to go Implicit ideal person is fit/physically active and
and do that’ ((laughs)). thin
Rebecca: Take the car yeah. Humans as naturally lazy
Sally: Yeah exactly.
Rebecca: Yeah.
Sally: Erm and I think over the decades as Times have changed
technology’s advanced, we’ve suddenly… Negative impacts of technology
our lifestyle has changed and it’s had an Different lifestyles
impact on society now. So we’ve got kids Kids learn bad habits from parents (laziness is
growing up who are going um ((pause)) who learnt behaviour)
are growing up thinking ‘oh well if I just Exercise as negative (inherently unpleasant)
jump in the car’, you know, ‘Mum’ll take me Exercise is not a natural desire or activity for
to so and so’ or… and they’re not in that humans (humans are naturally lazy)
sort of exercise is a luxury. You go and you Exercise is a necessary evil of modern life
have to motivate yourself to go to the gym. Exercise isn’t part of everyday life
Where at one time you didn’t have to go to Choice and exercise (we have it, but won’t do it)
a gym because you worked physically or Exercise as exceptional and expensive
whatever, and now we have to motivate Exercise requires something extra
and I’m not motivated. ((laughs)) (willpower/discipline)
Not being motivated is a common story
(laughter suggests ideal person is motivated
to exercise)
Judy: I think modern technology, like allows Modern technology facilitates obesity
you to be lazy as well cos you don’t have to Convenience of modern lifestyles is hard to

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for
beginners. London: Sage. For use in teaching and learning only.
do things for yourself. You can get machines Humans are naturally lazy
and stuff to do things for you.
?: Mm-hm.
Anna: My friends that live up in Manchester Big cities = modern lifestyles
and London, they find it’s actually erm Modern lifestyles = long working hours
easier to buy food on the way home, kind of Time poor (money rich)
like take-out and stuff or go out for a meal, Convenience (pre-prepared food)
than it is actually to go home and start Home cooking as onerous (time, effort)
cooking something if you like finish work at
eight or nine o’clock at night or something.
?: Mm-hm.
Anna: Erm which I guess fuels the fact that Forced into unhealthy eating by modern
people essentially may not be eating food lifestyles
that’s, you know, healthy ((laughs)). Eating out/not cooking = unhealthy eating
Categorisation of food (healthy/unhealthy;
Judy: Yeah if people are working hard they Common-sense association of working hard &
want something quick which tends to be the wanting ‘convenient’ (i.e., quick) food
unhealthy food rather than the healthy Categorisation of food (healthy/unhealthy;
food. good/bad)
Unhealthy food = quick/convenient; healthy
food = slow/inconvenient
? Yeah.
?: Yeah.
Carla: And then their children are growing up Modern lifestyle: bad for adults; bad for
not know- having the faintest idea to even children
cook or prepare food. And also, like you Kids don’t know how to cook
said, the modern technology, it’s like MSN, Children engage in sedentary ‘play’
kids live on it. Children’s socialisation is inadequate
Who is responsible (for children’s ‘not
Implicit parent blaming
Technology is unhealthy
?: Yeah ((laughs)).
Carla: It’s like we’ve got a trampoline outside. I Technology is unhealthy
have to drag them out by their hair to try Parents are powerless in the modern world
and get them to get on it, you know. Sort of Kids have their own agency, they can’t be made
constantly just talking to seven different to be ‘healthy’
people. ‘I’m on MSN, I’ll be there in a Kids today, what can you do?
minute.’ ((laughs)) You know. It’s not good. Technologies determine personhood
Karen: It’s finding the time as well to do the Commodification of exercise
exercise and more than that it’s the money Exercise isn’t part of everyday life
because it costs. Like I live in the middle of Exercise is expensive
town and I could join this one particular Time poor

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for
beginners. London: Sage. For use in teaching and learning only.
gym that you don’t have a contract and you External constraints on ability to/likelihood of
just pay about fifteen pound, twenty pound exercise
a month I think. And all the exercise classes Implicit ideal person is fit and thin
they do free, but it’s just finding the money Exercise that doesn’t cost (running) not a first
at the beginning or the end of the month choice
and so I’ve taken to having like a run round Gendered safety: (women) feeling unsafe
the waterfront but I don’t like to do that by running alone
myself cos I don’t feel safe.
?: Yeah.
Karen: Which means I won’t go if someone else Exercise dependent on others
isn’t able to go. And it just kind of… it leads Humans are naturally lazy (lack of motivation as
on like that. But it is motivation, it is kind of natural state, treated as a self-evident truth
I want to do it and I’m going to go and do it that we are that way)
now. Not I’m going to wait for someone Not exercising is easy
and… Motivation trumps all obstacles to exercise
(motivation also rare)
Sally: Yeah. I’ve actually joined a gym and do I Hard to motivate yourself to exercise (humans
go? No. Because if I had somebody to go are naturally lazy)
with, I’d be motivated to do it. It’s very Not exercising is not shameful – a common
much… it’s nice to have somebody to story (humans are naturally lazy)
actually join that sort of thing and actually Other people can facilitate exercise
help you. Need a motivator for exercise
Judy: I think... I think it’s easy once you get in Exercise easier if part of a regular routine (it
the habit as well cos I joined a gym and then becomes something you just do)
I used to go every day or two or three times Routines are easily disrupted
every day after work. And then stopped Exercise as not natural activity for humans
working and then came to uni and then you Natural human laziness (can be overcome)
just get.
?: Yeah.
Judy: You come out of the habit and it’s harder Routines are hard to establish
to get into the habit again.

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for
beginners. London: Sage. For use in teaching and learning only.
Extended version of Table 9.4: Examples of some extracts of data collated data for three codes
Eating badly leads to obesity Children are being brought up in a way Humans are naturally lazy
which promotes obesity
Rebecca: I was like ten-ish, well all Sally: we’ve got kids growing up who are Sally: [...] And nowadays we
throughout childhood, and going um ((pause)) who are growing up think oh I can’t do that,
MacDonalds or food like that was thinking oh well if I just jump in the car, can’t walk four miles to go
quite… well we had it like once a you know, Mum’ll take me to so and so and do that ((laughs)).
month. And I don’t know if that was or…. Rebecca: Take the car yeah.
just the done thing about ten years Carla: And then their children are Sally: Yeah exactly.
ago, but we never ate stuff like that. growing up not know- having the Rebecca: Yeah.
And at the moment I’ve got a twelve faintest idea to even cook or prepare Sally: You go and you have
year old brother and he just eats food. And also, like you said, the to motivate yourself to go
what the hell he likes and he’s… We modern technology, it’s like MSN, kids to the gym. Where at one
do, admittedly, you know, take the live on it. time you didn’t have to go
Mick out of him a little bit because ?: Yeah ((laughs)). to a gym because you
he has put on weight. And he just Carla: It’s like we’ve got a trampoline worked physically or
comes home and he… bearing in outside. I have to drag them out by their whatever, and now we
mind that he probably has dinner at hair to try and get them to get on it, you have to motivate and I’m
like one o’clock. He comes home, know. Sort of constantly just talking to not motivated ((laughs)).
stuffs his face with some crisps and seven different people. ‘I’m on MSN, I’ll Judy: I think modern
Mum will make him a pizza and be there in a minute.’ ((laughs)) You technology, like allows
chips and we’re just… it’s just got… It know. It’s not good. you to be lazy as well ‘cos
is more like I’d say acceptable, Rebecca: I was like ten-ish, well all you don’t have to do
acceptable that erm but because it’s throughout childhood, and MacDonalds things for yourself. You
just the done thing now I think. The or food like that was quite… well we had can get machines and
kids are just… unfortunately kids are it like once a month. And I don’t know if stuff to do things for you.
just fed that rubbish. that was just the done thing about ten Rebecca: it’s [MacDonald’s]
Carla: I think there is more of an issue years ago, but we never ate stuff like so cheap as well, just so…
of what we eat and the crap that we that. And at the moment I’ve got a Like you can’t justify
eat and people not cooking and not twelve year old brother and he just eats making a… making a meal
using real food what the hell he likes and he’s… We do, for like two pounds when
Rebecca: when I was a kid we all used admittedly, you know, take the Mick out you can go and buy
to sit round the table, whereas now of him a little bit because he has put on something for the same
everyone just makes their own weight. And he just comes home and amount of money, you
meals and just sits in front of the TV he… bearing in mind that he probably know, you don’t have to
and it’s dangerous I think. has dinner at like one o’clock. He comes bother with the washing
Rebecca: I think at the end of the day home, stuffs his face with some crisps up. And so yeah I think that
they’re [food labelling around and Mum will make him a pizza and uh advertising’s got a lot to
‘healthiness’] sort of meaningless chips and we’re just… it’s just got… It is answer for really.
because if you fancy erm a cake with more like I’d say acceptable, acceptable Sally: I think there’s got to
seven thousand calories in, you’re that erm but because it’s just the done be some sort of push
just going to eat it aren’t you, like thing now I think. The kids are just… towards physical
despite whether or not it’s got low unfortunately kids are just fed that education in school.
fat levels. rubbish. Although they… we
Rebecca: then that’s the individual’s Anna: I think, you know, kids sit inside and obviously do P.E. and
choice if they want to eat because they play their computer games all the stuff, erm without… I
nobody asks them, even if you are a time. don’t want to get over
bit depressed, you have got that… into the nanny state type

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for
beginners. London: Sage. For use in teaching and learning only.
that mentality to think oh I’m not Rebecca: I think it starts at home really. thing ‘cos I hate that, erm
going to eat my fifth cream cake Like the Government can stick their but something to actually
today because that’s just a bit labels on and schools can not give kids motivate kids into
piggish. chips, but you spend… I think you spend exercise. Not making it a
Sally: Yeah ‘cos I remember seeing a most of your time at home and I think a chore. Making it fun.
programme about the uh what is it, lot of it is down to erm parents and how Judy: I think modern
Britain’s Fattest Man or something. you have dinner time at home. And technology, like allows
And I mean he just really pigged out. when I was a kid we all used to sit round you to be lazy as well ‘cos
Judy: But then if they don’t want to be the table, whereas now everyone just you don’t have to do
that fat they shouldn’t eat it. makes their own meals and just sits in things for yourself. You
front of the TV and it’s dangerous I can get machines and
think. stuff to do things for you.

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for
beginners. London: Sage. For use in teaching and learning only.
The table illustrate the code – clustered to produce candidate theme and sub theme
Extended version of Table 10.1: Candidate themes showing selected associated codes
1. Human Nature 2. Modern Life
1.1. Sins and sinners 1.1.1. Deserving/ 1.2. Exercise is evil 2.1. Those 2.2. Modern life is rubbish 2.3. They don’t get no
undeserving halcyon days of 2.2.1. Technology education
obesity yore trumps all
‘Liking food’ as ‘Deserving’ and Choice and exercise ‘Dadadada’ – Cost as a bottom line that Children engage Adequate socialisation:
negative; associated ‘undeserving’ (none in past; now we common story – determines what you eat; in sedentary cooking needs to be learned
with overeating obesity: if in have it, but won’t do a past we all Time poor (money rich) ‘play’ (taught in home or school)
Convenience (pre- control, can judge it) recognise ‘Bad foods’ associated with Modern Children’s socialisation is
prepared food); them; if not, can’t Constraints and Choice and positive things in technology important (but inadequate)
convenience of Blaming (eat what supports for regular exercise (none in ads/marketing encourages/facili Irresponsible parenting:
modern lifestyles is he likes) and not exercise past; now we Advertising/ marketing of tates adults pander to children;
hard to resist blaming (he’s fed Exercise as negative: have it, but won’t junk food (to children) obesity/lack of don’t regulate children’s
Emotional bad food) boring (common do it) problematic exercise eating towards healthy
eating/’overeating’ Doesn’t take a lot story); chore and Different lifestyles Children engage in sedentary Negative impacts foods; feed them unhealthy
has no validity – not to cause obesity burden; inherently Past – no such ‘play’ of technology food
an ‘eating disorder’, Emotional eating is unpleasant; thing as Commodification of exercise She’s not Kids have an inherent desire
it’s just gluttony! still potentially inherently lacking fun ‘exercise’; Prepared food as cheap and responsible for to be able to cook, but
Home cooking as under control Exercise as self- physical activity therefore appealing... her children’s education system denies
onerous (time, effort); (some restraint indulgent an integral part Eating out/not cooking = behaviour: tries them this
cooking is a hassle should be applied; Exercise can be a of life unhealthy eating to promote good Kids learn bad habits from
Humans are naturally completely luxury/pleasure Times have Exercise isn’t part of behaviour but parents (laziness)
gluttonous: unless changed everyday life powerless in face

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for beginners. London: Sage. For use in teaching and learning
controlled will eat too unrestrained Exercise easier if part Times have External constraints on of technology Kids’ socialisation is
much/wrong foods; eating is bad) of a regular routine – changed (halcyon ability to/likelihood of and ‘modern life’ inadequate
BUT we should have External factors/life becomes something past): current exercise Technological Prep for home ec (Home
restraint events: obesity you just do cooking teaching Food as fuel – there’s no inherently Economics) is really
Humans are naturally impinges upon Exercise is needed inadequate – pleasure in creating or addictive (more demanding (lacking
lazy (lack of you (you have (because our modern incomplete eating it. The only thing to appealing than resources to just do it)
motivation as natural little control) lifestyles don’t process; think about is cost exercise as School as a key influence
state, treated as a Humans have a require us to do it) mealtimes Forced into unhealthy eating ‘leisure’ activity) Schools have responsibility to
self-evident truth that natural propensity Need a motivation to formerly by modern lifestyles Technology socialise children
we are that way; for obesity: a exercise (getting collective (family) Home cooking as onerous (modern) appropriately; including
exercise as unnatural; constant threat away from kids); experience; now (time, effort); cooking is a encourages teaching cooking, and
home cooking as you have to motivation trumps all individualised hassle obesity/lack of teaching PE well.
onerous – no actively work obstacles to exercise experience; Individualised eating exercise Socialisation (school PE
pleasure; unfit); can against (if you (motivation also rare) school used to (convenience?) Technology is teaching) as inadequate
be overcome become obese, Not exercising is easy; teach/socialise Irresponsible to cook if you unhealthy Socialisation deficit (not
Humans have a natural you’re too blame) exercise requires appropriately can buy a pre-prepared being taught home ec –
propensity for obesity: Mind over matter: effort (bother) Times have meal cheaply personal example)
a constant threat you control over our Other people can changed (the Junk food used to be a treat Socialisation failure; across
have to actively work natural urges to facilitate exercise halcyon past - Kids bombarded with food generations: parents don’t
against (if you become stuff our faces is The individual’s freedom and an ads/messages: hidden necessarily know how to
obese, you’re too vital psychological state active childhood; messages cook; young parents not
blame) Obesity: caused by key to exercise the hellish No ‘modern pantry’: The equipped to socialise their
Labradoring - pure laziness (psychologising/indivi present) home no longer contains children; current cooking
gluttony dualising story of the basics for cooking teaching inadequate;

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for beginners. London: Sage. For use in teaching and learning
Need top-down whether someone Times have Society is no longer safe: government intervention
interventions exercises) changed: junk children as perceived to be needed (to re-educate)
(unfortunately - Time for self as food not vulnerable – limits outside Socialisation is key: early
necessity rather than significant motivation everyday food in play as ‘child safety’ learning sets up later
desire) for exercise the past paramount attitudes and practices
Obesity: caused by Socialisation: being a good
laziness grandmother
Socialisation: home/parents
most important
Socialisation: schooling
should teach life/practical

Figure shows a visual map from the early stages of thematic analysis – showing two overarching themes, five themes and two sub themes + connections

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for beginners. London: Sage. For use in teaching and learning
The dont get technology
Deserving / no eduction trumps all
Sins and sinners undeserving
Human Nature Modern life
Modern life is
Excercise is evil rubbish

Those halcyon
days of yore

© Virginia Braun & Victoria Clarke (2013) Successful qualitative research: A practical guide for beginners. London: Sage. For use in teaching and learning
AISHE-J Volume , Number 3 (Autumn 2017) 3351

Doing a Thematic Analysis: A Practical, Step-by-Step

Guide for Learning and Teaching Scholars. *

Moira Maguire & Brid Delahunt

Dundalk Institute of Technology.


Data analysis is central to credible qualitative research. Indeed the qualitative researcher is
often described as the research instrument insofar as his or her ability to understand, describe
and interpret experiences and perceptions is key to uncovering meaning in particular
circumstances and contexts. While much has been written about qualitative analysis from a
theoretical perspective we noticed that often novice, and even more experienced researchers,
grapple with the ‘how’ of qualitative analysis. Here we draw on Braun and Clarke’s (2006)
framework and apply it in a systematic manner to describe and explain the process of analysis
within the context of learning and teaching research. We illustrate the process using a worked
example based on (with permission) a short extract from a focus group interview, conducted
with undergraduate students.

Key words: Thematic analysis, qualitative methods.

We gratefully acknowledge the support of National Digital Learning Repository (NDLR) local
funding at DkIT in the initial development of this work.

AISHE-J Volume 8, Number 3 (Autumn 2017) 3352

1. Background.
Qualitative methods are widely used in learning and teaching research and scholarship
(Divan, Ludwig, Matthews, Motley & Tomlienovic-Berube, 2017). While the epistemologies
and theoretical assumptions can be unfamiliar and sometimes challenging to those from, for
example, science and engineering backgrounds (Rowland & Myatt, 2014), there is wide
appreciation of the value of these methods (e.g. Rosenthal, 2016). There are many, often
excellent, texts and resources on qualitative approaches, however these tend to focus on
assumptions, design and data collection rather than the analysis process per se.

More and more it is recognised that clear guidance is needed on the practical aspects of how
to do qualitative analysis (Clarke & Braun, 2013). As Nowell, Norris, White and Moules (2017)
explain, the lack of focus on rigorous and relevant thematic analysis has implications in terms
of the credibility of the research process. This article offers a practical guide to doing a
thematic analysis using a worked example drawn from learning and teaching research. It is
based on a resource we developed to meet the needs of our own students and we have used
it successfully for a number of years. It was initially developed with local funding from[Irish]
National Digital Learning Repository (NDLR) and then shared via the NDLR until this closed in
2014. In response to subsequent requests for access to it we decided to revise and develop
this as an article focused more specifically on the learning and teaching context. Following
Clarke & Braun’s (2013) recommendations, we use relevant primary data, include a worked
example and refer readers to examples of good practice.

2. Thematic Analysis.
Thematic analysis is the process of identifying patterns or themes within qualitative data.
Braun & Clarke (2006) suggest that it is the first qualitative method that should be learned as
‘ provides core skills that will be useful for conducting many other kinds of analysis’ (p.78).
A further advantage, particularly from the perspective of learning and teaching, is that it is a
method rather than a methodology (Braun & Clarke 2006; Clarke & Braun, 2013). This means
that, unlike many qualitative methodologies, it is not tied to a particular epistemological or
theoretical perspective. This makes it a very flexible method, a considerable advantage given
the diversity of work in learning and teaching.
AISHE-J Volume 8, Number 3 (Autumn 2017) 3353

There are many different ways to approach thematic analysis (e.g. Alhojailan, 2012;
Boyatzis,1998; Javadi & Zarea, 2016). However, this variety means there is also some
confusion about the nature of thematic analysis, including how it is distinct from a qualitative
content analysis1 (Vaismoradi, Turunen & Bonda, 2013). In this example, we follow Braun &
Clarke’s (2006) 6-step framework. This is arguably the most influential approach, in the social
sciences at least, probably because it offers such a clear and usable framework for doing
thematic analysis.

The goal of a thematic analysis is to identify themes, i.e. patterns in the data that are important
or interesting, and use these themes to address the research or say something about an
issue. This is much more than simply summarising the data; a good thematic analysis
interprets and makes sense of it. A common pitfall is to use the main interview questions as
the themes (Clarke & Braun, 2013). Typically, this reflects the fact that the data have been
summarised and organised, rather than analysed.

Braun & Clarke (2006) distinguish between two levels of themes: semantic and latent.
Semantic themes ‘…within the explicit or surface meanings of the data and the analyst is not
looking for anything beyond what a participant has said or what has been written.’ (p.84). The
analysis in this worked example identifies themes at the semantic level and is representative
of much learning and teaching work. We hope you can see that analysis moves beyond
describing what is said to focus on interpreting and explaining it. In contrast, the latent level
looks beyond what has been said and ‘…starts to identify or examine the underlying ideas,
assumptions, and conceptualisations – and ideologies - that are theorised as shaping or
informing the semantic content of the data’ (p.84).

3. The Research Question And The Data.

The data used in this example is an extract from one of a series of 8 focus groups involving 40
undergraduate student volunteers. The full study involved 8 focus-groups lasting about 40
minutes. These were then transcribed verbatim. The research explored the ways in which
students make sense of and use feedback. Discussions focused on what students thought
about the feedback they had received over the course of their studies: how they understood it;
the extent to which they engaged with it and if and how they used it. The study was ethically
approved by the Dundalk Institute of Technology School of Health and Science Ethics
Committee. All of those who participated in the focus group from which the extract is taken

1 See O’Cathain & Thomas (2004) for a useful guide to using content analysis on responses to open-
ended survey questions.
AISHE-J Volume 8, Number 3 (Autumn 2017) 3354

also gave permission for the transcript extract to be used in this way.

The original research questions were realist ones – we were interested in students’ own
accounts of their experiences and points of view. This of course determined the interview
questions and management as well the analysis. Braun & Clarke (2006) distinguish between a
top-down or theoretical thematic analysis, that is driven by the specific research question(s)
and/or the analyst’s focus, and a bottom-up or inductive one that is more driven by the data
itself. Our analysis was driven by the research question and was more top-down than bottom
up. The worked example given is based on an extract (approx. 15 mins) from a single focus
group interview. Obviously this is a very limited data corpus so the analysis shown here is
necessarily quite basic and limited. Where appropriate we do make reference to our full
analysis however our aim was to create a clear and straightforward example that can be used
as an accessible guide to analysing qualitative data.

3.1 Getting started.

The extract: This is taken from a real focus-group (group-interview) that was conducted with
students as part of a study that explored student perspectives on academic feedback. The
extract covers about 15 minutes of the interview and is available in Appendix 1.

Research question: For the purposes of this exercise we will be working with a very broad,
straightforward research question: What are students’ perceptions of feedback?

3.2 Doing the analysis.

Braun & Clarke (2006) provide a six-phase guide which is a very useful framework for
conducting this kind of analysis (see Table 1). We recommend that you read this paper in
conjunction with our worked example. In our short example we move from one step to the
next, however, the phases are not necessarily linear. You may move forward and back
between them, perhaps many times, particularly if dealing with a lot of complex data.

Step 1: Become familiar with the data, Step 4: Review themes,

Step 2: Generate initial codes, Step 5: Define themes,
Step 3: Search for themes, Step 6: Write-up.

Table 1: Braun & Clarke’s six-phase framework for doing a thematic analysis
AISHE-J Volume 8, Number 3 (Autumn 2017) 3355

3.3 Step 1: Become familiar with the data.

The first step in any qualitative analysis is reading, and re-reading the transcripts. The
interview extract that forms this example can be found in Appendix 1.

You should be very familiar with your entire body of data or data corpus (i.e. all the interviews
and any other data you may be using) before you go any further. At this stage, it is useful to
make notes and jot down early impressions. Below are some early, rough notes made on the

The students do seem to think that feedback is important but don’t always find it useful.
There’s a sense that the whole assessment process, including feedback, can be seen as
threatening and is not always understood. The students are very clear that they want very
specific feedback that tells them how to improve in a personalised way. They want to be able
to discuss their work on a one-to-one basis with lecturers, as this is more personal and also
private. The emotional impact of feedback is important.

3.4 Step 2: Generate initial codes.

In this phase we start to organise our data in a meaningful and systematic way. Coding
reduces lots of data into small chunks of meaning. There are different ways to code and the
method will be determined by your perspective and research questions.

We were concerned with addressing specific research questions and analysed the data with
this in mind – so this was a theoretical thematic analysis rather than an inductive one. Given
this, we coded each segment of data that was relevant to or captured something interesting
about our research question. We did not code every piece of text. However, if we had been
doing a more inductive analysis we might have used line-by-line coding to code every single
line. We used open coding; that means we did not have pre-set codes, but developed and
modified the codes as we worked through the coding process.

We had initial ideas about codes when we finished Step 1. For example, wanting to discuss
feedback on a one-to one basis with tutors was an issue that kept coming up (in all the
interviews, not just this extract) and was very relevant to our research question. We discussed
these and developed some preliminary ideas about codes. Then each of us set about coding
a transcript separately. We worked through each transcript coding every segment of text that
seemed to be relevant to or specifically address our research question. When we finished we
compared our codes, discussed them and modified them before moving on to the rest of the
transcripts. As we worked through them we generated new codes and sometimes modified
AISHE-J Volume 8, Number 3 (Autumn 2017) 3356

existing ones. We did this by hand initially, working through hardcopies of the transcripts with
pens and highlighters. Qualitative data analytic software (e.g. ATLAS, Nvivo etc.), if you have
access to it, can be very useful, particularly with large data sets. Other tools can be effective
also; for example, Bree & Gallagher (2016) explain how to use Microsoft Excel to code and
help identify themes. While it is very useful to have two (or more) people working on the
coding it is not essential. In Appendix 2 you will find the extract with our codes in the margins.

3.5 Step 3: Search for themes.

As defined earlier, a theme is a pattern that captures something significant or interesting about
the data and/or research question. As Braun & Clarke (2006) explain, there are no hard and
fast rules about what makes a theme. A theme is characterised by its significance. If you have
a very small data set (e.g. one short focus-group) there may be considerable overlap between
the coding stage and this stage of identifying preliminary themes.

In this case we examined the codes and some of them clearly fitted together into a theme. For
example, we had several codes that related to perceptions of good practice and what students
wanted from feedback. We collated these into an initial theme called The purpose of

At the end of this step the codes had been organised into broader themes that seemed to say
something specific about this research question. Our themes were predominately descriptive,
i.e. they described patterns in the data relevant to the research question. Table 2 shows all
the preliminary themes that are identified in Extract 1, along with the codes that are associated
with them. Most codes are associated with one theme although some, are associated with
more than one (these are highlighted in Table 2). In this example, all of the codes fit into one
or more themes but this is not always the case and you might use a ‘miscellaneous’ theme to
manage these codes at this point.
AISHE-J Volume 8, Number 3 (Autumn 2017) 3357

Theme : The purpose of Theme: Lecturers. Theme: Reasons for using feedback (or not).
Codes Codes
Ask some Ls, To improve grade,
Help to learn what you’re doing
wrong, Some Ls more approachable, Limited feedback,

U n a b l e t o j u d g e w h e t h e r Some Ls give better advice, Didn’t understand fdbk,

question has been answered,
Reluctance to admit difficulties to L,Fear Fdbk focused on grade ,
of unspecified disadvantage,
Use to improve grade,
U n a b l e t o j u d g e w h e t h e r Unlikely to approach L to discuss fdbk,
Distinguish purpose and use,
question interpreted
Lecturer variability in framing fdbk,
Unlikely to approach L to discuss fdbk,
Unlikely to make a repeated attempt,
Improving structure improves grade,
Distinguish purpose and use,
Have discussed with tutor,
Can’t separate grade and learning,
Improving grade,
Example: Wrong frame of mind
New priorities take precedence = forget about
Improving structure

Theme: How feedback is used T h e m e : E m o t i o n a l r e s p o n s e t o Theme: What students want from feedback.
(or not). feedback.
Codes Codes
Usable fdbk explains grade and how to improve,
Read fdbk, Like to get fdbk,
Want fdbk to explain grade,
Usually read fdbk, Don’t want to get fdbk if haven’t done
well, Example- uninformative fdbk,
Refer to fdbk if doing
Reluctance to hear criticism, Very specific guidance wanted,
same subject,
Reluctance to hear criticism (even if More fdbk wanted,
Not sure fdbk is used, constructive),
Want dialogue with L,
Used fdbk to improve Fear of possible criticism,
Dialogue means more,
referencing, Experience: unrealistic fear of criticism,
Dialogue more personalised/ individual,
Example: using fdbk to Fdbk taken personally initially,
Dialogue more time consuming but better,
improve referencing, Fdbk has an emotional impact,
Want dedicated class for grades and fdbk,
Refer back to example Difficult for L to predict impact,
Compulsory fdbk class,
that ‘went right’, Student variability in response to fdbk,
Structured option to get fdbk,
Forget about fdbk until Want fdbk in L’s office as emotional
Fdbk should be constructive,
response difficult to manage in public,
next assignment,
Fdbk should be about the work and not the
Wording doesn’t make much difference,
Fdbk applicable to similar person,
Lecturer variability in framing fdbk,
assignments, Experience – fdbk is about the work,
Negative fdbk can be constructive,
Fdbk on referencing Difficulties judging own work,
Negative fdbk can be framed in a
widely applicable, Want fdbk to explain what went right,
supportive way.
Experience: fdbk focused Fdbk should focus on understanding,

on referencing, Improving understanding improves grade.

Generic fdbk widely Want fdbk in Ls office as emotional response

difficult to manage in public.

Table 2: Preliminary themes (* fdbk = feedback; L = lecturers)

AISHE-J Volume 8, Number 3 (Autumn 2017) 3358

3.6 Step 4: Review themes.

During this phase we review, modify and develop the preliminary themes that we identified in
Step 3. Do they make sense? At this point it is useful to gather together all the data that is
relevant to each theme. You can easily do this using the ‘cut and paste’ function in any word
processing package, by taking a scissors to your transcripts or using something like Microsoft
Excel (see Bree & Gallagher, 2016). Again, access to qualitative data analysis software can
make this process much quicker and easier, but it is not essential. Appendix 3 shows how the
data associated with each theme was identified in our worked example. The data associated
with each theme is colour-coded.

We read the data associated with each theme and considered whether the data really did
support it. The next step is to think about whether the themes work in the context of the entire
data set. In this example, the data set is one extract but usually you will have more than this
and will have to consider how the themes work both within a single interview and across all
the interviews.

Themes should be coherent and they should be distinct from each other. Things to think about

• Do the themes make sense?

• Does the data support the themes?

• Am I trying to fit too much into a theme?

• If themes overlap, are they really separate themes?

• Are there themes within themes (subthemes)?

• Are there other themes within the data?

For example, we felt that the preliminary theme, Purpose of Feedback ,did not really work as a
theme in this example. There is not much data to support it and it overlaps with Reasons for
using feedback(or not) considerably. Some of the codes included here (‘Unable to judge
whether question has been answered/interpreted properly’) seem to relate to a separate issue
of student understanding of academic expectations and assessment criteria.

We felt that the Lecturers theme did not really work. This related to perceptions of lecturers
and interactions with them and we felt that it captured an aspect of the academic environment.
We created a new theme Academic Environment that had two subthemes: Understanding
AISHE-J Volume 8, Number 3 (Autumn 2017) 3359

Academic Expectations and Perceptions of Lecturers. To us, this seemed to better capture
what our participants were saying in this extract. See if you agree.

The themes, Reasons for using feedback (or not), and How is feedback used (or not) ,did not
seem to be distinct enough (on the basis of the limited data here) to be considered two
separate themes. Rather we felt they reflected different aspects of using feedback. We
combined these into a new theme Use of feedback, with two subthemes, Why? and How?
Again, see what you think.

When we reviewed the theme Emotional Response to Feedback we felt that there was at least
1 distinct sub-theme within this. Many of the codes related to perceptions of feedback as a
potential threat, particularly to self-esteem and we felt that this did capture something
important about the data. It is interesting that while the students’ own experiences were quite
positive the perception of feedback as potentially threatening remained.

So, to summarise, we made a number of changes at this stage:

• We eliminated the Purpose of Feedback theme,

• We created a new theme Academic Environment that had two subthemes:

Understanding Academic Expectations and Perceptions of Lecturers,

• We collapsed Purpose of Feedback, Why feedback is (not)used and How feedback is

(not) used into a new theme, Use of feedback,

• We identified Feedback as potentially threatening as a subtheme within the broader

theme Emotional Response to feedback.

These changes are shown in Table 3 below. It is also important to look at the themes with
respect to the entire data set. As we are just using a single extract for illustration we have not

considered this here, but see Braun & Clarke (2006, p 91-92) for further detail. Depending on

your research question, you might also be interested in the prevalence of themes, i.e. how

often they occur. Braun & Clarke (2006) discuss different ways in which this can be addressed

AISHE-J Volume 8, Number 3 (Autumn 2017) 33510

T h e m e : A c a d e m i c Theme: Use of feedback. T h e m e : E m o t i o n a l Theme: What students want

Context. response to feedback. from feedback.
Subtheme: Reasons for using
Subtheme: Academic fdbk (or not). Like to get fdbk, Usable fdbk explains grade and
expectations. how to improve,
Help to learn what you’re doing Difficult for L to predict
Unable to judge whether wrong, impact, Example- uninformative fdbk,
question has been Very specific guidance wanted,
answered, Improving grade Improving Student variability in
structure, response to fdbk, More fdbk wanted,
Unable to judge whether
q u e s t i o n i n t e r p r e t e d To improve grade, S u b t h e m e : F e e d b a c k Want dialogue with L,
properly, potentially threatening.
Limited feedback, Dialogue means more,
Difficulties judging own Don’t want to get fdbk if
Didn’t understand fdbk, haven’t done well, Dialogue more personalised/
Fdbk focused on grade, Reluctance to hear criticism,
Subtheme: Perceptions
Dialogue more time consuming
of lecturers , Use to improve grade, Reluctance to hear criticism but better,
Ask some Ls, (even if constructive),
Distinguish purpose and use, Want dedicated class for
Some Ls m o r e Improving structure improves Fear of possible criticism, grades and fdbk,
approachable, grade, Experience: fear of potential Compulsory fdbk class,
S o m e L s g i v e b e t t e r C a n ’ t s e p a r a t e g r a d e a n d criticism,
Structured option to get fdbk,
advice, learning, Fdbk taken personally
Fdbk should be constructive ,
R e l u c t a n c e t o a d m i t New priorities take precedence = initially,
difficulties to L, forget about feedback. Fdbk should be about the work
Fdbk has an emotional
and not the person,
F e a r o f u n s p e c i f i e d Subtheme: How fdbk is used impact,
disadvantage, (or not). Experience – fdbk is about the
Want fdbk in L’s office as
emotional response difficult work,
Unlikely to approach L to Read fdbk/Usually read fdbk,
discuss fdbk, to manage in public,
Want fdbk to explain grade,
Refer to fdbk if doing same
U n l i k e l y t o m a k e a subject, Wording doesn’t make much
Want fdbk to explain what went
repeated attempt, difference,
Not sure fdbk is used,
Have discussed with tutor, Negative fdbk can be
Fdbk should focus on
Used fdbk to improve referencing, constructive, understanding,
Example: Wrong frame of
mind, Example: using fdbk to improve Negative fdbk can be framed
Improving understanding
referencing, in a supportive way.
improves grade,
Lecturer variability in
framing fdbk. Refer back to example that ‘went Want fdbk in L’s office as
right’, emotional response difficult to
manage in public.
Forget about fdbk until next

Fdbk applicable to similar


Fdbk on referencing widely


Experience: fdbk focused on


Generic fdbk widely applicable.

Table 3: Themes at end of Step 4

AISHE-J Volume 8, Number 3 (Autumn 2017) 33511

3.7 Step 5: Define themes.

This is the final refinement of the themes and the aim is to ‘..identify the ‘essence’ of what
each theme is about.’.(Braun & Clarke, 2006, p.92). What is the theme saying? If there are
subthemes, how do they interact and relate to the main theme? How do the themes relate to
each other? In this analysis, What students want from feedback is an overarching theme that
is rooted in the other themes. Figure 1 is a final thematic map that illustrates the relationships
between themes and we have included the narrative for What students want from feedback

Emotional response

What students want

from feedback

Potential threat

Academic Environment Use of feedback

Why? How?
Perceptions of Ls Understanding expectations

Figure 1: Thematic map.

What students want from feedback.

Students are clear and consistent about what constitutes effective feedback and made
concrete suggestions about how current practices could be improved. What students want
from feedback is rooted in the challenges; understanding assessment criteria, judging their
own work, needing more specific guidance and perceiving feedback as potentially threatening.
Students want feedback that both explains their grades and offers very specific guidance on
how to improve their work. They conceptualised these as inextricably linked as they felt that
improving understanding would have a positive impact on grades. Students identified that
they not only had difficulties in judging their own work but also how or why the grade was
awarded. They wanted feedback that would help them to evaluate their own work.
AISHE-J Volume 8, Number 3 (Autumn 2017) 33512

‘Actually if you had to tell me how I got a 60 or 67, how I got that grade, because I
know every time I'm due to get my result for an assignment, I kind of go ‘oh I did so
bad, I was expecting to get maybe 40 or 50’, and then you go in and you get in the
high 60s or 70s. It's like how did I get that? What am I doing right in this piece of work?’
(F1, lines 669-672).

Participants felt that they needed specific, concrete suggestions for improvement that they
could use in future work. They acknowledged that they received useful feedback on
referencing but that other feedback was not always specific enough to be usable.

‘The referencing thing I’ve tried to, that’s the only… that’s really the only feedback we
have gotten back ,I have tried to improve, but everything else it’s just kind of been ‘well
done’, I don’t… hasn’t really told us much.’ (F1, lines 389-392).

Significantly, it emerged that students want opportunities for both verbal and written feedback
from lecturers. The main reason identified for wanting more formal verbal feedback is that it
facilitates dialogue on issues that may be difficult to capture on paper. Moreover, it seems that
feedback enables more specific comments on strengths and limitations of submitted work.
However, it is also clear that verbal feedback is valued as the perception that lecturers are
taking an interest in individual students is perceived to ‘mean more’.

‘I think also the thing that, you know… the fact that someone has sat down and taken
the time to actually tell you this is probably, it gives you an incentive to do it (over-
speaking). It does mean a bit more ‘ (M1, lines 456-458).

For these participants, the ideal situation was to receive feedback on a one-to-one basis in the
lecturer’s office. Privacy is seen as important as students do find feedback potentially
threatening and are concerned about managing their reactions in public. For these students, it
was difficult to proactively access feedback, largely because the demands of new work limited
their capacity to focus on completed work. Given this, they wanted feedback sessions to be
formally scheduled.

3.8 Step 6: Writing-up.

Usually the end-point of research is some kind of report, often a journal article or dissertation.
Table 4 includes a range of examples of articles, broadly in the area of learning and teaching,
that we feel do a good job of reporting a thematic analysis.

Table 4: Some examples of articles reporting thematic analysis.

AISHE-J Volume 8, Number 3 (Autumn 2017) 33513

Gagnon, L.L. & Roberge, G. (2012). Dissecting the journey: Nursing student experiences
with collaboration during the group work process. Nurse Education Today, 32(8), 945-950.

Karlsen, M-M. W., Wallander; Gabrielsen, A.K., Falch, A.L. & Stubberud, D.G. (2017).
Intensive care nursing students’ perceptions of simulation for learning confirming
communication skills: A descriptive qualitative study. Intensive & Critical Care Nursing, 42,

Lehtomäki, E., Moate, J. & Posti-Ahokas, H. (2016). Global connectedness in higher

education: student voices on the value of crosscultural learning dialogue. Studies in Higher
Education, 41 (11), 2011-2027.

Polous, A. & Mahony, M-J. (2008). Effectiveness of feedback: the students' perspectives.
Assessment & Evaluation in Higher Education, 33(2), 143-154.

4. Concluding Comments.
Analysing qualitative data can present challenges, not least for inexperienced researchers. In
order to make explicit the ‘how’ of analysis, we applied Braun and Clarke (2006) thematic
analysis framework to data drawn from learning and teaching research. We hope this has
helped to illustrate the work involved in getting from transcript(s) to themes. We hope that you
find their guidance as useful as we continue to do when conducting our own research.

5. References.
Alholjailan, M.I. (2012). Thematic Analysis: A critical review of its process and evaluation.
West East Journal of Social Sciences, 1(1), 39-47.

Boyatzis, R. E. (1998). Transforming qualitative information: thematic analysis and code

development. Sage.

Braun, V. & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in
Psychology, 3, 77-101.

Bree, R. & Gallagher, G. (2016). Using Microsoft Excel to code and thematically analyse
qualitative data: a simple, cost-effective approach. All Ireland Journal of Teaching and
Learning in Higher Education (AISHE-J), 8(2), 2811-28114.
AISHE-J Volume 8, Number 3 (Autumn 2017) 33514

Clarke, V. & Braun, V. (2013) Teaching thematic analysis: Overcoming challenges and
developing strategies for effective learning. The Psychologist, 26(2), 120-123.

Divan, A., Ludwig, L., Matthews, K., Motley, P. & Tomlienovic-Berube, A. (2017). A survey of
research approaches utilised in The Scholarship of Learning and Teaching publications.
Teaching & Learning Inquiry,[online] 5(2), 16.

Javadi, M. & Zarea, M. (2016). Understanding Thematic Analysis and its Pitfalls. Journal Of
Client Care, 1 (1) , 33-39.

Nowell, L. S., Norris, J. M., White, D. E., & Moules, N. J. (2017). Thematic Analysis: Striving to
Meet the Trustworthiness Criteria. International Journal of Qualitative Methods, 16 (1), 1-13.

O’Cathain, A., & Thomas, K. J. (2004). “Any other comments?” Open questions on
questionnaires – a bane or a bonus to research? BMC Medical Research Methodology, 4, 25.

Rosenthal, M. (2016). Qualitative research methods: Why, when, and how to conduct
interviews and focus groups in pharmacy research. Currents in Pharmacy Teaching and
Learning, 8(4), 509-516.

Rowland, S.L. & Myatt, P.M. (2014). Getting started in the scholarship of teaching and
learning: a "how to" guide for science academics. Biochemistry & Molecular Biology
Education, 42(1), 6-14.

Vaismoradi, M., Turunen, H. & Bondas, T. (2013). Content analysis and thematic analysis:
Implications for conducting a qualitative descriptive study. Nursing and Health Sciences,
15(3), 398-405.
Analisis Data Kualitatif

*parts of this slides were taken from Creswell, Educational Research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research, 5th Ed.
Analisis data kualitatif melibatkan proses menyusun
dan menjelaskan data
Refleksi yang simbolik dan mendalam

APA ITU Data diperolehi dari:

 Transkrip temubual individu

Transkrip kumpulan berfokus (focus group)
Nota lapangan (Field notes)

Dokumen (laporan, surat khabar dll)
Transkrip rakaman audio dan video dari pemerhatian terhadap
aktiviti tertentu (observation).

Data yang berkaitan dengan konsep, pendapat, nilai

dan tingkah laku orang dalam konteks sosial.
Setiap orang berbeza dari segi pengalaman dan
pemahaman mereka tentang realiti

APA ANDA Fenomena sosial tidak dapat difahami di luar

konteksnya sendiri (konteks).
PERLU TAHU Penyelidikan kualitatif boleh digunakan untuk
MENGENAI memahami sesuatu fenomena - menghasilkan
kerangka - teori.
ANALISIS DATA Proses memahami tingkah laku manusia berlaku
KUALITATIF? secara perlahan dan tidak linear.
Dapatan yang unik boleh memberi gambaran
kepada fenomena yang hendak dikaji
 idea baru untuk pertanyaan / kajian lanjut.
PERLU TAHU Analisis adalah proses yang tidak linear.
Berterusan dan progresif.
MENGENAI Interaksi mendalam dengan data.
ANALISIS DATA Pengumpulan dan analisis data adalah serentak.
KUALITATIF? Tahap analisis berbeza-beza.
1. Preparing and organizing the data for analysis
2. Exploring the data through coding
3. Using codes to develop description and themes
4. Representing the findings through narratives and visuals
5. Making an interpretation of the meaning of the findings
6. Conducting a validation of the accuracy of the findings

Creswell (2012)
Codes the text for Codes the text for
description to be used themes to be used
in the research report in the research report

The researcher codes the data (locates text

segments and assigns a code to label them)
Iteractive Simultaneous
The researcher reads through data
(obtains general sense of material)

The researcher prepares data for analysis

(transcribes fieldnotes)

The researcher collects data (a text file, such as

fieldnotes, interview recordings, optically scanned material)
Creswell (2012)
Membuat penyediaan data mentah yang diperolehi dari
pengumpuan data – transkrip temubual, nota lapangan, video
pemerhatian, gambar dan dokumen
Bina jadual matrik untuk memudahkan organisasi data
Susun data berdasarkan jenisnya
Simpan Salinan data (backup)
Menterjemah temubual kepada transkrip

Creswell (2012)
Meneroka maksud data secara umum untuk mendapatkan
gambaran data secara keseluruhan
Membaca nota lapangan dan transkrip temubual berulang kali
untuk membuat refleksi terhadap makna data secara keseluruhan
Membuat catatan di margin bahagian temubual / nota lapangan
Pertimbangkan jika terdapat keperluan untuk menambah data

Creswell (2012)
• Assign code word
• One, two, or three words that
describe what is being said
3. PENGEKODAN DATA • Terms from the literature can
be used
• When possible use a
participant’s actual words
Satu proses mengorganisasi data kepada bahagian-
bahagian atau segmen-segmen berdasarkan
teks(transkrip) - sebelum membina maksud
Melibatkan proses:
 Membaca semua transkrip data
 Mulakan dengan satu transkrip terlebih dahulu
 Mengenalpasti segmen yang relevan dalam teks
 Membuat kategori kepada segmen-segmen tersebut
 Memberikan label kepada kategori

Creswell (2012)
Pengekodan boleh dibuat berdasarkan:
 Kod yang biasa digunakan untuk topik /perbincangan tersebut (diperolehi melalui tinjauan
literatur pengetahuan umum)
 Kod yang tidak dijangka pada permulaan penyelidikan.
 Kod yang baru / kurang digunakan dan menarik perhatian pembaca
 Kod yang menunjukkan teori yang umum dalam penyelidikan

Semak semula kod untuk mengurangkan kod yang bertindan – duplicate ideas
Kurangkan kepada jumlah yang mudah untuk diurus (25-30 kod)
Kod yang terhasil boleh dikumpulkan kepada kateogri tema – mengambarkan idea
yang menyeluruh (major idea)

Creswell (2012)
3 proses coding:
Open coding – secara terbuka untuk cari tema
Axial coding – cuba bina kategori dari tema yang dibina
Selective coding – Pilih tema dgn kategori yang hanya
boleh memberi makna / tafsiran / cerita
The participant didn’t tell
a lot of people about his
TB (only those he had to).
For his family, their main
reaction was concern/
worry both for him and for
themselves. For his boss, he
was worried about the
other employees.

(Gibson 2003)
Divide text Label Reduce Collapse
Initially read into segments segments of overlap and
through data of information information codes into
redundancy themes
with codes of codes

Many Many
pages segments 30–40 Codes
of text of text codes reduced Reduce codes to
to 20 5–7 themes

Creswell (2012)
Comparison table: Jadual untuk membanding bezakan kumpulan
dalam satu tema yang sama
Demographic table: Jadual yang mempunyai maklumat
demografik peserta kajian
Figures/diagrams: Rajah atau visual yang menunjukkan perkaitan
antara tema yang timbul.

Creswell (2012)
Menginterpretasi data atau menjelaskan maksud data yang telah
Melibatkan daya kreativiti pengkaji (jadual, graf & sbgainya
boleh dimasukkan)
Perbandingan dengan teori / literatur lepas
Perlu disokong dengan bukti
Cadangan untuk kajian akan datang

Creswell (2012)
Kesahan boleh dibuat dengan meneliti:
hubungan antara data dan tema yang timbul
menyemak kesahan melalui bukti yang menyokong dapatan
membuat triangulasi data dengan perspektif responden
dan/atau rakan penyelidik lain – member checking

Creswell (2012)
Proses pengekodan boleh dibuat secara manual
Proses pengekodan boleh dibuat dengan bantuan perisian
komputer – Nvivo, Atlas.ti
Lebih efisyen - pencarian data kualitatif
Sangat sesuai apabila melibatkan data yang besar (pencarian
lebih cepat kepada kod-kod yang telah ditetapkan) – transkrip
Analisis tematik adalah proses mengenal
pasti corak atau tema dalam data
ANALISIS Braun & Clarke (2006) mencadangkan
bahawa ia adalah kaedah kualitatif
TEMATIK pertama yang perlu dipelajari terlebih
dahulu kerana '.. ia menyediakan
kemahiran teras yang berguna untuk
menjalankan pelbagai jenis analisis'
Braun & Clarke (2006) menyediakan
panduan enam fasa yang merupakan
rangka kerja yang sangat berguna untuk
menjalankan analisis tematik.
1. Step 1: Become familiar with the data
ANALISIS 2. Step 2: Generate initial codes
TEMATIK 3. Step 3: Search for themes
4. Step 4: Review themes
5. Step 5: Define themes
6. Step 6: Write-up

Langkah pertama dalam analisis kualitatif adalah membaca,

dan membaca semula transkrip – kenal data anda!
membuat nota untuk mendapatkan gambaran awal.
Boleh cuba kaitkan dengan persoalan kajian (jika kajian
menggunakan pendekatan deduktif)
Dalam fasa ini, kita mula menyusun data dengan cara yang
bermakna dan sistematik.
Pengekodan - boleh mengumpul/mengumpuk data ke dalam
kumpulan data kecil yang bermakna.
Pengekodan transkrip boleh dibuat pada setiap segmen teks –
relevan dengan persolan kajian
Banding - bincang dan ubahsuai kod sebelum bergerak ke
bahagian teks/transkrip temubual yang lain.
Proses yang iteratif dan berulang
Hasil kod membentuk tema
Tema adalah corak yang menerangkan sesuatu yang penting atau
makna mengenai data.
No specific rules in generating themes!
Tema dicirikan oleh kepentingannya untuk memahami fenomena /
persoalan kajian – diperolehi daripada kajian literatur

Contoh beberapa kod yang berkaitan dengan persepsi tentang amalan yang
baik dan apa yang pelajar mahu dari maklum balas / feedback.
Proses mengkaji, mengubahsuai dan membangunkan tema yang dikenal
pasti di langkah 3.
Kumpul semua kod yang ada - berkaitan dengan setiap tema
Tema-tema sepatutnya jelas dan harus berbeza dari satu sama lain.
Perkara yang perlu difikirkan termasuk:
 Adakah tema itu logik?
 Adakah data menyokong tema?
 Jika tema bertindih:
 Adakah terdapat subtema bagi tema utama?
 Adakah terdapat lagi tema lain dalam data?

Contoh beberapa kod yang berkaitan dengan persepsi tentang amalan yang
baik dan apa yang pelajar mahu dari maklum balas / feedback.
Contoh beberapa tema yang
diubahsuai berdasarkan kod –
perspektif penyelidik
Proses penyempurnaan terakhir tema
Matlamatnya adalah untuk 'mengenal pasti' intipati 'dari setiap
tema. '(Braun & Clarke, 2006, p.92).
Apa yang digambarkan oleh tema tersebut?
Sekiranya ada subtema, bagaimana subtema berinteraksi dan
berkaitan dengan tema utama?
Bagaimana tema berkaitan dengannya satu sama lain?
Fasa ini melibatkan proses penulisan akhir penyelidikan - laporan,
artikel jurnal atau disertasi.
(Gibson 2003)

Warnakan setiap
tema/ kategri

(Gibson 2003)
Rich and detailed description and interpretation of theme

Data quotation inserted
throughout and used as an
example to support your
If quotation were removed, it
still would make sense to

Excerpt / data quotations from the transcript

Analysis would not
make sense if
the data is closely
with the quote from the
What is DDR ?

Why DDR?

Type of DDR

DDR phases and methodology

What is DDR

• Design and development research

• A “... systematic study of design,
development and evaluation processes
with the aim of establishing an empirical
basis for the creation of instructional OR
non-instructional products and tools and
new or enhanced models that govern their
development” (Richey & Klein, 2008, p.
• Design, development, evaluation
What is DDR
• Systematic planning of learning
experiences that encompasses 5
phases: analysis, design,
development, implementation
and evaluation, with the purpose
of improving an individual’s
learning, experiences and
(Yusop, 3rd June 2020)
What is DDR

Also Known As:

• Design Experiments (Brown,1992; Collins, 1992),
• Design Research (Edelson, 2002),
• Design-based Research (Design-based Research
Collective, 2003),
• Formative Research (Reigeluth & Frick, 1999),
• Developmental Research (Richey et al., 2004),
• Design and Developmental Research (Richey &
Klien, 2007)
Psychological and learning theory and research.
• the learner and the learning process, the learning and transfer context, and
instructional strategies.

Instructional theory and teaching-learning research.

• content structure and sequence, instructional strategies, media and delivery

Communication theory and message design research.

• media and delivery systems. When combined with the principles of information
processing and perception, guide page layout, screen design, graphics, and visual
TYPE 1: Product & TYPE 2: Model
Tool Research Research
DDR projects – instructional / Model Development –
non instructional products / process of developing a
TYPE OF DDR programs model (E.g. ID model for
Specific project phases (ID medic education)
phases) Model validation – internally
ADDIE and externally validate a
DDR tools – tool for
development & use Model use – study on the use
of specific model and
Context specific conditions that impact the
use of the model
ID Models Analysis

• ADDIE model is a generic Design

instructional design model
• Morrison, Ross, Kemp
• Isman Model
• 4CID model
• SAMR model

Phases needed in DDR – Type 1
Analysis Activity: Needs Data Collection Data Analysis:
analysis, method: Thematic analysis,
performance interview, focus statistical analysis
analysis, literature group, survey
review document

Design Activity: choosing Data collection Data analysis:

learning method: focus Nominal group
strategies, group technique (NGT)
Phases needed in DDR – Type 1
Development Activity/Approach: Data collection: Data analysis:
Phase prototyping, usability evaluation form, statistical analysis,
testing questionnaire, report on testing
observation check data
Phases needed in DDR – Type 1

Implementation Approach/ Activity: Data collection: Data analysis:

Phase instructor training, observation check learning analytics
observation, identify list, questionnaire, statistical analysis
learner’s existing pre-test (Quasi-
knowledge experiment)
Phases needed in DDR – Type 1
Approach/Activity: Data collection: Data Analysis:
Phase Formative evaluation questionnaire, Statistical
(survey), post-test (quasi) analysis (anova
summative evaluation etc)
So, how can we put this in research context?
Problem or issue to be solved / investigated



Technology integration

Product Evaluation Product

Due to pandemic – clinical rotation is limited
for final year students


Virtual Patients (learning
Product evaluation
Why do we need both learning theory
and ID model / theory?
Learning theory explain how people learn

ID model / theory explain how to help people learn and develop


Case Study Survey

Richey & Klein, 2008


Richey & Klein, 2008

Some tips
Problem Statements
• Focus on the problems or issues faced by the targeted audiences first;
not the solutions
• Propose the solutions (i.e. virtual patient) to overcome the
problems/issues identified
• Technology is just a tool to manage learning
• Pedagogy is the key

From Dr. Farah Dina DDR slide

Some tips
Research questions
• Should be written with the focus on the problems and the proposed
solutions; NOT by the phases of instructional design
• How would the theory XYZ be applied in the processes of designing
and developing the ABC module?
• How did the ABC online module assist learners’ creative skills

From Dr. Farah Dina DDR slide

Some tips
• Try to diversify your research approaches
• We usually are comfortable using qualitative approaches e.g. case
study  it is difficult to defend the model and generalize its
application to other ‘cases’;
• Try other methods e.g. mixed method, quantitative

From Dr. Farah Dina DDR slide

Some tips
Data analyses
• For qualitative approaches:
• include rich data and the coding processes
• Don’t just stated ‘it has been coded’ or ‘we used thematic analyses’
• tell audience/readers how you did it  others can replicate the processes
• Triangulate your data  use several methods to explain a result e.g. interview
& observation & student’s reflection
• Cite your ‘data’ in your writing – e.g. use participants’ excerpt/quotes

From Dr. Farah Dina DDR slide

Some tips
Reporting DDR
• Chapter 1 - Introduction • Chapter 4 – Methodology
• Problem statement • Context
• ROs • Participants
• RQs • Data collection strategies
• Operational definition • Pilot testing
• Chapter 2 - LR • Chapter 5 – Data analysis and findings
• Key concepts
• Chapter 6 – Discussion and conclusion
• Theoretical framework
• Implications to theories/ ID model framework
• Conceptual framework
• Recommendation to other researcher / practitioner
• Chapter 3 – DDR process / model development
Wrap up

What is DDR? Why DDR?

DDR Phases Tips

• Merriënboer, J. J. G., Clark, R. E., & Croock, M. B. M. (2002). Blueprints for complex learning: The 4C/ID-
model. Educational Technology Research and Development, 50(2), 39–64.

• Molenda, M. (2003). In search of the elusive ADDIE model. Performance Improvement, 42(5), 34–36.

• Richey, R. C., & Klien, J. D. (2014). Design and Development Research. In J. M. Spector, M. D. Merill, J. Elen,
& M. J. Bishop (Eds.), Handbook of Research on Educational Communications and Technology (4th ed., pp.
Thank you

Anda mungkin juga menyukai