Evaluationistheprocessofexaminingaprogramorprocesstodeterminewhat'sworking,what'snot,and
why.
Evaluationdeterminesthevalueofprogramsandactsasblueprintsforjudgmentandimprovement.
(Rossett&Sheldon,2001)
TypesofEvaluationsinInstructional
Design
Evaluationsarenormallydividedintotwobroadcategories:formativeandsummative.
Formative
Aformativeevaluation(sometimesreferredtoasinternal)isamethodforjudgingtheworthofaprogram
whiletheprogramactivitiesareforming(inprogress).Thispartoftheevaluationfocusesontheprocess.
Thus,formativeevaluationsarebasicallydoneonthefly.Theypermitthedesigners,learners,and
instructorstomonitorhowwelltheinstructionalgoalsandobjectivesarebeingmet.Itsmainpurposeisto
catchdeficienciessothattheproperlearninginterventionscantakeplacethatallowsthelearnerstomaster
therequiredskillsandknowledge.
Formativeevaluationisalsousefulinanalyzinglearningmaterials,studentlearningandachievements,and
teachereffectiveness....Formativeevaluationisprimarilyabuildingprocesswhichaccumulatesaseriesof
componentsofnewmaterials,skills,andproblemsintoanultimatemeaningfulwhole.WallyGuyot
(1978)
Summative
Asummativeevaluation(sometimesreferredtoasexternal)isamethodofjudgingtheworthofaprogram
attheendoftheprogramactivities(summation).Thefocusisontheoutcome.
Allassessmentscanbesummative(i.e.,havethepotentialtoserveasummativefunction),butonlysome
havetheadditionalcapabilityofservingformativefunctions.Scriven(1967)
Thevariousinstrumentsusedtocollectthedataarequestionnaires,surveys,interviews,observations,and
testing.Themodelormethodologyusedtogatherthedatashouldbeaspecifiedstepbystepprocedure.It
shouldbecarefullydesignedandexecutedtoensurethedataisaccurateandvalid.
Questionnairesaretheleastexpensiveprocedureforexternalevaluationsandcanbeusedtocollectlarge
samplesofgraduateinformation.Thequestionnairesshouldbetrialed(tested)beforeusingtoensurethe
recipientsunderstandtheiroperationthewaythedesignerintended.Whendesigningquestionnaires,keep
inmindthemostimportantfeatureistheguidancegivenforitscompletion.Allinstructionsshouldbe
clearlystated...letnothingbetakenforgranted.
HistoryoftheTwoEvaluations
Scriven(1967)firstsuggestedadistinctionbetweenformativeevaluationandsummativeevaluationwhen
describingtwomajorfunctionsofevaluation.Formativeevaluationwasintendedtofosterdevelopmentand
improvementwithinanongoingactivity(orperson,product,program,etc.).Summativeevaluation,in
contrast,isusedtoassesswhethertheresultsoftheobjectbeingevaluated(program,intervention,person,
etc.)metthestatedgoals.
Scrivensawtheneedtodistinguishtheformativeandsummativerolesofcurriculumevaluation.While
Scrivenpreferredsummativeevaluationsperformingafinalevaluationoftheprojectorperson,hedid
cometoacknowledgeCronbach'smeritsofformativeevaluationpartoftheprocessofcurriculum
developmentusedtoimprovethecoursewhileitisstillfluid(hebelieveditcontributesmoretothe
improvementofeducationthanevaluationusedtoappraiseaproduct).
Later,Misanchuk(1978)deliveredapaperontheneedtotightenupthedefinitionsinordertogetmore
accuratemeasurements.Theonethatseemstocausethegreatestdisagreementisthekeepingoffluid
movementsorchangesstrictlyintheprereleaseversions(beforeithitsthetargetpopulation).
InPaulSaettler's(1990)historyofinstructionaltechnology,hedescribesthetwoevaluations(pp.430431)
inthecontextofhowtheywereusedindevelopingSesameStreetandTheElectricCompanybythe
Children'sTelevisionWorkshop.CTWusedformativeevaluationsforidentifyanddefiningprogram
designsthatcouldprovidereliablepredictorsoflearningforparticularlearners.Theylaterusedsummative
evaluationstoprovetheirefforts(toquitegoodeffectImightadd).WhileSaettlerpraisesCTWfora
significantlandmarkinthetechnologyofinstructionaldesign,hewarnsthatitisstilltentativeandshould
beseenmoreasapointofdepartureratherthanafixedformula.
Saettlerdefinesthetwotypesofevaluationsas:1)formativeisusedtorefinegoalsandevolvestrategies
forachievinggoals,while2)summativeisundertakentotestthevalidityofatheoryordeterminethe
impactofaneducationalpracticesothatfutureeffortsmaybeimprovedormodified.
Thus,usingMisanchuk'sdefiningtermswillnormallyachievemoreaccuratemeasurements;however,the
costisquitehighasitishighlyresourceintensive,particularlywithtimebecauseofallthepreworkthat
hastobeperformedinthedesignphase:create,trial,redo,trial,redo,trial,redo,etc.;andallpreferably
withoutusingthetargetpopulation.
However,mostorganizationsaredemandingshorterdesigntimes.Thustheformativepartismovedoverto
theothermethods,suchasthroughtheuseofrapidprototypingandusingtestingandevaluationsmethods
toimproveasonemoveson.Whichofcourseisnotasaccuratebutitismoreappropriatetomost
organizationsastheyarenotreallythatinterestedinaccuratemeasurementsofthecontentbutratherthe
endproductskilledandknowledgeableworkers.
Misanchuk'sdefiningtermsbasicallyputsallthewaterinacontainerforaccuratemeasurementswhilethe
typicalorganizationestimatesthevolumeofwaterrunninginastream.
Thusifyouareavendor,researcher,orneedhighlyaccuratemeasurementsyouwillprobablydefinethe
twoevaluationsinthesamemannerasMisanchuk.Ifyouneedtopushthetraining/learningoutfasterand
arenotallthatworriedabouthighlyaccuratemeasurements,thenyoudefineitclosertohowmost
organizationsdoandSaettlerdoeswiththeCTWexample.
publishedasanarticle,TechniquesforEvaluatingTrainingPrograms,inabookKirkpatrickedited,
EvaluatingTrainingPrograms(1975).Howeveritwasnotuntilhis1994bookwaspublished,Evaluating
TrainingPrograms,thatthefourlevelsbecamepopular.Nowadays,hisfourlevelsremainacornerstonein
thelearningindustry.
Whilemostpeoplerefertothefourcriteriaforevaluatinglearningprocessesaslevels,Kirkpatricknever
usedthatterm,henormallycalledthemsteps(Craig,1996).Inaddition,hedidnotcallitamodel,but
usedwordssuchastechniquesforconductingtheevaluation(Craig,1996,p294).
Thefourstepsofevaluationconsistof:
Step 1: Reaction - How well did the learners like the learning
process?
Step 2: Learning - What did they learn? (the extent to which the
learners gain knowledge and skills)
Kirkpatrick'sconceptisquiteimportantasitmakesanexcellentplanning,evaluating,andtroubling
shootingtool,especiallyifwewemakesomeslightimprovementsasshowbelow.
ThisdiffersfromKirkpatrick(1996)whowrotethatreactionwashowwellthelearnerslikedaparticular
learningprocess.However,thelessrelevancethelearningpackageistoalearner,thenthemoreeffortthat
hastobeputintothedesignandpresentationofthelearningpackage.Thatis,ifitisnotrelevanttothe
learner,thenthelearningpackagehastohookthelearnerthroughslickdesign,humor,games,etc.Thisis
nottosaythatdesign,humor,orgamesareunimportant;however,theiruseinalearningpackageshouldbe
topromoteoraidthelearningprocessratherthanjustmakeitfun.Andifalearningpackageisbuiltof
soundpurposeanddesign,thenitshouldsupportthelearnersinbridgingaperformancegap.Hence,they
shouldbemotivatedtolearnifnot,somethingdreadfullywentwrongduringtheplanninganddesign
processes!Ifyoufindyourselfhavingtohookthelearnersthroughslickdesign,thenyouprobablyneedto
reevaluatethepurposeofyourlearningprocesses.
Thismakesitbothaplanningandevaluationtoolwhichcanbeusedasatroublingshootingheuristic:
(Chyung,2008):
Theadvantageofanormreferencedtestisthatitshowsushowourstudentisdoingrelatedtoother
studentsacrossthecountry.Adisadvantageisthattheyarestandardizedanddonotshowsmallincrements
ofgain.Theyaregoodforusingforplacementatthebeginningandthenagainfourorsixmonthslater,or
attheendoftheyear.Thiswillshowgrowthovertheperiodofthetime.
Normreferenced(alsocalledstandardizedorcriterionreferenced)testsalongwithinformalobservational
evaluationareusefulforshowingstudentgrowthovertime.Theyaren'ttobeusedforgradingthoughthey
canbeoneelementinatotalgrade.Onemustrememberwecan'texpectgreatgrowth,ifany,overshort
periodsoftimes,particularlyasshownonanormreferencedtest.
Thedefinitionofretardedis"slowed."Thatmeansthatthegrowthofourstudentsisslowed,butinmost
cases,formanythingsandformoststudents,notstopped.Asamatteroffact,itisjustunlikelyfora
"normal"populationofstudentstoshowmuchgrowthevenafterasemester'stime.Thesetestsarenot
intendedformeasuringsmallincrementsofgain.
Criterionrelatedtestsarenicebecausewecanseejustwhatourstudentaccomplished.Sonow,afterthree
months,s/hecanrecognize35morewords,ormaybe65morewordsthans/hecouldbefore.Thestudent
cannameallthecolors.Thestudentisnowputtingawaytoyswheretheybelongwherebefores/heeither
wouldnotorcouldnot.
Traditional Assessment
Authentic Assessment
AlternativeNamesforAuthenticAssessment
Definitions
Aformofassessmentinwhichstudentsareaskedtoperformrealworldtasksthatdemonstratemeaningful
applicationofessentialknowledgeandskillsJonMueller
"...Engagingandworthyproblemsorquestionsofimportance,inwhichstudentsmustuseknowledgeto
fashionperformanceseffectivelyandcreatively.Thetasksareeitherreplicasoforanalogoustothekindsof
problemsfacedbyadultcitizensandconsumersorprofessionalsinthefield."GrantWiggins
(Wiggins,1993,p.229).
"Performanceassessmentscallupontheexamineetodemonstratespecificskillsandcompetencies,thatis,
toapplytheskillsandknowledgetheyhavemastered."RichardJ.Stiggins(Stiggins,1987,p.34).
IfIwereagolfinstructorandItaughttheskillsrequiredtoperformwell,Iwouldnotassessmystudents'
performancebygivingthemamultiplechoicetest.Iwouldputthemoutonthegolfcourseandaskthem
toperform.Althoughthisisobviouswithathleticskills,itisalsotrueforacademicsubjects.Wecanteach
studentshowtodomath,dohistoryanddoscience,notjustknowthem.Then,toassesswhatourstudents
hadlearned,wecanaskstudentstoperformtasksthat"replicatethechallenges"facedbythoseusing
mathematics,doinghistoryorconductingscientificinvestigation.
TraditionalAuthentic
SelectingaResponsePerformingaTask
ContrivedReallife
Recall/RecognitionConstruction/Application
TeacherstructuredStudentstructured
IndirectEvidenceDirectEvidence
Letmeclarifytheattributesbyelaboratingoneachinthecontextoftraditionalandauthenticassessments:
SelectingaResponsetoPerformingaTask:Ontraditionalassessments,studentsaretypicallygiven
severalchoices(e.g.,a,b,cord;trueorfalse;whichofthesematchwiththose)andaskedtoselecttheright
answer.Incontrast,authenticassessmentsaskstudentstodemonstrateunderstandingbyperformingamore
complextaskusuallyrepresentativeofmoremeaningfulapplication.
ContrivedtoReallife:Itisnotveryofteninlifeoutsideofschoolthatweareaskedtoselectfromfour
alternativestoindicateourproficiencyatsomething.Testsofferthesecontrivedmeansofassessmentto
increasethenumberoftimesyoucanbeaskedtodemonstrateproficiencyinashortperiodoftime.More
commonlyinlife,asinauthenticassessments,weareaskedtodemonstrateproficiencybydoing
something.
Recall/RecognitionofKnowledgetoConstruction/ApplicationofKnowledge:Welldesigned
traditionalassessments(i.e.,testsandquizzes)caneffectivelydeterminewhetherornotstudentshave
acquiredabodyofknowledge.Thus,asmentionedabove,testscanserveasanicecomplementtoauthentic
assessmentsinateacher'sassessmentportfolio.Furthermore,weareoftenaskedtorecallorrecognizefacts
andideasandpropositionsinlife,sotestsaresomewhatauthenticinthatsense.However,the
demonstrationofrecallandrecognitionontestsistypicallymuchlessrevealingaboutwhatwereallyknow
andcandothanwhenweareaskedtoconstructaproductorperformanceoutoffacts,ideasand
propositions.Authenticassessmentsoftenaskstudentstoanalyze,synthesizeandapplywhattheyhave
learnedinasubstantialmanner,andstudentscreatenewmeaningintheprocessaswell.
TeacherstructuredtoStudentstructured:Whencompletingatraditionalassessment,whatastudent
canandwilldemonstratehasbeencarefullystructuredbytheperson(s)whodevelopedthetest.Astudent's
attentionwillunderstandablybefocusedonandlimitedtowhatisonthetest.Incontrast,authentic
assessmentsallowmorestudentchoiceandconstructionindeterminingwhatispresentedasevidenceof
proficiency.Evenwhenstudentscannotchoosetheirowntopicsorformats,thereareusuallymultiple
acceptableroutestowardsconstructingaproductorperformance.Obviously,assessmentsmorecarefully
controlledbytheteachersofferadvantagesanddisadvantages.Similarly,morestudentstructuredtasks
havestrengthsandweaknessesthatmustbeconsideredwhenchoosinganddesigninganassessment.
IndirectEvidencetoDirectEvidence:Evenifamultiplechoicequestionasksastudenttoanalyzeor
applyfactstoanewsituationratherthanjustrecallthefacts,andthestudentselectsthecorrectanswer,
whatdoyounowknowaboutthatstudent?Didthatstudentgetluckyandpicktherightanswer?What
thinkingledthestudenttopickthatanswer?Wereallydonotknow.Atbest,wecanmakesomeinferences
aboutwhatthatstudentmightknowandmightbeabletodowiththatknowledge.Theevidenceisvery
indirect,particularlyforclaimsofmeaningfulapplicationincomplex,realworldsituations.Authentic
assessments,ontheotherhand,offermoredirectevidenceofapplicationandconstructionofknowledge.
Asinthegolfexampleabove,puttingagolfstudentonthegolfcoursetoplayprovidesmuchmoredirect
evidenceofproficiencythangivingthestudentawrittentest.Canastudenteffectivelycritiquethe
argumentssomeoneelsehaspresented(animportantskilloftenrequiredintherealworld)?Askinga
studenttowriteacritiqueshouldprovidemoredirectevidenceofthatskillthanaskingthestudentaseries
ofmultiplechoice,analyticalquestionsaboutapassage,althoughbothassessmentsmaybeuseful.
TeachingtotheTest
Thesetwodifferentapproachestoassessmentalsoofferdifferentadviceaboutteachingtothetest.Under
theTAmodel,teachershavebeendiscouragedfromteachingtothetest.Thatisbecauseatestusually
assessesasampleofstudents'knowledgeandunderstandingandassumesthatstudents'performanceonthe
sampleisrepresentativeoftheirknowledgeofalltherelevantmaterial.Ifteachersfocusprimarilyonthe
sampletobetestedduringinstruction,thengoodperformanceonthatsampledoesnotnecessarilyreflect
knowledgeofallthematerial.So,teachershidethetestsothatthesampleisnotknownbeforehand,and
teachersareadmonishednottoteachtothetest.
WithAA,teachersareencouragedtoteachtothetest.Studentsneedtolearnhowtoperformwellon
meaningfultasks.Toaidstudentsinthatprocess,itishelpfultoshowthemmodelsofgood(andnotso
good)performance.Furthermore,thestudentbenefitsfromseeingthetaskrubricaheadoftimeaswell.Is
this"cheating"?Willstudentsthenjustbeabletomimictheworkofotherswithouttrulyunderstanding
whattheyaredoing?Authenticassessmentstypicallydonotlendthemselvestomimicry.Thereisnotone
correctanswertocopy.So,byknowingwhatgoodperformancelookslike,andbyknowingwhatspecific
characteristicsmakeupgoodperformance,studentscanbetterdeveloptheskillsandunderstanding
necessarytoperformwellonthesetasks.(Forfurtherdiscussionofteachingtothetest,seeBushweller.)
NormReferenced
TypesofTests
Standardizedtestscomparestudents'performancetothatofanormingorsamplegroupwhoareinthesamegradeorareofthesameage.
Students'performanceiscommunicatedinpercentileranks,gradeequivalentscores,normalcurveequivalents,scaledscores,orstanine
scores.
Examples:IowaTests;SAT;DRP;ACT
CriterionReferenced
Astudent'sperformanceismeasuredagainstastandard.Oneformofcriterionreferencedassessmentisthebenchmark,adescriptionofakey
taskthatstudentsareexpectedtoperform.
Examples:DIBELS;Chaptertests;Driver'sLicenseTest;FCAT(FloridaComprehensiveAssessmentTest)
Survey
Surveyteststypicallyprovideanoverviewofgeneralcomprehensionandwordknowledge.
Examples:Interestsurveys;KWL;LearningStylesInventory
DiagnosticTools
Diagnostictestsassessanumberofareasingreaterdepth.
Examples:WoodcockJohnson;BRI;"TheFoxintheBox"
FormalTests
Formaltestsmaybestandardized.Theyaredesignedtobegivenaccordingtoastandardsetofcircumstances,theyhavetimelimits,andthey
havesetsofdirectionswhicharetobefollowedexactly.
Examples:SAT;FCAT;ACT
InformalTests
Informaltestsgenerallydonothaveasetofstandarddirections.Theyhaveagreatdealofflexibilityinhowtheyareadministered.Theyare
constructedbyteachersandhaveunknownvalidityandreliability.
Examples:Reviewgames;Quizzes
Static(Summative)Tests
Measureswhatthestudenthaslearned.
Examples:Endofchaptertests;Finalexaminations;Standardizedstatetests
Dynamic(Formative)Tests
Measuresthestudents'graspofmaterialthatiscurrentlybeingtaught.Canalsomeasurereadiness.Formativetestshelpguideandinform
instructionandlearning.
Examples:Quizzes;Homework;Portfolios
LawrenceKohlberg'sstagesofmoraldevelopmentconstituteanadaptationofapsychologicaltheory
originallyconceivedbytheSwisspsychologistJeanPiaget.Kohlbergbeganworkonthistopicwhilea
psychologygraduatestudentattheUniversityofChicago
[1]in1958,andexpandedanddevelopedthis
theorythroughouthislife.
Thetheoryholdsthatmoralreasoning,thebasisforethicalbehavior,hassixidentifiabledevelopmental
stages,eachmoreadequateatrespondingtomoraldilemmasthanitspredecessor.[2]Kohlbergfollowedthe
developmentofmoraljudgmentfarbeyondtheagesstudiedearlierbyPiaget,[3]whoalsoclaimedthatlogic
andmoralitydevelopthroughconstructivestages.[2]ExpandingonPiaget'swork,Kohlbergdeterminedthat
theprocessofmoraldevelopmentwasprincipallyconcernedwithjustice,andthatitcontinuedthroughout
theindividual'slifetime,[4]anotionthatspawneddialogueonthephilosophicalimplicationsofsuch
research.[5][6]
Thesixstagesofmoraldevelopmentaregroupedintothreelevels:preconventionalmorality,conventional
morality,andpostconventionalmorality.
Stages
Kohlberg'ssixstagescanbemoregenerallygroupedintothreelevelsoftwostageseach:preconventional,
conventionalandpostconventional.[7][8][9]FollowingPiaget'sconstructivistrequirementsforastagemodel,
asdescribedinhistheoryofcognitivedevelopment,itisextremelyraretoregressinstagestolosethe
useofhigherstageabilities.[14][15]Stagescannotbeskipped;eachprovidesanewandnecessaryperspective,
morecomprehensiveanddifferentiatedthanitspredecessorsbutintegratedwiththem. [14][15]
Level 1 (Pre-Conventional)
1. Obedience and punishment orientation
(How can I avoid punishment?)
2. Self-interest orientation
(What's in it for me?)
(Paying for a benefit)
Level 2 (Conventional)
3. Interpersonal accord and conformity
(Social norms)
(The good boy/girl attitude)
4. Authority and social-order maintaining orientation
(Law and order morality)
Level 3 (Post-Conventional)
5. Social contract orientation
6. Universal ethical principles
(Principled conscience)
Theunderstandinggainedineachstageisretainedinlaterstages,butmayberegardedbythoseinlater
stagesassimplistic,lackinginsufficientattentiontodetail.
Pre-conventional
Thepreconventionallevelofmoralreasoningisespeciallycommoninchildren,althoughadultscanalso
exhibitthislevelofreasoning.Reasonersatthisleveljudgethemoralityofanactionbyitsdirect
consequences.Thepreconventionallevelconsistsofthefirstandsecondstagesofmoraldevelopment,and
issolelyconcernedwiththeselfinanegocentricmanner.Achildwithpreconventionalmoralityhasnot
yetadoptedorinternalizedsociety'sconventionsregardingwhatisrightorwrong,butinsteadfocuses
largelyonexternalconsequencesthatcertainactionsmaybring.[7][8][9]
InStageone(obedienceandpunishmentdriven),individualsfocusonthedirectconsequencesoftheir
actionsonthemselves.Forexample,anactionisperceivedasmorallywrongbecausetheperpetratoris
punished."ThelasttimeIdidthatIgotspankedsoIwillnotdoitagain."Theworsethepunishmentforthe
actis,themore"bad"theactisperceivedtobe.[16]Thiscangiverisetoaninferencethateveninnocent
victimsareguiltyinproportiontotheirsuffering.Itis"egocentric,"lackingrecognitionthatothers'points
ofviewaredifferentfromone'sown.[17]Thereis"deferencetosuperiorpowerorprestige."[17]
Anexampleofobedienceandpunishmentdrivenmoralitywouldbeachildrefusingtodosomething
becauseitiswrongandthattheconsequencescouldresultinpunishment.Forexample,achild'sclassmate
triestodarethechildinplayinghookyfromschool.Thechildwouldapplyobedienceandpunishment
drivenmoralitybyrefusingtoplayhookybecausehewouldgetpunished.Anotherexampleofobedience
andpunishmentdrivenmoralityiswhenachildrefusestocheatonatestbecausethechildwouldget
punished
Stagetwo(selfinterestdriven)expressesthe"what'sinitforme"position,inwhichrightbehavioris
definedbywhatevertheindividualbelievestobeintheirbestinterestbutunderstoodinanarrowway
whichdoesnotconsiderone'sreputationorrelationshipstogroupsofpeople.Stagetworeasoningshowsa
limitedinterestintheneedsofothers,butonlytoapointwhereitmightfurthertheindividual'sown
interests.Asaresult,concernforothersisnotbasedonloyaltyorintrinsicrespect,butrathera"You
scratchmyback,andI'llscratchyours."mentality.[2]Thelackofasocietalperspectiveinthepre
conventionallevelisquitedifferentfromthesocialcontract(stagefive),asallactionshavethepurposeof
servingtheindividual'sownneedsorinterests.Forthestagetwotheorist,theworld'sperspectiveisoften
seenasmoralrelativism.
Anexampleofselfinterestdriveniswhenachildisaskedbyhisparentstodoachore.Thechildasks
"what'sinitforme?"Theparentswouldofferthechildanincentivebygivingachildanallowancetopay
themfortheirchores.Thechildismotivatedtodochoresforselfinterest.Anotherexampleofselfinterest
driveniswhenachilddoestheirhomeworkinexchangeforbettergradesandrewardsfromtheirparents
Conventional
Theconventionallevelofmoralreasoningistypicalofadolescentsandadults.Toreasoninaconventional
wayistojudgethemoralityofactionsbycomparingthemtosociety'sviewsandexpectations.The
conventionallevelconsistsofthethirdandfourthstagesofmoraldevelopment.Conventionalmoralityis
characterizedbyanacceptanceofsociety'sconventionsconcerningrightandwrong.Atthislevelan
individualobeysrulesandfollowssociety'snormsevenwhentherearenoconsequencesforobedienceor
disobedience.Adherencetorulesandconventionsissomewhatrigid,however,andarule'sappropriateness
orfairnessisseldomquestioned.[7][8][9]
InStagethree(goodintentionsasdeterminedbysocialconsensus),theselfenterssocietybyconforming
tosocialstandards.Individualsarereceptivetoapprovalordisapprovalfromothersasitreflectssociety's
views.Theytrytobea"goodboy"or"goodgirl"toliveuptotheseexpectations,[2]havinglearnedthat
beingregardedasgoodbenefitstheself.Stagethreereasoningmayjudgethemoralityofanactionby
evaluatingitsconsequencesintermsofaperson'srelationships,whichnowbegintoincludethingslike
respect,gratitudeandthe"goldenrule"."Iwanttobelikedandthoughtwellof;apparently,notbeing
naughtymakespeoplelikeme."Conformingtotherulesforone'ssocialroleisnotyetfullyunderstood.
Theintentionsofactorsplayamoresignificantroleinreasoningatthisstage;onemayfeelmoreforgiving
ifonethinks,"theymeanwell..."[2]
InStagefour(authorityandsocialorderobediencedriven),itisimportanttoobeylaws,dictumsandsocial
conventionsbecauseoftheirimportanceinmaintainingafunctioningsociety.Moralreasoninginstage
fouristhusbeyondtheneedforindividualapprovalexhibitedinstagethree.Acentralidealoridealsoften
prescribewhatisrightandwrong.Ifonepersonviolatesalaw,perhapseveryonewouldthusthereisan
obligationandadutytoupholdlawsandrules.Whensomeonedoesviolatealaw,itismorallywrong;
culpabilityisthusasignificantfactorinthisstageasitseparatesthebaddomainsfromthegoodones.Most
activemembersofsocietyremainatstagefour,wheremoralityisstillpredominantlydictatedbyanoutside
force.[2]
Post-Conventional
Thepostconventionallevel,alsoknownastheprincipledlevel,ismarkedbyagrowingrealizationthat
individualsareseparateentitiesfromsociety,andthattheindividualsownperspectivemaytake
precedenceoversocietysview;individualsmaydisobeyrulesinconsistentwiththeirownprinciples.Post
conventionalmoralistslivebytheirownethicalprinciplesprinciplesthattypicallyincludesuchbasic
humanrightsaslife,liberty,andjustice.Peoplewhoexhibitpostconventionalmoralityviewrulesas
usefulbutchangeablemechanismsideallyrulescanmaintainthegeneralsocialorderandprotecthuman
rights.Rulesarenotabsolutedictatesthatmustbeobeyedwithoutquestion.Becausepostconventional
individualselevatetheirownmoralevaluationofasituationoversocialconventions,theirbehavior,
especiallyatstagesix,canbeconfusedwiththatofthoseatthepreconventionallevel.
Sometheoristshavespeculatedthatmanypeoplemayneverreachthislevelofabstractmoralreasoning. [7][8]
[9]
InStagefive(socialcontractdriven),theworldisviewedasholdingdifferentopinions,rightsandvalues.
Suchperspectivesshouldbemutuallyrespectedasuniquetoeachpersonorcommunity.Lawsareregarded
associalcontractsratherthanrigidedicts.Thosethatdonotpromotethegeneralwelfareshouldbe
changedwhennecessarytomeetthegreatestgoodforthegreatestnumberofpeople." [8]Thisisachieved
throughmajoritydecisionandinevitablecompromise.Democraticgovernmentisostensiblybasedonstage
fivereasoning.
InStagesix(universalethicalprinciplesdriven),moralreasoningisbasedonabstractreasoningusing
universalethicalprinciples.Lawsarevalidonlyinsofarastheyaregroundedinjustice,andacommitment
tojusticecarrieswithitanobligationtodisobeyunjustlaws.Legalrightsareunnecessary,associal
contractsarenotessentialfordeonticmoralaction.Decisionsarenotreachedhypotheticallyina
conditionalwaybutrathercategoricallyinanabsoluteway,asinthephilosophyofImmanuelKant.[18]This
involvesanindividualimaginingwhattheywoulddoinanothersshoes,iftheybelievedwhatthatother
personimaginestobetrue.[19]Theresultingconsensusistheactiontaken.Inthiswayactionisnevera
meansbutalwaysanendinitself;theindividualactsbecauseitisright,andnotbecauseitavoids
punishment,isintheirbestinterest,expected,legal,orpreviouslyagreedupon.AlthoughKohlberginsisted
thatstagesixexists,hefounditdifficulttoidentifyindividualswhoconsistentlyoperatedatthatlevel. [15]
MontessorieducationisaneducationalapproachdevelopedbyItalianphysicianandeducatorMaria
Montessoriandcharacterizedbyanemphasisonindependence,freedomwithinlimits,andrespectfora
childsnaturalpsychological,physical,andsocialdevelopment.Althougharangeofpracticesexistsunder
thename"Montessori",theAssociationMontessoriInternationale(AMI)andtheAmericanMontessori
Society(AMS)citetheseelementsasessential:[2][3]
Mixedageclassrooms,withclassroomsforchildrenages2or3to6yearsoldbyfarthemost
common
Studentchoiceofactivityfromwithinaprescribedrangeofoptions
Uninterruptedblocksofworktime,ideallythreehours
Aconstructivistor"discovery"model,wherestudentslearnconceptsfromworkingwithmaterials,
ratherthanbydirectinstruction
SpecializededucationalmaterialsdevelopedbyMontessoriandhercollaborators
Freedomofmovementwithintheclassroom
AtrainedMontessoriteacher
UnderstandingbyDesign,orUbD,isatoolutilizedforeducationalplanningfocusedon"teachingfor
understanding"advocatedbyJayMcTigheandGrantWigginsintheirUnderstandingbyDesign(1998),
publishedbytheAssociationforSupervisionandCurriculumDevelopment.[1][2]TheemphasisofUbDison
"backwarddesign",thepracticeoflookingattheoutcomesinordertodesigncurriculumunits,
performanceassessments,andclassroominstruction.[3]
"UnderstandingbyDesign"and"UbD"areregisteredtrademarksoftheAssociationforSupervisionand
CurriculumDevelopment("ASCD").AccordingtoWiggins,"ThepotentialofUbDforcurricular
improvementhasstruckachordinAmericaneducation.Over250,000educatorsownthebook.Over
30,000Handbooksareinuse.Morethan150Universityeducationclassesusethebookasatext."[1]As
definedbyWigginsandMcTighe,UnderstandingbyDesignisa"frameworkfordesigningcurriculum
units,performanceassessments,andinstructionthatleadyourstudentstodeepunderstandingofthecontent
youteach,"[4]UbDexpandson"sixfacetsofunderstanding",whichincludestudentsbeingabletoexplain,
interpret,apply,haveperspective,empathize,andhaveselfknowledgeaboutagiventopic.[5]
UnderstandingbyDesignreliesonwhatWigginsandMcTighecall"backwarddesign"(alsoknownas
"backwardsplanning").Teachers,accordingtoUbDproponents,traditionallystartcurriculumplanning
withactivitiesandtextbooksinsteadofidentifyingclassroomlearninggoalsandplanningtowardsthat
goal.Inbackwarddesign,theteacherstartswithclassroomoutcomesandthenplansthecurriculum,
choosingactivitiesandmaterialsthathelpdeterminestudentabilityandfosterstudentlearning. [6]
TheBackwarddesignapproachisdevelopedinthreestages.Stage1startswitheducatorsidentifyingthe
desiredresultsoftheirstudentsbyestablishingtheoverallgoalofthelessonsbyusingcontentstandards,
commoncoreorstatestandards.Inaddition,UbD'sstage1defines"Studentswillunderstandthat..."and
listsessentialquestionsthatwillguidethelearnertounderstanding.Stage1alsofocusesonidentifying
"whatstudentswillknow"andmostimportantly"whatstudentswillbeabletodo".
C. Item Analysis
Afteryoucreateyourobjectiveassessmentitemsandgiveyourtest,howcan
youbesurethattheitemsareappropriatenottoodifficultandnottooeasy?
Howwillyouknowifthetesteffectivelydifferentiatesbetweenstudentswho
dowellontheoveralltestandthosewhodonot?Anitemanalysisisa
valuable,yetrelativelyeasy,procedurethatteacherscanusetoanswerbothof
thesequestions.
Todeterminethedifficultyleveloftestitems,ameasurecalledtheDifficulty
Indexisused.Thismeasureasksteacherstocalculatetheproportionofstudents
whoansweredthetestitemaccurately.Bylookingateachalternative(for
multiplechoice),wecanalsofindoutifthereareanswerchoicesthatshouldbe
replaced.Forexample,let'ssayyougaveamultiplechoicequizandtherewere
fouranswerchoices(A,B,C,andD).Thefollowingtableillustrateshowmany
studentsselectedeachanswerchoiceforQuestion#1and#2.
Question
#1
24*
#2
12*
13
*Denotescorrectanswer.
ForQuestion#1,wecanseethatAwasnotaverygooddistractornoone
selectedthatanswer.Wecanalsocomputethedifficultyoftheitembydividing
thenumberofstudentswhochoosethecorrectanswer(24)bythenumberof
totalstudents(30).Usingthisformula,thedifficultyofQuestion#1(referredto
asp)isequalto24/30or.80.Arough"ruleofthumb"isthatiftheitem
difficultyismorethan.75,itisaneasyitem;ifthedifficultyisbelow.25,itisa
difficultitem.Giventheseparameters,thisitemcouldberegardedmoderately
easylots(80%)ofstudentsgotitcorrect.Incontrast,Question#2ismuch
moredifficult(12/30=.40).Infact,onQuestion#2,morestudentsselectedan
incorrectanswer(B)thanselectedthecorrectanswer(A).Thisitemshouldbe
carefullyanalyzedtoensurethatBisanappropriatedistractor.
Anothermeasure,theDiscriminationIndex,referstohowwellanassessment
differentiatesbetweenhighandlowscorers.Inotherwords,youshouldbeable
toexpectthatthehighperformingstudentswouldselectthecorrectanswerfor
eachquestionmoreoftenthanthelowperformingstudents.Ifthisistrue,then
theassessmentissaidtohaveapositivediscriminationindex(between0and1)
indicatingthatstudentswhoreceivedahightotalscorechosethecorrect
answerforaspecificitemmoreoftenthanthestudentswhohadaloweroverall
score.If,however,youfindthatmoreofthelowperformingstudentsgota
specificitemcorrect,thentheitemhasanegativediscriminationindex
(between1and0).Let'slookatanexample.
Table2displaystheresultsoftenquestionsonaquiz.Notethatthestudentsare
arrangedwiththetopoverallscorersatthetopofthetable.
Student
Total
Score (%)
Questions
1
Asif
90
Sam
90
Jill
80
Charlie
80
Sonya
70
Ruben
60
Clay
60
Kelley
50
Justin
50
Tonya
40
"1"indicatestheanswerwascorrect;"0"indicatesitwasincorrect.
FollowthesestepstodeterminetheDifficultyIndexandtheDiscrimination
Index.
A. Bloom's Taxonomy
Questions (items) on quizzes and exams can demand different levels of
thinking skills. For example, some questions might be simple
memorization of facts, and others might require the ability to
synthesize information from several sources to select or construct a
response. Benjamin Bloom created a hierarchy of cognitive skills
(called Bloom's taxonomy) that is often used to categorize the levels
of cognitive involvement (thinking skills) in educational settings. The
taxonomy provides a good structure to assist teachers in writing
objectives and assessments. It can be divided into two levels -- Level I
(the lower level) contains knowledge, comprehension and application;
Figure1.Bloom'sTaxonomy.
Bloom'staxonomyisalsousedtoguidethedevelopmentofstandardizedassessments.Forexample,in
Florida,about65%ofthequestionsonthestatewidereadingtest(FCAT)aredesignedtomeasureLevelII
thinkingskills(application,analysis,synthesis,andevaluation).Topreparestudentsforthesestandardized
tests,classroomassessmentsmustalsodemandbothLevelIandIIthinkingskills.Integratinghigherlevel
skillsintoinstructionandassessmentincreasesthelikelihoodthatstudentswillsucceedontestsand
becomebetterproblemsolvers.
Sometimesobjectivetests(suchasmultiplechoice)arecriticizedbecausethequestionsemphasizeonly
lowerlevelthinkingskills(suchasknowledgeandcomprehension).However,itispossibletoaddress
higherlevelthinkingskillsviaobjectiveassessmentsbyincludingitemsthatfocusongenuine
understanding"how"and"why"questions.Multiplechoiceitemsthatinvolvescenarios,casestudies,
andanalogiesarealsoeffectiveforrequiringstudentstoapply,analyze,synthesize,andevaluate
information
Beforeyouwritetheassessmentitems,youshouldcreateablueprintthatoutlinesthecontentareasandthe
cognitiveskillsyouaretargeting.Onewaytodothisistolistyourinstructionalobjectives,alongwiththe
correspondingcognitivelevel.Forexample,thefollowingtablehasfourdifferentobjectivesandthe
correspondinglevelsofassessment(relativetoBloom'staxonomy).Foreachobjective,fiveassessment
itemswillbewritten,someatLevelIandsomeatLevelII.Thisapproachhelpstoensurethatallobjectives
arecoveredandthatseveralhigherlevelthinkingskillsareincludedintheassessment.
Objectiv
e
Number of Items at
Level I
(Bloom's Taxonomy)
Number of Items at
Level II
(Blooms' Taxonomy)
Afteryouhavedeterminedhowmanyitemsyouneedforeachlevel,youcanbeginwritingthe
assessments.Thereareseveralformsofselectedresponseassessments,includingmultiplechoice,
matching,andtrue/false.Regardlessoftheformyouselect,besuretheitemsareclearlywordedatthe
appropriatereadinglevelanddonotincludeunintentionalclues.Thevalidityofyourtestwillsuffer
tremendouslyifthestudentscantcomprehendorreadthequestions!Thissectionincludesafewguidelines
forconstructingobjectiveassessmentitems,alongwithexamplesandnonexamples.
MultipleChoice
Multiplechoicequestionsconsistofastem(questionorstatement)withseveralanswerchoices
(distractors).Foreachofthefollowingguidelines,clickthebuttonstoviewanExampleorNonExample.
Example
Non-Example
Example
Non-Example
Example
Non-Example
Example
Non-Example
Matching
Matchingitemsconsistoftwolistsofwords,phrases,orimages(oftenreferredtoasstemsandresponses).
Studentsreviewthelistofstemsandmatcheachwithaword,phrase,orimagefromthelistofresponses.
Foreachofthefollowingguidelines,clickthebuttonstoviewanExampleorNonExample.
Example
Non-Example
Example
Non-Example
Example
Non-Example
Example
Non-Example
True/False
True/falsequestionscanappeartobeeasiertowrite;however,itisdifficulttowriteeffectivetrue/false
questions.Also,thereliabilityofT/Fquestionsisnotgenerallyveryhighbecauseofthehighpossibilityof
guessing.Inmostcases,T/Fquestionsarenotrecommended.
Example
Non-Example
Example
Non-Example
Example
Non-Example
Example
Non-Example
Test Topics
Step 9. Conduct the Item Analysis
DownloadthisinformationinPDFformat
Introduction
Theitemanalysisisanimportantphaseinthedevelopmentofanexamprogram.Inthisphasestatistical
methodsareusedtoidentifyanytestitemsthatarenotworkingwell.Ifanitemistooeasy,toodifficult,
failingtoshowadifferencebetweenskilledandunskilledexaminees,orevenscoredincorrectly,anitem
analysiswillrevealit.Thetwomostcommonstatisticsreportedinanitemanalysisaretheitemdifficulty,
whichisameasureoftheproportionofexamineeswhorespondedtoanitemcorrectly,andtheitem
discrimination,whichisameasureofhowwelltheitemdiscriminatesbetweenexamineeswhoare
knowledgeableinthecontentareaandthosewhoarenot.Anadditionalanalysisthatisoftenreportedis
thedistractoranalysis.Thedistractoranalysisprovidesameasureofhowwelleachoftheincorrect
optionscontributestothequalityofamultiplechoiceitem.Oncetheitemanalysisinformationisavailable,
anitemreviewisoftenconducted.
ItemDiscriminationIndex
Theitemdiscriminationindexisameasureofhowwellanitemisabletodistinguishbetweenexaminees
whoareknowledgeableandthosewhoarenot,orbetweenmastersandnonmasters.Thereareactually
severalwaystocomputeanitemdiscrimination,butoneofthemostcommonisthepointbiserial
correlation.Thisstatisticlooksattherelationshipbetweenanexaminee'sperformanceonthegivenitem
(correctorincorrect)andtheexaminee'sscoreontheoveralltest.Foranitemthatishighlydiscriminating,
ingeneraltheexamineeswhorespondedtotheitemcorrectlyalsodidwellonthetest,whileingeneralthe
examineeswhorespondedtotheitemincorrectlyalsotendedtodopoorlyontheoveralltest.
Thepossiblerangeofthediscriminationindexis1.0to1.0;however,ifanitemhasadiscrimination
below0.0,itsuggestsaproblem.Whenanitemisdiscriminatingnegatively,overallthemost
knowledgeableexamineesaregettingtheitemwrongandtheleastknowledgeableexamineesaregetting
theitemright.Anegativediscriminationindexmayindicatethattheitemismeasuringsomethingother
thanwhattherestofthetestismeasuring.Moreoften,itisasignthattheitemhasbeenmiskeyed.
Wheninterpretingthevalueofadiscriminationitisimportanttobeawarethatthereisarelationship
betweenanitem'sdifficultyindexanditsdiscriminationindex.Ifanitemhasaveryhigh(orverylow)p
value,thepotentialvalueofthediscriminationindexwillbemuchlessthaniftheitemhasamidrangep
value.Inotherwords,ifanitemiseitherveryeasyorveryhard,itisnotlikelytobeverydiscriminating.A
typicalCRT,withmanyhighitempvalues,mayhavemostitemdiscriminationsintherangeof0.0to0.3.
Ausefulapproachwhenreviewingasetofitemdiscriminationindexesistoalsovieweachitem'spvalue
atthesametime.Forexample,ifagivenitemhasadiscriminationindexbelow.1,buttheitem'spvalueis
greaterthan.9,youmayinterprettheitemasbeingeasyforalmosttheentiresetofexaminees,and
probablyforthatreasonnotprovidingmuchdiscriminationbetweenhighabilityandlowabilityexaminees.
DistractorAnalysis
Oneimportantelementinthequalityofamultiplechoiceitemisthequalityoftheitem'sdistractors.
However,neithertheitemdifficultynortheitemdiscriminationindexconsiderstheperformanceofthe
incorrectresponseoptions,ordistractors.Adistractoranalysisaddressestheperformanceoftheseincorrect
responseoptions.
Justasthekey,orcorrectresponseoption,mustbedefinitivelycorrect,thedistractorsmustbeclearly
incorrect(orclearlynotthe"best"option).Inadditiontobeingclearlyincorrect,thedistractorsmustalso
beplausible.Thatis,thedistractorsshouldseemlikelyorreasonabletoanexamineewhoisnotsufficiently
knowledgeableinthecontentarea.Ifadistractorappearssounlikelythatalmostnoexamineewillselectit,
itisnotcontributingtotheperformanceoftheitem.Infact,thepresenceofoneormoreimplausible
distractorsinamultiplechoiceitemcanmaketheitemartificiallyfareasierthanitoughttobe.
Inasimpleapproachtodistractoranalysis,theproportionofexamineeswhoselectedeachoftheresponse
optionsisexamined.Forthekey,thisproportionisequivalenttotheitempvalue,ordifficulty.Ifthe
proportionsaresummedacrossallofanitem'sresponseoptionstheywilladdupto1.0,or100%ofthe
examinees'selections.
Theproportionofexamineeswhoselecteachofthedistractorscanbeveryinformative.Forexample,itcan
revealanitemmiskey.Whenevertheproportionofexamineeswhoselectedadistractorisgreaterthanthe
proportionofexamineeswhoselectedthekey,theitemshouldbeexaminedtodetermineifithasbeenmis
keyedordoublekeyed.Adistractoranalysiscanalsorevealanimplausibledistractor.InCRTs,wherethe
itempvaluesaretypicallyhigh,theproportionsofexamineesselectingallthedistractorsare,asaresult,
low.Nevertheless,ifexamineesconsistentlyfailtoselectagivendistractor,thismaybeevidencethatthe
distractorisimplausibleorsimplytooeasy.
ItemReview
Oncetheitemanalysisdataareavailable,itisusefultoholdameetingoftestdevelopers,
psychometricians,andsubjectmatterexperts.Duringthismeetingtheitemscanbereviewedusingthe
informationprovidedbytheitemanalysisstatistics.Decisionscanthenbemadeaboutitemchangesthat
areneededorevenitemsthatoughttobedroppedfromtheexam.Anyitemthathasbeensubstantially
changedshouldbereturnedtothebankforpretestingbeforeitisagainusedoperationally.Oncethese
decisionshavebeenmade,theexamsshouldberescored,leavingoutanyitemsthatweredroppedand
usingthecorrectkeyforanyitemsthatwerefoundtohavebeenmiskeyed.Thiscorrectedscoringwillbe
usedfortheexaminees'scorereports.
Summary
Intheitemanalysisphaseoftestdevelopment,statisticalmethodsareusedtoidentifypotentialitem
problems.Thestatisticalresultsshouldbeusedalongwithsubstantiveattentiontotheitemcontentto
determineifaproblemexistsandwhatshouldbedonetocorrectit.Itemsthatarefunctioningverypoorly
shouldusuallyberemovedfromconsiderationandtheexamsrescoredbeforethetestresultsarereleased.
Inothercases,itemsmaystillbeusable,aftermodestchangesaremadetoimprovetheirperformanceon
futureexams.
Mean
The aritmetic mean or average of a set of numbers is the expected
value. The mean is calculated by adding up all the values, and then
dividing that sum by the number of values.
For example, suppose a teacher has seven students and records the
following seven test scores for her class: 98, 96, 96, 84, 80, 80, and 72.
The average test score is
(98+96+96+84+81+81+73)/7 = 609/7 = 87.
If one more student entered her class and took the test, the expected
score would be an 87.
Median
The median is the middle value in a set of values. To find the median,
order the numbers from largest to smallest, and then choose the value
in the middle. For example, consider the following set of nine numbers:
10, 13, 4, 25, 8, 12, 9, 19, 18
If we arrange them in descending order, we get
25, 19, 18, 13, 12, 10, 9, 8, 4
The middle value is 12, so the median = 12. What if we have a set with
an even number of values? For example, consider the set
1, 2, 3, 4, 5, 6.
Both 3 and 4 are in the middle. In this case, we must take the average
of the two middle numbers. Since (3+4)/2 = 3.5, the median = 3.5.
Mode
The mode of a set is the value or values that occur most frequently.
There can be more than one mode in a set. If there is more than one
mode, you simply list all of the modes; you do not have to average
them. For eaxample, consider the set
10, 10, 4, 8, 10, 8, 3, 9, 14
The number 10 occurs three times, and no other numbers occur as
frequently. Therefore, the mode = 10
Now consider this set
10, 10, 4, 8, 10, 8, 3, 8, 14
Both 10 and 8 occur three times each, and no other numbers occur as
often. Threrfore, the modes are 8 and 10.
Range
The range of a set of numbers is the maximum distance between any
two values. In other words, it's the difference between the largest and
smalles values. Knowing the range gives you an idea of how close
together the data points are. For example, consider the set of test
scores
78, 88, 67, 90, 92, 83, 97
The highest test score is 97 and the lowest is 67, therefore the range is
97-67 = 30.
Standard Deviation
The standard deviation is another way to measure how close together
the elements are in a set of data. The s.d. is the average distance
between each data point and the mean. Knowing the standard
deviation gives a more complete picture of the distribution of elements
in a data set. Suppose you have N data points and you label them X1,
X2, X3,... XN, and you call the mean . There are two formulas for
standard deviation depending on whether your data is a complete set,
or a sample take from a larger set.
For example, suppose your data is all of the ACT scores of the students
in a small class. Then the standard deviation formula is
Suppose the scores are 15, 21, 21, 21, 25, 30, and 35. The mean of
this set is 24. The s.d. is
sqrt[((15-24)2+(21-24)2+(21-24)2+(21-24)2+(25-24)2+(30-24)2+(3524)2)/7]
= sqrt[266/7]
= sqrt[38]
= 6.16
If you take a random sample of ACT scores from a large school, the
standard deviation formula is
For example, suppose you select ten students at random from a high
school, and their ACT scores are 17, 20, 24, 25, 26, 26, 29, 29, 30 and
32. The average of this set is 25.8. The standard deviation is
sqrt[((17-25.8)2+(20-25.8)2+(24-25.8)2+(25-25.82+(26-25.8)2
+(26-25.8)2+(29-25.9)2+(29-25.8)2+(30-25.8)2+(32-25.8)2)/(10-1)]
= sqrt[(191.6 )/(10-1)]
= sqrt[191.6/9]
= sqrt[21.2889]
= 4.61
Mean,Mode,Median,andStandardDeviation
TheMeanandMode
Thesamplemeanistheaverageandiscomputedasthesumofalltheobservedoutcomesfromthesample
dividedbythetotalnumberofevents.Weusexasthesymbolforthesamplemean.Inmathterms,
wherenisthesamplesizeandthexcorrespondtotheobservedvalued.
Example
SupposeyourandomlysampledsixacresintheDesolationWildernessforanonindigenousweedand
cameupwiththefollowingcountsofthisweedinthisregion:
34,43,81,106,106and115
Wecomputethesamplemeanbyaddinganddividingbythenumberofsamples,6.
34+43+81+106+106+115
=80.83
6
Wecansaythatthesamplemeanofnonindigenousweedis80.83.
Themodeofasetofdataisthenumberwiththehighestfrequency.Intheaboveexample106isthemode,
sinceitoccurstwiceandtherestoftheoutcomesoccuronlyonce.
Thepopulationmeanistheaverageoftheentirepopulationandisusuallyimpossibletocompute.Weuse
theGreekletterforthepopulationmean.
Median,andTrimmedMean
Oneproblemwithusingthemean,isthatitoftendoesnotdepictthetypicaloutcome.Ifthereisone
outcomethatisveryfarfromtherestofthedata,thenthemeanwillbestronglyaffectedbythisoutcome.
Suchanoutcomeiscalledandoutlier.Analternativemeasureisthemedian.Themedianisthemiddle
score.Ifwehaveanevennumberofeventswetaketheaverageofthetwomiddles.Themedianisbetter
fordescribingthetypicalvalue.Itisoftenusedforincomeandhomeprices.
Example
Supposeyourandomlyselected10housepricesintheSouthLakeTahoearea.Yourareinterestedinthe
typicalhouseprice.In$100,000thepriceswere
2.7,2.9,3.1,3.4,3.7,4.1,4.3,4.7,4.7,40.8
Ifwecomputedthemean,wewouldsaythattheaveragehousepriceis744,000.Althoughthisnumberis
true,itdoesnotreflectthepriceforavailablehousinginSouthLakeTahoe.Acloserlookatthedata
showsthatthehousevaluedat40.8x$100,000=$4.08millionskewsthedata.Instead,weusethe
median.Sincethereisanevennumberofoutcomes,wetaketheaverageofthemiddletwo
3.7+4.1
=3.9
2
Themedianhousepriceis$390,000.Thisbetterreflectswhathouseshoppersshouldexpecttospend.
Thereisanalternativevaluethatalsoisresistanttooutliers.Thisiscalledthetrimmedmeanwhichisthe
meanaftergettingridoftheoutliersor5%onthetopand5%onthebottom.Wecanalsousethetrimmed
meanifweareconcernedwithoutliersskewingthedata,howeverthemedianisusedmoreoftensince
morepeopleunderstandit.
Example:
AtaskirentalshopdatawascollectedonthenumberofrentalsoneachoftenconsecutiveSaturdays:
44,50,38,96,42,47,40,39,46,50.
Tofindthesamplemean,addthemanddivideby10:
44+50+38+96+42+47+40+39+46+50
=49.2
10
Noticethatthemeanvalueisnotavalueofthesample.
Tofindthemedian,firstsortthedata:
38,39,40,42,44,46,47,50,50,96
Noticethattherearetwomiddlenumbers44and46.Tofindthemedianwetaketheaverageofthetwo.
44+46
Median==45
2
Noticealsothatthemeanislargerthanallbutthreeofthedatapoints.Themeanisinfluencedbyoutliers
whilethemedianisrobust.
Variance,StandardDeviationandCoefficientofVariation
Themean,mode,median,andtrimmedmeandoanicejobintellingwherethecenterofthedatasetis,but
oftenweareinterestedinmore.Forexample,apharmaceuticalengineerdevelopsanewdrugthat
regulatesironintheblood.Supposeshefindsoutthattheaveragesugarcontentaftertakingthe
medicationistheoptimallevel.Thisdoesnotmeanthatthedrugiseffective.Thereisapossibilitythat
halfofthepatientshavedangerouslylowsugarcontentwhiletheotherhalfhavedangerouslyhighcontent.
Insteadofthedrugbeinganeffectiveregulator,itisadeadlypoison.Whatthepharmacistneedsisa
measureofhowfarthedataisspreadapart.Thisiswhatthevarianceandstandarddeviationdo.Firstwe
showtheformulasforthesemeasurements.Thenwewillgothroughthestepsonhowtousetheformulas.
Wedefinethevariancetobe
andthestandarddeviationtobe
VarianceandStandardDeviation:StepbyStep
1.
Calculatethemean,x.
2.
Writeatablethatsubtractsthemeanfromeachobservedvalue.
3.
Squareeachofthedifferences.
4.
Addthiscolumn.
5.
Dividebyn1wherenisthenumberofitemsinthesampleThisisthe
variance.
6.
Togetthestandarddeviationwetakethesquarerootofthevariance.
Example
TheowneroftheChesTahoerestaurantisinterestedinhowmuchpeoplespendattherestaurant.He
examines10randomlyselectedreceiptsforpartiesoffourandwritesdownthefollowingdata.
44,50,38,96,42,47,40,39,46,50
Hecalculatedthemeanbyaddinganddividingby10toget
x=49.2
Belowisthetableforgettingthestandarddeviation:
x49.2
(x49.2)2
44
5.2
27.04
50
0.8
0.64
38
11.2
125.44
96
46.8
2190.24
42
7.2
51.84
47
2.2
4.84
40
9.2
84.64
39
10.2
104.04
46
3.2
10.24
50
0.8
0.64
Total
2600.4
Now
2600.4
=288.7
101
Hencethevarianceis289andthestandarddeviationisthesquarerootof289=17.
Sincethestandarddeviationcanbethoughtofmeasuringhowfarthedatavaluesliefromthemean,we
takethemeanandmoveonestandarddeviationineitherdirection.Themeanforthisexamplewasabout
49.2andthestandarddeviationwas17.Wehave:
49.217=32.2
and
49.2+17=66.2
Whatthismeansisthatmostofthepatronsprobablyspendbetween$32.20and$66.20.
Thesamplestandarddeviationwillbedenotedbysandthepopulationstandarddeviationwillbedenoted
bytheGreekletter.
Thesamplevariancewillbedenotedbys2andthepopulationvariancewillbedenotedby2.
Thevarianceandstandarddeviationdescribehowspreadoutthedatais.Ifthedataallliesclosetothe
mean,thenthestandarddeviationwillbesmall,whileifthedataisspreadoutoveralargerangeofvalues,
swillbelarge.Havingoutlierswillincreasethestandarddeviation.
Oneoftheflawsinvolvedwiththestandarddeviation,isthatitdependsontheunitsthatareused.One
wayofhandlingthisdifficulty,iscalledthecoefficientofvariationwhichisthestandarddeviationdivided
bythemeantimes100%
CV=100%
Intheaboveexample,itis
17
100%=34.6%
49.2
Thistellsusthatthestandarddeviationoftherestaurantbillsis34.6%ofthemean.
Chebyshev'sTheorem
AmathematiciannamedChebyshevcameupwithboundsonhowmuchofthedatamustlieclosetothe
mean.Inparticularforanypositivek,theproportionofthedatathatlieswithinkstandarddeviationsof
themeanisatleast
1
1
k2
Forexample,ifk=2thisnumberis
1
1=.75
22
Thistellusthatatleast75%ofthedatalieswithin75%ofthemean.Intheaboveexample,wecansay
thatatleast75%ofthedinersspentbetween
49.22(17)=15.2
and
49.2+2(17)=83.2
dollars.
Skewed Data
Datacanbe"skewed",meaningittendstohavealongtailononesideortheother:
Negative Skew
No Skew
Positive Skew
Negative Skew?
Whyisitcallednegativeskew?Becausethe
long"tail"isonthenegativesideofthepeak.
Peoplesometimessayitis"skewedtothe
left"(thelongtailisonthelefthandside)
Themeanisalsoontheleftofthepeak.
The Normal
Distribution has No
Skew
ANormalDistributionisnot
skewed.
Itisperfectlysymmetrical.
AndtheMeanisexactlyatthe
peak.
Positive Skew
Andpositiveskewiswhenthelongtail
isonthepositivesideofthepeak,and
somepeoplesayitis"skewedtothe
right".
Themeanisontherightofthepeak
value.
Example:
Income
Distribution
HereissomedataI
extractedfroma
recentCensus.
Asyoucanseeitis
positivelyskewed...
infactthetail
continueswaypast
$100,000
Calculating Skewness
"Skewness"(theamountofskew)canbecalculated,forexampleyoucouldusetheSKEW()functionin
ExcelorOpenOfficeCalc.
Normal Distribution
ThenormaldistributionisaprobabilitydistributionthatassociatesthenormalrandomvariableXwitha
cumulativeprobability.Thenormaldistributionisdefinedbythefollowingequation:
Thecurveontheleftisshorterandwiderthanthecurveontheright,becausethecurveonthelefthasa
biggerstandarddeviation.
TheRorschachtest(/rrk/or/rrk/,[3]Germanpronunciation:[oax];alsoknownastheRorschach
inkblottest,theRorschachtechnique,orsimplytheinkblottest)isapsychologicaltestinwhich
subjects'perceptionsofinkblotsarerecordedandthenanalyzedusingpsychologicalinterpretation,
complexalgorithms,orboth.Somepsychologistsusethistesttoexamineaperson'spersonality
characteristicsandemotionalfunctioning.Ithasbeenemployedtodetectunderlyingthoughtdisorder,
especiallyincaseswherepatientsarereluctanttodescribetheirthinkingprocessesopenly. [4]Thetestis
namedafteritscreator,SwisspsychologistHermannRorschach.
Inthe1960s,theRorschachwasthemostwidelyusedprojectivetest.[5]InanationalsurveyintheU.S.,the
Rorschachwasrankedeighthamongpsychologicaltestsusedinoutpatientmentalhealthfacilities. [6]Itis
thesecondmostwidelyusedtestbymembersoftheSocietyforPersonalityAssessment,anditisrequested
bypsychiatristsin25%offorensicassessmentcases,[6]usuallyinabatteryofteststhatoftenincludethe
MMPI2andtheMCMIIII.[7]Insurveys,theuseofRorschachrangesfromalowof20%bycorrectional
psychologists[8]toahighof80%byclinicalpsychologistsengagedinassessmentservices,and80%of
psychologygraduateprogramssurveyedteachit.[9]
AlthoughtheExnerScoringSystem(developedsincethe1960s)claimstohaveaddressedandoftenrefuted
manycriticismsoftheoriginaltestingsystemwithanextensivebodyofresearch, [10]someresearchers
continuetoraisequestions.Theareasofdisputeincludetheobjectivityoftesters,interraterreliability,the
verifiabilityandgeneralvalidityofthetest,biasofthetest'spathologyscalestowardsgreaternumbersof
responses,thelimitednumberofpsychologicalconditionswhichitaccuratelydiagnoses,theinabilityto
replicatethetest'snorms,itsuseincourtorderedevaluations,andtheproliferationoftheteninkblot
images,potentiallyinvalidatingthetestforthosewhohavebeenexposedtothem.[11]
TheRorschachtest(/rrk/or/rrk/,[3]Germanpronunciation:[oax];alsoknownastheRorschach
inkblottest,theRorschachtechnique,orsimplytheinkblottest)isapsychologicaltestinwhich
subjects'perceptionsofinkblotsarerecordedandthenanalyzedusingpsychologicalinterpretation,
complexalgorithms,orboth.Somepsychologistsusethistesttoexamineaperson'spersonality
characteristicsandemotionalfunctioning.Ithasbeenemployedtodetectunderlyingthoughtdisorder,
especiallyincaseswherepatientsarereluctanttodescribetheirthinkingprocessesopenly. [4]Thetestis
namedafteritscreator,SwisspsychologistHermannRorschach.
Inthe1960s,theRorschachwasthemostwidelyusedprojectivetest.[5]InanationalsurveyintheU.S.,the
Rorschachwasrankedeighthamongpsychologicaltestsusedinoutpatientmentalhealthfacilities. [6]Itis
thesecondmostwidelyusedtestbymembersoftheSocietyforPersonalityAssessment,anditisrequested
bypsychiatristsin25%offorensicassessmentcases,[6]usuallyinabatteryofteststhatoftenincludethe
MMPI2andtheMCMIIII.[7]Insurveys,theuseofRorschachrangesfromalowof20%bycorrectional
psychologists[8]toahighof80%byclinicalpsychologistsengagedinassessmentservices,and80%of
psychologygraduateprogramssurveyedteachit.[9]
AlthoughtheExnerScoringSystem(developedsincethe1960s)claimstohaveaddressedandoftenrefuted
manycriticismsoftheoriginaltestingsystemwithanextensivebodyofresearch, [10]someresearchers
continuetoraisequestions.Theareasofdisputeincludetheobjectivityoftesters,interraterreliability,the
verifiabilityandgeneralvalidityofthetest,biasofthetest'spathologyscalestowardsgreaternumbersof
responses,thelimitednumberofpsychologicalconditionswhichitaccuratelydiagnoses,theinabilityto
replicatethetest'snorms,itsuseincourtorderedevaluations,andtheproliferationoftheteninkblot
images,potentiallyinvalidatingthetestforthosewhohavebeenexposedtothem.[11]
Existence of God
ThereareseveralmainpositionswithregardtotheexistenceofGodthatonemighttake:
Hinduism
Hinduism, a polytheistic religion and perhaps the oldest of the great
world religions, dates back about 6,000 years. Hinduism comprises so
many different beliefs and rituals that some sociologists have
suggested thinking of it as a grouping of interrelated religions.
Hinduismteachestheconceptofreincarnationthebeliefthatalllivingorganismscontinueeternallyin
cyclesofbirth,death,andrebirth.Similarly,Hinduismteachesthecastesystem,inwhichaperson's
previousincarnationsdeterminethatperson'shierarchicalpositioninthislife.Eachcastecomeswithits
ownsetofresponsibilitiesandduties,andhowwellapersonexecutesthesetasksinthecurrentlife
determinesthatperson'spositioninthenextincarnation.
Hindusacknowledgetheexistenceofbothmaleandfemalegods,buttheybelievethattheultimatedivine
energyexistsbeyondthesedescriptionsandcategories.Thedivinesoulispresentandactiveinallliving
things.
Morethan600millionHinduspracticethereligionworldwide,thoughmostresideinIndia.Unlike
MoslemsandChristians,Hindusdonotusuallyproselytize(attempttoconvertotherstotheirreligion).