Mohd Saidfudin Masodi February 2010 Table of Content 1.0 Rasch Course !er!ie"............................................................ 2.0 #ro$ra% u&line....................................................................... Rasch Analysis 2'Days Workshop u&line (.0 Keynote : 3.1 Modern Measurement Paradigm: Easier to read and better analysis using Rasch-based Approach...............................) '1) 3.2 Deeloping an !nstrument "onstruct Made #imple.........1* '2+
4. Wor!shop: WinS&eps,- A #rac&ical .uide ' #etting up "ontrol $ile.........................................................2%- 2& ' 'asic Rasch Analysis............................................................2( -3) - #erson'/&e% Dis&ribu&ion Map Cons&ruc& and in&erpre&a&ion - #ar&ial Credi& Ra&in$ Analysis 0 /&e% 1alidi&y - #erson Response 1alidi&y Analysis ' "omputation o* Probabilistic +i,elihood Estimate..........3% - 3& ". Appendices........................................................#$%4& 2.1 Sa%ple Size and /&e% Calibra&ion 3or #erson Measure4 S&abili&y 2.2 Wha& do /n5& and u&5&6 Mean's7uare and S&andardized %ean8 2.( Ra&in$ Scale /ns&ru%en& 9uali&y Cri&eria 2.+ :;a%ple of #erson /&e% Map re'dis&ribu&ion 2 Rasch Course '(er(ie) <he 5eld of Measure%en&6 <es& and :!alua&ion is a !ery i%por&an& sec&or in educa&ion despi&e i&=s co%ple;i&y bu& need &o be $i!en a serious &hou$h& and considera&ion &o assure &he pro$ra% ou&co%es is achie!ed. <he ne" re7uire%en& of learnin$ ou&co%es %easure%en& in u&co%e >ased :duca&ion ?>:@ has co%pelled acade%ician especially in en$ineerin$ educa&ion "here &hey ha!e $rea& need for &he de!elop%en& of !alid %easures6 e.$.6 of &he 7uan&i&y and 7uali&y of educa&ional ser!ices and of &he ou&co%es of &hose &eachin$ and learnin$ ser!ices rendered. Many acade%ician and researchers are frus&ra&ed "hen e;is&in$ ins&ru%en&s are no& "ell &ailored &o &he &ask6 since &hey &hen canno& e;pec& sensi&i!e6 accura&e6 or !alid 5ndin$s. <his "orkshop presen&s &he &heory and &he &radi&ional approach used in 7uali&a&i!e research. /& &hen pro!ides an o!er!ie" of A%odernA %easure%en& as prac&iced usin$ /&e% Response <heory "i&h a focus on Rasch %easure%en& %odel. Rasch analysis pro!ides &he social sciences "i&h &he kind of %easure%en& &ha& charac&erizes %easure%en& in &he na&ural sciences. Since i&e% response &heory focuses on &he i&e%s and &he persons ra&her &han &he &es& score6 &he syn&hesis of bo&h0 7uan&i&a&i!e analysis "i&h 7uali&a&i!e issues is e;perienced in a "ay &ha& is rare in social science. Bl&i%a&ely6 Rasch %easure%en& can facili&a&e %ore eCcien&6 reliable6 and !alid assess%en& "hile i%pro!in$ con!enience &o users. <he de%ons&ra&ion is useful for anyone "ho "an&s &o unders&and &he role of %odern %easure%en& in educa&ion and rela&ed research of %easure%en&. ( Key *ote Spea!er- Modern measurement paradigm: The Rasch Approach Mohd.Saidfudin Masodi0 /RCA ?Don@ /SE001 9MS Dead Assessor
Pro+ra, 'utline A&&endees "ill learn: F <he Funda%en&als of Measure%en& F /n&roduc&ion &o Rasch Measure%en& Model F Why and ho" Rasch seeks &o crea&e linear6 in&er!al %easures F Go" &o run -insteps sof&"are- Rasch Analyses F /n&erpre&a&ion of RaschH-insteps.'ond/$o0 ou&pu& A+enda: -ay .. Theory and Application Please bring your o1n lap-top computer2 running -indo1. Pre*erably participants bring their o1n data set 3or 1e 1ill proide one4. /ornin+ $:#a, Re$is&ra&ion E-00. #o"er'poin& presen&a&ion on his&ory and in&roduc&ion &o &he Rasch /odel. 11-00.0rea! 11-12'12-(0. Discussion of an applica&ion of Rasch Analysis in %easure%en& "i&h in&erpre&a&ion of RaschH-insteps ou&pu&. + 12-(0'2-12. 1unch Afternoon. 2-12. .uidance &hrou$h a Rasch Analysis s&ar&in$ "i&h an :;cel 5le and usin$ -insteps sof&"are6 %ore in&erpre&a&ion of ou&pu&. #rac&ice usin$ -insteps.Ministep sof&"are in analyzin$ AReadiness to "hangeA da&a &ha& is included "i&h &he sof&"are. (-+2. >reak +-00'2-(0. #ar&icipan&s "ill be $i!en an :;cel da&a 5le and "ill "ri&e &heir o"n con&rol 5les and run &he% on &he sa%ple da&a "i&h hands'on help fro% facili&a&ors. -ay 2. 3ands%on Analysis4 Practice4 -iscussion /ornin+. $:#a, #reli%inary brie5n$. <he facili&a&ors "ill "ork "i&h each par&icipan& indi!idually &o "ri&e a Wins&eps con&rol 5le for &heir da&a. 10-00'12-(0. When e!eryone has successfully run Wins&eps on &heir da&a6 par&icipan&s "ill presen& &heir Wins&eps resul&s6 and an ins&ruc&or "ill help &he% &o in&erpre& Wins&eps ou&pu&. <here "ill be lo&s of oppor&uni&y for 7ues&ions and discussion. 12-(0'2-12. Dunch I Dzuhur Afternoon. 2-12'2-(0. /n &he af&ernoon "e "ill con&inue s&uden& presen&a&ions6 9IA6 and discussions. >y &he end of &he day6 all par&icipan&s should ha!e had &he oppor&uni&y &o presen& &heir 5ndin$s and discuss &he% "i&h &he ins&ruc&ors and o&hers. A& &he end of &he day6 all 2 par&icipan&s should ha!e a "orkin$ fa%iliari&y "i&h Rasch Measure%en& and Analysis concep&s6 runnin$ &he -insteps pro$ra%6 and in&erpre&in$ -insteps ou&pu&. 2-(0p%. 5nd of Course. J 3.1 Rasch Model: The Modern Measurement Paradigm Dr.Azrilah Abdul Aziz Mohd Saidfudin Masodi For many decades, we have made ourselves assured that it is almost impossible to have quantitative type of data in social sciences. We deluded ourselves into having only descriptive type of research in social sciences. Over the years, social scientist became aware that there should be more than descriptive findings, more to only the typical reporting by median and percentages. There should be a way in presenting the findings in more meaningful ways. It would be more interesting to clearly see the inherent relationship between the human and the observable actions being assessed. There should be more than only reporting on the association and correlations values, rather, it should provide a clearer picture of what is happening between the human and the observable actions, sort of a hierarchical relationship between the two. Rasch measurement model has made it possible for social scientist to conduct calibrated measurement where human is the focused of attention. In our day to day lives, we rely on standard measurement system to measure and cut timber, buy lengths of cloth, assemble the correct amounts of ingredients to bae a cae, and to administer appropriate doses of medicine to ailing relatives. !imilarly, we can conduct educational research, treating and analysing data from survey or psychological investigation in the same manner as we do with the standard measurement system. Rasch measurement model is a way to mae sense of the world. "#perience is continuous, but once we notice the e#perience, it becomes discrete. We sense ) happiness when receiving flowers during convocation. When we distinguish between the different ind of happiness$ not very happy, happy, happier, and happiest, that moment of observations become discrete. Then we choose dimension according to its utility of the happy sensations$ for e#ample a flower, a bouquet of flowers, a bouquet of flowers with chocolate, and a bouquet of flowers with chocolate and a little teddy bear with it. To mae our observations more meaningful, we represent the observations with score in the form of$ Not very happy, happy, happier, happiest Which we score as observations% x&1, 2, 3, It is common that we tae the raw score of each of the sensation to indicate sort of a measurement for the dimensions. 'owever, raw score are only indications of a possible measure. Raw score cannot be the measures sought because in their raw state, they have little inferential value. To develop metric meaning, the counts must be incorporated into stochastic process which constructs inferential stability. ( survey item for e#ample is answered by ), *, +, or ,, with each correspond some ind of phrases such as -strongly disagree. to -strongly agree.. Responses from the survey are of ordinal variables and with it we can determine median, percentile ran and determine the relationship between the two characteristics by means of !pearman/s ran order correlation. That is as far as we can go with ordinal data. It is by no mean a measurement. The separation of the rating values assigned ), *, +, or , is not of equal interval and therefore does not give a scale which has been construed all the while. The termed scale for such given rating has submerged the truth of measurement it does not possess the characteristics what is deemed to be a scale by standard definition in physical science. The ignorance of standard * protocol has lead us into a situation which warrants us to review measurement as what is normally perceived in human science. !imilarly, the practise of raw counts may give the impression that they are interval measure of e#perience. 0ut this is always an illusion. From our observations on the sensation of happy, the assignment of the number labels *, +, ,, 1 to the options of -not very happy., -happy., -happier., -happiest., does not mae these numerals become equally distanced measures. If the category labels are not equally distances, than we cannot provide legitimate processing for these non2interval category labels, including the mean and standard deviation. (part from that, missing data is a common issue in survey data. 3issing data may result from oversight, non2compliance or from incidental interference. The purpose of we conducting a research is to use e#isting information to mae inferences about what is still unnown, then missing data are of importance too. ( useful measurement model for constructing inference from observation must be unaffected by these missing data. The measurement model must also enable precision estimation of the inference and able to provide detection and evaluation of discrepancy between an observation and e#pectation. 'ence raw counts cannot be relied upon to serve as measures. (ccuracy of a measure may be achieved through replication. When similar results occur repeatedly, we are confident that similar result will occur again in future. 'owever, replication does not guarantee accuracy of measure unless the testing instrument is used to operate for a specific purpose or target and more importantly each measurement shall be independent of each other. Irrespective of the person, a thermometer reading is always accepted independently. On the same premise, a measurement instrument thus developed must be able to behave equally functional. ( survey conducted or an e#amination being carried out on a sub4ect matter rightfully shall not get affected by the respondent who is taing it. E Thus, in order to construct inferences from observation, the measurement model must 5a6 be able to produce linear measures 5b6 to overcome missing data 5c6 to give estimates of precision 5d6 able to detect misfits or outliers 5e6 and the parameters of the ob4ect being measured and the measurement instrument must be separable or independent of each other. Only Rasch measurement models solve these problems. Rasch measurement model or in short referred to as Rasch model helps in constructing a scale based on a set of survey items. 7et/s start discussing on Rasch measurement model wors using simple outcomes. 0ac to our e#ample of happy sensation of receiving a bouquet of flower on a convocation day, assume a guy goes around giving a bouquet of flower to a successful lady graduate. On the first attempt, the observation of a lady graduate to respond is a 8)%8) chance of liing the giving. If the guy were to give the bouquet of flower to *) other lady graduates, he may receive a respond of si# 596 liings against four 516 dislies. This can be e#pressed as a probabilistic odd of 9)%1). We may get some other outcomes in the order of perhaps :)%,) or on the opposite scale +)%;) perhaps. <ow we can already develop a scale by transforming the frequency of responses into a probabilistic proportion$ hence transformed ordinal into ratio data. ( ratio data enable further investigation as it is divisive and multiplicative. Interestingly ratio data e#plains a respond as being a =*/ or a =)/$ which reflects the total absence of quantity being measured. 10 This order of probabilistic turn of event can be represented in the line diagram as in Figure *.
!"#$%& 1. 'robabilisti( line dia)ra* >sing a scientific calculator, the probabilistic odd can be converted into a series of numerical values. 'owever, we noticed that these numerical values are much clustered towards the left most end of the ruler when we attempt to position it on an equal interval scale. This is a prerequisite where these numerical values shall e#hibit distribution of equal distance being one of the criteria of a measurement scale. It was clearly noted that by converting the odd ratio into numerical value does not yield a scale of equal interval. Therefore, it cannot behave as a ruler of measurement. This is depicted in Figure +. !"#$%& 2. Nu*eri(al s(ale In order to achieve an equal interval scale, we can introduce logarithm of the odd probabilistic value. 3aintaining the same odd probabilistic ruler as in Figure *, starting with ).)* to *)), we can create an equal interval separation between the log odds units on the line, hence the measurement ruler with the logit unit. This can be verified by computing the value of log ).)* equals to 2+.)$ value of log ).* equals to 2*$ value of log * equals to ). Figure , shows the newly established logit 11 ruler as a scale with equal interval separation. It is 4ust lie looing at a thermometer with =)/, as water being ice and *)) as boiling point whilst the negative e#treme end as 2+:, o ?, the point where all atoms of any element come to a standstill. !"#$%& 3. +o) odds unit ruler <ow that we now how Rasch measurement model helps in constructing a scale based on a set of survey items, we can continue on our discussion of how Rasch establish the hierarchical relationship between human and the observable actions or items. This scale can now be the basis to describe human responses to any given event as the thermometer does. In Rasch, we are interested to describe the turn of event$ when a person is measured ,9.; o ? then he is said to be healthy. 7iewise, a logit shall describe the person measured attribute or ability. In Rasch, there is no need to describe the data. What is required is to test whether the data allow for measurement on a linear interval scale. 7et us tae some responses arising from a survey and tabulated ready for an analysis to be done. Responses are eyed2in according to a series of respondents and their respective responds for all the given items in the survey. For convenient of discussion, let/s use -). to indicate disagreement on an item and -*. in agreement. For respondent * to N, and the responses for item * to 7 respectively the survey response outcomes is depicted in Figure 1. 12 Items 1 1 2 . . . . . . . L Raw scor e 2 * * ) ) * * * ) ) ) 8 Respondent . * ) * ) * ) * ) ) * 8 . ) ) * * * ) * * * * : . ) ) ) * * * ) ) ) * 1 . * * ) ) * * ) ) * * 9 N * ) * ) * ) * ) * ) 8 /ndices 10 '2 10
2 10
0 10
1 10
' 1 !"#$%& . ,bservation of responses fro* survey While this raw data matri# is all the observation we have, it is of limited utility. "ven though it contains everything we could observe, it does not help to predict what will happen in future. The raw score does not allow us to mae any useful inference or able to draw any conclusion about the items or about the interactions between the items and the persons. It only gives an order of preference. Raw score is therefore ordinal. It lacs depth to give any meaningful interpretation of the data obtained. For e#ample if the survey is to differentiate the ability of person, we cannot conclude which respondents are more able simply by looing at the raw score. If we were to accept the respondent with the highest raw score to be more able compared to those with lower raw score, it will be complicated to distinguish respondents with same raw score. We do not have any basis to differentiate them. We must build a useful e#pectation of whether the respondent will agree or succeed on the ne#t item. To now about the ability of the respondent in agreeing, we need to now the degree of difficulty of an item endorsement in which the respondent has to attempt. The data matri# can be arranged so that the items are ordered from least to most difficult and the persons are ordered from least to most able, as in Figure 8. The higher up the table one goes, the more able the persons. The further right across the table one goes, the more difficult the items. This type of arrangement is termed !calogram or @uttman matri#. 1( !"#$%& -. S(alo)ra* of responses fro* survey From Figure 8, we can deduce that persons who are more able i.e., those located towards the top of the table would have a greater lielihood to respond to all the items as =*/. On the other hand, the less able person would have a greater lielihood to respond only to easy items and may find it difficult to respond to difficult items. This will generate some ind of pattern of observable responses. In general, we can estimate that the scalogram will have most respond of -*. on the upper portion of the diagonal line, and respond of -). on the lower portion of the diagonal line. This is the general rule that creates a pattern of responses which is easy to read and clearer to understand$ ready to be scrutinised and e#plored in depth. This ind of pattern response can be described best mathematically by a simple logistic regression by virtue of its point of origin =)/ and with a ma#imum limit of =*/. This probabilistic curve is the fundamental of Rasch analysis where data is e#pected to fit the model. On contrary, traditional research is deterministic in nature where the best fit line is established based on a set of historical data trying to describe a past event. ?learly, Rasch has a distinct feature where it can predict an e#pected event. This is e#hibited in Figure 9. 1+ From this estimation too, we can mae an inference that person ( made a careless mistae by not responding a -*.as e#pected for the easy item in Figure 9. 7iewise, person A did a lucy guess by responding a -*. on the difficult item when we e#pect him to respond a -). instead. <ow there are enough reason for us to mae conclusion that person ? are more able than person A even though they have similar raw score. This is based on the number of difficult questions person ? is able to answer which is more than person A. !"#$%& .. 'attern of responses The rearrangement of data from least to most occurrences for either persons or item has yield a predictive pattern of responses. It will be easier to predict the responds pattern of any missing data, which is depicted as blan 5with a square bo#6 in Figure :. The prediction of the pattern is based on the 3a#imum 7ielihood "stimate 537"6 of an event$ the observed over the e#pected event as shown%
5*6 From the scalogram, we can mae inferences that person A is most liely to be less able than person ?, thus person A has the lielihood of not answering the 12 3a#imum 7ielihood "stimate of an event & Observed of an event "#pected of an event Careless respond Careless 1 CAR:D: SS DBCKL .B:SS difficult items 5items which are towards the right most end6. Therefore by calculating the ma#imum lielihood estimate of event, we can predict that person A would answer a -). at the missing bo# 5third item from the right for person A6. !imilarly, person F is e#pected to respond a -*. for the second item since person F is less able and would find the item easy. !"#$%& /. 'redi(tion on responds pattern This pattern of responses gives the respond validity of a person to an item. !ubsequently, Rasch has enabled us to establish the construct validity of an item. (n item is said to be valid when it is able to discriminate between a more able person and a poor person. This give rise to two ma4or discussion. The first possibility is the person is an outfit for not meeting the e#pected outcome. Rasch focuses on this ind of outfit responses and attempt to find reasoned argument why the person does not fit the model. This contribute to the significant findings of a particular research. ?ontrary to traditional statistical practise, this ind of outfit data would have normally been cleaned when it is of utmost value in Rasch. !econdly is of more critical issue where the item construct is at stae. If the responses cannot discriminate the respondent$ between the able and the poor ones, then there is a need to re2construct the question or possibly discard the item. On the e#treme end, we may need to re2construct the whole survey 1J questionnaire or e#amination paper as it is not measuring what are we supposedly to measure. This is reflected in Rasch analysis as Item reliability hence instrument construct validity. The fundamental difference between a quantitative research as practised in physical science is now addressed in natural science in what so called qualitative research.. ?onsequently, the rating assumed at the beginning of the survey construct or test result grades can be verified whether it shows the e#pected pattern of response. This form of calibration is unique to Rasch where it can comply to physical science measurement requirement to calibrate the scale of measurement. (n instrument which is not calibrated is deemed to yield invalid data thus render the whole research futile. The responds pattern prediction in overcoming missing data is an essential feature in maing a more accurate statistical analysis. This is one of the Rasch principle that allows a more accurate analysis to be carried out. While other model treat missing data as Bero$ Rasch predictive power will mae the ma#imum lielihood of an event for that particular matri#. 0y so doing, it maes the whole data set =as though/ it is a complete data set while the other statistical method would be short of data therefore maing it less accurate in computing the basic statistics of chi2square$ set aside to calculate the mean and standard deviation including the z2test thereof. <ow, let/s move to how Rasch enable the construction of the intangible ruler. The responses from the survey are still considered by theory is of ordinal or continuum in nature. 1) These data if were to be put on a scatter plot would yield a sigmoidal curve 5!2 curve6, and do not have equal interval as a prerequisite in measurement as shown in Figure ;. !"#$%& 0. 1he si)*oidal (urve of responses Without the equal interval, prediction of an event is almost impossible. In overcoming the issue, linear regression approach is applied in establishing a straight line which fits the points as best as possible. !"#$%& 2. +inear re)ression on the responses It is then used to mae the required predictions by maing inter2polation or e#tra2polation as necessary as shown in Figure *). In obtaining the best fit line however, there e#its differences between actual point y i and the predicted C i that is on the best fit line. The difference is referred here as error e. 1* !"#$%& 13. 4est fit line 0y accepting the fact that there are always errors involves in the prediction model, the deterministic model of equation renders itself less reliable. This can further be resolved by transforming it into a probabilistic model y including the prediction error into the equation$ y & mx D c D e 5+6 The formulation of probabilistic model of Rasch is based on this principle% a person having a greater ability than another person should have the greater probability of solving any item of the type in question and similarly one item being more difficult than another means that for any person the probability of solving the second item is the greater one. 5Rasch, *E9)6 Therefore in summary, the probability of success depends on the difference between the ability of the person and the difficulty of the item. The Rasch 3odel incorporates an algorithm that e#presses the probabilistic e#pectations of an item =i/ and person =n/ performances$ mathematically e#pressed as$ ' ni 5x ni 61 7 n , i 8 6 e 5n 9 i 8 538 1 : e 5n 9 i 8 where% F ni 5# ni &* G n , i 6 is the probability of person n on item i scoring a correct response 5#&*6$ given the person ability, n and item difficulty, i . 1E This can be further simplified by introducing log to this probability and reduced the whole as$ log 5F ni 5# ni &* G n , i 66 & n 2 i 516 The probability of an event can be easily described as$
20 Frobability of success of an event & (bility of a person 2 Aifficulty of an item Su**ary Rasch measurement model helps to understand a little how we came to fall so short of our reasonable e#pectations for scientific measurement in the human sciences. Rasch measurement model provide a closest general appro#imation of measurement principle for the human sciences. It accomplished the five 586 principles of a measurement model which able to provide linear equal scale, overcoming missing data, estimates precision, detect misfits or provide reliability and is replicable. Thus, by complying to all the principles, more meaningful and accurate inferences can be made to the data. This is core issue in measurement$ the meaningfulness. There are three ma4or aspects of meaningfulness to tae into account in measurement. These have to do with the constancy of the unit, interpreting the siBe of differences in measures, and evaluating the coherence of the units and differences. First, raw scores 5counts of right answers or other events, sums of ratings, or ranings6 do not stand for anything that adds up the way they do. (ny given raw score unit can be 128 times larger than another, depending on where they fall in the range hence lac constant separation. 3eaningful measurement demands a constant unit. Instrument scaling by Rasch methods provide it. !econd, meaningful measurement requires that we be able to say 4ust what any quantitative amount of difference is supposed to represent. What does a difference between two measures stand for in the way of what is and isn/t done at those two levelsH Is the difference within the range of error, and so randomH Is the difference many times more than the error, and so repeatedly reproducible and constantH 3eaningful measurement demands that we be able to mae reliable distinctions and only Rasch 3odel fulfil this requirement. (nd finally, meaningful measurement demands that the items wor together to measure the same thing. If reliable distinctions can be made between measures, what is the one thing that all of the items tap intoH If the data e#hibit a consistency 21 that is shared across items and across persons, what is the nature of that consistency H 3eaningful measurement posits a model of what data must loo lie to be interpretable and coherent, and then it evaluates data in light of that model. Rasch has all these specific properties as a unique model of measurement. 22 4. WinSteps: A Practical Guide to Rasch Analysis 2( 6 *'T5S 7 4.. S5TT8*G 9P -ATA :.prn ;815 2+ 6 *'T5S 7 4.2 S5TT8*G 9P -ATA C'*TR'1 ;815S 22 6 *'T5S 7 4.# '9TP9T TA015S 2J 6 *'T5S 7 4.4 P5RS'* 8T5/ -8STR809T8'* /AP 2) 6 *'T5S 7 2* 6 *'T5S 7 4." S9//AR< STAT8ST8CS 2E 6 *'T5S 7 4.& PART8A1 CR5-8T =P'1<T'/'9S RAT8*GS (0 6 *'T5S 7 4.> 8T5/ P'1AR8T< (1 6 *'T5S 7 4.$ 8CC GRAP3 (2 6 *'T5 7 4.? CA1C91AT8'* '; PR'0A0818ST8C /A@8/9/ 18K5183''- 5ST8/AT5 (( 'erson +o)it "te* +o)it 'erson "te* '5!28 '5!.8 '5!8 '5!38 '5!-8 '5!.8 '5!18 Av) Measure *easure ite* F+ +.)* !pecialised *.9, ).8E ).1; ).11 ).,; ).+, ).+, ).*1 ).,9 F9 *.81 ," *.+E ).9: ).89 ).8, ).1: ).+E ).+E ).*E ).1, F1 *.1 IT *.+E ).9: ).89 ).8, ).1: ).+E ).+E ).*E ).1, F, *.*8 Research ).E9 ).:1 ).91 ).9* ).88 ).,9 ).,9 ).+1 ).8) F8 ).1 "#cellence ).91 ).;) ).:* ).9; ).9+ ).11 ).11 ).,* ).8: F9 ).1 Filter ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98 F* 2).*; ?onsultative ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98 Teamwr ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98 <etwrng ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98 3entor ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98 I<( 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ 3gmt 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ Resilient 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ ?areer 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ (daptability 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ 7eadershp 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ ?ommunication 2).); ).;E ).;, ).;* ).:: ).9+ ).9+ ).1; ).:+ InfoIuse 2).8) ).E+ ).;; ).;: ).;1 ).:* ).:* ).8; ).:E Froduct 2).8) ).E+ ).;; ).;: ).;1 ).:* ).:* ).8; ).:E Improve 2).8) ).E+ ).;; ).;: ).;1 ).:* ).:* ).8; ).:E FrioritiBed 2).8) ).E+ ).;; ).;: ).;1 ).:* ).:* ).8; ).:E Opportunity 2*.)1 ).E8 ).E, ).E+ ).E) ).;* ).;* ).:) ).;9 Jisionary 2*.;8 ).E; ).E: ).E9 ).E8 ).E) ).E) ).;1 ).E, 7ifelong 2*.;8 ).E; ).E: ).E9 ).E8 ).E) ).E) ).;1 ).E, ?ontent 2,.*1 ).EE ).EE ).EE ).EE ).E: ).E: ).E8 ).E; The probability of person n on ite, i +i(in+ an eApected responseB AC.4 +i(en person ability4 n and ite, diDculty4 Ei is eApressed as: Pni6AniM.N n4 Ei @ C e v - E i . .Fe v - Ei 6 *'T5 7 4.. P5RS'* /5AS9R5 (+ 6 *'T5 7 4... 8T5/ /5AS9R5 (2 INPUT: 25 Persons 20 Items MEASURED: 25 Persons 20 Items 2 CATS -----------------------------------------------------------------------
------------------------------------------------------------------------------------------------ |ENTR/ TOTAL MODEL| INFIT | OUTFIT |PT-MEASURE |E0ACT MATC1| | |NUM2ER SCORE COUNT MEASURE S.E. |MNS3 4STD|MNS3 4STD|CORR. E0P.| O2S5 E0P5| Person | |------------------------------------&----------&----------&-----------&-----------&-----------| | 14 18 20 2.58 .78|1.16 .5|5.94 2.6|A-.03 .2#| ".5 ".5| '1%111%(n$| | 19 12 20 .42 .52|1.95 3.6|2.84 3.9|B-.19 .%#| $#." !0.| '1221%,r$| | 16 16 20 1.64 .61|1.11 .4|2.72 2.0|C .19 .$5| !$.! "0.1| '1#221%)s2| | 13 13 20 .70 .53|1.31 1.4|1.39 1.1|D .25 .%%| #$.2 !1.1| '1$121%t+%| | 2$ 12 20 .%2 .52|1.1$ .!|1.1$ .5|E .$" .%#| #".% !0.| '2$11$5(*$| | 22 10 20 -.11 .52|1.0 .5|1.02 .2|F .%5 .50| 5!. !0.0| )2211$5)(1| | 2% 11 20 .15 .52|1.0! .%| ." .0|G .%5 .%"| #$.2 !0.$| '2%121%sr%| | 2 10 20 -.11 .52|1.0! .%|1.0! .$|1 .%5 .50| #".% !0.0| )0211$5)s2| | 17 13 20 .70 .53|1.03 .2|1.07 .3|I .41 .%%| !$.! !1.1| '1!222%(*$| | ! 12 20 .%2 .52|1.00 .1| ."! -.$|6 .%" .%#| #".% !0.| )0!121%)s1| | 1 20 -.$" .5$| ." .0| .1 -.2|7 .5$ .51| #$.2 !0.5| )01212%(*1| | 21 20 -.$" .5$| .# -.1| .! .0|L .5$ .51| #$.2 !0.5| )21212%(-1| | 3 13 20 .70 .53| .91 -.3| .96 .0|M .48 .%%| !$.! !1.1| '0$212%)s2| | 6 13 20 .70 .53| .95 -.2| .82 -.4|l .49 .%%| !$.! !1.1| )0#121%)s1| | 8 13 20 .70 .53| .92 -.3| .81 -.4|k .50 .%%| !$.! !1.1| )0"1125(*2| | 11 10 20 -.11 .52| .0 -.%| ."1 -.#|, .5# .50| #".% !0.0| )11212%)s1| | 12 10 20 -.11 .52| ." -.5| ."$ -.5|' .5! .50| !". !0.0| )12212%(n2| | 20 " 20 -.## .5%| ."" -.%| ."% -.%|* .5 .52| !$.! !2.!| '20211%(n$| | 9 15 20 1.30 .57| .87 -.4| .67 -.5|g .49 .$| !$.! !#.0| )021$%)(2| | 4 15 20 1.30 .57| .79 -.7| .58 -.7|f .53 .$| !$.! !#.0| '0%222%sr$| | 5 1% 20 . .55| .!! -1.0| .#" -.!|e .55 .%2| "%.2 !2.5| )05121%)s1| | 15 15 20 1.30 .57| .75 -1.0| .58 -.7|d .55 .$| "%.2 !#.0| '15222%)(1| | 10 " 20 -.## .5%| .!% -1.1| .!$ -.!|8 .## .52| "%.2 !2.!| )102210)s$| | 25 15 20 1.30 .57| .70 -1.2| .52 -.9|b .58 .$| "%.2 !#.0| )25111%)s2| | 1" 1# 20 1.#% .#1| .## -1.1| .%5 -.|9 .5! .$5| "%.2 "0.1| '1"221%))1| |------------------------------------&----------&----------&-----------&-----------&-----------| | MEAN 12.% 20.0 .5" .55| ." .0|1.21 .1| | !2.0 !$.2| | | S.D. 2.! .0 .! .05| .25 1.0|1.12 1.1| | 10." %.5| | ---------------------------------------------------------------------------------- () ". Appendices !.1 "ample "ize and #tem $alibration %or &erson 'easure( "tability !.2 )hat do #nfit and *utfit 'ean+square and "tandardized mean, a. Rating "cale #nstrument -uality $riteria b. .xample of &erson #tem 'ap Re+construction (* ".. Sa,ple SiGe and 8te, Calibration Hor Person /easureI Stability 'ow big a sample is necessary to obtain usefully stable item calibrationsH Or how long a test is necessary to obtain usefully stable person measure estimatesH The Rasch model is blind to what is a person and what is an item, so the numbers are the same. "ach time we calibrate a set of items on different samples of similar e#aminees, we e#pect slightly different results. In principle, as the siBe of the samples increases, the differences become smaller. If each sample were only + or , e#aminees, results could be very unstable. If each sample were +,))) or ,,))) e#aminees, results might be essentially identical, provided no other sources of error are in action. 0ut large samples are e#pensive and time2consuming. What is the minimum sample to give useful item calibrations K calibrations that we can e#pect to be similar enough to maintain a useful level of measurement stabilityH Polyto,ies The extra concern with polytomies is that you need at least 10 observations per category, see, for instance, Linacre J.M. !00!" #nderstanding $asch measurement% &ptimi'ing rating scale category effectiveness. Journal of (pplied Measurement )%1 *+,10-. or Linacre J.M. 1..." /nvestigating rating scale category utility. Journal of &utcome Measurement )%!, 10),1!!. &therwise the actual sample si'es could be smaller than with dichotomies because there is more information in each polytomous observation. (E Person /easure 5sti,ate Stability The re0uirements are symmetric for the $asch model so you need as many items for a stable person measure as you need persons for a stable item measure. 1onse0uently, )0 items administered to )0 persons with reasonable targeting and fit" should produce statistically stable measures. The first step is to clarify Ksimilar enough.K Lust as no person has a height stable to within .)* or even .* inches, no item has a difficulty stable to within .)* or even .* logits. In fact, stability to within M., logits is the best that can be e#pected for most variables. 7ee 5R3T 9%+ p.+++2,6 discovers that in many applications one logit change corresponds to one grade level advance. !o when an item calibration is stable within a logit, it will be targeted at a correct grade level. For groups of items, Wright N Aouglas 50est Test Aesign and !elf2Tailored Testing, 3"!( 3emo. *E, *E:86 report that, when calibrations deviate in a random way from their optimal values, Kas test length increases above ,) items, virtually no reasonable testing situation riss a measurement bias Ofor the e#amineesP large enough to notice.K For even shorter tests, measures based on item calibration with random deviations up to ).8 logits are Kfor all practical purposes free from bias.K Theoretically, the stability of an item calibration is its modelled standard error. For a sample of < e#aminees, that is reasonably targeted at the items and that responds to the test as intended, average item p2values are in the range ).8 to ).;:, so that modelled item standard errors are in the range +Qsqrt5<6 R !" R ,Qsqrt5<6 5Wright N !tone, 0est Test Aesign, p.*,96, i.e, 1Q!" + R < R EQ!" + . The lower end of the range applies when the sample is targeted on items with 1)S29)S success rate, the higher end when the sample obtains success rates more e#treme than *8S or ;8S success. (s a rule of thumb, at least ; correct responses and ; incorrect responses are needed for reasonable confidence that an item calibration is within * logit of a stable value. What, then, is the sample siBe needed to have EES confidence that no item calibration is more than * logit away from its stable valueH ( two2tailed EES confidence interval is M+.9 !.". wide. For a M* logit interval, this !.". is M*Q+.9 logits. This gives a minimum sample in the range 1T5+.96 + R < R ET5+.96 + , i.e, +: R < R 9*, depending on targeting. Thus, a sample of 8) well2targeted e#aminees is conservative for obtaining useful, stable estimates. ,) e#aminees is enough for well2 +0 designed pilot studies. The Table suggests other ranges. Inflate these sample siBes by *)S21)S if there are ma4or sources of unmodelled measurement disturbance, such as different testing conditions or alternative curricula. If much larger samples are conveniently available, divide them into smaller, homogeneous samples of males, females, young, old, etc. in order to chec the stability of item calibrations in different measuring situations. /&e% Calibra&ions s&able "i&hin Con5den ce Mini%u% sa%ple size ran$e ?bes& &o poor &ar$e&in$@ Size for %os& purposes O 1 lo$i & E2P 1J '' (J (0 O 1 lo$i & EEP 2) '' J1 20 O Q lo$i & E2P J+ '' 1++ 100 O Q lo$i & EEP 10* '' 2+( 120 /ohn 'ichael Linacre &xplanatory notes% *. KFor a M* logit interval this !.". is M*Q+.9 logits.K (n estimateUs standard !.". is the modelled standard deviation of the normal distribution of the observed estimate around its KtrueK value. !uppose we want to be EES confident that the KtrueK item difficulty is within * logit of its reported estimate. Then the estimate needs to have a standard error of *.) logits divided by +.9 or less & *Q+.9 & ).,;8 logits. +1 +. KThis gives a minimum sample in the range 1T5+.96V R < R ET5+.96VK With optimum targeting of a dichotomous test, the modelled probability of each response is p&).8. Then the modelled binomial variance & ).8T).8 & the information in a response. Thus < perfectly targeted observations have information < T ).8 T ).8 & <Q1. This means that the !.". of an estimate produced by < perfectly targeted observations is !.". & sqrt51Q<6. !imilarly, for < e#tremely off2target observations 5for a reasonable dichotomous test6, p&).*, or p&).;:. For these, the modelled binomial variance & ).*,T).;: & the information in a response. < e#tremely off2target observations have information < T ).*, T ).;: & <QE. This means that the !.". of an estimate produced by < perfectly targeted observations is !.". & sqrt5EQ<6. !o, for < observations, the minimum !.". is sqrt 51Q<6 and a reasonable ma#imum !.". is sqrt5EQ<6. !o, the minimum range of < to produce an !.". of ).,;8 logits or better regardless of targeting is sqrt51Q<6 & *Q+.9 for the best case 5lower limit6 and sqrt5EQ<6 & *Q+.9 for the worst reasonable case 5upper limit6 i.e., 1T5+.96V R < R ET5+.96Tsup+$ is the range of minimum values of < to produce the desired !.". 5or better6. Wright 0 N Fanchapaesan < *E9E. 0 procedure for sample+free item analysis. "ducational N Fsychological 3easurement +E * +,21; Wright 0 N Aouglas @ *E:8. 1est test design and self+tailored testing. 3"!( 3emorandum <o. *E. Aepartment of "ducation, >niv. of ?hicago Wright, 0. A. N Aouglas, @. (. Rasch item analysis by hand. Research 3emorandum <o. +*, !tatistical 7aboratory, Aepartment of "ducation, >niversity of ?hicago, *E:9 Wright N Aouglas5*E:96 KRasch #tem 0nalysis by 2andK% KIn other wor we have found that when Otest lengthP is greater than +), random values of Oitem calibrationP as high as ).8) have negligible effects on measurement.K Wright N Aouglas 5*E:86 K1est 3est 4esign and "elf+3ailored 3estingK% KThey allow the test designer to incur item discrepancies, that is item calibration errors, as large as *.). This may appear unnecessarily generous, since it permits use of an item of difficulty +.), say, when the design calls for *.), but it is offered as an upper limit because we found a large area of the test design domain to be e#ceptionally robust with respect to independent item discrepancies.K Wright N !tone 5*E:E6 K1est 3est 4esign5 p.67 + 5random uncertainty of less than .8 logits,K referencing 3"!( 3emo *E% 0est Test and !elf2Tailored Testing. 0en4amin A. Wright N @raham (. Aouglas, *E:8 . (lso ., logits in !olving 3easurement Froblems with the Rasch 3odel. Lournal of "ducational 3easurement *1 5+6 pp. E:2**9, !ummer *E:: 5and 3"!( 3emo 1+6 "ample "ize and #tem $alibration "tability. Linacre /'. Rasch 'easurement 3ransactions 1669 :;9 p.827 +2 ".2 What do 8nJt and 'utJt4 /ean%sKuare and StandardiGed ,eanL These are all KfitK statistics. In a Rasch conte#t they indicate how accurately or predictably data fit the model. Aichotomous fit statistics. Folytomous fit statistics. "nfit means inlier2sensitive or information2weighted fit. This is more sensitive to the pattern of responses to items targeted on the person, and vice2versa. For e#ample, infit reports overfit for @uttman patterns, underfit for alternative curricula or idiosyncratic clinical groups. These patterns can be hard to diagnose and remedy. ,utfit means outlier2sensitive fit. This is more sensitive to responses to items with difficulty far from a person, and vice2versa. For e#ample, outfit reports overfit for imputed responses, underfit for lucy guesses and careless mistaes. These are usually easy to diagnose and remedy. Mean;s<uare fit statistics show the siBe of the randomness, i.e., the amount of distortion of the measurement system. *.) is their e#pected values. Jalues less than *.) indicate observations are too predictable 5redundancy, data overfit the model6. Jalues greater than *.) indicate unpredictability 5unmodeled noise, data underfit the model6. !tatistically, mean2squares are chi2square statistics divided by their degrees of freedom. 3ean2squares are always positive. 3ean2square ranges encountered in practice have been reported at Reasonable 3ean2!quare Fit Jalues. In general, mean2squares near *.) indicate little distortion of the measurement system, regardless of the standardiBed value. "valuate mean2squares high above *.) before mean2squares much below *.), because the average mean2square is usually forced to be near *.). Outfit problems are less of a threat to measurement than Infit ones, but are easier to manage. To evaluate the impact of any misfit, replace suspect responses with missing values and e#amine the resultant changes to the measures. Standardized fit statistics 5<std in some computer output6 are t2tests of the hypothesis KAo the data fit the model 5perfectly6HK These are reported as B2scores, i.e., unit normal deviates. They show the improbability of the data, i.e., its significance, if the data actually did fit the model. ).) are their e#pected values. 7ess than ).) indicates too predictable. 3ore than ).) indicates lac of predictability. !tandardiBed values are positive and negative. =or the relationship between mean+squares and standardized statistics see www.rasch.org>rmt>rmt1:1n.htm +( !tandardiBed fit statistics are usually obtained by converting the mean2square statistics to the normally2distributed B2standardiBed ones by means of the Wilson2'ilferty cube root transformation. An(hored runs= (nchor values may not e#actly accord with the current data. To the e#tent that they donUt, fit statistics can be misleading. (nchor values that are too central for the current data tend to mae the data appear to fit too well. (nchor values that are too dispersed for the current data tend to mae the data appear noisy. Mean's7uare 1alue /%plica&ion for Measure%en& R 2.0 Dis&or&s or de$rades &he %easure%en& sys&e%. May be caused by only one or &"o obser!a&ions. 1.2 ' 2.0 Bnproduc&i!e for cons&ruc&ion of %easure%en&6 bu& no& de$radin$. 0.2 ' 1.2 #roduc&i!e for %easure%en&. S 0.2 Dess produc&i!e for %easure%en&6 bu& no& de$radin$. May produce %isleadin$ly hi$h reliabili&y and separa&ion coeCcien&s. !tandardiBed Jalue Implication for 3easurement T ( Da&a !ery une;pec&ed if &hey 5& &he %odel ?perfec&ly@6 so &hey probably do no&. >u&6 "i&h lar$e sa%ple size6 subs&an&i!e %is5& %ay be s%all. 2.0 ' 2.E Da&a no&iceably unpredic&able. '1.E ' 1.E Da&a ha!e reasonable predic&abili&y. U '2 Da&a are &oo predic&able. &her Adi%ensionsA %ay be cons&rainin$ &he response pa&&erns. )hat do #nfit and *utfit 'ean+square and "tandardized mean, Linacre /'. ? Rasch 'easurement 3ransactions 2@@2 1A;2 p.7:7 ++ -.3 %atin) S(ale "nstru*ent >uality ?riteria Ratin+ Scale 8nstru,ent Muality Criteria Criterion Poor ;air Good Nery Good 5Acellent <ar$e&in$ R 2 errors 1'2 errors S 1 error S .2 error S .22 error /&e% Model Fi& Mean S7uare Ran$e :;&re%es S .(( ' R(.0 .(+ V 2.E .2 ' 2.0 .)1 ' 1.+ .)) ' 1.( #erson and /&e% Measure%en& Reliabili&y S.J) .J)'.*0 .*1'.E0 .E1'.E+ R.E+ #erson and /&e% S&ra&a Separa&ed 2 or less 2'( ('+ +'2 R2 Ceilin$ eWec&- P %a;i%u% e;&re%e scores R2P 2'2P 1'2P .2'1P S.2P Floor eWec&- P %ini%u% e;&re%e scores R2P 2'2P 1'2P .2'1P S.2P 1ariance in da&a e;plained by %easures S20P 20'J0P J0')0P )0'*0P R*0P Bne;plained !ariance in con&ras&s 1'2 of #CA of residuals R12P 10'12P 2'10P ('2P S(P 5his 5able has been deeloped by -illiam P. $isher2 6r. based on the Rasch literature and his many years o* e0perience conducting Rasch analyses in di7erent settings. Rating !cale Instrument Wuality ?riteria. Fisher, W.F. Lr. X Rasch 3easurement Transactions, +)):, +*%* p. *)E8 +2 TA2LE 12.2 Assessment o: Fr98t'on: S8*oo-s 'n Ar9U A;+ 1# 2:$ 200INPUT: 2"" Persons %0 Items MEASURED: 2"" Persons %0 Items 2 CATS Persons MAP OF Items <more=|<r9re= # & | Q17_Pr!"r#-$ >>> | | 5 & | T| .>>> | | % & .>>>>> |T | >>>>>>>>>> | 32!?@Im)ro)erR-9 $ .>>>>>>>> S& | 32!9''@M'ANo.M-( >>>>>>>>>> | 32!9'@M'ANo.M-( >>>>>>> | | 32$@Pro)erO-9 2 >>>>>>>>>>> &S 3!@Pro)erPB-( 3$1@M'ANo.O-9 .>>>>>>>>>> | .>>>>>>> M| 3$2@M'ANo.O-9 .>>>> | 3$$?@M'ANo.M-; >>>>> | 32#@Im)ro)erM-; 1 .>>>>>> & 31%@Pro)er-; .>>>> | 3"@Pro)erM-( 3%@Pro)erPB-; >>>>>>>> | 325@Im)ro)erM-( 3@Pro)erM-; .>>> | 31@Pro)erPB-( 32"@M'ANo.PB-; >>>>>>>>> | 3$$9@M'ANo.PB-( 0 .>>>> S&M 3$@Pro)erPB-( 31#@Pro)erM-; 3$$C@M'ANo.O-9 .>>>>>> | 31"9@Pro)erR-( 312@Pro)erM-; >>> | 32%@Im)ro)erM-( 31"?@Pro)erR-; .>>>> | 31@Pro)erPB-( .> | 31$@Pro)er3-( 322@Pro)er3-9 -1 .> & 3$$8@M'ANo.PB-; > | 32@M'ANo.PB-; | 3$0@M'ANo.PB-( .> T| . | 320?@Pro)erR-( 311@Pro)erR-; 321@Pro)erO-9 -2 .> &S | Q6_Pr!"rP%-k Q10_Pr!"r#-$ | | Q20&_Pr!"rP%-k | Q5_Pr!"rP%-k Q15_Pr!"r#-$ -$ & . | |T -% & | Q2_Pr!"rP%-k -5 & <-ess=|<:reD;= +J K*'W15- 9*-5RSTA APP18CAT8 '* TA2LE 1#.$ Assessment o: Fr98t'on: S8*oo-s 'n Ar9; A;+ 1# 2:$ 200 INPUT: 2"" Persons %0 Items MEASURED: 2"" Persons %0 Items 2 CATS
+* '9R C'*S91TA*TSO and published in refereed Pournals. 3is paper entitled QClearer to read and easier to understand: Rasch Analysis in 1earnin+ 'utco,esR )on as the 0est Conference Paper in 8C55- K1 2?. Plannin+ and MA=MC for Muality 8,pro(e,ent. 3e is currently the Pro+ra, Coordinator for the 5Aec.-ip. in Muality /ana+e,ent Syste, at SPAC54 9ni(ersity Te!nolo+i /alaysia. 8n year 2> and 2$4 he )as consecuti(ely na,ed SPAC5 9T/ 0est Nisitin+ 1ecturer. ;or details4 (isit : http:==))).lin!edin.co,=in=saidfudin /'3- SA8-;9-8* /AS'-8B holds a de+ree in Architecture fro, Australia and is a 1ead Assessor trained by *i+el 0auer 1td.4 9K and RWT9N A.G.%8nternational4 Ger,any. 3e is also an A'TS4 Sapan recipient in 5/S. With o(er 24 years of eAperience4 he has ,ana+ed a di(erse ran+e of proPects on aspects of MA=MC for both pri(ate and public sectors )ith speciJc interest in 8nstitution of 3i+her 1earnin+. 3is papers on Kuality perfor,ance ,easure,ent based on Rasch Analysis has been accepted as proceedin+s in se(eral ATR81A3 A0- AT8T0 earned a 0achelor of Science in Co,puter Science and /arthenatics de+ree fro, Kent State 9ni(ersity4 Kent4 'hio4 9S4 and her ,asters in /ana+e,ent 8nfor,ation Syste,s fro, the 8nternational 8sla,ic 9ni(ersity /alaysia. 3er Ph.- is on co,petency ,easure,ent indeA of 8nfor,ation Professionals usin+ Rasch /odel. 3er papers on Rasch /easure,ent has been presented in international conferences and published in refereed Pournals (iGB WS5AS Transactions4 SournalsB 8555Aplore4 *A9* etc. She has conducted se(eral )or!shops and short courses re+ularly4 locally and abroadB in areas of Perfor,ance /easure,ent4 Strate+ic Plannin+ based on Rasch /odel /easure,ent and Analysis usin+ Winsteps and 0ondU;oA. ;or further enKuiries4 she can be reached at aGrilahV+,ail.co,
Hector Cruz, Matthew Gulley, Carmine Perrelli, George Dunleavy, Louis Poveromo, Nicholas Pechar, and George Mitchell, Individually and on Behalf of All Other Persons Similarly Situated v. Benjamin Ward, Individually and in His Capacity as Commissioner of the New York State Department of Correctional Services, Vito M. Ternullo, Individually and in His Capacity as Director of Matteawan State Hospital, Lawrence Sweeney, Individually and in His Capacity as Chief of Psychiatric Services at Matteawan State Hospital, 558 F.2d 658, 2d Cir. (1977)