Anda di halaman 1dari 6
1 Objectives and expectations Introduction: Some common misconceptions about language testing and resulting problems ‘The primary purpose ofthis book isto enable the reader to become compet- ent in che design, development, and use of langage tests. Over the years ‘we have worked with a wide range of individvale—language teachers who want t0 be able to use tests 2s part of their classroom teaching, applied linguists interested in developing tests for use as research inseuments, people who are involved in large-scale language testing programs, and ‘graduate students in fields such as applied linguistics, English asa second! forcign language, bilingual education, and foreign language education, In virtually every group we have worked with, we have found misconceptions about the development and use of language tests, and unrealistic expecta. ‘ions about what language tess can do and what they should be like, that have prevented peopl from becoming competent in language testing. Fur- thermore, there is often a belie chat language testers have some skmest ‘magical procedures and formulae for creating the "bes test. These miscom ceptions and unrealistic expectations, and the mistique associated with la _Buage testing, constitute strong affective barriers to many people who want and need 10’ be able to use language tests in their profesional work, Breaking down this affective barier by dispelling misconceptions, helping readers develop a sense of what ean reasonably be expected of language tes, and demystiyng language esting is thos an portant pat of ths Perhaps the best way to illustrate this is with an example from our own, experience with language resting. We fist started working together in lan- age testing about 25 years ago, when we were in similar situations in which we needed to develop language tests for a particular purpose. We were bth involved in developing tests for we in placing students into an appropsiate level or group in English asa foreign language (EFL) courses in terviary-leyel institutions in Thailand, where, at tha time, English was rot the meliom of instruction for education, and was not widely used in the society at Large. Neither of us had had any formal teaming in either langage resting or pyehometres, andl we had come tothe task with differ cent hackgroundy--one in theoretical ingustcs and the ther in Ei 4 Conceptual bases of test development guage and fteratre, On the other banal we had both bad experience jn teaching, sigs and considerable under: standing of what wis then Koos, sn ters of theory and research, of secondfforeiy we shared a common con ‘cern: to develop the “ben test for our situatons. We believed that there was § model langage test ad st of steaightlonward procedures recipe, if you will—that we could follow to create a tst that wold be the best one for our purposes and situations. ‘What we did, essentially, was ro model our tess on tests that were widely used at that time, which included sections testing English grammar, vocabulary, reading, and listening: comprehension, Fol lowing chi model, we employed est development procures that had been {developed for psychological and educational tests to prodice, rather mech anically, rests that both we and our colleagues believed were ‘state-of the fart” EFL tests, and hence the ‘best for our notds. We had started with the ‘bese models and had used sophisticated statistical techniques in test development, so that our rests were definitely state-of the-art at that time, but now, in tetospect, we wonder whether they were the best for those situations. Indeed, we wonder if there isa single “best es for any language ‘esting situation In developing those tests, we believed that if we followed the model of ‘test that was widely recognized and used, it would auromaticaly be usefl for our particular needs. These tests had been designed and developed by the ‘experts in the feld, who were assumed to know more than we did. ‘There wete, however, several questions we did nocask. Were our situations differen enough from the ones for which these large-scale teste were developed to make them inappropriate? Were our test takers like the ones ‘who took those large-scale certs, or would the results of our ters be used to make the same kinds of decisions? We did not even ask whether the abilities tested in those teats were the ones we needed to test Given what was known (and not known) about the nature of language vse, of language learning, and of language testing ar that ime, chese were ‘questions that simply never occureed. Language ability was viewed asa set of fnite components grammar, vocabulary, pronunciation, spelling that ‘were realized as four skills—lstening, speaking, reading, and waiting, f we taught or tested these, we were teaching or testing everything that was needed. Language learners were viewed a8 organisms who all leaned lan- fuage by essentially the same processes—stimulus and response, 25 ‘described by behaviorist psychology. Finally, it was assumed thatthe peo- cesses involved in language learning, were more or less the same for all learners, for all situations, and forall purposes. Ic is not suprising, then, that we believed that a single model would provide the best test for our particular test takers fr our particular uses, and forthe areas of language Ability that were of interest in our particular sitution. c large-scale EFL Objectives and expectations 5 As it tumed out, the two groups of test takers for whom we developed ‘esenillythe same kind of language test were quit different. One group consisted of first-year students entering a university in which very litle of their academic course work would involve the use of English, Most of them ‘would be required to take at last one English course as part of thee degree requirements. Though al ofthe students had had some exposure to English in ther secondary school education, most had ery litle control ofthe l- ‘guage, and almost none of them had had any exposure to English outside Of the EFL classroom. Few had ever spoken English with a native speaker for had any opportunity to use English for any non-instructional purpose. ‘The other group conssed of university teachers, many of whom were ite senior, from many diferent universities, and representing a wide ange of academic disciplines, who had been selected as recipients ofschol- arships to continue work on advanced degrees in countries where English is the medium of instruction. They were much more highly specialized in their knowledge of this disciplines than were the fist-year university st ‘dents, were considerably older, on average, and were more experienced. ‘The programs into which these test takers would be placed by means of the texts were also quite diferent. The program into which the university students would be placed consisted of four levels of non-intemsive (Five hours per week) English instruction during their fest and second years of university work. The program focused primarily on enabling the stadens to read academic reference works writen in English, Stadents were placed in courses atone of the four levels by general abiity level and not according, to their area of academic specialization. Mose ofthe English clases were ‘aught by teachers who bad learned Haglish as a foreign language, and ‘much of the classroom instruction was caried out in the students native language. Th anit teachers on he ther andy would be ed oa ten week intensive (40 hours per week) course ata national English language institute where they would be required to speak nothing but English hnetween the hours of about eight until five every working day. They would ke classes in all four skill, but would be divided into groups according to broad clasications of thei academic disciplines, such as agriculture, ‘ngincring and sciences, medial sciences, and economics. Unlike the uni verity English program, the teachers in this program were all native speakers of English, and all classroom instruction was catried out in Fish. This program was thus raich more intensive than that ofthe uni vetity stunts the curriculum was focused on English for specific pur= mes and involved a great deal more actual use of English, "This example ilastrates the nist common misconception that we find shone who ask for advice about thei specific testing needs. In our nae, many people heiews, ts we dd that there fan ideal of what 2 yoo” Language text iy and want to kaos howe to create tests on this 6 Chanepanal bases of test development ideal mode! for their own testing needs, Our answer i tha there is no such thing as 2 ‘good? or "bad test or he abstract, and that there is no such thing as che one best test, even for aspect situation. To understand why this is so, we must consider rome ofthe problems tha result from this ‘misconception Ifwe assume thar a singe ‘het tex exist and we attempt either to use this test tslf, orto use i as a mode! fr developing atest of our own, we ft likely to end up with a text chat wil be inappropriate for atleast some ‘of our teat aker, Inthe example abe, the est developed forthe univer Si students might have Been appropriate foe eis group, in terms of the reas of language ability measured (grammar, vocabulary, and reading comprehension), aid topical content, since this was quite general and not Speife to auy particular discipline. The tet developed for the university Teachers, however, was probably not_partculaly appropriate for this troup, since ie did not inlude material related to the teachers diferent {cplnes or tothe areas of ESP chat were covered inthe intesive course Tis test was also of limited appropriateness because it did not elude an assesment of students abilsy to perform listening ad speaking ask, ‘which was heavily emphasized in the inensive program. Because of these limitations the tt forthe teachers did nor met all of the needs of the test users (the dirctoe of and teachers in the intensive program), Specify, teachers in the intensive course reported. that Students who were placed ino level on the basis of the test were quite hhomogencous in ters of die reading, but that there were considerable differences among students within a gives level in teams of thee listening tnd speaking. These differences made i quite dificult for teachers co find “Ie use lntening and speaking actvsies that were appropriate fora given troup. Teachers fle that the test should be able to accurately predict Students’ placement int the listening and speaking clases, as well into the reading classes, and urged the tt developer to remedy this situation. aan attempt to address tht pootlem, » dictation was added tothe test asa way of assessing the stodent” aby to perform listening tasks. in his {ask the test takers listened to passage presented with a tape recorder, tnd were reguied to write down eat what they heard, Ths particule task wat add largely because it hed been used previously, and was com Sidered to bea “good” way to rest Hineing. At the same te, the director ‘ofthe intensive program agred vo group student homogeneously ino i= fening and speaking groups on the bast of thee scores on the dictation, ‘This seemed to work well as a progeum modification, and teachers fel that it fcltated both thee teaching an¢ thee stodents' earning. Ieis aor clear that it was the disation text or the program change— grouping students homogencousy imo isening and speaking classe that oled the problem withthe Tseng and speaking clases. What i cle Ihowever, i that ang citation st eee anther peoblem. Nos of the lignin tasks in the otenive anutse were imeraetive, clanersaional tanks wing ype and ae, a wc ee ger nla ie sort To the dation ota ch invader aneactnt andrei oaly writen responses, Ths ih the: adits Sem gn heaton dd erp prove some general simon about the student aly olsen and undead spoken nn the take el as qin rom the kinds olen tks the tact woul! be eagaged inn the nei our, and ba ‘he wher and the tse ws freuen complained atthe west wat ‘sya and re telat whe stedents acta the sin ours The eachers wee fred because ven hough the tet ie! be providing esa formation for placing sade no groupes the ask hat wee incaded in he tet boce ite anys semblance tho works The teacher expected the tet provide compel aecarate cfr placement purpose ant thse Sve tind fe tos that wer ery sia to thou they incorporated ito reaching The ws developer was strated Beate he fl that he fad done very sh heel oak hi she Beet pos He had wed ce ‘wor ges of ts that were widely uel hy langue ters and hed ied standard tex development procedors that were wed snd re “riba hy measuremen speci, To ummarie Table 1. shows some ofthe misconceptions and resaking Msconceptins enuting problems "ar at eo Hl Te ich rape Po “npn ang a angac et ‘ede hte ces espe * vig entre pectens 3. Uefa ae oat stig ‘it gag ah ant” ed ny bee ey hae Sra be tone popu 4 itr a be egy of 4 Beara twa on rmeavomat {ea dv poet 5 Litt hn ro cay svcopg an ang ‘Score a wee ing at imma wig oti [= arco Unoatnd a 6 Beha pont na atten oo sed me indeab csny ‘Stora ct ora ‘See vesinatn pare able bas 8 septs and reading pretleme

Anda mungkin juga menyukai