Testing for language teachers
matters and for successful problem solving. In the chapters on validity
and reliability, simple statistical norions are presented in terms that it is
Iiuped everyone should be able to grasp. Appendix 1 deals in some
detail with the statistical analysis of test results. Even here, however, the
temphasis is on interpretation rather than on calculation. In fac, given
the computing power and statistics software that is readily available
these days, there is no real need for any calculation on the part of
language testers. They simply need to understand the ouput of the
computer programs which they (or others) use. Appendix 1 attempts to
develop this understanding and, just as important, show how valuable
statistical information can be in developing better tests.
Further reading
The collection of critical reviews of nearly $0 English language tests
(mostly British and American), edited by Alderson, Keahnke and
Stansfield (1987), reveals how well professional test writers are thought
to have solved their problems. A full understanding af the reviews wil
depend to some degree on an assimilation of the content of Chapters 3,
4, and 5 of this book. Alderson and Buck (1993) and Alderson et ai
(1995) investigate the test development procedures of certain British
testing insticutions.
1. ‘Abilities is not being used here in any technical sense, Ie refers simply to
‘what people can do in, or with, 2 language. I could, for example, include
the ability to converse fluently ina language, as well asthe ably to recite
‘grammatical rules if chat is something which we are interested in measur
ig) Te does nor, howeves, refer co language aptitude, ee talent whieh
people have in differing degres, for leaening languages. The measurement
‘ofthis talent in order to predict how well or how quickly individuals will
Tear a foreign language, is beyond the scope of this book. The interested
scader is referred 0 Pimsleur (1968), Carroll (1981), and Skehan (1986),
Stemberg (1995), MacWhinney (1995), Spolsky (1995), Mislevy (1995),
Matalin (1995),
is of tests and testing
“This chapter bogins by considering the purposes for which language
testing is carried out, It goes on to make a number of distinctions:
beeen diet ad indie testing, between dacrte point and integra
tive testing, between norm-ffereaced and criteion-eferenced testing,
tnd between objective and subjective testing, Finally there are notes on
Compu ade esing ad commons agg ting
fasts can be categorised according tothe fypes of information they
provide, This saagetenton il prove uefa both in desing wheter
In existng testis suitable fora particular purpose and in writing appro-
priate new tests where thse are necessary. The four types of et which
tve will discuss inthe following sections are: proficiency tests achiewe-
tent tests, dagnostc vests and placement texts
Proficiency tests
Profieny tes are designed o measure people’ bility ina ngage,
regardless of any training they may have had in that language. The.
content of a proficiency test, Heelers is aot based on the comient oF
Objectives of language cow ‘People taking the test may have
followed. Rather, cs based on a specification of what candidates have
to be able to do in the language in order to be considered proficient.
This raises the question of what we mean by the word ‘proficient.
In the case of some proficiency tests, proficient” means having suffi
ient command of the language fora particular purpose. An example of
{iis would be ates designed to discover wether someone can funcion
successfully asa United Nations translator, Another example would be
atest used f0 determine whether a student's English is good enough to
follow a course of study at a British university. Such a test may even
attempt to take into account the level and kind of English needed to
follow courses in particular subject areas. Ie might, for example, have
‘one form of the test for arts subjects, another for sciences, and so on.‘Testing for language teachers
‘Whatever the particular purpose to which the language isto be put, this
Ml be reflected inthe specfeation fest conent a a eal stage O 2
test’ development.
“There are ther proficiency tex which, by contrast, do not have any
cccupation or cours of study in mind. For them the concept of proficiency
is more general, British examples of these would be the Cambridge Fist
Certificate in English examination (FCE) and the Cambridge Certifeate
of Proficiency in English examination (CPE). The function of soch tests
4s ta show whether candidates have reached a certain standard with
respect to a set of specified abilities. The examining bodies responsible
for such tests are independent of teaching institutions and so ean be
relied on by potential employers, etc to make fair comparisons between
Candidates from different insttitions and different countries. Though
there ig no parscular purpose in mind for the language, these general
proficiency tests should have detailed specifications saying just what it
is that successful candidates Rave demonstrated that they can do. Each
test should be seen to be based directly on these specifications. All users
fof atest (teachers, students, employers etc.) can then judge whether the
test is suitable for them, and can interpret tes results. Iris nor enough
to have some vague notion of proficiency, however prestigious the
testing body concerned. The Cambridge examinations referred to above
are linked to levels in the ALTE (Association of Language Testers in
Europe) framework, which draws heavily on the work of the Council of
Europe (see Further Reading)
Despite differences between them of content and level of difficulty, all
proficiency tests have in common the fact that they are not based on
‘courses that eandidases may have previously taken. On the other hand,
as we saw in Chapter 1, such tests may themselves exercise considerable
influence over the method and content of language courses. Their back
swash effece - for thie ip what i ir - may be heneficial or harmful. In my’
view, the effect of some widely used proficiency tests is more harmful
than beneficial. However, the teachers of students who take such tests,
and whose work suffers from a harmful backwash effect, may be able
to exercise more influence over the testing organisations concerned than
they realise, The supplementing of TOEFL with a writing test, referred
ton Chapter 1, is a ease in point.
‘Achievement tests
‘Most teachers are unlikely to be responsible for proficiency tests. It is
‘much more probable that they will be involved in the preparation and
tuse of achievement tests, In contrast to proficiency tests, achievement
Kinds of tests and testing
ses are directly related to language courses, their purpose bing_t0
‘SSMEsh how suseesfl individeal students, groups of students, or the
“SERS Ulemsclves have Been in achieving objectives. They ae of two
sevement tests and progress achievement Ts.
"rial agbievernent tests are those administered atthe end ofa course
of study They may be written and administered by ministries of educa-
Ton, oficial examining boards, or by members of teaching institutions
cit i content of there teste must be related to the courses with
‘chic they are concerned, but the nature ofthis relationship is a matter
Df dlsagreement amongst language testers.
In the view of some testers, the content ofa final achievement test,
shouldbe based diecly on a detailed course sllabus or on the books
nd other materials used, This has been referred to as the. syllabus-
aie et Sppraach, It hasan obvious appert since the vest only contains
att sthougt har he students actual encoun nd hon
“in be considered, in this respect atleast fittest. The disadvant
‘hate plaka bay designed, ofthe books and other ma
Mare badlychosen, the results of a est can be very misleading,
Sassou performance on the test may noe truly indicate soccessul
SURSESie af couse objectives, For example, a course may have as an
‘Shyectve the develooment of conversational ability, but the cours elf
Sind the test may require student only to ter carefully prepared state-
iment about their home town, the weather, or whatever. Another
Course may aim to develop a reading ability in German, but the test
seat inn fll to the vocabulary the sulenre are knowen to have met.
Ya another course is intended to prepare stadens for university study
iMEnplsi, bur the slabs (and 50 the course and the test);may not
include listening (with note taking) to English delivered in eerre style
aie opace ofthe Lind tha the students wil have to deal with at univer-
Sis In each ofthese examples ~all of them based on actual cases ~ cst
results will faul to show what students have achieved in terms of course
onthe lt h
“The altemative approach is 0. et
objectives of the course, This has a number of advantages. First, it
compels course designers to be explicit about objectives. Secondly, it