Anda di halaman 1dari 6

LANGUAGE ASSESSMENT QUARTERLY, 2(1), 7781

Copyright 2005, Lawrence Erlbaum Associates, Inc.

Traditional Approach
but a Worthy Choice
Testing for Language Teachers (2nd ed.). Arthur Hughes. Cambridge, England:
Cambridge University Press, 2003, xii + 251 pp., $24.00 (softcover).
Anne Lazaraton
University of Minnesota

The instructor of an introductory course in language assessment is fortunate to


have a number of textbook choices available. As a first-time teacher of such a
course at my institution, I was pleased to see that Arthur Hughes Testing for Language Teachers had recently appeared as a completely revised edition. I had
used the first edition (1989) quite successfully just after it was published; students
were generally pleased with its tidy layout, its simple and concise explanations,
and its end-of-chapter activities and further readings. The only significant problem
with the book was its British orientation to U.K.-based language tests (ARELS,
FCE) and to vocabulary. As a result, I had high hopes for using the newer edition
with graduate students at the University of Minnesota this past semester.
Testing for Language Teachers (TLT) claims to be the most practical guide for
teachers who want to have a better understanding of the role of testing in language
teaching (back cover) and strives to help language teachers write better tests (p.
xi). It is hard to argue with these claims: The book deals with the desirable qualities
of tests and the testing of component language skills in an accessible and concise
matter. Although Hughes points out that the book is based on the testing of EFL,
readers who design assessments in other languages should have no trouble whatsoever applying the concepts covered to the development of other foreign language
tests.
The content of TLT is much the same as the first edition. The first chapter deals
with teaching and testing, a pep talk of sorts for readers, who will learn not only
Requests for reprints should be sent to Anne Lazaraton, Institute of Linguistics, English as a Second
Language and Slavic Languages and Literatures, University of Minnesota, 214 Nolte Center, 315
Pillsbury Drive SE, Minneapolis, MN 55455, USA. E-mail: lazaratn@tc.umn.edu

78

LAZARTON

how to write better tests themselves but to enlighten other people involved in
testing processes and to put pressure on professional testers and examining
boards, to improve their tests (p. 5; italics in original). The concept of backwash is
brought front and center from page one, and is perhaps the most prevalent conceptual thread throughout the book.
After an overview of the book (chapter 2), Kinds of Tests and Testing (chapter 3) covers types of tests (proficiency, achievement, diagnostic, placement) as
well as approaches to test construction (direct and indirect, discrete-point and integrative, norm-referenced and criterion-referenced, and objective and subjective
testing). Though all of these descriptions are easy to understand and helpful, a significant shortcoming of the book emerges at this point: What are its language
teacher-readers supposed to do with this information? The end-of-chapter reader
activities are only minimally helpful in this regard. In chapter 3, readers are instructed to consider a number of language tests with which you are familiar (p.
23) and to answer questions about their characteristics. Two problems arise in this
regard. First, some language teachers may not be very familiar with many (or any)
language tests; second, Hughes has not modeled this activity with an example. Unfortunately, most of the reader activities at the end of subsequent chapters are
equally unsatisfactory.
Chapter 4 delves into the concept of construct validity, which Hughes defines as
the general, overarching notion of validity (p. 26), empirical evidence for which
can be found in the subordinate forms of content and criterion-related validity. Although the discussion of this topic is lucid and not overly theoretical, readers may be unsure of how to translate some of this information into good classroom
testing practice.
Reliability is the topic of chapter 5 and is explained with reference to small sets
of invented data. The clear explanations of reliability coefficients, the standard error of measurement, and scorer reliability are followed by suggestions for making
tests more reliable and a (too) brief mention of the relationship between reliability
and validity. Given that the tension between these two fundamental aspects of test
usefulness is apparent in the differences between two large-scale tests (TOEFL
and IELTS) about which language teachers likely have some knowledge, a more
in-depth discussion of this topicas it relates to these tests and, more to the point,
to teacher-made classroom assessmentswould be welcome.
The concept of backwash is returned to in chapter 6, which consists of a number
of suggestions for achieving it. Although these suggestions are clearly explained,
readers may want to know how they can tell if their classroom tests do or do not actually promote beneficial backwash. Also, this chapter is about as far as Hughes
goes in discussing practicality and test impact; issues of the ethical aspects of assessment are only mentioned as topics for further reading here and in chapter 1.
Chapter 7 presents an overview of stages of test development. Most useful is the
guidance in stating the testing problem and writing test specifications. The ques-

BOOK REVIEWS

79

tions posed about the former and the samples given for the latter are very helpful in
showing readers how to conceptualize language assessment as a process composed
of a series of steps. Although not all of the specifications are applicable to every assessment situation, the framework itself provides a useful structure. On the other
hand, a mere three pages are devoted to test administration (chapter 16).
Common testing techniques are presented in chapter 8; the testing of component skills and overall language ability are covered in chapters 914. It is in these
chapters that the book excels. For each skill, Hughes states the testing problem,
then offers advice for creating and examples of test tasks. The test specification
format presented in chapter 7 is employed in each skills chapter, and issues of task
sampling are discussed. For testing writing (chapter 9), the testing problem includes setting representative writing tasks, eliciting valid samples of writing, and
validly and reliably scoring these samples. Examples of different writing tasks are
presented, suggestions for achieving valid and reliable scoring are made, and sample writing assessment guidelines (e.g., TWE, ACTFL) are discussed. Hughes also
provides guidance for constructing a rating scale and training scorers. Similarly,
Testing Oral Ability (chapter 10) presents various techniques for eliciting
speech and for rating elicited samples. Although each of the chapters is quite comprehensive (chapter 12, Testing Listening, perhaps the least so), it is inconvenient to deviate from the order in which the chapters appear (Writing, Oral Ability,
Reading, Listening, Grammar, and Vocabulary), because the reader is directed
back to earlier explanations or examples.
The chapter on Tests for Young Learners is new to this edition and presents a
general approach that promotes a close link between testing, assessment, and
teaching. It suggests immediate and positive feedback to test takers and encourages self-assessment as a regular part of the program. A brief discussion of the particular demands of this type of testing leads to a description of recommended testing techniques.
Appendix 1 consists of a somewhat expanded section (from the first edition) on
The Statistical Analysis of Test Data. Hughes makes clear that he is teaching interpretation rather than calculation of test statistics (an important change from the
first edition). In fact, he presents only ETA software output, a program available
from the textbook Web site (http://uk.cambridge.org/elt/tflt). The dataset used as
an example consists of responses to a (hypothetical?) 100-item placement test,
taken by 186 candidates. A frequency table and a histogram of the data are displayed and explained; measures of central tendency and dispersion are reported.
Four reliability coefficients are given, but his explanations that analysis of variance
or the Spearman-Brown prophecy formula was used in computing these estimates
are uninterpretable for readers who do not know what these procedures entail; it remains unclear how each coefficient is different and which gives the best estimate. The standard error of measurement is also mentioned (and readers should refer back to the lucid discussion of this statistic in chapter 5).

80

LAZARTON

The classical item analysis statistics for item facility and item discrimination
are then covered, along with distractor analysis. The chapter concludes with a brief
but understandable explanation of IRT. Although this appendix directs the reader
to the Web site for practice activities, these are very superficial (comment on a frequency distribution, calculate a mean, comment on item facility and discrimination indexes). An assessment course instructor could easily create better, more
comprehensive activities, using actual data that may be more meaningful. Furthermore, the relationship between item difficulty and item discrimination is not explained in a thorough manner. Appendix 2 contains a very brief explanation of item
banking; the reader is directed to the books Web site for further information.
Finally, there is an up-to-date bibliography and a user-friendly index as well.
All things considered, Hughes presents what appears to be a very traditional approach to language testing (albeit one that is consistent and well-articulated).
Many of the tests mentioned are well-established, large-scale tests (TOEFL,
IELTS), whereas portfolios and systematic self-assessment, for example, are overlooked. Suggestions for scoring items for partial credit, grading summaries,
critiquing in-class oral presentationsjust the kinds of assessments in which language teachers are most likely to be engagedare also missing. Furthermore,
Hughes makes no mention of current thinking (e.g., McNamara, 2003) about language assessment measures tapping social ability as much as psychological or
mental traits.
As a result, one begins to wonder for whom this book is really appropriate.
As one graduate student-language teacher puts it,
Sometimes I wish that he would either orient himself more to the perspective
of the teacher (rather than the professional tester) or, if he is going to take the
other tack, that he would explain the machinery of test analysis. I dont mind
having statistical stuff exiled to an appendix, but the appendix should tell
you what a z score is and what a correlation is, for example. I know people
dont do these things by hand any more, but its hard to know what something really means unless you know how it was produced. Or perhaps I
should say that there is a segment of the readership for whom a definition
(with formulas) is the clearest and quickest explanation. (R. White, personal
communication, November 24, 2003)
Finally, on a strictly appearance note, the physical layout of the book leaves
something to be desired. As another graduate student comments,
It isnt that easy to see the structure by scanning the headings because they
arent differentiated enough (there are bold heads, followed by italic heads,
but they are the same sizethey dont convey the hierarchy well enough).
And there are so many different structuresbold to italic in one situation,

BOOK REVIEWS

81

smaller bold italic, all caps bold, numbered bold with italicized lowercase
roman numeral entries beneath them, bulleted lists, boxed bulleted lists (K.
Hansen, personal communication, November 23, 2003)
Despite these shortcomings, overall, the organization of Testing for Language
Teachers is commendable and it constitutes a solid reference resource for language
teachers. Hughes style is straightforward and clear, and readers will undoubtedly
appreciate the minimal amount jargon that often obscures meaning. Finally, I myself welcome a book with no obvious bells and whistles. Just as Davies (2003)
found it important to sound a note of skepticism about assessment tools and
technology dictating language test content and language ability constructs, it is refreshing to find a text that does not require highly sophisticated knowledge of language testing terminology, of statistical analysis software, or of statistical analysis
procedures to get through the book. For the language assessment course instructor
and for the language teacher-reader, Testing for Language Teachers is a worthy
choice.

REFERENCES
Davies, A. (2003). Three heresies of language testing research. Language Testing, 20,
355368.
Hughes, A. (1989). Testing for language teachers. Cambridge, England: Cambridge University Press.
McNamara, T. (2003). Looking back, looking forward: Rethinking Bachman. Language
Testing, 20, 466473.

Anda mungkin juga menyukai