Anda di halaman 1dari 7

Assessment Theoretical Framework

Lidia A. Escalona, Marcela P. Mellado Departamento de Lenguas, Facultad de Educacin Universidad Catlica de la Santsima Concepcin Concepcin

In the last half of century, second language acquisition (SLA) has become matter of great interest for many scholars around the world. Uncountable theories have been developed in order to explain how a second language is acquired, and the cognitive processes that it implies. These theories are given to second language teachers some valuable hints about how to teach a second language effectively, and how students acquire it. But, how teachers are informed about their lesson effect on learners? Here, an essential part in second language teaching calls out, which is Language Assessment. Gottlieb (2006) defined Assessment as the systematic, iterative process of planning, collecting, analyzing, reporting, and using student data from a variety of sources over time (p.185). This subdiscipline has resulted to be vital in teaching a foreign language, because it allows teachers to assess learners progress in acquiring the language and helps them to manage their lessons properly. Teachers must assess all the time in order to make decisions about students and instructional aspects. Brown( 2004) says that a good teacher never ceases to assess students, whether those assessments are incidental or intended (p. 4). Although this term evokes testing situations, it does not limit to merely use of tests, but it goes beyond. Assessment gives hints about how the students perceive the language, how much they have achieved by using different techniques, such as tasks, questions, tests, among others. The concept Assessment tends to be misunderstood in educational settings, this is erroneously considered as a synonym of Testing and Evaluation, but they are not the same. Coombe Ch., Folse, K., and Hubley, N. (2007), make a clear distinction among these concepts. They state that concept Evaluation is the widest concept which involves

both Assessment and Testing. It focuses on aspects of teaching-learning process, inside and outside the classroom, such as teaching practice, course design, syllabus objectives, curricula, materials among others, whilst Assessment focuses mainly on students achievements and language learning process. Testing is a subcategory of Assessment, which involves the procedures and instruments used to gather information about students ability and achievements. Following the same line, D. Brown (2004) makes a similar differentiation, but instead of using the concept Evaluation, he talks about Teaching. For him, the concept Teaching is the whole, and Assessment and Test are within it. He metaphorically states that Teaching sets up the practice games of language learning (Brown, 2004, p.5), this is, that Teaching offers to students

instances to try out on language, take risks and receive feedback of their performance from the teacher. He also adds that during these practice activities, teachers are indeed observing students performance and making various evaluations of each learnerand these observations feed into the way the teacher provides instruction to each student (Brown, 2004, p.5).

As it was mentioned before, educators are assessing all the time during the teaching /learning process, but how to assess will depend on the purpose and the time it will be done. Brown (2004) presents two dichotomies, informal versus formal assessment, and formative versus summative assessment. Firstly, informal assessment may take place intentionally, through given tasks, activities, etc, or incidentally by unplanned comments and interpretations during teaching, whist formal assessment is always intentional, and makes use of procedures like tests or other tasks specifically designed to assess student achievement. Secondly, formative assessment allows teachers to evaluate students progression, allowing teachers to be aware of instruction effect on learners, and learners strengths and weaknesses during the teaching-learning process. On the contrary, summative assessment is focus on measuring knowledge that students have mastered after as instructional period, being usually applied at the end of a course. Results of summative assessments are recorded, because these kinds of assessments are usually used to show the accomplishment of course objectives. Another distinction that Brown (2004) and Coombe et al. (2007) make about assessment is that between traditional and alternative assessment. Huerta-Macias (as cited in Coombe et al., 2007) declares that Alternative assessment asks students to show what they can do; students

are evaluated on what they integrate and produce rather than on what they are able to recall and reproduce (p.xix). On the contrary, traditional assessments do not urge students to integrate the four skills in using language, but limit them to just knowing about the language and developing skills separately out of context. After analyzing all these categories of Assessments, it might be said that informal, formative and alternative assessments are intrinsically related to each other, because they are significant for the process; and formal, summative and traditional assessments center more in result, used to show achievement. In assessing language formally, educators generally utilize certain instruments to measure students knowledge and abilities, such as tests. In order to success in assessing, teachers must keep in mind purpose, time, test takers background, contruct, impact, among others, before administering a test. Choosing or designing a test is not an easy task, for this reason some authors have provided some guidance for carrying out it successfully. Bachman and Palmer (1996), Brown (2004), and Coombe et al., (2007) have provided some principles/qualities to design develop and analyze a good quality test. These principles are Usefulness, Validity, Reliability, Practicality, Washback, Authenticity, Transparency and Security. These three authors agree that practicality is a matter of convenient issues, such as available resources and time, also easiness to administer and check tests. Also, they coincide that authenticity implies use of authentic material which provides meaningful and contextualized tasks to learners, allowing them to use language in real-world situations, close to students reality. Bachman and Palmer (1996, p.23) define authenticity as the degree of correspondence of the characteristics of a given language test task to the features of a target language task. Another essential principle mentioned by Brown and Coombe is Washback, which refers to the effects of testing on teaching and learning (Coombe et al., 2007, p.xxiv). Washback plays an essential role in instruction, because it provides information to teachers about learners, giving directions to improve teaching/learning process, and at the same time it allows learners to realize their strengths and weaknesses. According to the authors two of the more essential qualities that tests must include are reliability and validity. On the one hand, authors perceive the term reliability as a subject of consistency and trustworthiness of test scores. Bachman and Palmer (1996) mention that when a test is not reliable it cannot provide trustful information about students ability we want to measure. In addition, Brown (2004) says

that a test is reliable when the results of a test administered on two different occasions at the same group of students are similar. He also describes four factors that might threat reliability of tests; Student-Related Reliability, learner-related issues in reliability; Rater Reliability, related to scoring criteria and levels of teachers objectivity or subjectivity in the scoring process; Test Administration Reliability which is about conditions under test is administered; and Test Reliability, linked to test features. Henning (as cited in Coombe et al., 2007) mention just the first three of them, but under other names; he calls Fluctuations in the learners, Fluctuations in Scoring and Fluctuations in Test Administration. On the other hand, validity refers to appropriateness of a test in measuring a determined aspect, Gronlund (as cited in Brown 2004) defines validity as the extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment (p.22). In addition, Coombe et al., (2007) in simple words say that validity refers to test what you teach and how you teach it! (p.xxii), this is that , a test cannot measure something that has not been taught or presented before . Bachman and Palmer (1996) instead of calling it Validity they name it as Construct Validity. They state that it pertains to the meaningfulness and appropriateness of the interpretations that we make on the basis of test scores. Both Brown (2004) and Coombe et al. (2007) subcategorize the term Validity; for Brown (2004), there are five types of validity; Content-Related Evidence, which is about subject matters or the content that must be measured; Construct-Related Evidence is related to the theoretical constructs, and domain of language which will be assessed; Criterion-Related Validity, it is about specific objectives or level of performance that are expected to be achieved; Consequential Validity refers to test effects on learner, teacher or at social level. According to Brown (2004), Washback is immersed within this category, having effect primarily on learners. Face Validity involves mainly aspects of appearance and relevance under learners point of view, clarity in instructions and familiar tasks. Coombe, et al, distinct only three types of Validity: Content Validity, Construct Validity, and Face Validity. Davies et al. (as cited in Coombe, Ch. et al. 2007) states that while reliability focuses on the empirical aspects of the measurement process, validity focuses on the theoretical aspects

To these five principles, Coombe et al., (2007) introduce two more: Transparency and Security. The first one refers to the availability of clear, accurate

information to students about testing (Coombe, 2007, p.xxv). This means that students must be informed about the test format, allowed time for complete the test, the ideal score, weighting of items, grading criteria, among others. The other principle, Security refers to test leak, this principle is violated when test contents and answers are known by the test-takers, like a public domain before administration of the test. According to Coombe et al., (2007) it is part of reliability and validity of the tests. To avoid breaking this principle teachers must be careful in repeating the same tests every year, it might be used just parts of it, but not completely. In the same way like Coombe et al (2007), Bachman and Palmer (1996) introduce two new ones, Interactiveness and Impact. They define Interactiveness as the extent and type of involvement of the test-takers individual characteristics in accomplishing a test task (Bachman & Palmer, 1996, p.25). In other words, it refers to how students are involved with the test, language appropriateness in accordance with the level of students ability. Also, this calls forth for others areas, such as previous knowledge, learners interest, among others, interactiveness of a given language test task can thus be characterized in terms of ways in which the test takers areas of language knowledge, metacognitive strategies, topical knowledge, and affective schemata are engaged by the test task(Bachman & Palmer, 1996, p.25). The second one is Impact; it refers to the effect of test results on individuals or society. According to the authors, Impact operates at two levels: a micro level, in terms of the individuals who are affected by the particular test use, and micro level, in terms of the educational system or society (Bachman & Palmer, 1996, p.30). For them, Washback is within this principle, especially at micro level. Bachman and Palmer (1996), encompass all these principles; Reliability, Construct Validity, Authenticity, Interactiveness, Impact and Practicality under the umbrella of the term Usefulness. They state that the most important consideration in designing and

developing a language test is the use for which it is intended, so the most important quality of a test is its usefulness (Bachman & Palmer, 1996, p.17); he adds that for developing a good quality test should exist a balance among all those qualities.

To conclude, assessing language is an important part of teaching a second language, because it informs teacher about instruction effect on learners. In addition, after analyzing the authors viewpoint about principles which must be covered for good quality tests, it might be said that all these principles give guidelines to fulfill with language assessing successfully, helping teachers to design and choose the more

suitable test for the students in accordance with the purpose and context that surround the assessment. Assessment must be coherent with the purpose which a teacher wants to accomplish, including appropriateness of contents for students and students proficiency level.


Bachman, L. F. & Palmer A. S. (1996). Language Testing in Practice: designing and developing useful language tests. Oxford: Oxford University Press Brown, D. (2004). Language Assessment: Principles and Classroom Practices. The United States of America: Longman. Coombe Ch., F. K. (2007). A Practical Guide to Assessing English Language Learners. The United States of America: The University of Michigan Press. Gottlieb, M. H. (2006). Assessing English Language Learners: bridges from language proficiency to academic achievement. California, United States of America: Corwin Press.