Anda di halaman 1dari 7

! ! !

"!

!"#"$%"&'

89:%

!"#$%&'()*#+#%,%-(+.%/0.1%
2"'0+$%3#0'%4%5#(6"))"%70'"$$"%
%%%%%%

Universit du Qubec Trois-Rivires DME1010 Principles of language testing and evaluation Professor Sonia El Euch

%&'(!)*+,-./.! ! As future teachers, it is of the upmost importance to know how to build, administer and evaluate tests. This report will attempt to analyze the strengths and weaknesses of a test and suggest improvements, if needed, to the measurement instrument. The chosen exam was created by the Commission Scolaire des Samares and adapted from a previous exam by the Ministre de lducation du Qubec des Loisirs et des Sports (MELS). The exam was acquired through an English teacher at Polyvalente

0!

des Chutes in Rawdon. The title of the evaluation situation is You are What You Wear, which was designed for the secondary core program of cycle 1 year 2 students. This evaluation situation was first used as an end of cycle 1 summative assessment that was administered in June of 2011. To begin the analysis, we will first discuss the exams strong points. First, our chosen exam clearly indicates that it is intended for the core program of secondary cycle year two students as a summative assessment. In addition, the theme that was chosen for this level is suitable and motivating for 13 and 14 year old teenagers. At this age, they are very self-conscious of the image they project and can relate their self-identity through clothing and style. All three competencies proposed by the MELS are incorporated in the exam, except for the oral comprehension, and are evaluated independently. The test is composed of three different tasks that correspond directly to the MELSs three competencies (i.e. task 1(T1) = Competency 1 (C1), task 2 (T2) = Competency 2 (C2), task 3 (T3)= Competency (C3)). A logical sequence for task layout is presented to the students. Indeed, the exam begins with C1 since discussions provide an easy channel to activate prior knowledge and introduce a topic. It is also logical to end the evaluation situation with C3 since this competency requires the gathering of pertinent information

%&'(!)*+,-./.! !

1!

for its completion. Moreover, each task respects the progression of learning. For example, in task one, students are asked to share and discuss their opinion concerning their perceptions about clothes and their personal image through the use of proposed functional language appropriate for that level. Within these three tasks, different types of items are used. The test-takers are presented mostly with open-ended items, which require the student to either express an opinion, to agree or disagree and to engage in a response process as obliged by the MELS. At the outset on task 1, the type of items presented enable low-level learners to experience some success. Moreover, the level of difficulty of each item seems to be appropriate for the level. Items are not too easy or too difficult. The degree of difficulty is well balanced. In the final task, a restricted performance is expected by the test-taker. The tasks are not equally balanced in the number of items per task but their distribution (T1=10 items, T2=6 items, T3= 1item) reflects the weight attributed by the MELS (C1=40%, C2=30%, C3=30%) for each competency. We cannot evaluate the distribution of multiple-choice items (MCI) or true or false items since there are no such items in the test. Greater importance is given to C1, which reflects the weights, imposed by the MELS, for each competency. All instructions in the students booklets are presented in the target language through a visual channel. The use of both channels is not warranted, according to us, for providing task instructions. Additional information is provided to the students regarding the material to be used during the exam. There is no mention of what not to use or bring, but we feel that the instructions given are explicit enough. Visually, the layout of the test appears to have an adequate spacing between items. Each new task is presented in a new page and individual items offer plenty of space for students to write their answers and to follow the tasks logical

%&'(!)*+,-./.! ! sequence. The evaluation situation is designed to be prolonged over three to four class

2!

periods. According to the tasks that students are to complete, the length of the test seems to be adequate. In order to assess content validity of the evaluation instrument, we must have a clear understanding of the material taught in class. Since we do not have this information, we cannot justify an answer. The content does not discriminate against students with different backgrounds. The topic is appropriate for ethnic minorities, boys and girls, as well as the physically impaired. As we can see up to now, the exam is well founded on the MELS requirements. Every task is composed of an evaluation grid, which respects the evaluation criteria within the MELS program. So far, we have overviewed the general structure of the test. Now, we will analyze with deeper thought the test tasks of the exam. Task 2 involves the evaluation of C2. According to the communicative learning teaching approach, authentic material must be used throughout the exam. Our exam presents authentic written texts that are designed by native speakers of English for native speakers of English. We came to this conclusion by crossreferencing the available referenced resources made available by the test designer(s). In order to assess overall comprehension, students are asked to perform a series of supply response items in order to demonstrate overall comprehension. No factual questions are asked throughout the exam. The syntactic structure of each item renders them easier than the actual text. Wording is simple and to the point. The oral interaction task, which is related to C1, is a discussion survey about ones clothing personality. The chosen topic enables every student to participate freely in the discussion by sharing his or her personal tastes and preferences. Clothing is an interesting topic for teenagers, thus promoting animated discussions and exchanges. In task three, the students are clearly instructed to

%&'(!)*+,-./.! !

3!

write a friendly letter to one of the people presented through a series of images. They are to explain what these peoples clothing conveys as a message. In the previous paragraph, we have discussed the various strengths of the evaluation tool. Weaknesses also appear randomly throughout the exam. To begin, the different language skills have not been given any weights. The weights are not mentioned throughout the exam and it is not clear to the students in a concrete fashion even though the exam is an end-of-cycle assessment. Explanations of the procedures in task two are confusing and grouped within one task. For example, it is mentioned that the students should take notes, but does not stipulate what exactly they should take notes of or what pertinent information they should pay attention to. Furthermore, the C2 is partially integrated. The test does not include the reinvestment of oral text comprehension. For the purpose of this assignment we have included an oral comprehension task from a different test: The Mystery of Oak Island. In this test, the audio material for the oral text is not authentic according to us since it seemed to have been designed specifically for the purpose of the examination. The speech rate does not seem natural. Not enough items are presented in order to have a clear measurement of the students overall understanding. Moreover, the task seems overly complicated with many steps, which include cutting images for sequencing and finding the GIST. Although it does present the reinvestment of written texts, this reinvestment task does not correspond to reinvestment task criteria. Presented as the Integration Activity, students are asked to write two statements from the written texts that they believe not true. This task does not call for overall comprehension and to stipulate why they chose such a statement. As mentioned previously, the evaluation situation is meant to last three or four periods, however the

%&'(!)*+,-./.! ! time allotted per task is not mentioned in the instructions. We also believe that task 1 does not engage students completely on task. In our opinion, students might be tempted to simply read out their answers through the survey. Spontaneity might be impeded and discussions will most likely stall through such an activity. The final activity does not mention at any point the length of the letter, whether it be amount of words or paragraphs. As for the evaluation grids, each task proposes a holistic evaluation-scoring

4!

grid. However, the scales different level descriptions do not present themselves as being reliable. The descriptions present too much subjectivity and could be interpreted differently by different raters. They also require the raters to present consistent reading abilities. This could be interpreted as being inter and intra-rater unreliable. According to us, the evaluation instrument presents a good number of qualities. Its main aspect relies in the way it relates to the MELS orientations since it is competency based. Nonetheless, there is still a major flaw in relation to C2. This competency must include the reinvestment of oral and written texts in order to evaluate accordingly the test-takers. For an end-of-cycle assessment, unfortunately, this measuring tool does not respect the MELS obligations. The C2 competency only includes the reinvestment of written texts, which is not adequate with respect to the MELS. The exam raises questions concerning the C2 competency, rendering construct validity debatable. As for content validity, we believe that the exam does present content validity. We base our conclusions on the fact that the material being tested (functional language, explicative letters, stating opinions) are in line with the progressions of learning that should have been taught all through the course curriculum. Unfortunately, there is a major issue concerning the scoring grids of the exam. The holistic grid descriptions are poorly elaborated. As

%&'(!)*+,-./.! ! previously mentioned, the grids allow for too much subjectivity and too many unclear

5!

elements resulting in doubtful reliability. As an end-of-cycle exam, evaluation grids must be precise in order to offer an unbiased judgment about the students language abilities. We would not use the exam in a similar summative assessment context, but we would use it as a formative assessment tool. However, modifications would have to be implied in order to make use of this measurement tool. First, C2 should be reorganized so as to incorporate comprehension of oral texts. Also, clearer instructions for the unfolding of the task should be made. The instructions should be separated in different activities within the task. Finally, the reinvestment should also be modified to respect reinvestment qualities. We would ask the students to further justify their statement choices in order to respect a valid reinvestment task. We would also adapt the evaluation grid in order to achieve reliable scoring results. In conclusion of this in-depth analysis of a measurement instrument, this assignment has helped us develop a critical mindset in relation to testing in a communicative context. We were able to focus our attention to the basic principles of testing in relation to the important elements needed when testing in a communicative context. It was also brought to our attention that notwithstanding the tests origins, teachers must carry out their due diligence in order to assess reliability and validity throughout the exam. This assignment has made us familiar with testing and evaluation concepts that are used in the field of testing and evaluation in an academic context. As teachers, we are called on to use and design tests. Testing is an important aspect of the profession and we must acquire the competency required to accomplish such a task.

Anda mungkin juga menyukai