Anda di halaman 1dari 3

QUESTION 2 (a)

There are many criteria which should meld together in the process of test
construction. Arguably, the two most important criteria one should keep as it relates to test
construction is that of validity and reliability. These two criteria are essentially above all else
in ensuring the proverbial weight that the test itself carries.

Vincent (1999) purports that validity may be defined as the soundness or


appropriateness of a test or instrument in measuring what it is designed to measure. In simple
terms the validity of a test has to do with assessing objectives that were taught to students.
Thus, for a test to be valid it must be testing concepts that were imparted to the students.
Additionally, in constructing a test a teacher must be able to identify the achievement that is
being measured. For example, a Social Studies test on the functions of the family should not
be measuring a student’s ability to read maps but on the student’s ability to identify and
explain the functions of the family unit. In order to do this, many teachers create a Table of
Specifications. This table gives a list of all that was taught and how many items on the test
will cover those topics. By doing this, a test will be designed in such a way that it will
measure that which was taught.

The second essential element in test construction is reliability. Reliability may be


defined as the degree to which a test or measure produces the same score when applied in the
same circumstances (Nelson, 1997). It is believed that a reliable test should yield same or
very similar results each time it is administered. Many theorists believe that lengthier tests
make for greater reliability than shorter ones as it is believed that any fluctuations in
performance that may occur over the administration of the test will be nullified over the items
of the test. Reliability is also affected by dynamics like objectivity, test-retest interval and
the variation in testing conditions. Tests are said to be more reliable when they are
objectively rather than subjectively scored and if the interval between administering them is
shorter. Oversights in test conditions, like level of distractions, noise levels,
misinterpretation/misunderstanding of test questions will also hamper the reliability of tests.

In summating, test construction is primarily hinged on these two criteria. The validity
of any test is conditioned by its reliability. There is a symbiotic relationship between the two
which must not be overlooked.
QUESTION 2 (b)

In constructing an examination such as the December End of Term Social Studies


examination for the Form 5 2017/2018-year group (see attachment), one of the main driving
forces if not the main one, was to ensure that the content covered was in keeping with the
requirements of the most recent Caribbean Secondary Education Certificate’s (CSEC)
Curriculum document (i.e. May/June 2010 edition) since this is the document that guided all
instruction from the time this year group entered Form 4. This driving force is what fulfils
the first requirement of any test being constructed, in that it must be valid.

This test is highly valid as it not only measures the objectives laid out in the
curriculum document indicated above but it also measures only the topics that have been
taught thus far. It would be extremely invalid and highly unethical as well, to construct a test
to be administered to students who were not taught the material being tested. Additionally, as
the teacher, I can clearly identify the achievement that I am measuring. I want to assess
whether students have grasped the concepts covered in Section A through to Section B (i) of
the syllabus. This is done through a series of both multiple choice and structured questions
appropriated from previous CSEC examination papers.

The second measure of test construction is a test’s reliability. Reliability refers to the
degree to which a test or measure produces the same score when applied in the same
circumstances (Nelson, 1997). However, there are numerous factors which would have
affected the reliability of the aforementioned examination. One factor was the length of the
test. The test was quite lengthy with respect to the number of items as there were 40 multiple
choice questions and three structured questions to be done in 2 hours 15 minutes. While the
length of the test with respect to the number of items would make the test quite reliable, the
time allotted would possible negate the test’s reliability. The superior student would be more
than able to complete the test in the allotted time. The less capable student however, may find
some difficulty in completing the task in the time allotted or may complete the test but not
with a passing grade. Another factor which may affect the reliability of the test would be the
difficulty of items. This test is a balance of straight forward, recall questions as well as
questions requiring more in-depth analysis making it reliable in structure. Both the difficulty
of items and the mark scheme for these questions assist in making this test quite reliable.

At the heart of a reliable test is the ability of said test to be replicated with the same
results. This examination would not be considered reliable if it was to be administered now.
Firstly, two months have gone since the test was administered and many of the students have
had the opportunity to review the topics tested in more depth. Additionally, when the test
was first administered, students were not sufficiently prepared for the examination because of
circumstances within the school context. The students were not sure if they would even have
an examination because of the health and safety concerns at the school. Generally, the period
prior to examination would be used for revision of concepts taught throughout the term. On
this occasion, students had a far shorter revision period. Furthermore, even when the test was
administered the school climate was unsettled and many students were unable to get back into
the routine of school having been out of school for at least four school days. All these factors
would negatively affect the reliability of the test.

Maybe assessments need to be more varied in nature offering a more reliable test that
caters to testing students’ knowledge as well as skill level across the various domains.
Additionally, even if the same test is used, in order to gain a more reliable test the original
test should be re-administered in more appropriate and suitable test conditions for a better
base line comparison to be made if/when retested.

Anda mungkin juga menyukai