Anda di halaman 1dari 3

Criteria of tests

Three most important criteria for any good tests are validity, reliability and practicality.
Validity
Validity helps us to see if a test is measuring what it claims to measure. The goal of this
test is to measure a practical skill and if the test is measuring other skills at the same time
then its not valid. We can say that a reading test is not valid if the exam depends on
information that is not provided in a text. For example: if students are not familiar with
American or British culture.
There are 4 types of validity: face validity, content validity, construct and empirical
validity.
Face validity
A test has face validity if it looks like its measuring what it is supposed to measure.
Sometimes the teacher can be so involved in the test that he/she fails to look at the
individual test items objectively. It is recommended that the teacher shows his/hers
constructed test to colleagues and friends. Other people can discover if there are some
ambiguities in the test.
Content validity
A test has content validity if it contains sample of language skills structures. That means
if the teacher wants to write a grammar test, the test must be made of items testing
knowledge of grammar. That means that the test should include proper sample of the
relevant structures. The relevant structures will depend upon the purpose of the test. For
example: we dont expect that an achievement test for intermediate learners will contain
the same set of structures as one for advanced learners. In order to do that the test writers
should draw up a table of test specifications, describing which language skills and areas
will bi included in the test.
Construct validity
Teachers should construct the test after researching testees behavior
and learning.
This type of validity is essential to any language test Construct validity
as the extent to which we can interpret a given test score as an
indicator of the ability(ies), or construct(s), we want to measure.
(Assessing grammar, James Purpura).
For example: a speed reading test based on a short comprehension
passage is an inadequate measure of reading ability

Criterion-Related Validity
Criterion-Related Validity is the degree to which results on the test agree with

those provided by some independent and highly dependable assessment of


the candidates ability. (Arthur Hughes)
There are two types of criterion-related validity: concurrent criterion validity
and predictive validity
1. Concurrent Criterion Validity is when the results are supported by
other performance besides the assessment itself.
2. Predictive validity- is used to predict candidates future performance.
Example: a placement test

Reliability
A test is reliable if the teacher administered to the same group of candidates on different
occasions, and if the results are different then its not reliable.
There are some factors that need to be consider that can influence the reliability of the
test:
- test instructions
- personal factors: such as motivation, illness, anxiety
- scoring the test
There are a couple of methods measuring the reliability of the test. One method is to
administer the same test after certain time. Another method is giving a similar test. That
means the test must be identical with the first test, it needs to be of the same difficulty,
lengths, rubric, etc. if the results are similar, then the test is reliable.
Also, if a test is written poorly, if some questions are ambiguous, then the test is not
reliable.
Reliability VS validity
Two main criteria of any good tests are test validity and reliability.
The ideal test for both students and teachers would be if the test is both valid and reliable.
A test is not valid if it cannot measure what is supposed to measure. That means that a
valid test also needs to be reliable.
It is difficult to make a test that will be reliable and valid at the same time.

For example: a multiple choice test can be made to be highly reliable, but the results of
that test cannot be valid we cannot measure ones language ability.
When making a test one criteria will be maximizes at the expense of the other one.
Practicality
A test is practical if its not expensive, relatively easy to administer and is easy to score.
If a test is expensive, if a student needs 5 hours to solve it and if teacher needs several
hours to evaluate the test which took students to solve in a few minutes then the test is
impractical.
Discrimination
Discrimination is the capacity of a test to discriminate among different candidates and to
reflect the differences in the performances of the individuals in the group. Tests on which
almost all candidates score 70% clearly fail to discriminate between various students.
Example: PLACEMENT TESTS

Anda mungkin juga menyukai