Anda di halaman 1dari 47

VALIDITY AND RELIABILITY OF

RESEARCH INSTRUMENT
JEFFERSON S. VALDEZ
MAEd Educ.Mgt.

The measuring instruments are used for data


gathering or collecting data and are important
devices because the success or failure of a study
lies on data gathered.

The significance of any research paper or its


entirety, for that matter, can be put to waste if
the instrumentation is questionable.
As a researcher, you are thus, cautioned to
exercise extra care in designing the data
collection procedures that you will employ in
your research, especially in choosing or
constructing research instruments.

CRITERIA OF A GOOD INSTRUMENT


VALIDITY (truthfulness)

RELIABILITY (consistency & accuracy)

VALIDITY
the degree to which a test or measuring
instrument measures what it intends to measure.
it has to do with soundness, (what the test or
questionnaire measure its effectiveness)
degree to which a test measures what it really
purports to measure

deals with the relationship of data obtained to


the nature of the variable being studied.
the determination of the degree of validity is
through indirect measures.

TYPES OF VALIDITY

Content validity
Concurrent Validity
Criterion related Validity
Construct Validity
Predictive Validity

1. CONTENT VALIDITY
the degree to which an instrument measures an
intended content area.
the degree to which the test represents the
essence, topics, and the test is designed to
measure.
reported in terms of non-numerical data unlike
the other types of validity.

Ways to achieve a high degree of


content validity:

Documentary analysis or Pre-survey


Development of a Table of Specifications (TOS)
Consultation with Experts
Item Writing

Documentary analysis or Pre-survey


At this stage, the researcher familiarized herself
with the theoretical constructs directly related to
the test. The review of related literature and
studies provide a comprehensive knowledge of
the nature of the test criterion.
Focus on the test used, purposes of the said
tests, the areas covered, format, scaling
techniques, etc.

A pre-survey may therefore start the


development phase of the instrument that you
are constructing.

Developing Evaluation Scale for a


School System
Administering two-item questionnaire to a small group of
respondents can generate areas and items for the test you will
construct. In developing an Evaluation Scale for a school
system, for instance the researcher can ask these two
questions to a group of teachers and/or school heads:

1. What conditions in your school enhance childrens


learning?
2. What conditions in your schools reduce pupils
learning?

The lists of conditions generated can become a rich source for


constructing the test items.

Development of a Table of
Specifications (TOS)
A detailed T0S includes areas or concepts,
objectives, number of items and percentage or
proportion of items in each area.
It is advisable to make a 50 to 100% allowance in
the construction of items.

Consultation with Experts


The researchers competence in judging the
instrument is limited, at this point, it is advisable
to consult the thesis advisers or with some
authorities who have the expertise in making
judgement about the representativeness or
relevance of the entries made in the TOS.

Item Writing
At this stage, the creative talent of the writer
should decide what type of items are supposed to
construct: the type of instrument, scoring
techniques.etc.
The quality of the test items, therefore, depends
on the considerable extent upon the researchers
ability to produce ideas and translate into items
and satisfy the TOS.

2.CONCURRENT VALIDITY
The degree to which the test agrees or correlates with a
criterion set up as an acceptable measure. The criterion
is always available at the time of testing.
Correspondence of one measure of a phenomenon with
another of the same construct (administered at the same
time)
Two tools are used to measure the same concept and
then a correlation analysis is performed. The tool which
is already demonstrated to be valid is the gold standard
with which the other measure correlate.

Example:
A researcher wishes to validate a Biology
achievement test he has constructed. He
administers this test to a group of Biology
students. The result of this test is correlated an
acceptable Biology test which has been
previously proven as valid. If the correlation is
high, the Biology test he has constructed is
valid.

3. CRITERION RELATED VALIDITY


It is characterized by prediction of relation to an
outside criterion and by checking a measuring
instrument, either now or in the future, against
some outcome or measure,
The difficulty usually met in this type of validity
is in selecting or judging which criterion should
be used to validate the measure at hand

Example:
If the criterion set for professionalism is in
nursing is belonging to nursing organizations
and reading nursing journals, then couldnt we
count memberships and subscriptions to come
up with a professionalism score?

4. PREDICTIVE VALIDITY
It is determined by showing how well
predictions made from the test are confirmed by
evidence gather at some subsequent time.
The criterion measure against this type of
validity is important because the outcome of the
subjects is predicted.
The ability of one measure to predict another
future measure of the same concept.

Example:
The researcher wants to estimate how well a
student maybe able to do in graduate school
courses on the basis of how well he has done on
test he took in undergraduate courses.
The criterion measure against which the test scores are
validated and obtained are available after a long period
of interval.

Example:
If IQ predicts SAT and SAT predicts QPA, then
shouldnt IQ predict QPA (we could skip SATs
for admission decisions.
If scores on parenthood readiness scale indicate
levels of integrity, trust, intimacy and identity
couldnt this test be used to predict successful
achievement of the developmental tasks of
adulthood?

5. CONSTRUCT VALIDITY
Sometimes called as concept validity
It is the extent to which a test measures the
theoretical construct or trait. This involves such
as those of understanding, appreciation and
interpretation of data.

The main concern lies in the property being


measured or the meaning of the test rather than
the test itself.
Examples are intelligence and mechanical
aptitude tests.

Example:
A researcher wishes to establish the validity
of an IQ (Intelligence Quotient) using Weschler
Adult Intelligence Scale (WAIS). He
hypothesizes that students with high IQ also
have high achievement and those with low IQ,
low achievement. He therefore administers both
WAIS and achievement tests to two groups of
students with high and low IQ respectively. If
the results show the same with the hypothesis,
the test is valid.

RELIABILITY
test is dependable, self consistent and stable
It is concerned with the consistency of the
responses from moment to moment.
The instrument yields the same results over
repeated measures and subjects

Four methods in estimating the


reliability of the good research
instrument:

Test-retest method
Parallel-forms method
Split-half method
Internal consistency method

1. TEST RETEST METHOD


The research instrument is administered twice to the
same group of subjects and the correlation coefficient is
determined.
The limitations of this method are:
1. When the time interval is short, the subjects may
recall his previous responses and this tends to make
correlation high.
2. When the time interval is long, such factors as
unlearning, forgetting, among others, may occur and
may result in low correlation of the test.
.

3. Regardless of the time interval separating


the two administrations, other varying
conditions such as noise, temperature, lightning
and other factors may affect the correlation
coefficient of the research instrument.
A Spearman rank correlation of coefficient or
Spearman rho is a statistics used to measure the
relationship between paired ranks assigned to
individual scores on two variables.

Formula:
rs

1 6ED2
------------------

N3 N
Where: rs = Spearman Rho
ED2 = sum of the squared differences between ranks
N = total number of cases

Interpretation of Correlation
Coefficient Value
Correlation value (r)
0.00

- 0.20

Interpretation
Negligible

0.21 - 0.40

Low / slight

0.41 - 0.70

Marked /moderate

0.71 - 0.90

High

0.91 - 0.99

Very High

1.00

Perfect

2. PARALLEL FORMS METHOD


The test may be administered to the group of
subjects and the paired observations are
correlated.
Two forms of the test must be constructed so
that the content, type of item, difficulty,
instructions for administration and many others,
are similar but not identical.

The correlation between the scores obtained on


paired observations of these two forms
represents the reliability coefficient of the test.
If the coefficient correlation (r) value obtained is
high, the research instrument is reliable.
The higher the reliability coefficient, the lower
the variance. (70 or higher = acceptable)

Example:
The item, Convert 7,000 grams to kilograms in
Form A is parallel to Convert 7 kilograms to
grams in Form B. Moreover, these forms should
have approximately the same average and
variability of scores.
Form A : I am able to tell my partner how I feel.
Form B: My partner tries to understand my
feelings.

Assessment of Depression
Version A : During the past 4 weeks, I have felt
downhearted:
1- everyday, 2 some days, 3- never
Version B: During the past 4 weeks, I have felt
downhearted:
1- never, 2 some days, 3- everyday

Assessment of loneliness
Version A: How often the past months have you felt alone in the
world?
1- everyday
2- some days
3- occasionally
4- never
Version B: During the past 4 weeks, how often have you felt a
sense of lineliness? 1- all of the time
2- sometimes
3- from time to time
4- never

Equivalent or Non-equivalent
rewording?
Version 1: When your boss blames you for something you
did not do, how often do you stick up for yourself?
1 always
2- sometimes
3- never
Version 2: When presented with difficult professional
situations where a superior censures you an act for
which you are not responsible, how frequently do you
respond in assertive way?
1 always
2- sometimes
3- never

3.SPLIT- HALF METHOD


This method may be administered once, but the
test items are divided into two halves. The
common procedure is to divide a test into odd
and even items.
The two halves of the test must be similar but
not identical in content, number of items,
difficulty, means and standard deviations.

Each student obtains two scores, one on the odd


and the other on the even items in the test.
The result is a realibility coefficient for half test,
Spearman-Brown Formula for a whole test.

Formula:
2 (rht)
rwt

= -----------1 + rht

Where :

rwt is the reliability of a whole test; and


rht reliability of a half test.

Interpretation of Correlation
Coefficient Value
Correlation value (r)
0.00

- 0.20

Interpretation
Negligible

0.21 - 0.40

Low / slight

0.41 - 0.70

Marked /moderate

0.71 - 0.90

High

0.91 - 0.99

Very High

1.00

Perfect

4. INTERNAL CONSISTENCY METHOD


This method is used with psychological tests
which consists of dichotomously scored items.
The examinee either passes or fails in an item.

A rating of 1 (one) is assigned for a pass and for


0 (zero) a failure.
The method of obtaining is determined by
Kuder- Richardson Formula 20. measuring the
internal consistency or homogeneity of the
measuring instrument.

Formula:
N
SD2 - Epiqi
rxx = ----------------------N -1
SD2
Where:
SD2

= number of items
= the variance on the scores on test defined as E (X Xm);

----------------

N -1
p iq i
= product of the proportion passing or failing in item i
pi= proportion of individuals passing
qi = proportion of individuals failing (q i = 1- pi)

Interpretation of Correlation
Coefficient Value
Correlation value (r)
0.00

- 0.20

Interpretation
Negligible

0.21 - 0.40

Low / slight

0.41 - 0.70

Marked /moderate

0.71 - 0.90

High

0.91 - 0.99

Very High

1.00

Perfect

Anda mungkin juga menyukai