Anda di halaman 1dari 20


/ (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

A Five-Dimensional Framework for

Authentic Assessment

Judith T. M. Gulikers
Theo J. Bastiaens
Paul A. Kirschner

Authenticity is an important element of new It is widely acknowledged that in order to

modes of assessment. The problem is that what meet the goals of education, a constructive align-
authentic assessment really is, is unspecified. ment between instruction, learning and assess-
In this article, we first review the literature on ment (ILA) is necessary (Biggs, 1996).
authenticity of assessments, along with a Traditional frontal classroom instruction for
five-dimensional framework for designing learning facts, assessed through short-answer or
authentic assessments with professional multiple-choice tests, is an example of such an
practice as the starting point. Then, we present alignment. The ILA-practices in this kind of edu-
the results of a qualitative study to determine cation can be characterized as instructional
if the framework is complete, and what the approach—knowledge transmission; learning
relative importance of the five dimensions is in approach—rote memorization; and assessment
the perceptions of students and teachers of a procedure—standardized testing (Birenbaum,
vocational college for nursing. We discuss 2003). This approach to assessment is also
implications for the framework, along with known as the testing culture (Birenbaum &
important issues that need to be considered Dochy, 1996) and consists primarily of
when designing authentic assessments. decontextualized, psychometrically designed
items in a choice-response format to test for
knowledge and low-level cognitive skill acquisi-
tion. The tests are primarily used in a summa-
tive way to differentiate between students and
rank them according to their achievement. How-
ever, the alignment compatible with present-
day educational goals has changed over the
years. Current educational goals focus more on
the development of competent students and
future employees than on simple knowledge
acquisition. The ILA-practices that characterize
these goals are instructional-approach—focused
on learning and competence development;
learning-approach—reflective-active knowl-
edge construction; and assessment-procedure—
contextualized, interpretative, and performance
assessment (Birenbaum, 2003). Here, the goal of
assessment is the acquisition of higher-order
thinking processes and competencies instead of
factual knowledge and basic skills. The function
of the assessment changes from being summa-

ETR&D, Vol. 52, No. 3, 2004, pp. 67–86 ISSN 1042–1629 67

AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

68 ETR&D, Vol. 52, No. 3

tive to also serving a formative goal of promot- assessment this means that (a) tasks must appro-
ing and enhancing student learning. This view priately reflect the competency that needs to be
requires alternative assessments because stan- assessed, (b) the content of an assessment
dardized, multiple-choice tests are not suitable involves authentic tasks that represent real-life
for this (Birenbaum & Dochy, 1996; Segers, problems of the knowledge domain assessed,
Dochy, & Cascallar, 2003). Birenbaum and and (c) the thinking processes that experts use to
Dochy (1996) characterized alternative assess- solve the problem in real life are also required by
ments as follows: Students have a responsibility the assessment task (Gielen et al., 2003). Based
for their own learning; they reflect, collaborate, on these criteria, authentic competency-based
and conduct a continuous dialogue with the assessments have a higher construct validity for
teacher. Assessment involves interesting real- measuring competencies than so-called objec-
life or authentic tasks and contexts as well as tive or traditional tests have.
multiple assessment moments and methods to
Consequential validity describes the intended
reach a profile score for determining student
and unintended effects of assessment on instruc-
learning or development. Increasing the authen-
tion or teaching (Biggs, 1996) and student learn-
ticity of an assessment is expected to have a pos-
itive influence on student learning and ing (Dochy & McDowell, 1998). As stated,
motivation (eg., Herrington & Herrington, Biggs’s (1996) theory of constructive alignment
1998). Authenticity, however, is only a vaguely stresses that effective education requires instruc-
described dimension of assessment, because it is tion, learning, and assessment to be compatible.
thought to be a familiar and generally known If students perceive a mismatch between the
concept that needs no explicit defining messages of the instruction and the assessment,
(Petraglia, 1998). This article focuses on defining a positive impact on student learning is unlikely
authenticity in competency-based assessment, (Segers, Dierick, & Dochy, 2001). This impact of
without ignoring the importance of other char- assessment on instruction and on student learn-
acteristics of alternative assessment. ing is corroborated by researchers as
Based on an extensive literature study, a the- Frederiksen (1984, “The Real Test Bias”), Pro-
oretical framework consisting of five dimen- dromou (1995, “Backwash Effect”), Gibbs (1992,
sions of assessment that can vary in their degree “Tail Wags the Dog”), and Sambell and McDow-
of authenticity is presented. After the descrip-
ell (1998, “Hidden Curriculum”). Fredericksen
tion of this framework, the results of a qualita-
and Prodromou implied that tests have a strong
tive study are discussed. This study explored
influence on what is taught, because teachers
whether the framework is a complete descrip-
teach to the test, even though the test might
tion of authenticity or is missing important ele-
ments, and what the relative importance of the focus on things the teacher does not find most
dimensions is in the perceptions of students and important. Gibbs emphasized that student
teachers at a nursing college. learning is largely dependent on the assessment
and on student perceptions of the assessment
requirements. Sambell and McDowell held that
the effects of instruction and assessment on
The Importance of Authentic
Competency-Based Assessment
learning are largely based on teacher and stu-
dent perceptions of the curriculum, which can
deviate from the actual intentions of the curricu-
The two most important reasons for using
lum. All four ideas support the proposition that
authentic competency-based assessments are (a)
learning and assessment are two sides of the
their construct validity and (b) their impact on
student learning, also called consequential same coin, and that they strongly influence each
validity (Gielen, Dochy, & Dierick, 2003). Con- other. To change student learning in the direction
struct validity of an assessment is related to of competency development, authentic compe-
whether an assessment measures what it is sup- tency-based instruction aligned to authentic
posed to measure. With respect to competency competency-based assessment is needed.
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


Defining Authentic Assessment resemblance to the criterion situation. This idea

is extended and specified by the theoretical
framework that describes that an assessment can
The question is thus, What is authenticity? Dif-
resemble a criterion situation along a number of
ferent researchers have different opinions about
authenticity. Some see authentic assessment as a
synonym for performance assessment (Hart, Complicating matters is the fact that authen-
1994; Torrance, 1995), while others argue that ticity is subjective (Honebein, Duffy & Fishman,
authentic assessment puts a special emphasis on 1993; Huang, 2002; Petraglia, 1998) and is
the realistic value of the task and the context dependent on perceptions. This implies that
(Herrington & Herrington, 1998). Reeves and what students perceive as authentic is not neces-
Okey (1996) pointed out that the crucial differ- sarily the same as what teachers and assessment
ence between performance assessment and developers see as authentic. If these perceptions
authentic assessment is the degree of fidelity of do indeed differ, then the fact that teachers usu-
the task and the conditions under which the per- ally develop authentic assessments according to
formance would normally occur. Authentic their own view causes a problem: Although we
assessment focuses on high fidelity, whereas this may do our best to develop authentic assess-
is not as important an issue in performance ments, this may all be for nothing if the learner
assessment. These distinctions between perfor- does not perceive them as such. This process,
mance and authentic assessment indicate that known as preauthentication (Huang, 2002;
every authentic assessment is performance Petraglia, 1998), can be interpreted either as that
assessment, but not vice versa (Meyer, 1992) it is impossible to design an authentic assess-
ment, or that it is very important to carefully
Savery and Duffy (1995) defined authenticity
examine the experiences of the users of the
of an assessment as the similarity between the
authentic assessments, before designing authen-
cognitive demands—the thinking required—of
tic assessments (Nicaise, Gibney & Crane, 2000).
the assessment and the cognitive demands in the
We chose the latter interpretation.
criterion situation on which the assessment is
based. A criterion situation reflects or simulates This discussion about authentic assessment
a real-life situation that could confront students and validity shows that:
in their internship or future professional life. 1. In light of the constructive alignment theory
Darling-Hammond and Snyder (2000) argued (Biggs, 1996) authentic assessment should be
that dealing only with the thinking required is aligned to authentic instruction in order to
too narrow. In their view, students need to positively influence student learning.
develop competencies because real life demands 2. Authentic assessment requires students to
the ability to integrate and coordinate knowl- demonstrate relevant competencies through
edge, skills, and attitudes, and the capacity to a significant, meaningful, and worthwhile
apply them in new situations (Van Merriënboer, accomplishment (Resnick, 1987; Wiggins,
1997). Birenbaum (1996) further specified the 1993).
competency concept by emphasizing that stu-
3. Authenticity is subjective, which makes stu-
dents need to develop not only cognitive compe-
dent perceptions important for authentic
tencies such as problem solving and critical
assessment to influence learning.
thinking, but also meta-cognitive competencies
such as reflection, and social competencies such These three elements led to the following gen-
as communication and collaboration. eral framework (Figure 1) for the place of
The definition of authentic assessment used in authentic assessment in educational practices.
this study is: an assessment requiring students The concept of authentic achievement, as we
to use the same competencies, or combinations use it here, requires a note of explanation. This
of knowledge, skills, and attitudes, that they article deals with authentic assessment in gen-
need to apply in the criterion situation in profes- eral, regardless of the level or field of endeavor.
sional life. The level of authenticity of an assess- This does not mean that we dismiss the concept
ment is thus defined by its degree of of authentic academic achievement (Newmann,
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

70 ETR&D, Vol. 52, No. 3

Figure 1 General framework.

1997), but rather that we see it as a specific sub- ment were distinguished: (a) the assessment
set within a specific field of endeavor, namely task, (b) the physical context, (c) the social con-
becoming an academic. In this we concur with text, (d) the assessment result or form, and (e)
Brown, Collins and Duguid (1989) who, too, saw the assessment criteria. These dimensions can
authentic achievement to be more than authentic vary in their level of authenticity (i.e., they are
academic achievement. continuums). It is a misconception to think that
The following section discusses five dimen- something is either authentic or not authentic
sions (a theoretical framework) that can vary in (Cronin, 1993; Newmann & Wehlage, 1993),
their degree of authenticity in determining the because the degree of authenticity is not solely a
authenticity of an assessment. The purpose of characteristic of the assessment chosen; it needs
this framework is to shed light on in the concept to be defined in relation to the criterion situaiton
of assessment authenticity and to provide guide- derived from professional parctice. For example:
lines for implementing authenticity elements carrying out an assessment in a team is authentic
into competency-based assessment. only if the chosen assessment task is also carried
out in a team in real life. The main point of the
framework is that each of the five dimensions
can resemble the criterion situation to a varying
FRAMEWORK FOR AUTHENTIC degree, thereby increasing or decreasing the
ASSESSMENT authenticity of the assessment.
Because authentic assessment should be
To define authentic assessment, we carried out a aligned to authentic instruction (Biggs, 1996;
review of literature on authentic assessment, on Van Merriënboer, 1997), the five dimensions of a
authenticity and assessment in general, and on framework for authentic assessment are also
student perceptions of (authentic) assessment applicable to authentic instruction. Even though
elements. Five dimensions of authentic assess- the focus of this article is on authentic assess-
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


ment, an interpretation of the five dimensions in a conceptualization of these five aspects as

for authentic instruction is included in this arti- dimensions that can vary in their degree of
cle to show how the same dimensions can be authenticity.
used to create an alignment between authentic
instruction and authentic assessment. The
dimensions and the underlying elements of Task. An authentic task is a problem task that
authentic instruction as presented in Figure 2 confronts students with activities that are also
and Figure 3 do the same for authentic assess- carried out in professional practice. The fact that
ment. an authentic task is crucial for an authentic
assessment is undisputed (Herrington &
As the figures show, learning and assessment
Herrington, 1998; Newmann, 1997; Wiggins,
tasks are a lot alike. This is logical, because the
1993), but different researchers stress different
learning task stimulates students to develop the
elements of an authentic task. Our framework
competencies that professionals have and the
defines an authentic task as a task that resembles
assessment task asks students to demonstrate
the criterion task with respect to the integration
these same competencies without additional
of knowledge, skills, and attitudes, its complex-
support (Van Merriënboer, 1997). Schnitzer
ity, and its ownership (see Kirschner, Martens, &
(1993) stressed that for authentic assessment to
Strijbos, 2004). Furthermore, the users of the
be effective, students need the opportunity to
assessment task should perceive the task,
practice with the form of assessment before it is
including above elements, as representative, rel-
used as an assessment. This implies that the
evant, and meaningful.
learning task must resemble the assessment
task, only with different underlying goals. An authentic assessment requires students to
Learning tasks are for learning, and assessment integrate knowledge, skills, and attitudes as pro-
tasks are for evaluating student levels of learn- fessionals do (Van Merriënboer, 1997). Further-
ing in order to improve (formative), or in order more, the assessment task should resemble the
to make decisions (summative). These models complexity of the criterion task (Petraglia, 1998;
show how a five-dimensional framework can Uhlenbeck, 2002). This does not mean that every
deal with a (conceptual) alignment between assessment task should be very complex. Even
authentic instruction and assessment. The inter- though most authentic problems are complex,
pretation and validation of the five dimensions involving multidisciplinarity, ill-structuredness,
for authentic assessment will be further and having multiple possible solutions
explained and examined in the rest of this arti- (Herrington & Herrington, 1998; Kirschner,
cle. 2002; Wiggins, 1993), real-life problems can also
be simple, well structured with one correct
answer, and requiring only one discipline (Cro-
nin, 1993). The same need for resemblance holds
An Argumentation for the Five for ownership of the task and of the process of
Dimensions of Authentic Assessment developing a solution. Ownership for students
in the assessment task should resemble the own-
As stated, there is confusion and there exist ership for professionals in the criterion task. Sav-
many differences of opinions about what ery and Duffy (1995) argued that giving
authenticity of assessment really is, and which students ownership of the task and the process
assessment elements are important for authen- to develop a solution is crucial for engaging stu-
ticity. To try to bring some clarity to this situa- dents in authentic learning and problem solving.
tion, the literature was reviewed to explicate the On the other hand, in real life, assignments are
different ideas about authenticity. Many sub- often imposed by employers, and professionals
concepts and synonyms came to light, which often use standard tools and procedures to solve
were conceptually analyzed and divided into a problem, both decreasing the amount of own-
categories, resulting in five main aspects of ership for the employer. Therefore, the theoreti-
authenticity. The notion of authenticity as a con- cal framework argues that in order to make
tinuum (Newmann & Wehlage, 1993) resulted students competent in dealing with professional
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

72 ETR&D, Vol. 52, No. 3

Figure 2 Five-dimensional model for authentic instruction.

AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


Figure 3 Five-dimensional model for authentic assessment.

AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

74 ETR&D, Vol. 52, No. 3

problems, the assessment task should resemble contain relevant as well as irrelevant informa-
the complexity and ownership levels of the real- tion (Herrington & Oliver), should resemble the
life criterion situation. resources available in the criterion situation. For
Up to this point, task authenticity appears to example, Resnick (1987) argued that most school
be a fairly objective dimension. This objectivity tests involve memory work, while out-of-school
is confounded by Sambell, McDowell, and activities are often intimately engaged with tools
Brown (1997), who showed that it is crucial that and resources (calculators, tables, standards),
students perceive a task as relevant, that (a) they making such school tests less authentic. Segers
see the link to a situation in the real world or et al. (1999) argued that it would be inauthentic
working situation; or (b) they regard it as a valu- to deprive students of resources, because profes-
able transferable skill. McDowell (1995) also sionals do rely on resources. Another important
stressed that students should see a link between characteristic crucial for providing an authentic
the assessment task and their personal interests physical context is the time students are given to
before they perceive the task as meaningful. perform the assessment task (Wiggins, 1989).
Clearly, perceived relevance or meaningfulness Tests are normally administered in a restricted
will differ from student to student and will pos- period of time, for example two hours, com-
sibly even change as students become more pletely devoted to the test. In real life, profes-
experienced. sional activities often involve more time
scattered over days or, on the contrary, require
fast and immediate reaction in a split second.
Physical context. Where we are, often if not
Wiggins (1989) said that an authentic assess-
always, determines how we do something, and
ment should not rely on unrealistic and arbitrary
often the real place is dirtier (literally and figura-
time constraints. In sum, the level of authenticity
tively) than safe learning environments. Think,
of the physical context is defined by the resem-
for example, of an assessment for auto mechan-
blance of these elements to the criterion situa-
ics for the military. The capacity of a soldier to
find the problem in a nonfunctioning jeep can be
assessed in a clean garage, with all the conceiv-
ably needed equipment available, but a future Social context. Not only the physical context, but
physical environment may possibly involve a also the social context, influences the authentic-
war zone, inclement weather conditions, less ity of the assessment. In real life, working
space, and less equipment. Even though the task together is often the rule rather than the excep-
itself is authentic, it can be questioned whether tion, and Resnick (1987) emphasized that learn-
assessing students in a clean and safe environ- ing and performing out of school mostly takes
ment really assesses their ability to wisely use place in a social system. Therefore, a model for
their competencies in real-life situations. authentic assessment should consider social pro-
The physical context of an authentic assess- cesses that are present in real-life contexts. What
ment should reflect the way knowledge, skills, is really important in an authentic assessment is
and attitudes will be used in professional prac- that the social processes of the assessment
tice (Brown et al., 1989; Herrington & Oliver, resemble the social processes in an equivalent
2000). Fidelity is often used in the context of com- situation in reality. At this point, this framework
puter simulations, which describe how closely a disagrees with literature on authentic assess-
simulation imitates reality (Alessi, 1988). ment that defines collaboration as a characteris-
Authentic assessment often deals with high- tic of authenticity (e.g., Herrington &
fidelity contexts. The presentation of material Herrington, 1998). Our framework argues that if
and the amount of detail presented in the con- the real situation demands collaboration, the
text are important aspects of the degree of fidel- assessment should also involve collaboration,
ity. Likewise, an important element of the but if the situation is normally handled individ-
authenticity of the physical context is that the ually, the assessment should be individual.
number and kinds of resources available When the assessment requires collaboration,
(Segers, Dochy, & De Corte, 1999), which mostly processes such as social interaction, positive
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


interdependency and individual accountability realistic outcome, explicating characteristics or

need to be taken into account (Slavin, 1989). requirements of the product, performance, or
When, however, the assessment is individual, solutions that students need to create. Further-
the social context should stimulate some kind of more, criteria and standards should concern the
competition between learners. development of relevant professional competen-
cies and should be based on criteria used in the
real-life situation (Darling-Hammond & Snyder,
Assessment result or form. An assessment involves
an assessment assignment (in a certain physical
Besides basing the criteria on the criterion sit-
and social context) that leads to an assessment
uation in real life, criteria of an authentic assess-
result, which is then evaluated against certain
ment can also be based on the interpretation of
assessment criteria (Moerkerke, Doorten, & de
the other four dimensions of the framework. For
Roode, 1999). The assessment result is related to
example, if the physical context determines that
the kind and amount of output of the assessment
an authentic assessment of a competency
task, independent of the content of the assess-
requires five hours, a criterion should be that
ment. In the framework, an authentic result or
students need to produce the assessment result
form is characterized by four elements. It should
within five hours. On the other hand, criteria
be a an (a) quality product or performance that
based on professional practice can also guide the
students can be asked to produce in real life
interpretation of the other four dimensions of
(Wiggins, 1989). This product or performance
authentic assessment. In other words, the frame-
should be a (b) demonstration that permits mak-
work argues for a reciprocal relationship
ing valid inferences about the underlying com-
between the criterion dimension and the other
petencies (Darling-Hammond & Snyder, 2000).
four dimensions.
Since the demonstration of relevant competen-
cies is often not possible in one single test, an
authentic assessment should involve a (c) full
array of tasks and multiple indicators of learn- Some Considerations
ing in order to come to fair conclusions (Darling-
Hammond & Snyder, 2000). Uhlenbeck (2002) What does all of this mean when teachers or
showed that a combination of different assess- instructional designers try to develop authentic
ment methods adequately covered the whole assessments? What do they need to consider?
range of professional teaching behavior. Finally, The first consideration deals with predictive
students should (d) present their work to other validity. If the educational goal of developing
people, either orally or in written form, because competent employees is pursued, then increas-
it is important that they defend their work to ing the authenticity of an assessment will be
ensure that their apparent mastery is genuine valuable. More authenticity is likely to increase
(Wiggins, 1989). the predictive validity of the assessment because
of the resemblance between the assessment and
Criteria and standards. Criteria are those charac- real professional practice. However, one should
teristics of the assessment result that are valued; not throw the baby out with the bath water.
standards are the level of performance expected Objective tests are still very useful for certain
from various grades and ages of students (Arter purposes as high-stakes summative assessments
& Spandel, 1992). Setting criteria and making on individual achievement, where predicting
them explicit and transparent to learners before- student ability to function competently in future
hand is important in authentic assessment, professional practice is not the purpose.
because this guides learning (Sluijsmans, 2002) Another consideration in designing authentic
and, after all, in real life, employees usually assessment is that we should not lose sight of the
know on what criteria their performances will educational level of the learners. Lower-level
be judged. This implies that authentic assess- learners may not be able to deal with the authen-
ment requires criterion-referenced judgment. ticity of a real, complex, professional situation. If
Moreover, some criteria should be related to a they are forced to do this, it may result in cogni-
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

76 ETR&D, Vol. 52, No. 3

tive overload and, in turn, have a negative the authenticity dimensions differed between
impact on learning (Sweller, Van Merriënboer, students and teachers and between students
& Paas, 1998). As a result, a criterion situation with different amounts of practical and educa-
will often need to be an abstraction of real pro- tional experience. The differences and similari-
fessional practice in order to be attainable for ties along a limited number of dimensions can
students at a certain educational level. The ques- give insight into what is crucial for defining and
tion that immediately comes to mind in this con- designing authentic assessments.
text is How do you create an authentic
assessment for students who are not prepared to
function as beginning professionals? The METHOD
answer is that the authenticity of an assessment
should be defined by its degree of resemblance
to the criterion situation (i.e., an abstraction from
professional practice) and not necessarily to real
Students and teachers from a nursing college
professional practice. Van Merriënboer (1997)
took part in this study. One session of the study
argued that an abstraction of real professional
involved only teachers, one session involved
practice (i.e., the criterion situation) can still be
sophomore students (second year), and one ses-
authentic as long as the abstracted situation
sion involved senior students (fourth year). The
requires students to perform the whole compe-
student groups could be further divided into a
tency as an integrated whole of constituent com-
group of students studying nursing in a voca-
petencies. The abstraction results from
tional training program (VTP) where they are
simplifying contextual factors that complicate
primarily in school and make use of short
the performance of the whole competency.
internships, and a group that studied nursing in
A third consideration also sheds a light on a block release program (BRP) where learning
the question stated in the previous sections, and working are integrated on an almost daily
namely the subjectivity of authenticity. The per- basis. This resulted in five groups of partici-
ception of what authenticity is may change as a pants: (a) 8 sophomore VTP students (M age =
result of educational level, personal interest, age, 18.5 years), (b) 8 sophomore BRP students (M
or amount of practical experience with profes- age = 20.9 years), (c) 8 senior VTP students (M
sional practice (Honebein et al., 1993). This age = 19.7 years), (d) 4 senior BRP students (M
implies that the five dimensions that are argued age = 31.4 years), and (e) 11 teachers (M age =
in the framework for authentic assessment are 42.8 years). The number of participants per ses-
not absolute but, rather, variable. It is possible sion was limited because of the practical possi-
that assessing professional competence of stu- bilities of the group support system used in this
dents in their final year of study, when they study.
have often served internships and have a better
idea of professional practice, requires more
authenticity of the physical context than when Materials
assessing first year students, who usually or
often have little practical experience. Designers
An electronic group support system (GSS) at the
must take changing student perspectives into
Open University of the Netherlands was used as
account when designing authentic assessment. research tool. A GSS is a computer-based infor-
The qualitative study described in the rest of mation processing system designed to facilitate
this article has two main goals. First, it explores group decision making. It is centered on group
whether our five-dimensional framework com- productivity through idea generation, prefer-
pletely describes authenticity or whether impor- ence, and opinion exchange of people involved
tant elements may be missing. Second, it in a common task in a shared environment. The
explores the relative importance of the five GSS allows collaborative and individual activi-
dimensions. A subgoal of this study was to ties such as brainstorming, idea generation, sort-
explore if the perception of (the importance of) ing, rating, and clustering via computer
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


communication. To prevent participants (espe- because of the differences in their studies, they
cially students) from feeling inhibited in would have different perceptions of what deter-
expressing their ideas and opinions, the GSS mines authenticity. VTP students, BRP students,
was a good option because it is completely anon- and teachers were randomly divided in two
ymous. Furthermore, it was a practical and valu- halves, one that received Cases ABCD in the pre-
able method because it made it possible to test and EFGH in the posttest, and one that
collect a lot of information in a structured way in received the cases in the reverse order.
a short period of time. After the initial rating of the case descrip-
To examine the relative importance of the tions, the participants were appraised of the
five dimensions, four case descriptions of assess- purpose of the study. In order to create a specific
ments that varied in their amount of authenticity frame of mind, a very general description was
based on the five dimensions of the model were given of the term authenticity (i.e., true to life).
designed. They described competencies from The GSS part of the study consisted of four activ-
the nursing competency profile, which were val- ities. The first activity required the participants
idated by two employees of the nursing college. to enter into the system their own statements
To check the influence of the GSS session itself that described authenticity of an assessment.
on the perceptions of the authenticity of the This was a free brainstorm, and participants
cases, the descriptions were used in a pre- and a were encouraged to generate as many state-
posttest. To do this, a second set of different but ments as possible. Statements were anony-
comparable case descriptions was designed, mously entered into the GSS, where it was also
which resulted in two sets of four cases. Cases A possible to respond to statements made by oth-
and E were completely authentic except for the ers. After this electronic brainstorm, the contri-
task; Cases B and F were completely authentic butions were discussed in order to clarify them.
except for the physical context; Cases C and G This was recorded for later use and analysis.
were completely authentic except for the result The second activity required respondents to
or form; and Cases D and H were completely specify (voting is a feature of a GSS) the 10 most
authentic (see Appendix for a full description of important statements for designing authentic
a completely authentic case description). assessments that were generated during the
brainstorm. The purpose of this activity was to
determine which elements the participants per-
ceived as being especially important for authen-
All participants had access to a GSS computer. tic assessment. After completing these two
During a two-hour session, participants carried activities, a prototype five-dimensional frame-
out both individual and collaborative activities. work for authentic assessment was presented as
a framework for assessing professional behav-
At the beginning and end of the GSS session,
ior. The five dimensions were explained to the
participants were presented four case descrip-
participants in an attempt to create mutual
tions (ABCD or EFGH). In six paired compari-
understanding about the meaning of the dimen-
sons (4 × 3/2), they chose the case that they
sions. The five dimensions were characterized as
considered to be a more authentic assessment.
This activity was meant to determine the relative
importance of the different dimensions of 1. Task: What do you have to do?
authentic assessment in the eyes of the different 2. Physical context: Where do you have to do it?
groups of participants. A second underlying 3. Social context: With whom do you have to do
purpose of this activity was to bring participants it?
in a specific reference frame for the rest of the
4. Result or form: What has to come out of it?
session, and to focus their thinking toward
What is the result of your efforts?
authenticity of assessment instead of assessment
in general. 5. Criteria: How does what you have done have
to be evaluated or judged?
A distinction was made between VTP stu-
dents and BRP students; it was possible that The third and fourth activities consisted of
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

78 ETR&D, Vol. 52, No. 3

paired comparisons to determine the relative The paired comparison data of the five
importance of the dimensions. Activity three dimensions, that is, the number of times that a
consisted of 10 paired comparisons of the five dimension in the paired comparisons was rated
dimensions (5 × 4/2). Participants had to choose as more important than another dimension,
the dimensions of the framework that they per- were tallied per participant group. The absolute
ceived to be more important for authentic scores were then translated into rankings. The
assessment. The fourth activity was the same as paired comparisons of the case descriptions
the activity at the beginning of the experiment: were analyzed in the same way.
The participants were again required to carry
out paired comparisons of case descriptions that
varied in their amount of authenticity according RESULTS
to the five-dimensional framework. Each group
received the counterbalanced set of case descrip-
In general, the task, the result or form, and the
tions to those compared at the beginning of the
criteria were rated as most important for the
authenticity of the assessment. The social con-
text was clearly considered to be least important
for authenticity, and the importance of the phys-
ical context was strongly discussed.
A characteristic of the GSS is that the answers,
statements, choices, and so forth, of each indi-
vidual participant are anonymous. This means The Relative Importance of the Five
that scores per participant were not available, Dimensions: Paired Comparisons
which precluded the possibility of carrying out
statistical tests. On the other hand, the anonym- The paired comparisons of the dimensions and of
ity inhibited socially accepted answering behav- the case descriptions gave insight into the relative
ior, and has been shown to stimulate response in importance of the five dimensions for designing
idea generation and increase the reliability of authentic assessments. The comparisons of the
answers. The data, thus, were qualitatively ana- dimensions resulted in five rankings (sophomore
lyzed. The tapes of the discussions were tran- VTP students, sophomore BRP students, teachers,
scribed. Both discussion statements and the senior VTP students, and senior BRP students)
statements keyed in during the brainstorms from 1 to 5. The paired comparisons of the case
were analyzed to discern which of the five descriptions were analyzed for the same groups,
dimensions of the framework they fit. State- but were measured in pre- and posttests, which
ments that did not fit were classified as other. resulted in ten rankings from 1 to 4.

Table 1 Ranking of dimensions by group.

Physical Social Result

Task context context or form Criteria

Sophomore VTP students 2.0 4.5 4.5 1.0 3.0

Sophomore BRP students 1.0 3.5 5.0 3.5 2.0
Teachers 1.0 4.0 5.0 2.0 3.0
Senior VTP students 2.0 5.0 3.5 3.5 1.0
Senior BRP students 2.0 4.0 5.0 1.0 3.0

Total 8.0 21.0 23.0 11.0 12.0

Note. 1 = most important, 5 = least important

AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


Table 1 shows rankings per group of the five dimensions was perceived as most authentic
dimensions, based on their perceived impor- (score 1) by all, except the senior BRP students
tance in providing authenticity to an assessment on the posttest (score 2.5). The other three kinds
(1 = most important, 5 = least important). Table 1 of cases showed an interesting pattern. The case
shows that all groups perceived the task as that was authentic except for the task received
important (score 1 or 2), and all groups except mostly a score of 2, which meant that this case
the senior VTP students (score 3.5), perceived was perceived as relatively authentic, which in
the social context as the least important. Further- turn meant that the task (which was not authen-
more, the result or form and criterion dimen- tic in this case) was not perceived as very impor-
sions received more than average importance, tant in designing an authentic assessment. This
whereas all groups perceived the physical con- is contrary to the findings of the paired compar-
text as relatively unimportant (score about 4). In isons of the dimensions in which the task was
short, independent of the group (see totals in perceived as very important in providing
Table 1), the task was perceived as most impor- authenticity to an assessment. Finally, the partic-
tant, followed by the result or form and criterion ipant groups disagreed about the authenticity of
dimensions; the physical context and especially the remaining two kinds of cases. All
the social context lagged far behind. sophomore students ranked the case that was
The results of the paired comparisons of the authentic expect for the result as 4, meaning that
case descriptions, in pre- and posttests, also gave they perceived this case to be the least authentic.
insight into the relative importance of the In other words, they perceived the result or form
dimensions. Table 2 shows rankings per group dimension as most important for designing an
of the four case descriptions. authentic assessment. Teachers, on the other
A 1 meant that this case was perceived as the hand, ranked the case that was authentic except
most authentic case description and a 4 referred for physical context as the least authentic case
to the least authentic case description. An (score 4), which meant that teachers perceived
important finding, for the framework, was that physical context to be most important in
the case that described a completely authentic designing an authentic assessment. Senior stu-
assessment based on the presence of all five dents did not appear to differentiate, meaning

Table 2 Ranking of case descriptions by group.

All authentic All authentic

All authentic except for the except for the
except for the task physical context result or form All authentic

Sophomore VTP, pretest 2.0 3.0 4.0 1.0

Sophomore BRP, pretest 2.0 3.0 4.0 1.0
Sophomore VTP, posttest 3.0 2.0 4.0 1.0
Sophomore BRP, posttest 2.0 3.0 4.0 1.0

Teachers pretest 3.0 4.0 2.0 1.0

Teachers posttest 2.0 4.0 3.0 1.0

Senior VTP pretest 2.0 3.5 3.5 1.0

Senior BRP pretest 2.0 3.5 3.5 1.0
Senior VTP posttest 2.0 3.5 3.5 1.0
Senior BRP posttest 1.0 4.0 2.5 2.5

Note. 1 = most authentic, 4 = least authentic

AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

80 ETR&D, Vol. 52, No. 3

that they perceived the cases with no authentic should be real professional practice or a simula-
physical context or with no authentic result or tion in school.
form as equally inauthentic (score 3.5). To sum,
A closer look at the content of the brainstorm
the findings of the paired comparisons of the
statements gave the impression that teachers
case descriptions indicated that when all of the
and seniors agreed more with each other and
dimensions in the framework are present in a
with the idea of the framework, than with the
case, the case was unequivocally seen as most
sophomore students, especially when it came to
authentic. In addition, there appear to be contra-
task and result or form dimensions. Teachers
dictory results with respect to task authenticity
and seniors agreed with the framework that an
compared to the results of the paired compari-
authentic task required an integration of profes-
sons of the dimensions. Finally, when evaluat-
sional knowledge, skills, and attitudes, and they
ing assessment cases, teachers and students
acknowledged that the task should resemble
appear to differ with respect to the importance
real-life complexity. On the other hand,
of the authenticity of physical context versus
sophomore students were preoccupied with
result authenticity.
knowledge testing, they had problems picturing
the idea of integrated testing, and were primar-
ily concerned with making assessment easier
Completeness and Relative and clearer (e.g., “assignments should be less
Importance: What Do Participants Say? vague, not more than one answer should be pos-
sible”) instead of simulating real-world com-
Table 3 shows that all dimensions received plexity. In the result or form dimension, teachers
attention in the brainstorm and discussion ses- and seniors agreed that more assessment
sions. Furthermore, these results corroborated moments and methods should be combined for
the earlier findings, in that social context a fairer and more authentic picture of students’
received the least attention in all groups. Besides professional competence. Sophomores did not
the five dimensions, almost all subelements of discuss the result or form dimension much; they
the dimensions, described in the framework, only mentioned that reshaping current tests in
were reviewed. the form of cases would make them more realis-
tic. In other words, every kind of assessment
Based on the number of statements and the
could be made more authentic by adding realis-
ratios of the statements compared to each other,
tic information.
as shown in Table 3, sophomores place primary
interest on task, followed by physical context. A specification of the other statements (see
Seniors and teachers place equal emphasis on Table 4) showed, first, that all groups made
task and result. Teachers differ from all stu- statements emphasizing the alignment between
dents, regardless of level, with respect to the instruction and assessment, and between school
emphasis on physical context. Teachers devoted and real-life practice. This is in agreement with
a lot of time to discussing the required fidelity the theoretical ideas behind the framework for
level of the physical context in an effective authentic assessment. Second, Table 4 shows
authentic assessment. Especially emphasized that issues concerning the assessor of an authen-
was the question of whether the physical context tic assessment, and organizational or pre-

Table 3 Number of statements per dimension of each group.

Physical Social Result

Task context context or Form Criteria Other

Sophomore students 24 19 6 7 13 45
Senior students 34 21 9 36 12 26
Teachers 16 39 5 19 21 56
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


Table 4 Variables in the other category, per group.

Sophomore students Senior students Teachers

General statements applicable to all five dimensions 6 1 2

Instruction 28 7 5
Alignment instruction—assessment 2 3 3
Alignment school—practice 6 3 3
Assessor 3 3 6
Organization or preconditions – – 7
Influence on the learning process – – 4
Not defined or nonsense – 9 26

conditional issues, should be taken into account A combination of the results of the GSS activ-
in a framework for authentic assessment. Issues ities led to the conclusion that task, result or
related to the assessor dealt with the realization form, and criteria were perceived as very impor-
that people from professional practice should be tant for authentic assessment. Physical context
involved in defining and using criteria and stan- was most important in the eyes of teachers.
dards. Organizational issues involved state- Social context was perceived as the least impor-
ments about conditions that should be met before tant dimension.
authentic assessment can be implemented in Furthermore, not all groups perceived the
school. For example, teachers talked about plac- dimensions and elements in the same way.
ing students in professional practice sooner and Teachers and seniors mostly agreed with each
more often for the purpose of assessing them in other and with the theoretical framework; how-
this professional context. Finally, Table 4 shows ever, sophomores often deviated from the other
that sophomores took the opportunity to talk groups. There were no differences between VTP
and complain about the instruction. Although and BRP students.
instruction was not being evaluated (i.e., it was
about assessment), 28 statements dealt with
what was taught and not with what was
assessed. Seniors were more focused, and
teacher statements were spread over different
other variables and the 26 statement of the not To reiterate: The two questions with which we
defined variable included mostly jokes or ques- began were (a) Is the framework complete? (b)
tions they asked each other. Do students differ from teachers with respect to
what they perceive as important for authentic-
ity? Both of these questions shed light on possi-
ble guidelines for designing authentic
CONCLUSION assessments.
The answer to Question 1 appears to be yes.
Overall, the five-dimensional framework gave a The five dimensions appear to adequately
good description of what dimensions and ele- define authenticity, as demonstrated by both the
ments should be taken into account in an brainstorming and the high ranking of those
authentic assessment; the participants discussed cases that were authentic on all five dimensions.
all dimensions and almost all elements The adequacy of the framework is corroborated
described in the framework. However, elements by the finding that during the brainstorming,
concerning the assessor and organization issues most subelements of the dimensions as de-
should be added to complete the framework, as scribed by the framework were seen as impor-
these elements turned out to be important to all tant when designing authentic assessment. The
participant groups. paired comparisons showed some subtle differ-
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

82 ETR&D, Vol. 52, No. 3

ences in the importance of the five dimensions their perception of authenticity, some interest-
for providing authenticity. While the task, the ing findings came to light. The most differences
result or form, and the criterion dimensions were found between sophomores and teachers,
turned out to be very important for authenticity, while seniors agreed with teachers more often.
the physical context and especially the social Moreover, the perceptions of teachers and
context dimensions were perceived as less seniors agreed more with the ideas of the theo-
important. Social context is unequivocally per- retical framework. Possibly, the perceptions of
ceived as the least important dimension of older students have changed during their col-
authenticity. All groups stressed the need for lege career as a result of having had experience
individual testing, although both students and with professional practice; the perceptions of
teachers stressed that most nursing activities in sophomores—who have less practical experi-
real life are collaborative. Teachers explained ence—seemed to be based primarily on their
that “assessing in groups is a soft spot, we just previous experiences with assessment, which
don’t know how to assess students together, explained the focus on knowledge and in-school
because at the end we want to be sure that every testing. In other words, it appears that
individual student is competent.” It should not sophomore students have different conceptions
be concluded, based on these findings, that and possibly misconceptions of real professional
social context is not important for authentic practice and, thus, authenticity of assessment.
assessment, but if choices have to be made in Furthermore, the brainstorming and the
designing an authentic assessment, social con- paired comparisons of the case descriptions
text is probably the first dimension to leave out. showed differences between teachers and stu-
The findings on importance of task are some- dents in the perception of physical context.
times contradictory. Although the brainstorm- Teachers focused on the importance of increas-
ing and the paired comparisons of the ing the authenticity of physical context by plac-
dimensions show that task was perceived as ing the assessment in professional practice,
very important by all, the paired comparisons of whereas students, especially sophomores,
the cases made task seem less important. It is mostly focused on in-school testing with, for
possible, thus, that although the respondents example, simulated patients and realistic equip-
consider task (as an abstracted concept) to be ment.
most important, they are not able to identify (i.e., Finally, all groups agreed on the relative
they do not perceive) an authentic task. A possi- unimportance of the social context and on the
ble explanation for this is that the all-authentic- importance of using criteria that resemble the
except-for-the-task case resembles current criteria used in real professional practice. Teach-
assessment practices. Because previous experi- ers and students agree that, at this point, the cri-
ences are found to strongly influence percep- teria used in school differ too much from criteria
tions (Birenbaum, 2003), the familiarity of these used in professional institutes, and that school
cases may have influenced the paired compari- criteria are often unknown or misinterpreted by
sons of the cases. If this is the case, the paired assessors at the professional institutes.
comparisons of the five dimensions were proba-
bly a more objective measure of the importance
of the five dimensions. Future Implications

Finally, it might be the case that assessor- The findings of the study allow for some critical
related issues would complete the framework. questions and guidelines concerning the design
This could be done by adding a sixth dimensions of authentic assessment. First, student percep-
called “the assessor,” or by adding the issues tions should be considered in designing effective
concerning who should use and develop authentic assessments. The qualitative results of
authentic criteria and standards as subelements this study showed that students, especially at
to the criterion dimension. the beginning of their study and with little prac-
With respect to Question 2, concerning the tical experience, have different conceptions
differences between students and teachers in (possibly misconceptions) of what authenticity
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


means than do older, more experienced students Finally, as stated at the beginning of this arti-
and teachers. For authentic assessment to work, cle, authenticity is only one of the elements and
two options need to be considered: either (a) the quality criteria of competency-based (alterna-
assessment should meet the expectations of the tive) assessment (Birenbaum & Dochy, 1996;
sophomores, for example, by sticking to explicit Dierick, Dochy, & Van de Watering, 2001). Mak-
knowledge testing in the name of authentic ing decisions about implementing authentic ele-
assessment, which is likely to confirm unwanted ments in an assessment should be considered in
learning behavior; or (b) explicit attention the broader context of quality criteria for assess-
should be given to changing student perceptions ment (i.e., reliability or generalizability), and in
and, thereby, opening the possibilities to change the context of other assessment goals (i.e., timeli-
their learning behavior toward professional ness, affordability, and accountability). How-
competency development, when implementing ever, a thorough discussion of these other
authentic assessment. assessment goals and criteria is beyond the
scope of this article.
Second, we might be able to save precious
time and money in the design, development and The argumentation of the theoretical frame-
implementation of authentic assessment with work and the qualitative study gave some inter-
respect to the physical context and the creation esting impulses to further theoretical and
of social contexts. Research should examine if practical research concerning authentic assess-
assessing students in a real professional context ments and student perceptions, and especially
has additional value for students, or if assessing the focus on vocational college is interesting,
in an (electronic) simulation in school is authen- because most assessment research is done in
tic enough as long as students are confronted higher education. All participants in this study
with an authentic task, result or form, and cri- agreed that instruction and assessment in school
teria. Simulation in school, virtual or not, is should be aligned with each other and that
probably easier and less expensive to imple- developing education that focuses on the devel-
ment, and, therefore, warrants careful consider- opment of competencies and takes professional
ation. practice as a starting point, requires assessments
that are also competency based and based on
The exploratory nature of this study, without professional practice. In other words, it requires
the possibility of quantitative statistical analyses authentic assessment.
owing to the nature of the GSS, makes firm con-
clusions impossible. However, the electronic
GSS efficiently delivered a lot of qualitative data Judith T. M. Gulikers [], Theo J.
in a short period of time. What the data of this Bastiaens, and Paul A. Kirschner are with the
Educational Technology Expertise Center at the
study do show is that authenticity is definitely a
Open University of the Netherlands, P.O. Box 2960,
multifaceted concept, and that a number of the 6401 DL Heerlen, The Netherlands.
facets (dimensions) appear to be of more impor- The authors would like to thank Marijke Bijnens for
tance than others. This can have far-reaching her help in organizing the participation of teachers
implications for educational design. and students in the GSS. They would also like to
thank Dr. Robert Schuwer for his assistance in setting
The actual effectiveness of this framework for up and carrying out the GSS sessions.
designing authentic assessments, however,
should be examined by evaluating the influ-
ences of different kinds and levels of authentic- REFERENCES
ity of assessment on student learning and
motivation. Because implementing authenticity Alessi, S. M. (1988). Fidelity in the design of instruc-
elements in assessment requires a lot of time, tional simulations. Journal of Computer-Based Instruc-
money, and energy (Martens, Bastiaens, & tion, 15(2), 40–47.
Arter, J. A., & Spandel, V. (1992). An NCME instruc-
Gulikers, 2002), research should examine which
tional module on: Using portfolio of student work in
elements of the framework are crucial for affect- instruction and assessment. Educational Measure-
ing student learning in the direction of the ment: Issues and Practice, 11(1), 36–45.
development of professional competencies. Biggs, J. (1996). Enhancing teaching through construc-
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40

84 ETR&D, Vol. 52, No. 3

tive alignment. Higher Education, 32, 347–364. Constructivism and the design of learning environ-
Birenbaum, M. (1996). Assessment 2000: Towards a ments: Context and authentic activities for learning.
pluralistic approach to assessment. In M. Birenbaum In T. M. Duffy, J. Lowyck, & D. H. Jonassen (Eds.),
& F. J. R. C. Dochy (Eds.), Alternatives in assessment of Desinging environments for constructive learning (pp.
achievements, learning processes and prior knowledge 88–108). Berlin: Springer-Verslag.
(pp. 3–29). Boston, MA: Kluwer Academic Publish- Huang, H. M. (2002). Towards constructivism for
ers. adult learners in online learning environments. Brit-
Birenbaum, M. (2003). New insights into learning and ish Journal of Educational Technology, 33, 27–37.
teaching and their implications for assessment. In M. Kirschner, P. A. (2002). Three worlds of CSCL: Can we
Segers, F. Dochy, & E. Cascallar (Eds.), Optimising support CSCL? Heerlen: Open University of the
new modes of assessment: In search of quality and stan- Netherlands.
dards (pp. 13–36). Dordrecht, The Netherlands: Kirschner, P. A., Martens, R. L., & Strijbos, J. W. (2004).
Kluwer Academic Publishers. CSCL in higher education? A framework for
Birenbaum, M., & Dochy, F. J. R. C. (1996). Alternatives designing multiple collaborative environments. In
in assessment of achievements, learning processes and P. Dillenbourg (Series Ed.) & J. W. Strijbos, P. A.
prior knowledge. Boston, MA: Kluwer Academic Pub- Kirschner & R. L. Martens (Vol. Eds.), Computer-sup-
lishers. ported collaborative learning: Vol. 3. What we know
Brown, J. S., Collins, A., & Duguid, P. (1989). Situated about CSCL _. And implementing it in higher education
cognition and the culture of learning. Educational (pp. 3–30). Boston, MA: Kluwer Academic Publish-
Researcher, 18(1), 32–42. ers
Cronin, J. F. (1993). Four misconceptions about Martens, R., Bastians, Th., & Gulikers, J. (2002). Leren
authentic learning. Educational Leadership, 50(7), 78– met computergebaseerde authentieke taken:
80. motivatie, gedrag en resultaten van studenten
Darling-Hammond, L., & Snyder, J. (2000). Authentic [Learning with computer-based authentic tasks: stu-
assessment in teaching in context. Teaching and dent motivation, behavior and results]. Pedagogische
Teacher Education, 16, 523–545. Studiën, 79(6), 469–482.
Dierick, S., Dochy, F., & Van de Watering, G. (2001). McDowell, L. (1995). The impact of innovative assess-
Assessment in het hoger onderwijs: Over de ment on student learning. Innovations in Education
implicaties van nieuwe toetsvormen voor de and Training International, 32(4), 302–313.
edumetrie. [Assessment in higher education: About Meyer, C. (1992). What’s the difference between
the implications of new assessment forms for authentic and performance assesment? Educational
edumetrics] Tijdschrift voor Hoger Onderwijs, 19(1), 2– Leadership, 49(8), 39–40.
18. Moerkerke, G., Doorten, M., & de Roode, F. A. (1999).
Dochy, F. J. R. C., & McDowell, L. (1998). Assessment Constructie van toetsen voor competentiegericht curric-
as a tool for learning. Studies in Educational Evalua- ula [Construction of assessments for competency-based
tion, 23(4), 279–298. curricula] (OTEC report 1999/W02). Heerlen, The
Frederiksen, N. (1984). The real test bias, influences of Netherlands: Open Universiteit Nederland, Educa-
testing and teaching on learning. American Psycholo- tional Technology Expertise Center.
gist, 39(3), 193–202. Newmann, F. M. (1997). Authentic assessment in
Gibbs, G. (1992). Improving the quality of student learn- social studies: Standards and examples. In G. D.
ing. Bristol, UK: Technical and Educational Services. Phye (Ed.). Handbook of classroom assessment: Learn-
Gielen, S., Dochy, F., & Dierick, S. (2003). The influence ing, achievement, and adjustment (pp. 359–380). San
of assessment on learning. In M. Segers, F. Dochy, & Diego, CA: Academic Press.
E. Cascallar (Eds.), Optimising new modes of assess- Newmann, F. M., & Wehlage, G. G. (1993). Five stan-
ment: In search of quality and standards (pp. 37–54). dards for authentic instruction. Educational Leader-
Dordrecht, The Netherlands: Kluwer Academic ship, 50(7), 8–12.
Publishers. Nicaise, M., Gibney, T., & Crane, M. (2000). Toward an
Hart, D. (1994). Authentic assessment: A handbook for understanding of authentic learning: Student per-
education. Menlo Park, CA: Addison-Wesley Pub- ceptions of an authentic classroom. Journal of Science
lishing Company. Education and Technology, 9, 79–94.
Herrington, J., & Herrington, A. (1998). Authentic Petraglia, J. (1998). Reality by design: The rhetoric and
assessment and multimedia: How university stu- technology of authenticity in education. Mahwah, NJ:
dents respond to a model of authentic assessment. Lawrence Erlbaum Associates Publishers.
Higher Educational Research & Development, 17(3), Prodromou, L. (1995). The backwash effect: From test-
305–322. ing to teaching. ELT Journal, 49(1), 13–25.
Herrington, J., & Oliver, R. (2000). An instructional Reeves, T. C., & Okey, J. R. (1996). Alternative assess-
design framework for authentic learning environ- ment for constructivist learning environments. In
ments. Educational Technology Research and Develop- B.G. Wilson (Ed.). Constructivist learning environ-
ment, 48(3), 23–48. ments: Case studies in instructional design (pp. 191–
Honebein, P. C., Duffy, T. M., & Fishman, B. J. (1993). 202). Englewood Cliffs, NJ: Educational Technology
AAH GRAPHICS, INC. / (540) 933-6210 / FAX 933-6523 / 11-22-2004 / 10:40


Publications. Research, 2, 191–213.

Resnick, L. B. (1987). Learning in school and out. Edu- Slavin, R. E. (1989). Research on cooperative learning:
cational Leadership, 16(9), 13–20. An international perspective. Journal of Educational
Sambell, K., & McDowell, L. (1998). The construction Research, 33, 231–243.
of the hidden curriculum: Messages and meanings
Sluijsmans, D. (2002). Student involvement in assessment:
in the assessment of student learning. Assessment and the training of peer assessment skills. Unpublished doc-
Evaluation in Higher Education, 23(4), 391–402.
toral dissertation, Open University of the Nether-
Sambell, K., McDowell, L., & Brown, S. (1997). But is it lands, Heerlen, The Netherlands.
fair?: An exploratory study of student perceptions of
the consequential validity of assessments. Studies in Sweller, J., Van Merriënboer, J. J. G., & Paas, F. (1998).
Educational Evaluation, 23(4), 349–371. Cognitive architecture and instructional design.
Savery, J., & Duffy, T. (1995). Problem based learning: Educational Psychology Review, 10(3), 251–296.
An instructional model and its constructivist frame- Torrance, H. (1995). Evaluating authentic assessment.
work. Educational Technology, 35, 31–38. Buckingham, UK: Open University Press.
Schnitzer, S. (1993). Designing and authentic assess- Uhlenbeck, A. (2002). The development of an assessment
ment. Educational Leadership, 50(7), 32–35. procedure for beginning teachers of English as a foreign
Segers, M., Dierick, S., & Dochy, F. (2001). Quality language. Unpublished doctoral dissertation, Uni-
standards for new modes of assessment. An explor- versity of Leiden, Leiden, The Netherlands.
atory study of the consequential validity of the
Van Merriënboer, J. J. G. (1997). Training complex cogni-
OverAll test. European Journal of Psychology of Educa-
tive skills: A four-component instructional design model
tion, 16(4), 569–586.
for technical training. Englewood Cliffs, NJ: Educa-
Segers, M., Dochy, F., & Cascallar, E. (2003). Optimising
tional Technology Publications
new modes of assessment: In search of qualities and stan-
dards. Dordrecht, The Netherlands: Kluwer Aca- Wiggins, G. (1989). Teaching to the (authentic) test.
demic Publishers. Educational Leadership, 46(7), 41–47.
Segers, M., Dochy, F., & De Corte, E. (1999). Assess- Wiggins, G. P. (1993). Assessing student performance:
ment practices and students’ knowledge profiles in Exploring the purpose and limits of testing. San Fran-
a problem-based curriculum. Learning Environments cisco, CA: Jossey-Bass/Pfeiffer.

See Appendix, overleaf