November 2008
Students were given research reports, budgets, and content and basic skills like reading and computation from
other documents to help draft their answers, and they more advanced analytical and thinking skills, even in the
were expected to demonstrate proficiency in subjects earliest grades.
like reading and math as well as mastery of broader and
more sophisticated skills like evaluating and analyzing But standing in the way of incorporating 21st century skills
information and thinking creatively about how to apply into teaching and learning are widespread concerns about
information to real-world problems. measurement. The cost, time demands, and difficulty in
scoring tests of these less easily quantified skills have
Not many public school students take assessments like slowed the adoption of such tests, as have concerns
the CWRA. Instead, most students take tests that are among civil rights advocates that these tests would
primarily multiple-choice measures of lower-level skills in erode progress toward ensuring common standards of
reading and math, such as the ability to recall or restate learning for all students. Collectively, these concerns
facts from reading passages and to handle arithmetic- derailed efforts in the late 1990s to move toward the use
based questions in math. These types of tests are useful of performance-based assessments such as portfolios,
for meeting the proficiency goals of the federal No exhibitions, and projects.
Child Left Behind Act (NCLB) and state accountability
systems. But leaders in business, government, and higher New assessments like the CWRA, however, illustrate
education are increasingly emphatic in saying that such that the skills that really matter for the 21st century—the
tests don’t do enough. The intellectual demands of 21st ability to think creatively and to evaluate and analyze
century work, today’s leaders say, require assessments information—can be measured accurately and in a
that measure more advanced skills, 21st century skills. common and comparable way. These emergent models
Today, they say, college students, workers, and citizens also demonstrate the potential to measure these complex
must be able to solve multifaceted problems by thinking thinking skills at the same time that we measure a
creatively and generating original ideas from multiple student’s mastery of core content or basic skills and
sources of information—and tests must measure knowledge. There is, then, no need for more tests to
students’ capacity to do such work. measure advanced skills. Rather, there is a need for better
tests that measure more of the skills students’ need to
While many policymakers, including Secretary of Education succeed today.
Margaret Spellings, have emphasized the need for
schools to, first and foremost, teach the basics, learning
science—an interdisciplinary field that includes cognitive
science, educational psychology, information science, and
Unmet Challenges
neuroscience—suggests that the best learning occurs The idea that schools should focus on more than
when basic skills are taught in combination with complex just the basics is not new. A century ago, leaders of
thinking skills. Decades of research reveals that there is, in the progressive education movement, spearheaded
fact, no reason to separate the acquisition of learning core by American philosopher and educator John Dewey,
Percentile Change
and acquire and analyze information.2 4
2
More recently, the New Commission on the Skills of 0
1969 1979 1989 1999
the American Workforce—a group of business leaders, −2
governors, school chancellors, and former secretaries of −4
labor and education—released a sequel to its 1990 report −6
on the nation’s educational and economic challenges. −8
The message of the 2006 report, Tough Choices or
Tough Times, is clear: Basic skills are necessary but not Source: Frank Levy and Richard Murnane, The New Division of Labor:
How Computers Are Creating the Next Job Market (Princeton, NJ:
sufficient. Princeton University Press, 2004).
The commission’s report describes how new technology performance on the most recent international tests. The
and global competition have changed the game for Programme for International Student Assessment (PISA)
American workers. Students need a strong foundation and the Trends in International Mathematics and Science
of basic skills, the commission asserts, but that alone is Study (TIMSS), two of the largest education surveys in
no longer enough for economic and job security. “It is a the world, measure how well early adolescent students
world in which comfort with ideas and abstractions is the (PISA tests 15-year-olds and TIMSS tests the rough
passport to a good job, in which creativity and innovation equivalent of eighth-graders) are faring in their abilities to
are the key to the good life, in which high levels of problem-solve in math and science.5 TIMSS found U.S.
education—a very different kind of education than most of eighth-graders to be above average performers among
us have had—are going to be the only security there is.”3 participating nations and found substantial improvement
in performance, particularly in science, from the 1999
This new reality applies to all children in the United States, to 2003 tests. But the PISA, designed to test students’
not just an elite class of students. Nearly every segment application of math and science to real-world scenarios,
of the workforce now requires employees to know how to found U.S. students to be among the worst performers.
do more than simple procedures—they look for workers (See Table 1.) Taken together, these results reveal that
who can recognize what kind of information matters, U.S. students may be performing well in their mastery
why it matters, and how it connects and applies to other of instructional material but that this performance is not
information. carrying over to the application of material to real-world
problems.
Richard Murnane and Frank Levy, both economists and
professors at Harvard and MIT, respectively, have been
researching and writing about workforce skills for more The ‘Must Have’ Skills
than a decade. They agree that basic skills, once in high
demand for workers, are no longer what matter most. There It is an emphasis on what students can do with knowledge,
are fewer tasks requiring only routine skills, they explain, rather than what units of knowledge they have, that best
and they are often done by computers.4 (See Figure 1. describes the essence of 21st century skills.
Concerns that the United States is losing its global But that core notion is often lost in the welter of terms
competitive edge are heightened by the nation’s used to describe 21st century skills and in the many
*Survey of published goals and outcomes of after-school and out-of-school programs, including 21st Century Community Learning Centers grantees,
Spring, 2007.
**The Afterschool Alliance estimates the total funding of the after-school industry to be $3.5 billion. 21st CCLC represents the largest pot of funding for
after- and out-of-school programming.
†
Interview with Heather Weiss, July 2008.
The CWRA is intended as a tool for school improvement, PowerSource measures advanced skills in the context
not necessarily to measure individual student gains. But of measuring content proficiency, which means it can
those who use it affirm its value as an essential metric demonstrate student learning for specific subject matter
*Association of American Colleges and Universities (AAC&U), Greater Expectations: A New Vision for Learning as a Nation Goes to College, 2002.
**U.S. Department of Education, Washington, D.C., 2006. http://www.ed.gov/about/bdscomm/list/hiedfuture/index.html.
***Personal interview with Richard Shavelson, March, 2008.
†
Interview and correspondence with Robert Sternberg, February 2008.
‡
Correspondence with Tufts University Admissions Office, March, 2008.
while also testing students’ development of higher- toolkit of applications to use to complete tasks that
order skills. Students are asked, for example, to apply measure learning skills such as “finding things out,”
algebraic principles as well as explain why they chose “developing ideas,” and “exchanging and sharing
the principles. PowerSource was designed to measure a information.” Student actions are tracked and mapped
broader set of outcomes by focusing on a handful of big against expected abilities for that level of education and
ideas rather than a heap of discrete facts.18 test results provide both national scores for students and
detailed feedback about student performance that can
The United Kingdom recently developed an innovative be used to inform teaching and learning at the classroom
national assessment that aligns with its existing national level.19
standards. The Key Stage 3 (ages 12–13) Information
Communications Technology (ICT) Literacy Assessment, The closest thing in the United States is the 2009 NAEP
created by the British government’s Qualifications and Science Assessment, administered by the federally
Curriculum Authority, measures a set of technical skills funded National Assessment of Educational Progress
as well as a student’s ability to use those skills to solve (NAEP). The test will for the first time measure not just
a set of complex problems. Students are provided a students’ knowledge of science principles, but also
The testing industry, keenly aware that the call for 21st Using people to grade a wide range of open-ended and
century skills means more demand for tests that measure performance-based assessments of 21st century skills
these skills, has also been working hard to develop raises concerns about the reliability of results. People
assessments that measure more than the basics. may be able to assess in more depth and with more
nuance than a computer program, but human scorers
The Educational Testing Service (ETS), a private inevitably introduce a level of subjectivity into the
nonprofit organization and one of the world’s largest assessment process. So-called “inter-rater reliability” is
test developers, has several initiatives underway a challenge; no matter how clear scoring standards are
to measure a broader set of skills, including critical it’s difficult to expect human scorers to grade tests with
thinking, communication, and a host of socio-emotional perfect consistency. Training and monitoring scorers can
skills like adaptability and agreeableness. Researchers be time-consuming and costly, but it can help. The IB
at the organization’s Center for New Constructs are Diploma Programme, for instance, has nearly 5,000 test
studying how to assess skills that are not measured by examiners worldwide. The program ensures a high level
the SAT, ACT, and other traditional standardized tests. of consistency among its examiners, most of whom are
Pilot projects of more than 20 individual assessments experienced Diploma Programme teachers, by providing
Tough Choices or Tough Times (Washington, DC: National Eva Baker, Moving to the Next Generation System Design:
18
Center on Education and the Economy, 2006). Integrating Cognition, Assessment, and Learning (Los Angeles,
Richard Murnane and Frank Levy, The New Division of Labor:
4 CA: National Center for Research on Evaluation, Standards,
How Computers Are Creating the Next Job Market (Princeton, and Student Testing (CRESST), University of California,
NJ: Princeton University Press, 2004). February, 2007).
(TIMSS) 2003; The Programme for International Student Agency, United Kingdom, available online at http://www.naa.
Assessment (PISA) 2003. org.uk/naa_18908.aspx.
Literacy in the Digital Age (Los Angeles, CA: The Metiri Group, Programme Assessment Principles and Practice.” (United
2003). Kingdom: Antony Rowe Ltd., September 2004).
Partnership for 21st Century Skills, July 23, 2007). in psychology and measurement. It is, however, a misnomer
for describing these skills, as many are actually cognitive in
Definition and Selection of Key Competencies (Paris, France:
8
nature.
OECD, May 27, 2005).
Interview with Richard Roberts, March 2008.
22
People Learn, 2000; OECD, Innovation in the Knowledge “Title I: Characteristics of tests will influence expenses;
23
Economy: Implications for Education and Learning, 2004; and information sharing may help states realize efficiencies.”
R. Kozma, Technology, Innovation, and Educational Change: Report to Congressional requesters. (Washington, DC:
A Global Perspective (Eugene, OR: International Society for Government Accounting Office, May, 2003).
Technology in Education, 2003). Thomas Toch, Margins of Error: The Testing Industry in the
24
10
Benjamin S. Bloom and David R. Krathwohl, Taxonomy of No Child Left Behind Era (Washington, DC: Education Sector,
Educational Objectives: The Classification of Educational 2006).
Goals, by a committee of college and university examiners. Linda Jacobson, “Kentucky Lawmakers Take Aim at State
25
Handbook I: Cognitive Domain (New York: Longman, Green, Tests,” Education Week, March, 2008.
1956). Also D. Krathwohl, B.S. Bloom, and B.B. Masia,
Taxonomy of Educational Objectives: The Classification of A bill was submitted during the Spring 2008 session that called
26
Educational Goals. Handbook II: Affective Domain (New York: for the replacement of the KY CATS system with a norm-
David McKay Co., Inc., 1964). referenced test and an augmented test. It did not move out
of committee. According to the state’s office of accountability
11
L.W. Anderson, D. R. Krathwohl, eds, A Taxonomy for and assessment, the state has established a task force on
Learning, Teaching, and Assessing: A Revision of Bloom’s assessment to consider future changes to state tests.
Taxonomy of Educational Objectives (New York: Allyn-Bacon
Longman, 2001). “The Technical Advisory Committee on Faking on
27