Anda di halaman 1dari 10

Learning and Individual Differences 20 (2010) 327–336

Contents lists available at ScienceDirect

Learning and Individual Differences


j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / l i n d i f

Assessment of gifted students for identification purposes: New techniques for a


new millennium
Robert J. Sternberg
School of Arts and Sciences, Tufts University, 3rd Floor, Ballou Hall, Medford, MA 02155, USA

a r t i c l e i n f o a b s t r a c t

Article history: The augmented theory of successful intelligence [Sternberg, R. J. (2003b). Wisdom, intelligence, and creativity
Received 4 March 2009 synthesized. New York: Cambridge University Press] postulates that intelligence comprises creative skills in
Received in revised form 30 July 2009 generating novel ideas; analytical skills in discerning whether they are good ideas; practical skills in
Accepted 20 August 2009
implementing the ideas and persuading others of their worth; and wisdom-based skills in employing one's
creative, analytical, and practical skills for a common good. The article summarizes three projects designed to
Keywords:
Intelligence
identify gifts. In the Rainbow Project, my colleagues and I found that it was possible substantially to increase
Creativity prediction of first-year university academic performance and simultaneously reduce ethnic-group differences on
Wisdom the predictive test, relative to a standardized test used for admissions in the United States. In the Kaleidoscope
Analytical intelligence Project, my colleagues and I found that students admitted for expanded skills performed as well as did other
Practical intelligence students, without the ethnic-group differences typically obtained in such measures. In the Aurora Project, Elena
Grigorenko, Mei Tan, and their colleagues are seeking to identify giftedness in students at the upper elementary
grades. All three projects show that it is possible to apply the augmented theory of successful intelligence in ways
that enhance gifted identification.
© 2009 Elsevier Inc. All rights reserved.

For roughly 100 years, since the work of Alfred Binet and Theodore intelligence to execute their ideas and to persuade others of their value,
Simon, testing to identify children's abilities has changed relatively and (d) wisdom in order to ensure that their abilities are being used for
little (see Binet & Simon, 1916). If any other technology had stayed some kind of common good that balances their own interests with other
about the same for 100 years, people would be amazed. Imagine if we people's and institutional interests over the short and long terms
had only telegraphs operated by Morse code, primitive telephones, no (Sternberg, 2003b). According to the theory, these abilities are modifiable,
televisions, no computers, and no serious electrical appliances. That is in some degree, rather than fixed (Dweck, 1999; Sternberg, 1999a, 2003a;
a world hard to imagine. It is the world in which we live in the field of Sternberg & Grigorenko, 2007).
testing the abilities of the gifted. Wisdom was added to the original theory of successful intelligence,
It would not be fair to say that there have been no new developments. which comprised only the analytical, creative, and practical abilities, on
Joseph Renzulli, Howard Gardner, and others (see Kaufman & Sternberg, the basis of the notion that some people may be academically and even
2008; Sternberg & Davidson, 2005) have proposed new models of practically intelligent, but unwise, as in the case of corporate scandals
identification that have been used to identify gifted children in ways that such as those that have surrounded Enron, Worldcom, and Arthur
go beyond conventional IQ. But the tests used to measure IQ have not Andersen—three US companies that failed as a result of serious ethical
changed much. They still measure the same basic construct of so-called scandals—and in the case of numerous political scandals as well. The
“general ability” that Charles Spearman identified early in the twentieth perpetrators were smart, well-educated, and unethical. The conception
century (Spearman, 1927). Our efforts have been addressed toward of wisdom used here is that of the balance theory of wisdom (Sternberg,
developing new kinds of tests to assess intelligence in broader ways than 1998b), according to which wisdom is the application of intelligence,
has been possible in the past. This article describes three of our efforts. creativity, and knowledge for the common good, by balancing
The framework my colleagues and I have used is one called the intrapersonal, interpersonal, and extra-personal interests, over the
augmented theory of successful intelligence (Sternberg, 1997, 1999b, long and short terms, through the infusion of positive ethical values.
2005b), or WICS (which stands for Wisdom, Intelligence, Creativity, In the augmented theory of successful intelligence, abilities and
Synthesized). The basic idea is that people in almost any walk of life need achievement are viewed as being on a continuum. Abilities are largely
(a) creativity to generate new and exciting ideas, (b) analytical intelligence achieved (Sternberg, 1998a, 1999a). Thus, ability and achievement tests
to evaluate whether their and others' ideas are good ideas; (c) practical are also on a continuum, measuring similar constructs that differ pri-
marily in terms of when the skills and knowledge they measure were
E-mail address: robert.sternberg@tufts.edu. acquired.

1041-6080/$ – see front matter © 2009 Elsevier Inc. All rights reserved.
doi:10.1016/j.lindif.2009.08.003
328 R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336

This framework suggests that conventional tests of abilities, dating Someone who is creatively gifted, but not gifted in other ways, may
back to Binet and Simon (1916) and Spearman (1927), are not fully come up with many novel ideas. But creatively gifted people need as
adequate because they so heavily emphasize analytical (as well as well to be able to decide whether their novel ideas are good ones, and
memory-based) abilities to the near or total exclusion of creative and they need to be able to persuade others of the value of their ideas.
practical abilities. Such tests predict a large variety of performances People who are creatively gifted and no more may find themselves
(Herrnstein & Murray, 1994; Jensen, 1998; Schmidt & Hunter, 1998), frustrated that they cannot acquire an audience; or they may find that
but perhaps not at the highest level that can be achieved. even if they have an audience, it fails to be persuaded by their ideas.
Many other theories also claim that there are abilities beyond Someone who is practically gifted, but not gifted in other ways,
general intelligence (e.g., Ceci, 1996; Gardner, 1983, 2006; Guilford, may be able to sell ideas or products, but in general, will have to rely
1967; Thurstone, 1938). Even within theories that postulate general on others to supply the ideas. They could as easily sell bad used cars as
intelligence, a widely accepted view is that abilities are hierarchically clever new inventions. Without wisdom, they may sell ideas that are
differentiated (e.g., Carroll, 1993; Cattell, 1971; Vernon, 1971; see harmful to those around them. They need not be career salespeople,
essays in Sternberg & Grigorenko, 2002). So a view of broad measures per se. They may be, for example, cynical politicians more concerned
of intelligence fits with many theories. Where the theories differ about enhancing their power than about achieving a common good
somewhat is in exactly which abilities are measured—in what kinds of for those they supposedly represent.
abilities are considered meritorious—and in how important the
abilities are considered to be beyond general intelligence (g). 1. The Rainbow Project
School assessments, like standardized tests, often emphasize analyt-
ical and memory-based skills. For example, the SAT, widely used in the One of the primary venues for identifying the gifted is selective
United States for college admissions, measures, among other things, university admissions. When universities make decisions about
analysis of reading passages and solution of mathematics problems. The selective admissions, the main quantitative information they have
A-Levels used in the United Kingdom measure memory for knowledge available to them typically is grade-point average in high school or its
learned in secondary school and basic analysis of this knowledge. These equivalent and scores on standardized tests (Lemann, 1999). Is it
memory and analytical skills are precisely the abilities in which many possible to create assessments that are psychometrically sound and
children of the middle- and upper middle class excel, resulting in a fairly that provide incremental validity over existing measures, without
substantial correlation between test scores and socioeconomic class destroying the cultural and ethnic diversity that makes a university
(Lemann, 1999; Sternberg, 1997). Of course, there are exceptions. But on environment a place in which students can interact with and learn
the whole, the system of selective admissions based on tests is geared to from others who are different from themselves? Can one create
favor these children, who have had opportunities that children of the assessments recognizing that people's gifts differ and that many of the
working class may not have had. The system also is stacked against variety of gifts they possess are potentially relevant to university and
children from the middle and upper middle classes who may be life success (Sternberg & Davidson, 2005)? And can one do so in a way
nontraditional learners. So testing has the potential advantage of creating that is not a mere proxy for socioeconomic class (Golden, 2006;
equity by admitting students because of their abilities and achievements, Kabaservice, 2004; Karabel, 2006; Lemann, 1999; McDonough, 1997)
and the potential disadvantage of destroying equity by favoring, on bases or for IQ (Frey & Detterman, 2004)?
other than abilities and achievements, some groups of students over The Rainbow Project (for details, see Sternberg & the Rainbow
others. Project Collaborators, 2006; see also Sternberg, 2005a, 2006; Sternberg
Success in life depends on a broader range of abilities than what & the Rainbow Project Collaborators, 2005; Sternberg, the Rainbow
conventional tests measure. For example, memory and analytical Project Collaborators, & the University of Michigan Business School
abilities may be sufficient to produce As in science courses, but they Collaborators, 2004) was a first project designed to enhance selective
are probably not sufficient to produce outstanding research, even if university admissions procedures at the undergraduate level. The
they are relevant, as in deciding whether one's ideas are good ones Rainbow measures were intended, in the US, to supplement the SAT,
(Lubinski, Benbow, Webb, & Bleske-Rechek, 2006). In particular, but they can supplement any conventional standardized test of abilities
outstanding researchers must be creative in generating ideas for or achievement. They were created before wisdom became a part of the
theories and/or experiments, analytical in discerning whether their theory described here (Sternberg, 2003b), so they do not assess wisdom.
ideas are good ones, and practical in getting their ideas funded and The SAT is a comprehensive examination currently measuring
accepted by competitive refereed journals. Conventional tests thus verbal comprehension and mathematical thinking skills, with a
may well be a good beginning to identifying the gifted, but, over the writing component recently added. A wide variety of studies have
years, they also seem to have become the end. shown the utility of the SAT and similar tests as predictors of university
My colleagues and I have been involved in three related projects and job success, with success in college typically measured by GPA
exploring whether broader quantitatively-based assessments might (grade-point average) (Hezlett et al., 2001; Kobrin, Camara, &
be helpful in the university admissions process. The first of these Milewski, 2002; Schmidt & Hunter, 1998). Taken together, these
projects is the Rainbow Project, the second, the Kaleidoscope Project, data suggest reasonable predictive validity for the SAT in predicting
and the third, the Aurora Project. Our goal here is not to present the undergraduate performance. Indeed, traditional intelligence or apti-
projects in detail, which is done elsewhere, but rather to discuss their tude tests have been shown to predict performance across a wide
relevance to identifying the gifted. variety of settings. But as is always the case for a single test or type of
The basic claim of the three projects is the same—that assessments test, there is room for improvement.
of giftedness need to measure not only g-based abilities, but also, The augmented theory of successful intelligence provides one
creative and practical abilities, and in the ideal case, wisdom-based basis for improving prediction and possibly for establishing greater
abilities as well. In the ideal, the person will have, or be able to acquire equity and diversity, which is a goal of most higher-educational
or capitalize on in others, high levels of all these skills. Consider why. institutions (Bowen, Kurzweil, & Tobin, 2006). It suggests that
Someone who is analytically gifted, but not gifted in other ways, may broadening the range of skills tested to go beyond analytic skills, to
do well on standardized tests and in school. He or she may also do well in include practical and creative skills as well, might significantly
entry-level jobs that require analysis. But higher level jobs almost enhance the prediction of undergraduate performance beyond
inevitably require one to come up with one's own ideas and to sell them. current levels. Thus, the theory does not suggest replacing, but rather,
People who are analytically gifted and no more may find their best days augmenting the SAT and similar tests such as the ACT or the A-Levels
in terms of achievement behind them after they leave school. in the undergraduate admissions process. A collaborative team of
R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336 329

Table 1 such as “The Octopus's Sneakers.” Oral Story Telling required orally
Assessments measuring cognitive skills in the Rainbow Project. telling two stories based upon choices of picture collages. Open-ended
Multiple-choice performance-based answers were rated by trained raters for novelty,
quality, and task-appropriateness. Multiple judges were used for each
Analytical Creative Practical
task and satisfactory reliability was achieved (Sternberg & the
Verbal Learning meanings of Novel analogies Everyday problems of
Rainbow Project Collaborators, 2006).
words from context adolescents
Quantitative Number series Novel number Practical mathematics Multiple-choice measures of practical skills were of three kinds. In
systems Everyday Problems of Adolescents, students are presented with a set of
Figural Matrices Series with Route-planning for everyday problems in the life of an adolescent and have to select the
mappings complex routes option that best solves each problem. In Practical Mathematics, students
are presented with scenarios requiring the use of math in everyday life
Performance
(e.g., buying tickets for a ballgame), and have to solve math problems
Creative Practical based on the scenarios. In a Route-Planning for Complex Routes, students
Task 1 Captioning cartoons School-based practical problems are presented with a map of an area (e.g., an entertainment park) and
Task 2 Written story-telling Job-based practical problems have to answer questions about navigating effectively through the area
Task 3 Oral story-telling Practical problems presented as movies
depicted by the map.
Practical skills also were assessed using three situational-judgment
inventories tapping different types of tacit knowledge. The general format
of tacit knowledge inventories has been described in Sternberg et al.
investigators sought to study how successful such an augmentation (2000), so only the content of the inventories used in this study will be
could be. Even if societies did not use the SAT, ACT, or A-Levels, in described here. School-Based Practical problems provided everyday
particular, we still would need some kind of assessment of the university situations for which a solution was required. Job-Based
memory and analytical abilities the tests assess. Practical problems provided everyday business problems, such as being
assigned to work with a coworker whom one has difficulty tolerating. One
1.1. Methodological considerations had to figure out what to do. In Practical Problems Presented as Movies,
movies presented everyday situations that confront undergraduate
In the Rainbow Project, data were collected at 15 schools across students, such as asking for a letter of recommendation from a professor
the United States, including 8 four-year undergraduate institutions, 5 who shows, through nonverbal cues, that he does not recognize you very
community colleges, and 2 high schools. well. One then has to rate various options for how well they would work
The participants were 1013 students predominantly in their first in response to each situation.
year as undergraduates or their final year of high school. In this report, Unlike the creativity performance tasks, in the practical perfor-
analyses only for undergraduate students are discussed because they mance tasks the participants were not given a choice of situations to
were the only ones for whom the authors had data available regarding rate. For each task, participants were told that there was no “right”
undergraduate academic performance. The final number of partici- answer, and that the options described in each situation represented
pants included in these analyses was 793. variations on how different people approach different situations.
Baseline measures of standardized test scores and high school grade- Consider examples of the kinds of items one might find on the
point average were collected to evaluate the predictive validity of Rainbow Assessment. An example of a creative item might be to write a
current tools used for undergraduate admission criteria, and to provide a story using the title “3516” or “It's Moving Backward.” Another example
contrast for the current measures. Students' scores on standardized might show a collage of pictures in which people are engaged in
university entrance exams were obtained from the College Board. different a wide variety of activities helping other people. One would
Measures are described briefly in Table 1. then orally tell a story that takes off from the collage. An example of a
The measure of analytical skills was provided by the SAT plus practical item might show a movie in which a student has just received a
multiple-choice analytical items my colleagues and I added measuring poor grade on a test. His roommate had a health crisis the night before,
inference of meanings of words from context, number series and he had been up all night helping his roommate. His professor hands
completions, and figural matrix completions. him back the test paper, with a disappointed look on her face, and
Creative skills were measured by multiple-choice items and by suggests to the student that he study harder next time. The movie then
performance-based items. The multiple-choice items were of three stops. The student then has to describe how he would handle the
kinds. In Novel Analogies, students are presented with verbal analogies situation. Or the student might receive a written problem describing a
preceded by counterfactual premises (e.g., money falls off trees). They conflict with another individual with whom she is working on a group
have to solve the analogies as though the counterfactual premises project. The project is getting mired down in the interpersonal conflict.
were true. In Novel Number Systems, students are presented with The student has to indicate how she would resolve the situation to get
rules for novel number operations, for example, “flix,” which involves the project done. All materials were administered in either of two
numerical manipulations that differ as a function of whether the first of formats. A total of 325 of the university students took the test in paper-
two operands is greater than, equal to, or less than the second. and-pencil format, whereas a total of 468 different university students
Participants have to use the novel number operations to solve took the test on the computer via the World Wide Web.
presented math problems. In a Figure Series with Mapping, partici- No strict time limits were set for completing the tests. The time
pants are first presented with a figural series that involves one or more taken to complete the battery of tests ranged from 2 to 4 h.
transformations; they then have to apply the rule of the series to a new As a result of the lengthy nature of the complete battery of
figure with a different appearance, and complete the new series. These assessments, participants were administered parts of the battery
measures are not typical of assessments of creativity and were using an intentional incomplete overlapping design. The participants
included for relative quickness of participants' responses and relative were randomly assigned to the test sections they were to complete.
ease of scoring. Details of the use of the procedure are in Sternberg and the Rainbow
Creative skills also were measured using open-ended measures. In Project Collaborators (2006).
Captioning Cartoons, students were given a cartoon and had to Creativity in this (and the subsequent Kaleidoscope) Project was
provide a caption for it. Written Story Telling requires students to assessed on the basis of the novelty and quality of responses. Level of
write two very short stories from a selection among unusual titles, demonstrated practical intelligence was assessed on the basis of
330 R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336

ratings of the feasibility of the products with respect to human and Table 2
material resources and how persuasive the product was. Simple correlations with first-year college GPA in the Rainbow Project.a

SAT Verbal .26


1.2. Findings SAT Math .28
Analytic—STAT .24
Creative—STAT .35
The analysis described below is a conservative one that does not
Creative—Cartoons .08 (ns)
correct for differences in the selectivity of the institutions at which the Creative—Written stories .12
study took place. In a study across so many undergraduate institutions Creative—Oral stories .29
differing in selectivity, validity coefficients will seem to be lower than is Practical—STAT .25
Practical reasoning—Movies .14
typical, because an A at a less selective institution counts the same as an
Practical reasoning—Common sense .27
A at a more selective institution. When the authors corrected for Practical reasoning—College life .16
institutional selectivity, the results described below became stronger. a
All correlations are statistically significant unless otherwise noted.
But correcting for selectivity has its own problems (e.g., on what basis
does one evaluate selectivity?), and so uncorrected data are used in this
report. My colleagues and I also did not control for university major: multiple-choice tests we used. It appears that when mult
Different universities may have different majors, and the exact course iple-choice is used as a methodology, it produces an analytical or,
offerings, grading, and populations of students entering different majors roughly, what appears to be a g factor.
may vary from one university to another, rendering control difficult. How do the Rainbow measures fare on incremental validity? In one
When examining undergraduate students alone, the sample showed set of analyses, the SAT-V, SAT-M, and High School GPA were included in
a slightly higher mean level of SAT than that found in undergraduate the first step of the prediction equation because these are the standard
institutions across the United States. The standard deviation was above measures used today to predict undergraduate performance. Only High
the normal 100-point standard deviation, meaning our study did not School GPA contributed uniquely to prediction of undergraduate GPA.
suffer from restriction of range. Our means, although slightly higher Inclusion of the Rainbow measures roughly doubled prediction
than typical, are within the range of average undergraduate students. (percentage of variance accounted for in the criterion) versus the SAT
Another potential concern is pooling data from different institutions. alone.
My colleagues and I pooled data because in some institutions we simply In particular, adding our analytical measure hierarchically to the SAT
did not have large enough numbers of cases for the data to be meaningful. only increased the squared multiple correlation between our measures
Some scholars believe that there is only one set of skills that is highly and first-year GPA from .098 to .099. Thus, our analytical measures, like
relevant to school performance, what is sometimes called “general the SAT, are probably largely g-based. All measures of g assess roughly
ability,” or g (e.g., Jensen, 1998). These scholars believe that tests may the same abilities. Adding practical measures increased the squared
appear to measure different skills, but when statistically analyzed, show multiple correlation to .129, a more substantial increase from .098.
themselves just to be measuring the single general ability. Does the test Adding creative measures increased the squared multiple correlation to
actually measure distinct analytical, creative, and practical skill group- .186, a notable increase. And adding all the measures to SAT increased
ings? Factor analysis addresses this question. Three meaningful factors the squared multiple correlation from .098 to .209, or more than double
were extracted from the data: practical performance tests, creative the value for the SAT alone. In general, our correlations were lower than
performance tests, and multiple-choice tests (including analytical, in many other studies because we counted grades from many different
creative, and practical). In other words, multiple-choice tests, regardless institutions, not correcting the grades for institutional quality. When we
of what they were supposed to measure, clustered together (see also did correct, we got the same patterns of results, but with higher squared
Sternberg, Castejón, Prieto, Hautamäki, & Grigorenko, 2001, for similar correlations.
findings). Thus, method variance proved to be very important. The These results suggest that the Rainbow tests added considerably to
results show the importance of measuring skills using multiple formats, the prediction of first-year GPA beyond that of SAT scores alone. They
precisely because method is so important in determining factorial also suggest the power of high school GPA in prediction, particularly
structure. The results show the limitations of exploratory factor analysis because it is an atheoretical composite that includes within it many
in analyzing such data, and also of dependence on multiple-choice items variables, including motivation and conscientiousness.
outside the analytical domain. In the ideal, one wishes to ensure that one Studying group differences requires careful attention to method-
controls for method of testing in designing aptitude and other test ology and sometimes has led to erroneous conclusions (Hunt &
batteries. Carlson, 2007). Although one important goal of the present study was
Undergraduate admissions offices in selective institutions are not to predict success in the undergraduate years, another important goal
interested, exactly, in whether these tests predict undergraduate involved developing measures that reduce ethnic-group differences in
academic success. Rather, they are interested in the extent to which mean levels. There has been a lively debate as to why there are
these tests predict school success beyond those measures currently in socially-defined racial group differences, and as to whether scores for
use, such as the SAT and high school academic grade-point-average members of underrepresented minority groups are over- or under-
(GPA). In order to test the incremental validity provided by Rainbow predicted by SATs and related tests (see, e.g., Bowen & Bok, 2000;
measures above and beyond the SAT in predicting GPA, a series of Camara & Schmidt, 1999; Rowe, 2005; Rushton & Jensen, 2005;
hierarchical regressions was conducted that included the items
analyzed above in the analytical, creative, and practical assessments.
If one looks at the simple correlations in Table 2, the SAT-V, SAT-M, Table 3
Factor loadings for Rainbow measures.
high school GPA, and the Rainbow measures all predict first-year GPA.
Thus, there is no clear reason solely from the simple correlations to Oral stories 0.57 − 0.06 − 0.06
choose one set of measures or another. Written stories 0.79 0.01 − 0.02
Cartoons 0.20 0.28 − 0.08
Table 3 shows the results of a factor analysis of the measures. The
STAT-creative 0.00 0.73 0.09
results were only partially what we expected, which was a set of three STAT-analytic − 0.06 0.80 − 0.04
factors: analytical, creative, and practical. We obtained differentiable STAT-practical 0.03 0.81 − 0.02
factors for the creative and practical performance measures. However, Movies 0.12 0.05 0.52
we did not obtain an analytical factor in the sense we had intended. College life − 0.13 0.01 1.00
Common sense 0.12 − 0.01 0.92
Rather, we obtained a method factor corresponding to all the
R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336 331

undergraduate academic performance. The Rainbow measures alone


roughly doubled the predictive power of Undergraduate GPA when
compared to the SAT alone. The Rainbow measures predicted
substantially beyond the contributions of the SAT and High School
GPA. These findings, combined with encouraging results regarding the
reduction of between-ethnicity differences, make a compelling case for
furthering the study of the measurement of analytic, creative, and
practical skills for predicting success in the university.
One important goal for the current study, and future studies, is the
creation of standardized assessments that reduce the different
outcomes between different groups as much as possible to maintain
test validity. The measures described here suggest results toward this
end. Although the group differences in the tests were not reduced to
Fig. 1. Omega squared indices for the measures used in this study. The indices compare zero, the tests did substantially attenuate group differences relative to
whites and Asians versus members of underrepresented minority groups. Higher values other measures such as the SAT. This finding could be an important
indicate greater differences in scores. step toward ultimately ensuring fair and equal treatment for
members of diverse groups in the academic domain.
Sternberg, Grigorenko, & Kidd, 2005; Turkheimer, Haley, Waldron, The principles behind the Rainbow Project apply at other levels of
D'Onofrio, & Gottesman, 2003). There are a number of ways one can selective admissions as well. For example, Hedlund, Wilt, Nebel,
test for group differences in these measures, each of which involves a Ashford, and Sternberg (2006) have shown that the same principles
test of the size of the effect of ethnic-group. Two different measures can be applied in admissions to business schools, also with the result of
were chosen: ω2—omega squared and Cohen's D. increasing prediction and decreasing ethnic- (as well as gender-) group
There were two general findings. First, in terms of overall differences, differences.
the Rainbow tests appeared to reduce ethnic-group differences relative Stemler, Grigorenko, Jarvin, and Sternberg (2006) have found that
to traditional assessments of abilities like the SAT. Second, in terms of including creative and practical items in augmented psychology and
specific differences, it appears that the Latino students benefited the statistics Advanced Placement Examinations can reduce ethnic-group
most from the reduction of group differences. The black students, too, differences on the tests. Such examinations are generally taken by high
seemed to show a reduction in difference from the white mean for most school students who are identified as sufficiently gifted to take college-
of the Rainbow tests, although a substantial difference appeared to be level courses. My colleagues and I modified Advanced Placement tests in
maintained with the practical performance measures. Psychology and Statistics additionally to assess analytical, creative, and
As an example of these results, omega squared was computed practical skills. Here is an example in psychology:
comparing scores of whites and Asian-Americans versus scores of A variety of explanations have been proposed to account for why
members of underrepresented minority groups (African-Americans, people sleep.
Hispanic-Americans, and American Indians) for the SAT Verbal was .09
a) Describe the Restorative Theory of sleep (memory).
and for the SAT Math was .04. For our measures, the median value was
b) An alternative theory is an evolutionary theory of sleep, sometimes
.02 (see Fig. 1).
referred to as the “Preservation and Protection” theory. Describe this
Table 4 shows the results for the Cohen's D analyses. The units in the
theory and compare and contrast it with the Restorative Theory.
table are mean differences between groups in standard deviation units.
State what you see as the two strong points and two weak points of
The results are framed in terms of differences of various groups from the
this theory compared to the Restorative Theory (analytical).
results for white students. In general, the SATs show larger effects than
c) How might you design an experiment to test the Restorative
do the Rainbow measures. The Rainbow measures also show some
Theory of sleep? Briefly describe the experiment, including the
variation across ethnic groups in the patterns of results. In general,
participants, materials, procedures, and design (creative).
differences for the Rainbow measures are less than for the SATs.
d) A friend informs you that she is having trouble sleeping. Based on your
Although the group differences are not perfectly reduced, these
knowledge of sleep, what kinds of helpful (and health-promoting)
findings suggest that measures can be designed that reduce ethnic and
suggestions might you give her to help her fall asleep at night
racial group differences on standardized tests, particularly for histori-
(practical)?
cally disadvantaged groups like black and Latino students. These
findings have important implications for reducing adverse impact in My colleagues and I found that by asking such questions, as in the
undergraduate admissions. other studies, we were able both to increase the range of skills we tested
The SAT is based on a conventional psychometric notion of cognitive and substantially to reduce ethnic-group differences in test scores. Thus,
skills. Using this notion, it has had substantial success in predicting it is possible to reduce group differences, not only in tests of aptitudes,
but also, in tests of achievement. Recently, we have found very similar
Table 4 results for AP Physics as those we found for AP Psychology and Statistics
Cohen's D with Whites as a reference group. (Stemler, Sternberg, Grigorenko, Jarvin, & Sharpes, 2009).
Blacks Latinos Asians Nat. Am.
It is one thing to have a successful research project, and another
actually to implement the procedures in a high-stakes situation where
SAT-M − 0.74 − 0.98 0.35 − 1.00
one is trying to identify the most gifted students in an applicant pool.
SAT-V − 0.67 − 1.10 − 0.23 − 0.62
SAT-T − 0.73 − 1.10 0.04 − 0.76 My colleagues and I have had the opportunity to do so. The results of a
STAT-A − 0.19 − 0.36 0.34 − 0.33 second project, Project Kaleidoscope, are reviewed here.
STAT-C − 0.67 − 0.46 − 0.03 − 1.15
STAT-P − 0.47 − 0.53 0.09 − 0.66
2. The Kaleidoscope Project
Movies − 0.51 − 0.35 0.05 − 0.77
Common sense − 0.89 − 0.22 0.21 − 0.40
College life − 0.68 − 0.22 − 0.22 0.20 Tufts University in Medford, Massachusetts, USA, has strongly
Cartoons − 0.24 − 0.51 − 0.16 − 0.39 emphasized the role of active citizenship in education. It has put into
Oral stories − 0.14 − 0.46 − 0.50 0.50 practice some of the ideas from the Rainbow Project. In collaboration
Written stories − 0.26 − 0.11 − 0.25 0.01
with Dean of Admissions Lee Coffin, my colleagues and I instituted
332 R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336

Project Kaleidoscope, which represents an implementation of the ideas Creativity and practicality were assessed in the same way as in the
of Rainbow, but goes beyond that project to include in its assessment the Rainbow Project. Analytical quality was assessed by the organization,
construct of wisdom (Sternberg, 2007b,c; for details, see Sternberg et al., logic, and balance of the essay. Wisdom was assessed by the extent to
submitted for publication). which the response represented the use of abilities and knowledge for a
Tufts University is one of the more selective universities in the common good by balancing one's own, others', and institutional interests
United States. In conventional terms, the average SAT scores are in the over the long and short terms through the infusion of positive ethical
low 700s and the large majority of students graduate in the top 10% of values.
their high school class. By conventional standards, then, they would Note that the goal is not to replace SAT and other traditional
likely have been classified as gifted at some stage of their school admissions measurements like grade-point averages and class rank with
careers. But our interest in admissions was to go beyond traditional some new test. Rather, it is to re-conceptualize applicants in terms of
notions of what it means to be gifted. academic/analytical, creative, practical, and wisdom-based abilities, using
Lee Coffin and the Tufts Undergraduate Admissions Office placed on the essays as one but not the only source of information. For example,
the 2006–2007 application for all of the over 15,000 students applying highly creative work submitted in a portfolio also could be entered into
to Arts, Sciences, and Engineering at Tufts, questions designed to assess the creativity rating, or evidence of creativity through winning of prizes or
wisdom, analytical and practical intelligence, and creativity synthesized awards. The essays were major sources of information, but if other
(WICS), which is the augmented form of the theory of successful information was available, the trained admissions officers used it.
intelligence (Sternberg, 2003b). The program was continued for 2007– My colleagues and I now have some results of our first year of
2008 and 2008–2009, but the data reported here are for the first year, for implementation, and they are very promising. Applicants were evaluated
which we have more nearly complete data. for creative, practical, and wisdom-based skills, if sufficient evidence was
The questions were optional in the first two years. Whereas the available, as well as for academic (analytical) and personal qualities in
Rainbow Project was done as a separate high-stakes test administered general.
with a proctor, the Kaleidoscope Project was done as section of the Tufts- Among the applicants who were evaluated as being academically
specific supplement to the Common Application. It was not practical to qualified for admission, approximately half completed an optional
administer a separate high-stakes test such as the Rainbow assessment essay. Doing these essays had no meaningful effect on chances of
for admission to one university. Moreover, the advantage of Kaleidoscope admissions. However, quality of essays or other evidence of creative,
is that it got us away from the high-stakes testing situation in which practical, or wisdom-based abilities did have an effect. For those rated
students must answer complex questions in very short amounts of time as an “A” (top rating) by a trained admission officer in any of these
under incredible pressure. three categories, average rates of acceptance were roughly double
Students were encouraged to answer just a single question so as those for applicants not getting an A. Because of the large number of
not overly to burden them. Tufts University competes for applications essays (over 8000), only one rater rated applicants except for a sample
with many other universities, and if our application was substantially to ensure that inter-rater reliability was sufficient, which it was.
more burdensome than those of our competitor schools, it would put Many measures do not look like conventional standardized tests,
us at a real-world disadvantage in attracting applicants. In the theory but have statistical properties that mimic them. My colleagues and I
of successful intelligence, successful intelligent individuals capitalize were therefore interested in convergent–discriminant validation of
on strengths and compensate for or correct weaknesses. Our format our measures. The correlation of our measures with a rated academic
gave students a chance to capitalize on a strength. composite that included SAT scores and high school GPA were modest
Measures for Year 1 of the project are described briefly in Table 5. but significant for creative, practical thinking, and wise thinking. The
The items change every year. correlations with a rating of quality of extracurricular participation
and leadership were moderate for creative, practical, and wise
thinking. Thus, the pattern of convergent–discriminant validation
Table 5
Assessments measuring cognitive skills in the Kaleidoscope Project. was what we had hoped for—very modest correlations with the SAT
and moderate correlations with measures of extracurricular and lead-
Analytical 1. The late scholar James O. Freedman referred to libraries as “essential
ership activities.
harbors on the voyage toward understanding ourselves.” What work of
fiction or non-fiction would you include in a personal library? Why? The average academic quality of applicants in Arts & Sciences rose
2. An American adage states that “curiosity killed the cat.” If that is slightly in 2006–2007, the first year of the pilot, in terms of both SAT
correct, why do we celebrate people like Galileo, Lincoln, and Gandhi, and high school grade-point average. In addition, there were notably
individuals who thought about longstanding problems in new ways or fewer students in what before had been the bottom third of the pool in
who defied conventional thinking to achieve great results?
Creative 3. History's great events often turn on small moments. For example,
terms of academic quality. Many of those students, seeing the new
what if Rosa Parks had given up her seat on that Montgomery bus in application, seem to have decided not to bother to apply. Many more
1955? What if Pope John Paul I had not died in 1978 after a month in strong applicants applied.
office? What if Gore had beaten Bush in Florida and won the 2000 U.S. Thus, adopting these new methods does not result in less qualified
Presidential Election? Using your knowledge of American or world history,
applicants applying to the institution and being admitted. Rather, the
choose a defining moment and imagine an alternative historical scenario
if that key event had played out differently. applicants who are admitted are more qualified, but in a broader way.
4. Create a short story using one of the following topics: Perhaps most rewarding were the positive comments from large
a. The End of MTV numbers of applicants that they felt our application gave them a
b. Confessions of a Middle School Bully chance to show themselves for who they are. Of course, many factors
c. The Professor Disappeared
are involved in admissions decisions, and Kaleidoscope ratings were
d. The Mysterious Lab
7. Using an 8.5×11 in. sheet of paper, create an ad for a movie, design only one small part of the overall picture.
a house, make an object better, illustrate an ad for an object. My colleagues and I did not get meaningful differences across
Practical 5. Describe a moment in which you took a risk and achieved an ethnic groups, a result that surprised us, given that the earlier
unexpected goal. How did you persuade others to follow your lead?
Rainbow Project reduced but did not eliminate differences. And after a
What lessons do you draw from this experience? You may reflect on
examples from your academic, extracurricular or athletic experiences. number of years in which applications by underrepresented minor-
Wisdom 6. A high school curriculum does not always afford much intellectual ities were relatively flat in terms of numbers, this year they went up
freedom. Describe one of your unsatisfied intellectual passions. substantially. In the end, applications from African-Americans and
How might you apply this interest to serve the common good and Hispanic-Americans increased significantly, and admissions of Afri-
make a difference in society?
can-Americans were up 30% and of Hispanic-Americans up 15%.
R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336 333

We found, at the end of the first year, that students admitted with grounded in the theory of successful intelligence as presented earlier.
very high scores on Kaleidoscope did just as well academically as did The conventional assessment of general intellectual ability has been
students who were also excellent but who were admitted to Tufts for developed as a supplement. Of greatest importance and significance is the
other reasons. But we also found that the students admitted with high former, which, accordingly, I discuss more extensively here. We do not yet
Kaleidoscope scores excelled, on average, in participation in extra- have analyzed validity data for Aurora, so the purpose of this discussion is
curricular and leadership activities. to describe the assessment, not to present data.
To give some sense of what we found in Kaleidoscope, here are In designing the augmented assessment, Elena Grigorenko, Mei Tan,
correlations of our measure with other measures used in undergraduate and their colleagues used a basic grid structure to depict graphically the
admissions for applicants to the Class of 2011 at Tufts. The correlation broad range of item types to be developed. Analytical, creative, and
with admissions officers' academic rating (based on information about practical domains are depicted as columns and figural, verbal, and
students' high school academic performance) was .10. The correlation quantitative modes as rows (see Table 6). Subtests are created such that
with SAT-V was .07 and with SAT-M was .00. The correlation with high their dominant properties fulfill the criteria of each cell of the grid (see, for
school GPA was .00, although high school GPAs mean different things in another example of such item development, Sternberg & Clinkenbeard,
schools of different quality, so it is hard to interpret this correlation. The 1995). Resulting are nine different types of subtests that together assess
correlation with admissions officers' ratings of personal qualities was each combination of domain and modal specificity. This design is
.25 and the correlation with rated extracurricular activities was .49. implemented for three reasons: to anchor the assessment securely in
Thus, the assessment, as desired, measured leadership skills beyond the augmented theory of successful intelligence, to allow students
what SAT and high school GPA measure. balanced opportunities to demonstrate multiple and varied abilities,
So our results, like those of the Rainbow Project, showed that it is and to serve as a clear guide for assessing abilities across and between
possible to increase academic quality and diversity simultaneously, domains and modes.
and to do so in for an entire undergraduate class at a major university, Augmented assessment items differ in ways beyond the categorical
not just for small samples of students at some scattered schools. Most properties of the grid. Difficulty varies from subtest to subtest, and
importantly, my colleagues and I sent a message to students, parents, from item to item within these. A central goal of task creation is the
high school guidance counselors, and others, that we believe that elimination of ceilings on each subtest to the extent possible (and
there is a more to a person than the narrow spectrum of skills assessed reasonable) without compromising the capacity of the assessment to
by standardized tests, and that these broader skills can be assessed in be given not only to students already thought to be gifted but to whole
a quantifiable way. student populations without generating undo distress or anxiety. Both
subtests and tasks range in length and individual questions take many
3. The Aurora Project forms. Some items require receptive answers, those chosen from a
discrete set of options, and others productive answers, generated by
Aurora is an assessment designed for children roughly 10 to 12 years the student with varying degrees of constraint. Among other
of age that can be used for identification of gifted performers (Chart, variations, there are multiple-choice and fill-in-the-blank questions
Grigorenko, & Sternberg, 2008). Two parts comprise the battery under answered, math problems solved, lists generated, short selections
current development, a newly designed, augmented part (Aurora-a or written, pieces of information classified and ordered, money allocated,
Aurora-a-battery) and a more conventional, intelligence-based part paths drawn, and subjective decisions made. A glance at the
(Aurora-g or Aurora-g-battery). Both are paper-and-pencil assessments assessment reveals photographs, arrangements of numbers, drawings,
intended for group administration to mainstreamed students at the short paragraphs, and computer generated images. The content
elementary to middle school levels at which gifted programming is most domains are not “pure.” For example, creative stories are told about
prevalent. The augmented assessment is more substantial and is numbers, which involves a numerical component but also a verbal one.

Table 6
Assessments measuring cognitive skills in the Aurora Project.

Analytical Creative Practical

Images Shapes (Abstract Tangrams): complete Book Covers: interpret an abstract picture Paper Cutting: identify the proper unfolded
(visual/spatial) shapes with missing pieces. and invent a story to accompany it. version of a cut piece of paper.
(10 items) (MC) (5 items) (OE) (10 items) (MC)
Floating Boats: identify matching patterns Multiple Uses: devise three new uses for Toy Shadows: identify the shadow that
among connected boats. each of several household items. will be cast by a toy in a specific orientation.
(5 items)(MC) (5 items)(OE) (8 items)(MC)
Words Words That Sound the Same (Homophone Blanks): (Inanimate) Conversations: create dialogues (Silly) Headlines: identify and explain an
(verbal) complete a sentence with two missing words between objects that cannot typically talk. alternative “silly” meaning of actual headlines.
using homonyms. (10 items) (OE) (11 items)(RW)
(20 items) (RW) Interesting (Figurative) Language: interpret what Decisions: list elements given in a scenario
(Limited) Metaphors: explain how two somewhat sentence logically comes next after one on either “good” or “bad” side of a list in order
unrelated things are alike. containing figurative language. to make a decision.
(10 items)(OE) (12 items)(MC) (3 items) (RW)
Numbers Number Cards (Letter Math): find the single-digit Number Talk: imagine reasons for various Maps (Logistics Mapping): trace the best
(numerical) number that letters represent in equations. described social interactions between numbers. carpooling routes to take between friends' houses
(5 items) (RW) (7 items) (OE) and destinations.
Story Problems (Algebra): (before any algebra training) (10 items) (RW)
devise ways to solve logical math problems with two Money (Exchange): divide complicated “bills”
or more missing variables. appropriately between friends.
(5 items) (RW) (5 items) (RW)

MC: Multiple-Choice.
OE: Open-ended items that need to be scored by an individual using a rating scale.
RW: Answers are either Right or Wrong.
( ) in subtest titles: Subtest titles or portions of titles no longer in use.
334 R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336

Progressing across the grid of the Aurora-a-battery as if reading number of goals and enhance their gifted identification possibilities in
cells from left to right and then top to bottom, example subtests are important ways.
described. Floating Boats allows students to match patterns of Aurora has been translated into a number of languages and has been
connected toys whose arrangement changes from one photograph standardized in diverse countries. The challenges of translating the
to another. Book Covers allows students to generate a brief story plot measure and adapting it to diverse cultures are discussed in Tan et al.
to describe somewhat abstract pictures described as children's book (2008).
covers. Toy Shadows allows students to choose the shadow that is
made by a toy oriented in a particular way in relation to a light. 4. Limitations
Strange Metaphors allows students to generate a link between two
somewhat unrelated nouns. Inanimate Conversations allows students There are many limitations of these studies that circumscribe the
to imagine what certain objects might say to each other if they could conclusions that can be drawn from them.
speak. Tough Decisions allows students to categorize given informa- A first limitation is that socioeconomic class is confounded with
tion in pro- or con-lists to make an everyday choice. Letter Math ethnicity. So ethnicity differences may be attributable, in unknown
allows students to find numerical solutions to math problems with measure, to socioeconomic class differences. The differences are
letters in place of some “missing” values. Number Talk allows students unlikely to be solely a function of socioeconomic class, in that, where
to explain the reason for a social interaction briefly described and my colleagues and I obtained differences, others have obtained similar
illustrated between two cartoon numbers. Logistics Mapping allows patterns of differences (see, e.g., Loehlin, Lindzey, & Spuhler, 1975). For
students to compare different routes to destinations based on example, Asian-Americans did better on quantitative analytical mea-
incremental distances provided. This selection offers a sample of the sures than did White Americans (see also review in Lynn, 2006) and
range of tasks developed for the augmented assessment. worse on the creative measures than did White Americans, but in a
As a supplement to the analytical, creative, and practical measures result with Chinese and American college students in comparably
described above, a g-factor assessment has also been developed (the so- selective universities, we obtained the same result regarding creativity,
called Aurora-g-battery). Its design is likewise guided by a grid structure regardless of whether we used Chinese or American university
with identical modes, but with task types rather than skill areas informing professors as raters (Niu & Sternberg, 2001). Moreover, Asian-Amer-
the second axis. These are analogy, series completion and classification icans are generally not at a higher socioeconomic level than are whites
tasks—all typical traditional measures of general intelligence. Analogy but performed better on the quantitative analytical tests, such as the
requires students to analyze a relationship between a pair of stimuli math SAT, here and in other studies (Lynn, 2006). The reason my
(images, words or numbers) and extend this relationship to a second, collaborators and I did not control for socioeconomic class is that we
unfinished pair by choosing the correct stimulus from choices. Series were unable to obtain the data that would have enabled us to do so.
completion requires students to evaluate the logic of a progressive series A second limitation is that there were problematical methodological
of stimuli (images, words or numbers) and choose the next stimulus in issues in both the Rainbow and Kaleidoscope Projects. In Rainbow, my
the series from choices. Finally, classification tasks require students to collaborators and I used an incomplete design, meaning that not all
compare and contrast the properties of a list of stimuli (images, words or students took all tests. This made the statistical analysis complex to the
numbers) and select the one that conforms least to the others. Exactly point where we would not recommend the use of this design by others. In
nine subtests were developed such that the criteria of each cell of the Kaleidoscope, unlike in Rainbow, assessments were done without proc-
design grid are met. toring. Thus, we cannot be certain of the conditions under which the
The two sections that make up the Aurora Battery are intended to assessments were taken, or even that it was the applicant who took the
complement each other by reserving a place for traditionally valued g- assessment. The nature of the assessments, though, makes it questionable
factor skills while expanding the scope of identification methods to whether parents or others who might take the assessment would do
recognize less formally appreciated creative and practical skills with better than the applicants (for example, to cite one of the essays, many
the augmented assessment. The inclusion of both tests grants schools parents know far less about MTV than do their children). Moreover, an
the ability to demonstrate the relative effectiveness of each for advantage of doing the assessment at home is that students have more
assessing the abilities valued in their stated definitions of giftedness time to think carefully and deeply than they do in a timed proctored test;
and fostered through their programming. Educators are also given the often it is hard to think creatively, practically, or wisely without having
opportunity to consider how the augmented assessment compares sufficient time to do so.
with a more traditional one in identifying students in the school's A third limitation is that the new assessments require more time,
particular context without employing multiple test batteries. This resources, and money to score the assessments. We had to hire raters and
single battery might therefore be uniquely applied in accordance with train them. Although reliability was good, it could only be achieved with
the needs and goals of particular schools. training. Schools would therefore have to decide that the additional
Depending on the variable definitions of giftedness adopted, types information was worth the cost. In the Rainbow Project, my colleagues
of programs offered, and particular concerns of gifted educators, and I got substantially better prediction than SAT alone (double) or SAT
the Aurora Battery may be viewed as a series of assessments and plus high school GPA (roughly 50% increase) and decrease ethnic-group
therefore employed in several ways. Because the g-factor assessment differences. In the Kaleidoscope Project, we did not see academic
(g-battery) is intended as a supplement, the use of only this portion of differences between groups, but these results were considered excellent,
the battery is likely to offer schools little beyond what is already given the absence of ethnic-group differences in Kaleidoscope. So to the
available. Conversely, the augmented assessment is designed to allow extent one wishes to increase diversity and maintain academic standards,
for several alternative uses. First, schools that are uninterested in or these measures seem promising.
discouraged by the performance of traditional instruments with their A fourth limitation is that our follow-up data at this time are restricted.
population might use the a-battery independent of the g-battery. For Rainbow, we had only first-year university grades. For Kaleidoscope,
Alternatively, schools seeking to better identify only a particular skill, we have only first-semester performance and are currently analyzing the
either as a complement to existing identification measures, or for full first-year performance. For this project, we will be following up by
selection for more specialized gifted programming, might use only measuring progress broadly—including nonacademic measures—during
part of the Aurora-a assessment. For example, creativity subtests, or the four years the students are at the university.
only those dealing with figures as opposed to verbal and numerical A fifth limitation, in Kaleidoscope, is selection bias. Students who
modes, might be administered alone. Particularly with employment completed the essays were not a random sample of applicants: They chose
of the entire augmented assessment, educators may better meet a to do extra work. However, because admission probabilities were not
R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336 335

related to the fact of completing the essays, only to quality of essays for References
those who did complete them, the bias may not have been an important
factor in the results. Binet, A., & Simon, T. (1916). The development of intelligence in children. Baltimore:
Williams & Wilkins Originally published in 1905.
Bowen, W. G., & Bok, D. (2000). The shape of the river: Long-term consequences of
5. Conclusion considering race in college and university admissions. Princeton, NJ: Princeton
University Press.
Bowen, W. G., Kurzweil, M. A., & Tobin, E. M. (2006). Equity and excellence in American
In sum, the augmented theory of successful intelligence appears to higher education. Charlottesville, VA: University of Virginia Press.
provide a strong theoretical basis for identification of the gifted. There Camara, W. J., & Schmidt, A. E. (1999). Group differences in standardized testing and social
is evidence to indicate that it has good incremental predictive power, stratification. (College Board Research Rep. No. 99-5). New York, NY: The College Board
Retrieved 12/21/2006 from http://www.collegeboard.com/research/home/.
and serves to increase equity. As teaching improves and university Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. NY:
teachers emphasize more the creative and practical skills needed for World Book Co.
success in school and life (Sternberg, Jarvin, & Grigorenko, 2009), the Cattell, R. B. (1971). Abilities: Their structure, growth and action. Boston: Houghton Mifflin.
Ceci, S. J. (1996). On intelligence: A bioecological treatise on intellectual development.
predictive power of the test may increase. Cosmetic changes in testing
Cambridge, MA: Harvard University Press.
over the last century have made relatively little difference to the Chart, H., Grigorenko, E. L., & Sternberg, R. J. (2008). Identification: The Aurora Battery.
construct validity of assessment procedures. The augmented theory of In J. A. Plucker, & C. M. Callahan (Eds.), Critical issues and practices in gifted education
successful intelligence—WICS—could provide a new opportunity to (pp. 281−301). Waco, TX: Prufrock.
Dweck, C. S. (1999). Self-theories: Their role in motivation, personality, and development.
increase construct validity at the same time that it reduces differences Philadelphia: Psychology Press.
in test performance between groups. It may indeed be possible to Frey, M. C., & Detterman, D. K. (2004). Scholastic assessment or g? The relationship between
accomplish the goals of affirmative-action through tests such as the the Scholastic Assessment Test and general cognitive ability. Psychological Science, 15,
373−378.
Rainbow assessments, either as supplements to traditional affirma- Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic.
tive-action programs or as substitutes for them. Gardner, H. (2006). Multiple intelligences: New horizons. New York: Perseus.
Other modern theories of intelligence, such as those mentioned earlier Golden, D. (2006). The price of admission. New York: Crown.
Guilford, J. P. (1967). The nature of human intelligence. New York: McGraw-Hill.
in the article (e.g., Ceci, 1996; Gardner, 1983), may also serve to improve Hedlund, J., Wilt, J. M., Nebel, K. R., Ashford, S. J., & Sternberg, R. J. (2006). Assessing
prediction and increase diversity. Moreover, other approaches to practical intelligence in business school admissions: A supplement to the Graduate
supplementing the SAT, and the Rainbow tests, may be called for. For Management Admissions Test. Learning and Individual Differences, 16, 101−127.
Herrnstein, R. J., & Murray, C. (1994). The bell curve. New York: Free Press.
example, Oswald, Schmitt, Kim, Ramsay, and Gillespie (2004) have found Hezlett, S., Kuncel, N., Vey, A., Ones, D., Campbell, J., & Camara, W. J. (2001). The effectiveness of
biodata and situational-judgment tests (the latter of which my colleagues the SAT in predicting success early and late in college: A comprehensive meta-analysis. Paper
and I also used) to provide incremental validity to the SAT. Sedlacek presented at the annual meeting of the National Council of Measurement in Education,
Seattle, WA.
(2004) has developed non-cognitive measures that appear to have had
Hunt, E., & Carlson, J. (2007). Considerations relating to the study of group differences
success in enhancing the university admissions process. in intelligence. Perspectives on Psychological Science, 2, 194−213.
The theory and principles of assessment described in this article can be Jensen, A. R. (1998). The g factor. Westport, CT: Praeger/Greenwood.
extended beyond the United States (Sternberg, 2004, 2007a). My Kabaservice, G. (2004). The guardians: Kingman Brewster, his circle, and the rise of the
liberal establishment. New York: Henry Holt.
colleagues and I have used assessments based on the theory of successful Karabel, J. (2006). The chosen: The hidden history of admission and exclusion at Harvard,
intelligence on five continents, and found that the general principles seem Yale, and Princeton. New York: Mariner.
to hold, although the content used to assess abilities need to differ from Kaufman, S. B., & Sternberg, R. J. (2008). Conceptions of giftedness. In S. Pfeiffer (Ed.),
Handbook of giftedness in children: Psycho-educational theory, research, and best
one locale to another. At present, we are starting a collaboration with practices (pp. 71−92). New York: Springer.
psychologists in Germany to determine whether the instruments we have Kobrin, J. L., Camara, W. J., & Milewski, G. B. (2002). The utility of the SAT I and SAT II for
used in the United States might, in suitable form, be useful there as well. admissions decisions in California and the Nation.New York: College Entrance
Examination Board College Board Report No. 2002-6.
There is no question but that the methods used in the Rainbow Lemann, N. (1999). The big test: The secret history of the American meritocracy. New York:
Project, the Kaleidoscope Project, and related projects are at early stages Farrar, Straus, & Giroux.
of development. They do not have more than 100 years of experience Loehlin, J. C., Lindzey, G., & Spuhler, J. N. (1975). Race differences in intelligence. New
York: Freeman.
behind them, as do traditional methods. What the results suggest is that Lubinski, D., Benbow, C. P., Webb, R. M., & Bleske-Rechek, A. (2006). Tracking
an argument is to be made for broader assessments—that broader exceptional human capital over two decades. Psychological Science, 17, 194−199.
assessments are not synonymous with fuzzy-headed assessments. Such Lynn, R. (2006). Race differences in intelligence: An evolutionary analysis. Augusta, GA:
Washington Summit.
assessments can improve prediction and increase diversity, rather than
McDonough, P. M. (1997). Choosing colleges: How social class and schools structure
trading off the one for the other. Broader assessments do not replace opportunity. Albany, NY: State University of New York Press.
conventional ones: They supplement them. Our results show an Niu, W., & Sternberg, R. J. (2001). Cultural influences on artistic creativity and its
important role for traditional analytical abilities in academic and other evaluation. International Journal of Psychology, 36(4), 225−241.
Oswald, F. L., Schmitt, N., Kim, B. H., Ramsay, L. J., & Gillespie, M. A. (2004). Developing a
forms of success. But these are not the only abilities that matter, and biodata measure and situational judgment inventory as predictors of college
should not be the only abilities we measure. Giftedness is not just about student performance. Journal of Applied Psychology, 89, 187−207.
g or conventional abilities. It is about wisdom, intelligence, and Rowe, D. C. (2005). Under the skin: On the impartial treatment of genetic and
environmental hypotheses of racial differences. American Psychologist, 60(1), 60−70.
creativity, synthesized. Rushton, J. P., & Jensen, A. R. (2005). Thirty years of research on race differences in
cognitive ability. Psychology, Public Policy, and Law, 11, 235−294.
Acknowledgments Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in
personnel psychology: Practical and theoretical implications of 85 years of research
findings. Psychological Bulletin, 124, 262−274.
I am grateful to the Rainbow Project Collaborators, the Kaleidoscope Sedlacek, W. E. (2004). Beyond the big test: Noncognitive assessment in higher education.
Project Collaborators, and the Aurora Project Collaborators for making this San Francisco: Jossey-Bass.
Spearman, C. (1927). The abilities of man. New York: Macmillan.
research possible. I especially want to thank Hilary Chart, Elena Stemler, S. E., Grigorenko, E. L., Jarvin, L., & Sternberg, R. J. (2006). Using the theory of
Grigorenko, Linda Jarvin, Steven Stemler, and Mei Tan. Robert J. Sternberg successful intelligence as a basis for augmenting AP exams in psychology and
is Dean of the School of Arts and Sciences and Professor of Psychology at statistics. Contemporary Educational Psychology, 31(2), 344−376.
Stemler, S., Sternberg, R. J., Grigorenko, E. L., Jarvin, L., & Sharpes, D. K. (2009). Using the
Tufts University, as well as Honorary Professor of Psychology at the
theory of successful intelligence as a framework for developing assessments in AP
University of Heidelberg. Preparation of this article was supported by Physics. Contemporary Educational Psychology, 34, 195−209.
CASL–IES grant R305H030281, ROLE–NSF grant REC 440171, and REESE– Sternberg, R. J. (1997). Successful intelligence. New York: Plume.
NSF grant REC 0633952. The Rainbow Project was supported by the Sternberg, R. J. (1998a). Abilities are forms of developing expertise. Educational Researcher,
27(3), 11−20.
College Board, and the Kaleidoscope Project has been supported by Tufts Sternberg, R. J. (1998b). A balance theory of wisdom. Review of General Psychology, 2,
University. The Aurora Project has been supported by Karen Jensen. 347−365.
336 R.J. Sternberg / Learning and Individual Differences 20 (2010) 327–336

Sternberg, R. J. (1999a). Intelligence as developing expertise. Contemporary Educational Sternberg, R. J., & Davidson, J. E. (Eds.). (2005). Conceptions of giftedness, 2nd ed. New
Psychology, 24, 359−375. York: Cambridge University Press.
Sternberg, R. J. (1999b). The theory of successful intelligence. Review of General Sternberg, R. J., Forsythe, G. B., Hedlund, J., Horvath, J., Snook, S., Williams, W. M., et al.
Psychology, 3, 292−316. (2000). Practical intelligence in everyday life. New York: Cambridge University Press.
Sternberg, R. J. (2003a). Teaching for successful intelligence: Principles, practices, and Sternberg, R. J., & Grigorenko, E. L. (Eds.). (2002). The general factor of intelligence: How
outcomes. Educational and Child Psychology, 20(2), 6−18. general is it? Mahwah, NJ: Lawrence Erlbaum Associates.
Sternberg, R. J. (2003b). Wisdom, intelligence, and creativity synthesized. New York: Sternberg, R. J., & Grigorenko, E. L. (2007). Teaching for successful intelligence, 2nd ed.
Cambridge University Press. Thousand Oaks, CA: Corwin.
Sternberg, R. J. (2004). Culture and intelligence. American Psychologist, 59(5), 325−338. Sternberg, R. J., Grigorenko, E. L., & Kidd, K. K. (2005). Intelligence, race, and genetics.
Sternberg, R. J. (2005a). Accomplishing the goals of affirmative action—With or without American Psychologist, 60(1), 46−59.
affirmative action. Change, 37(1), 6−13. Sternberg, R. J., Jarvin, L., & Grigorenko, E. L. (2009). Teaching for wisdom, intelligence,
Sternberg, R. J. (2005b). The theory of successful intelligence. Interamerican Journal of creativity, and success. Thousand Oaks, CA: Sage.
Psychology, 39(2), 189−202. Sternberg, R. J.Rainbow Project Collaborators. (2005). Augmenting the SAT through
Sternberg, R. J. (2006). How can we simultaneously enhance both academic excellence assessments of analytical, practical, and creative skills. In W. Camara, & E. Kimmel
and diversity? College and University, 81(1), 17−23. (Eds.), Choosing students: Higher education admission tools for the 21st century
Sternberg, R. J. (2007a). Culture, instruction, and assessment. Comparative Education, (pp. 159−176). Mahwah, NJ: Lawrence Erlbaum Associates.
43(1), 5−22. Sternberg, R. J.Rainbow Project Collaborators. (2006). The Rainbow Project: Enhancing
Sternberg, R. J. (2007b). Finding students who are wise, practical, and creative. The the SAT through assessments of analytical, practical and creative skills. Intelligence,
Chronicle of Higher Education, 53(44), B11. 34, 321−350.
Sternberg, R. J. (2007c). How higher education can produce the next generation of Sternberg, R. J.Rainbow Project CollaboratorsUniversity of Michigan Business School
positive leaders. In M. E. Devlin (Ed.), Futures Forum 2007 (pp. 33−36). Cambridge, Project Collaborators. (2004). Theory based university admissions testing for a new
MA: Forum for the Future of Higher Education. millennium. Educational Psychologist, 39(3), 185−198.
Sternberg, R. J., Castejón, J. L., Prieto, M. D., Hautamäki, J., & Grigorenko, E. L. (2001). Tan, M. T., Aljughaiman, A., Elliott, J. G., Kornilov, S. A., Ferrando Prieto, M., Bolden, D. S., et al.
Confirmatory factor analysis of the Sternberg triarchic abilities test in three (2008). Considering language, culture, and cognitive abilities: The international
international samples: An empirical test of the triarchic theory of intelligence. translation and adaptation of the Aurora Assessment Battery. In E. L. Grigorenko (Ed.),
European Journal of Psychological Assessment, 17(1), 1−16. Assessment of abilities and competencies in the era of globalization New York: Springer.
Sternberg, R. J., & Clinkenbeard, P. R. (1995). The triarchic model applied to identifying, Thurstone, L. L. (1938). Primary mental abilities. Chicago, IL: University of Chicago Press.
teaching, and assessing gifted children. Roeper Review, 17(4), 255−260. Turkheimer, E., Haley, A., Waldron, M., D'Onofrio, B., & Gottesman, I. I. (2003). Socioeconomic
Sternberg, R. J., Coffin, L., Bonney, C. R., Gabora, L., Karelitz, T., & Jarvin, L. (submitted for status modifies heritability of IQ in young children. Psychological Science, 14(6), 623−628.
publication). Broadening the spectrum of undergraduate admissions. Vernon, P. E. (1971). The structure of human abilities. London: Methuen.

Anda mungkin juga menyukai