CITATION
DOWNLOADS
VIEWS
167
120
4 AUTHORS, INCLUDING:
Patrick S R Davidson
University of Ottawa
41 PUBLICATIONS 1,181 CITATIONS
SEE PROFILE
Claude Messier
University of Ottawa
80 PUBLICATIONS 3,044 CITATIONS
SEE PROFILE
467961
JPAXXX10.1177/0734282912467961Journal of
Article
Abstract
New editions of the Wechsler Adult Intelligence and Memory scales are now available.Yet, given
the significant changes in these new releases and the skepticism that has met them, independent
evidence on their psychometric properties is much needed but currently lacking. We administered the WAIS-IV and the Older Adult version of the WMS-IV to 145 older adults. We examined how closely our data matched the normative sample by comparing our scaled scores with
those of the publisher and by evaluating interrelations among subtests using confirmatory factor analysis. Not surprisingly, scaled scores from our sample were somewhat higher than those
from the normative sample on some tests. Factor analysis on our sample provided support for
a higher-order model of the WAIS-IV/WMS-IV Older Adults battery combined. In addition, allowing some subtests to load on more than one factor significantly improved model fit.The best
fitting model for our sample was also the best for the normative sample. Overall, the data suggest that the factor analysis models generated from the normative samples for the new WAIS-IV
and WMS-IV are reliable.
Keywords
memory, intelligence, neuropsychological tests, normative data, confirmatory factor analysis
The Wechsler Adult Intelligence and Memory scales are among the most commonly used by
neuropsychologists (Butler, Retzlaff, & Vanderploeg, 1991; Rabin, Barr, & Burton, 2005; Sullivan
& Bowden, 1997) and have been considered by many to be the gold standard (Hartman, 2009;
Stanos, 2004). Recently, new editions of both tests have been released (the Wechsler Adult Intelligence Scalefourth edition (WAIS-IV; Wechsler, 2008) and the Wechsler Memory Scale
fourth edition (WMS-IV; Wechsler, 2009). These new versions were deemed necessary to
improve the match with the psychological constructs they are purported to measure and to provide updated norms. Yet, given the significant changes in these new releases and the questions
that have met them (e.g., Loring & Bauer, 2010), independent evidence on their psychometric
1
Corresponding Author:
Claude Messier, School of Psychology, University of Ottawa, 136 Jean-Jacques Lussier Room 2076A, Ottawa, Ontario,
Canada K1N 6N5.
Email: cmessier@uottawa.ca
376
properties is much needed but currently lacking. Here, we present data from 145 older adults
who completed the WAIS-IV and the Older Adult version of the WMS-IV. We examined how
closely our data matched the normative sample by comparing our scaled scores with those of the
publisher and by evaluating interrelations among subtests in our data using covariances and factor analyses. We then examined the factor structure of the WAIS-IV and the WMS-IV Older
Adult battery in the normative sample. In addition, we tested for measurement invariance in the
covariance structure across the two samples.
377
Miller et al.
Table 1. Evolution of Factors in the Various Versions of the WAIS and WMS.
WAIS
(1955)
FSIQ
Verbal scale
Information
Comprehension
Arithmetic
Similarities
Digit span
Vocabulary
Performance scale
Digit symbol
Picture completion
Block design
Picture arrangement
Object assembly
WAIS-R
(1981)
FSIQ
VIQ
Information
Comprehension
Arithmetic
Digit span
Similarities
Vocabulary
PIQ
Picture arrangement
Picture completion
Block Design
Object Assembly
Digit symbol
WMS
(1945)
Memory scale score
Information
Orientation
Mental control
Logical memory
Digit span (forward
and backward)
Visual reproduction
Associate learning
WMS-R
(1987)
General memory composite
Figural memory
Logical memory I
Visual paired associates I
Verbal paired associates I
Visual reproduction I
Attention/concentration
composite
Mental control
Digit span
Visual memory span
Verbal memory
Logical memory
Verbal paired associates
Visual memory
Figural memory
Visual paired associates
Visual reproduction
Delayed recall index
Logical memory II
Visual paired associates II
Verbal paired associates II
Visual reproduction II
WAIS-III
(1997)
FSIQ
VIQ
VCI
Vocabulary
Similarities
Information
WMI
Arithmetic
Digit span
Letter number sequencing
PIQ
POI
Picture completion
Block design
Matrix reasoning
PSI
Digit symbol coding
Symbol search
WMS-III
(1997)
Immediate memory
Auditory immediate
Logical Memory I
Verbal Paired Associates I
Visual Immediate
Faces I
Family pictures I
General memory
Auditory (delayed)
Logical memory II
Verbal paired associates II
Auditory Recognition Delayed
Logical Memory II
Recognition
Verbal paired associates II
Visual (Delayed)
Faces II
Family pictures II
Working memory
Auditory
Letter-number sequencing
Visual
Spatial span
WAIS-IV
(2008)
FSIQ
VCI
Similarities
Vocabulary
Information
PRI
Block design
Matrix reasoning
Visual puzzles
WMI
Arithmetic
Digit span
PSI
Coding
Symbol search
WMS-IV
(2009)
Auditory memory
Logical memory I and II
Verbal paired associates I
and II
Visual memory
Visual reproduction I and II
Designs I and II (adult
battery only)
Visual working memory
(adult battery only)
Spatial addition (adult
battery only)
Symbol span
Immediate memory
Logical memory I
Verbal paired associates I
Visual reproduction I
Delayed memory
Logical memory II
Verbal paired associates II
Visual reproduction II
Note. FSIQ = full scale IQ; PIQ = performance IQ;VIQ = verbal IQ; PSI = processing speed index;VCI = verbal comprehension index;
PRI = perceptual reasoning index; POI = perceptual organization index; WMI = working memory index.
because they completed the WMS-IV Older Adult battery only. Thus, to date, no study has evaluated the factor structure of the WAIS-IV and WMS-IV in older adults.
Method
Participants
The study presented here was approved by the Ethics Committee of the University of Ottawa.
One hundred and forty-five (94 females: 65%) community dwelling people between 65 and 92
years of age (mean = 73.17 years, SD = 6.50) were recruited from diverse socioeconomic backgrounds, using advertisements in two free magazines for seniors and flyers in community cen-
378
ters and subsidized housing buildings. Participants education ranged from 7 to 22 years (mean
= 13.96 years, SD = 2.83); 2.1% of participants had Grade 8 or less, 13.8% had between Grade
9 and Grade 12, 33.1% had a high school diploma, 17.9% had some college or university, and
33.1% had a bachelors, graduate, or professional degree. The exclusion criteria included age
younger than 65, lack of proficiency in English, diabetes, brain disease, chronic hepatitis, and
presence of mental health problems such as anxiety and depression. Participants were compensated CAN$100. In the sample, 87.6% were Caucasian, 0.7% African American, 3.4% Asian,
4.8% South Asian, 0.7% Hispanic, and 2.8% were from a mixed background. Sixty-six percent
of the sample reported experiencing memory problems.
The publishers normative sample consisted of 286 participants who completed both the
WAIS-IV and the WMS-IV Older Adults battery. The mean age of participants in this subset of
the normative sample was 78.78 years (SD = 6.91). In this sample, 17% of people had Grade 8
or less, 13% had between Grade 9 and Grade 12, 38% had a high school diploma, 19% had some
college or university, and 13% had a bachelors, graduate, or professional degree.1
Measures
Wechsler Adult Intelligence Scale Fourth Edition. The 10 core subtests yield four index
scores (verbal comprehension, perceptual reasoning, working memory, and processing speed), as
well as Full-Scale IQ. The WAIS-IV was normed on 2,200 people aged 16 to 90 years old, 600
of whom were over the age of 65 (mean age of 75.68 years, SD = 7.68). In that sample, 14% of
people had Grade 8 or less, 12% had between Grade 9 and Grade 12, 35% had a high school
diploma, 20% had some college or some university education, and 19% had a bachelors, graduate, or professional degree. Full Scale IQ construct validity was assessed by the publisher using
a number of other cognitive measures including the WAIS-III (r = 0.94) and the subtests of the
WMS-III (rs range from r = 0.34 to r = 0.69). For people 65 years of age and older, reliability
coefficients for the WAIS-IV subtests range from r = 0.78 to r = 0.96 and for the WAIS-IV composite scores range from r = 0.91 to r = 0.98. The reliability coefficient for Full Scale IQ is r =
0.98 (Wechsler, 2008).
Wechsler Memory Scale Fourth Edition. The Older Adult battery (for people 65 to 90 years
old) consists of seven subtests: logical memory 1 and 2, verbal paired associates 1 and 2, visual
reproduction 1 and 2, and symbol span, yielding four indexes: auditory memory, visual memory,
immediate memory, and delayed memory. The WMS-IV Older Adult battery was normed on 500
people aged 65 to 90 (mean age of 77.35 years, SD = 7.11). In that sample, 13% of people had
Grade 8 or less, 13% had between Grade 9 and Grade 12, 35% had a high school diploma, 19%
had some college or some university education, and 20% had a bachelors, graduate, or professional degree. According to the publisher, the WAIS-IV FSIQ indexs correlations with the different subtests of the WMS-IV Older Adult battery range from r = 0.44 to r = 0.62, and with the
WMS-IV index scores range from r = 0.57 to r = 0.71. The reliability coefficients for the WMSIV Older Adult battery subtests range from r = 0.74 to r = 0.96, and for the indexes range from
r = 0.92 to r = 0.97.
Analyses
Analyses of Variance. In order to determine how similar the normative data were to our new
sample, we obtained the scaled scores (i.e., age-adjusted; mean = 10, SD = 3) for healthy older
adults from the normative samples for the WAIS-IV (n = 600) and WMS-IV (n = 500) from the
379
Miller et al.
publisher. We compared their data against ours using a pair of mixed analyses of variance (ANOVAs), one for WAIS-IV subtests and the other for WMS-IV subtests that were included in the
factor analyses. We report effect sizes (Cohens d) in all post hoc comparisons to help interpret
the practical significance of these findings: d = 0.2 is considered small, d = 0.5 moderate, and d
= 0.8 is considered large (Cohen, 1988).
Correlations. Before proceeding with our factor analyses, we ran exploratory Pearson correlations among subtest scores (shown in Table 2). Following the procedure employed by Holdnack et al. (2011), we omitted the immediate versions of the WMS-IV subtests (e.g., logical
memory 1) from our analyses (see Holdnack et al., 2011).
Confirmatory Factor Analyses. We used AMOS-18 and AMOS-19 to discover the best fit for
our four main a priori specified models. CFA is preferred over exploratory factor analysis when
a specific theoretical model exists (Tabachnick & Fidel, 2007).
Invariance Analyses. We used AMOS-19 to test for strong factorial invariance across the two
groups by specifying that factor loadings and intercepts to be equal (constraints were imposed on
all factor loadings and latent factors in the model.)
Models. We began by replicating the typical model for WAIS-IV alone, given that the WAISIV model is very similar to its previous versions, and has been relatively well accepted.
Higher-order models presented below include general ability as an overarching second-order
factor, whereas first-order models do not. The typical WAIS model (shown in Figure 1) is a
higher-order model (HO WAIS-IV), that includes a second-order general ability factor and
first-order verbal comprehension (similarities, vocabulary, and information subtests), perceptual reasoning (block design, matrix reasoning, and visual puzzles subtests), working memory (arithmetic and digit span subtests), and processing speed (coding and symbol search
subtests) factors. We also evaluated a first-order model of the WAIS-IV (FO WAIS-IV),
which was identical to the higher-order model except that it did not include the second-order
general ability factor. We examined the modification indices for potential cross-loading paths
that would improve the model fit.
We then added scores from the WMS-IV to evaluate the best-fitting possible model
advanced by Holdnack et al. (2011). First we tested the first-order model, which consisted
of the same verbal comprehension, perceptual reasoning, working memory, and processing
speed factors as the WAIS-IV only models, but also included the publishers delayed memory factor from the WMS-IV (logical memory 2, verbal pairs 2 and visual reproduction 2
subtests) and added the symbol span subtest to the working memory factor. We examined the
modification indices for cross-loading paths that would improve the model fit. In addition,
we examined whether the cross-loadings described in Holdnack et al. (2011) would also
improve the model fit in our models. The variants included freeing up the correlated uniqueness of error terms 8 and 9, which was also kept for all consequent variants (FOa. WAIS/
WMS-IV), allowing the arithmetic subtest to cross-load on the verbal comprehension and
working memory factors (FOb. WAIS/WMS-IV), allowing the logical memory 2 subtest to
cross-load on the delayed memory and the verbal comprehension factors (FOc. WAIS/
WMS-IV), allowing the visual reproduction 2 subtest to cross-load on the perceptual reasoning and delayed memory factors (FOd. WAIS/WMS-IV), and allowing the visual reproduction 2 subtest to cross-load on the perceptual reasoning and delayed memory factors and the
symbol span subtest to cross-load on the delayed memory and the working memory factors
(FOe. WAIS/WMS-IV; see Table 4).
380
0.308
0.201
0.514
0.193
0.333
0.360
0.483
0.283
0.459
0.065
0.160
0.007
0.159
0.221
0.246
0.338
9.96
2.63
1
0.216
0.411
0.638
0.332
0.257
0.199
0.429
0.311
0.049
0.249
0.131
0.236
0.009
0.145
0.220
9.91
3.03
Si
1
0.319
0.308
0.436
0.260
0.208
0.163
0.242
0.010
0.120
0.033
0.253
0.041
0.077
0.372
10.01
3.07
Dsp
Voc
Arith
SS
1
0.394
1
0.395 0.351
1
0.273 0.273 0.213
1
0.433 0.152 0.355 0.389
0.419 0.494 0.307 0.183
0.445 0.320 0.292 0.589
0.107 0.107 0.069 0.081
0.228 0.344 0.328 0.218
0.009 0.018 0.025 0.113
0.240 0.316 0.170 0.311
0.029 0.005 0.036 0.166
0.332 0.015 0.063 0.252
0.498 0.310 0.207 0.345
10.25 10.68 10.35 10.66
3.24
3.01
2.77
2.55
Matrix
1
0.268
0.353
0.068
0.189
0.028
0.182
0.063
0.221
0.351
9.74
2.52
VPuz
1
0.258
0.061
0.361
0.011
0.281
0.039
0.202
0.320
10.68
2.53
In
LM1
1
0.026
1
0.152 0.017
0.014 0.338
0.202 0.283
0.137 0.342
0.141 0.065
0.405 0.038
11.44 11.01
2.60
2.79
CD
VP1
VP2
VR1
1
0.132
1
0.451 0.042
1
0.109 0.245
.121
1
0.268 0.053
.301 .109
0.310 0.084
.388 .155
11.01 11.07 11.37 9.91
2.73
2.81
2.85 3.15
LM2
SSp
.256
1
9.03 10.22
3.23
2.72
VR2
Note. BD = block design; Si = similarities; Dsp = digit span; Matrix = matrix reasoning;Voc = vocabulary; Arith = arithmetic; SS = symbol search;Vpuz = visual puzzles; In = information;
CD = coding; LM1 = logical memory 1; LM2 = logical memory 2;VP1 = verbal pairs 1;VP2 = verbal pairs 2;VR1 = visual reproduction 1;VR2 = visual reproduction 2; SSp = symbol
span.
Si
Dsp
Matrix
Voc
Arith
SS
VPuz
In
CD
LM1
LM2
VP1
VP2
VR1
VR2
SSp
Means
SD
BD
Table 2. Correlation Matrix With Means and Standard Deviations of Subtest Scaled Scores for Our Sample.
381
Miller et al.
Figure 1. Higher-order model for the WAIS-IV using the present sample.
We then evaluated a higher-order model for our sample by adding a second-order general ability factor (HOa WAIS/WMS-IV; shown in Figure 2). This provided information regarding the
statistical contribution of the general ability factor to the model fit.
Next, we conducted the same factor analyses on the normative sample, using the publishers
data on the 286 older adults who completed both the WAIS-IV and the Older Adult battery of the
WMS-IV (see Figure 3).
For all models, we used a 2-test to evaluate goodness of fit (Byrne, 2001). However, because
2
is potentially over-sensitive to larger sample sizes, we examined additional fit indices (as suggested by (Barrett, 2007; Byrne, 2001): the adjusted goodness-of-fit index (AGFI; Bentler,
1983), root mean squared error of approximation (RMSEA; Steiger, 1990), standardized root
mean square residual (SRMR; Bentler & Wu, 1995), TuckerLewis nonnormed fit index (TLI;
Tucker & Lewis, 1973), comparative fit index (CFI; Bentler, 1990), and Schwarzs Bayesian
information criterion (BIC; Schwartz, 1978). RMSEA indicates the extent of fit between the
model and the population covariance matrix under optimal parameter values; adequate fit is
indicated by RMSEA values of 0.05 or less. SRMR indicates the match between the observed
and implied model covariance matrices; a good fit is indicated by smaller residuals; values less
than 0.08 are considered a good fit (Hu & Bentler, 1999; Meade, Johnson, & Braddy, 2008). CFI
reflects how well the hypothesized model fits with the independence model where all correlations among variables are zero; a good fit occurs when CFI is 0.95 or higher (Hu & Bentler,
1999). Smaller BIC values are preferred and a difference of more than 10 points in the indices
suggests a better model fit (Raftery, 1993).
382
Results
ANOVA
Our samples WAIS-IV and WMS-IV scores are shown in Table 2. For WAIS-IV, a mixed 2
(sample: ours vs. normative) 10 (subtest) ANOVA yielded no significant main effect of sample
(F[1, 743] = 1.91, MSE = 46.47, p = .17), but a significant effect of subtest (F[9, 6687] = 6.50,
MSE = 4.63, p < .001), and a significant interaction between sample and subtest (F[9, 6687] =
9.71, MSE = 4.63, p < .001). Post hoc independent t-tests with Bonferroni corrected to 0.005
indicated that two of our samples subtest scores were significantly above the normative means
(for all normative WAIS and WMS scaled scores, mean = 10 and SD = 3): Information (t[743]
=3.07, p = .002, d = 0.31), and coding (t[743] = 5.10, p < .001, d = 0.49). Our samples vocabulary scores were marginally higher than the normative groups (t[743] =2.78, p =.006, d = 0.26).
The Cohens d values suggested that the differences between our sample and the normative data
were small (on vocabulary and information) to moderate (on coding).
For WMS-IV, a mixed 2 (sample: independent versus normative) 7 (subtest) ANOVA indicated a main effect of sample (F[1, 642] = 6.29, MSE= 30.33, p = .01), a main effect of subtest
(F[6, 3852] = 17.70, MSE = 5.37, p < .001), and a significant interaction between sample and
subtest (F[6, 3852] = 16.62, MSE = 5.37, p < 0.001). Post hoc independent t-tests (Bonferroni
corrected to 0.007) showed that four of our scores were significantly above the normative mean:
logical memory 1 (t[643] =4.05, p < .001, d = 0.38), logical memory 2 (t [643] =2.91, p = .004,
d = 0.28), verbal paired associates 1 (t[643] =4.45, p < .001, d = .43), and Verbal Paired Associates
2 (t[643] =4.12, p < .001, d = 0.39); these differences were in the small-to-moderate range. One
of our scores was significantly below the normative mean: visual reproduction 2 (t[643] = 4.27,
p < .001, d = 0.39).
Correlations
As expected, all correlations among the subtests were positive and almost all were statistically
significant (even when we used a stringent alpha level of 0.005, to adjust for multiple correlations), as shown in Table 2. Particularly high correlations occurred between scores that are part
of the same index. For example, vocabulary and similarities both load on the verbal comprehension index and yielded r = 0.64, and symbol search and coding both load on the processing speed
index and yielded r = 0.59.
383
Miller et al.
Table 3 .First-Order and Higher-Order Models for the WAIS-IV Using the Present Sample.
Model
FO WAIS-IV our
sample
HO WAIS-IV our
sample
df
AGFI
RMSEA
SRMR
CFI
TLI
BIC
39.139
29
0.905
0.049
0.049
0.976
0.963
168.534
42.860
31
0.901
0.052
0.052
0.972
0.960
162.301
Note. AGFI = adjusted goodness-of-fit index; RMSEA = root mean squared error of approximation; SRMR = standardized root mean square residual; CFI = comparative fit index; TLI = TuckerLewis nonnormed fit index; BIC = Schwarzs
Bayesian information criterion
Table 4. First-Order and Higher-Order Models for the WAIS-IV/WMS-IV Using the Present Sample.
Model
FO WAIS/WMS-IV
FOa. WAIS/WMS-IV
FOb. WAIS/WMS-IV
FOc. WAIS/WMS-IV
FOd. WAIS/WMS-IV
FOe. WAIS/WMS-IV
HOa. WAIS/WMS-IV
df
AGFI
123.693
101.767
101.525
101.212
97.317
96.348
110.216
67
66
65
65
65
64
71
0.835
0.856
0.854
0.855
0.860
0.859
0.855
RMSEA SRMR
0.077
0.061
0.062
0.062
0.059
0.059
0.062
0.064
0.061
0.061
0.060
0.058
0.057
0.062
CFI
TLI
0.904
0.940
0.938
0.939
0.945
0.945
0.934
0.870
0.917
0.914
0.914
0.924
0.922
0.915
BIC
df
312.809
279.425
factor) would significantly improve the model fit. Freeing up the two unique error variances
let to a 2 reduction of 22 points, df = 1, p < .001, and a higher CFI (0.940), higher TLI (0.917),
and lower RMSEA (0.061). In addition, we examined if the cross-loadings described in
Holdnack et al. (2011) will also improve the model fit in our models. Only one of the crossloadings (allowing the visual reproduction subtest to cross-load on the perceptual reasoning
and delayed memory factors) had a significant 2 value, however, the factor loading was small
(less than 0.25), thus we did not retain this path in our final model. We then evaluated a higherorder model for our sample by adding a second-order general ability factor (HOa WAIS/
WMS-IV; shown in Figure 2). The fit statistics of the two models were comparable, however,
similarly to the WAIS-IV only model, the BIC value favoured the more parsimonious higherorder model. Thus, in the end we retained the higher-order model with freed up unique error
variances of the arithmetic and symbol span subtests (HOa WAIS/WMS-IV).
Once we had completed the analyses for our sample, we returned to the normative dataset and
replicated our initial analyses with it (Figure 3 and Table 5). We found essentially the same patterns in the normative dataset as we found in our sample. That is, freeing up the unique error
variances of the same two subtests significantly improved the model fit. In the normative sample
model, the same cross-loading path led to a significant 2 value (allowing the visual reproduction
to cross-load on the perceptual reasoning and delayed memory factors). In this model, however,
384
Figure 2. Higher-order model for the combined WAIS-IV and WMS-IV batteries for the present sample.
Figure 3. Higher-order model for the combined WAIS-IV and WMS-IV batteries for the normative sample.
385
Miller et al.
Table 5. First-Order and Higher-Order Models for the WAIS-IV/WMAS-IV Using the Normative Sample.
2
TLI
BIC
df
FO WAIS/WMS-IV
FOa. WAIS/WMS-IV
FOb. WAIS/WMS-IV
145.974 67 0.891
134.731 66 0.898
129.773 65 0.900
0.064
0.060
0.059
FOc. WAIS/WMS-IV
128.572 65 0.902
0.059
FOd. WAIS/WMS-IV
109.829 65 0.913
0.049
FOe WAIS/WMS-IV
102.439 64 0.918
0.046
0.051
7.39
.000
.026
r = 0.24
.013
r = 0.27
.000
r = 0.40
.007
r = 0.28
the factor loading was higher (0.38), thus we retained the cross-loading path in the model. Adding
the general ability factor as a second-order factor in the model led to a better fitting model as
indicated by the lower BIC value. The model that we retained was the higher-order model with
freed up unique error variances for the arithmetic and symbol span subtests and a cross-loading
of the visual reproduction subtest to the perceptual reasoning and delayed memory factors (HOd.
WAIS/WMS-IV normative sample).
In addition to examining the factor structure of the WAIS-IV and WMS-IV batteries combined in our sample and the normative sample, we conducted analyses of invariance to test the
assumption of equal variance across the two samples. We imposed equality of variance constraints on all factor loadings and all latent variables, including the second-order general ability
factor (Model 2; see Table 6). The model we used to test the equality of variance assumption was
the most parsimonious higher-order model with freed up unique error variances for the arithmetic and symbol span subtests. Results indicated that there was a statistically significant difference
between Model 1 (the model with no constraints imposed) and Model 2, 2 = 57.002, df = 16,
p < .001, thus we failed to establish strong measurement invariance. Next, we removed the constraints imposed on the second-order general ability factor to evaluate the contribution of the
factor to the invariance in the model across the two samples and compared this model to Model
2. Results indicated that there was a statistically significant invariance across the two samples in
their scores on the general ability factor, 2 = 5.164, df = 1, p = .023. Next, we systematically
evaluated the invariance in the other five factors by removing the equality constraints for one
factor at a time (see Table 7). The only other factor for which results were statistically significant
was the perceptual reasoning factor, 2 = 7.624, df = 2, p = .022. Thus, we established weak
measurement invariance between our sample and the normative sample, but failed to establish
strong measurement invariance. The two factors contributing to the invariance in the most constrained model were the general ability factor and the perceptual reasoning factor.
386
Table 6. Measurement Invariance Testing Between the Present and the Normative Sample.
Specifications
df
No constraint
Strong Invariance testing
General ability factor constraint removed
Verbal comprehension factor constraint removed
Perceptual reasoning factor constraint removed
Working memory factor constraint removed
Processing speed factor constraint removed
Delayed memory factor constraint removed
251.983
308.985
303.821
303.372
301.361
303.510
303.813
303.762
142
158
157
156
156
156
156
156
.000
.000
.000
.000
.000
.000
.000
.000
Model
Model 1
Model 2
Model 3
Model 4
Model 5
Model 6
Model 7
Model 8
df
57.002
5.164
5.613
7.624
5.475
5.172
5.223
16
1
2
2
2
2
2
.000
.023
.060
.022
.065
.075
.073
Discussion
We collected independent evidence on the new WAIS-IV and WMS-IV for older adults. Our
scaled scores for the WAIS-IV and WMS-IV subtests were relatively close to the published
norms, albeit slightly higher. This finding is similar to many previous studies bringing community members into a university for testing (e.g., Glisky, Rubin, & Davidson, 2001; Salthouse,
2010; Soubelet & Salthouse, 2011; Tucker-Drob, 2011). We had a slightly younger, more highly
educated sample than the WAIS and WMS normative groups. The largest differences between
our sample and the normative one were for coding (d = 0.49) and verbal paired associates 1 (d
= 0.43), but these were still only approximately half a standard deviation in size. Note too that
our mean scaled scores were not always above the norm: visual reproduction 2 was significantly
below the normative mean.
Our best-fitting models were very similar to the ones previously published in young and
middle-aged people from the normative sample (Holdnack et al., 2011). This was the case even
though we had to omit the designs and spatial addition WMS-IV subtests used by those authors,
because those subtests are not part of the WMS-IV Older Adult battery. In addition, the freeing
of the same unique error variance led to a significant improvement in the model fit in our sample
and the normative sample. Once we had ascertained the best-fitting model for our data, we found
that it also fit well with the normative data.
Thus, even though cognition declines with age (especially in memory and processing speed),
the interrelations among the factors that make up the WAIS-IV and WMS-IV appear to remain
relatively stable in aging. Consistent with this idea, Salthouse and Saklofske (2010) reported that
the factor structure of the WAIS-IV normative sample data was similar in younger and older
adults (e.g., see also (Bowden, Weiss, Holdnack, & Lloyd, 2006).
387
Miller et al.
We were able to establish weak measurement invariance between our sample and the normative sample, indicating that the factor loading variances remained the same across the two samples. However, we failed to establish strong measurement invariance; the two factors contributing
to the variability across the two samples were general ability and perceptual reasoning. This
finding makes clear the need for new, independent samples to be collected and compared against
the normative one.
Future Work
The present study is the first to independently examine the factor structure of the combined
WAIS-IV and WMS-IV Older Adult batteries. In future work, along with further replication of
the aging findings in new datasets, we must also examine performance in dementia as well as in
other (e.g., developmental and psychiatric) disorders. When constructing the new norms, the
publisher screened out possibly impaired participants using a new brief cognitive status test. The
publisher also provides normative data from people with Alzheimers disease and mild cognitive
impairment, but these need to be supplemented by researchers in the field. For instance, only 36
people with MCI (collapsed across subtype) were administered the Older Adult WMS-IV
(Wechsler, 2009). Arguably the best strategy would be to follow cognitively normal and mildly
impaired participants for a few years and then retroactively exclude those who end up showing
signs of dementia. Of course, very few studies do this, for reasons of feasibility.
Finally, further theoretical and empirical work is needed on WAIS-IV and WMS-IV. On a
theoretical level, although both the WAIS and WMS have evolved to better conform to current
theories of intelligence, cognition, and neuropsychology (Coalson et al., 2010; Drozdick,
Wahlstrom, Zhu, & Weiss, 2012; Kaufman, 2010), in particular the WAIS remains the focus of
considerable controversy. For example, many researchers have argued that the WAIS is better
described by the CattellHornCarroll theory than by the model outlined in the Wechsler manual
(e.g., Benson et al., 2010; Ward, Bergman, & Hebert, 2012); for a review, see (McGrew, 2009),
but there is still disagreement over this issue, and competing theories and measures of intelligence exist (e.g., Reynolds & Kamphaus, 2003).
On an empirical level, our confirmatory analyses were guided by Holdnak et al. (2011), but
we needed to make adjustments to our models because the WMS-IV Older Adult battery does not
include two of the subtests used in the general WMS-IV battery that Holdnack et al. used in their
study. Thus, we could not test all their possible models. Thus, future work using both the Older
Adult and the standard WMS-IV battery is potentially fruitful. Not only confirmatory, but also
further exploratory factor analyses (especially with cognitively-impaired groups) will likely be
useful. Exploratory factor analyses have yielded several interesting findings with previous editions of the WAIS and WMS (e.g. Bowden et al., 1999; Bowden et al., 2001; Burton, Ryan,
Axelrod, Schellenberger, & Richards, 2003; Millis, Malina, Bowers, & Ricker, 1999; Price,
Tulsky, Millis, & Weiss, 2002; Tulsky & Price, 2003), and it is likely that such work with the new
versions will too.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by grants from the Natural Sciences and Engineering
Research Council of Canada (D. M., P. D., C.M.).
388
Note
1. In the normative sample, the education level of participants was entered as a categorical variable,
which is why means and standard deviations are not reported. Overall, our sample appears to have
been slightly younger and more highly educated than the normative one(s).
References
Barrett, P. (2007). Structural equation modeling: Adjusting model fit. Personality and Individual Differences, 42, 815-824.
Benson, N., Hulac, D. M., & Kranzler, J. H. (2010). Independent examination of the Wechsler adult intelligence scale-fourth edition (WAIS-IV): What does the WAIS-IV measure? Psychological Assessment,
22(1), 121-130.
Bentler, P. M. (1983). Some contributions to efficient statistics for structural models: Specification and
estimation of moment structures. Psychometrika, 48, 493-571.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological bulletin, 107, 238-246.
Bentler, P. M., & Wu, E. J. C. (1995). Eqs for windows users guide. Encino, CA: Multivariate Software.
Bowden, S. C., Carstairs, J. R., & Shores, E. A. (1999). Confirmatory factor analysis of combined Wechsler
adult intelligence scalerevised and Wechsler memory scalerevised scores in a healthy community
sample. Psychological Assessment, 11, 339-344.
Bowden, S. C., Ritter, A. J., Carstairs, J. R., Shores, E. A., Pead, J., Greeley, J. D., & Clifford, C.C. (2001).
Factorial invariance for combined WAIS-R and WMS-R scores in a sample of patients with alcohol
dependency. Clinical Neuropsychologist, 15, 69-80.
Bowden, S. C., Saklofske, D. H., & Weiss, L. G. (2011). Augmenting the core battery with supplementary subtests: Wechsler adult intelligence scale-iv measurement invariance across the united states and
canada. Assessment, 18(2), 133-140.
Bowden, S. C., Weiss, L. G., Holdnack, J. A., & Lloyd, D. (2006). Age-related invariance of abilities measured with the Wechsler adult intelligence scale-iii. Psychological Assessment, 18, 334-339.
Brooks, B. L., Holdnack, J. A., & Iverson, G. L. (2011). Advanced clinical interpretation of the WAIS-IV
and WMS-IV: Prevalence of low scores varies by level of intelligence and years of education. Assessment, 18, 156-167.
Burton, D. B., Ryan, J. J., Axelrod, B. N., Schellenberger, T., & Richards, H. M. (2003). A confirmatory
factor analysis of the wms-iii in a clinical sample with cross-validation in the standardized sample.
Archives of Clinical Neuropsychology, 18, 629-641.
Butler, M., Retzlaff, P., & Vanderploeg, R. (1991). Neuropsychological test usage. Professional PsychologyResearch and Practice, 22, 510-512.
Byrne, B. M. (2001). Structural equation modeling: Perspectives on the present and the future. International Journal of Testing, 1, 327-334.
Canivez, G. L., & Watkins, M. W. (2010). Investigation of the factor structure of the Wechsler adult intelligence scale-fourth edition (WAIS-IV): Exploratory and higher order factor analyses. Psychological
Assessment, 22, 827-836.
Coalson, D., Raiford, S. E., Saklofske, D. H., & Weiss, L. (2010). Advances in the assessment of intelligence. In L. Weiss, D. H. Saklofske, D. Coalson, & S. E. Raiford (Eds.), WAIS-IV clinical use and
interpretation: Scientist-practitioner perspectives (pp. 3-24). San Diego, CA: Academic Press.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd edition). Hillsdale, NJ:
Erlbaum.
Drozdick, L. W., & Cullum, C. M. (2011). Expanding the ecological validity of WAIS-IV and WMS-IV
with the texas functional living scale. Assessment, 18(2), 141-155.
Drozdick, L. W., Wahlstrom, D., Zhu, J., & Weiss, L. G. (2012). The Wechsler adult intelligence scale
fourth edition and the Wechsler memory scalefourth edition. In D. P. Flanagan, & P. L. Harrison
(Eds.), Contemporary intellectual assessment (3rd ed., pp 197-223). New York, NY: Guilford Press.
389
Miller et al.
Glisky, E. L., Rubin, S. R., & Davidson, P. S. (2001). Source memory in older adults: An encoding or
retrieval problem? Journal of experimental psychology, Learning, memory, and cognition, 27,
1131-1146.
Hartman, D. E. (2009). Test review Wechsler adult intelligence scale iv (WAIS IV): Return of the gold
standard. Applied Neuropsychology, 16, 85-87.
Hoelzle, J. B., Nelson, N. W., & Smith, C. A. (2011). Comparison of Wechsler memory scale-fourth edition
(wms-iv) and third edition (wms-iii) dimensional structures: Improved ability to evaluate auditory and
visual constructs. Journal of Clinical and Experimental Neuropsychology, 33, 283-291.
Holdnack, J. A., Xiaobin, Z., Larrabee, G. J., Millis, S. R., & Salthouse, T. A. (2011). Confirmatory factor
analysis of the WAIS-IV/WMS-IV. Assessment, 18, 178-191.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional
criteria versus new alternatives. Structural Equation Modeling, 6(1), 1-55.
Kaufman, A. S. (2010). Foreword. In L. Weiss, D. H. Saklofske, D. Coalson, & S. E. Raiford (Eds.), WAIS-IV clinical use and interpretation: Scientist-practitioner perspectives (pp. xiii-xxi). San Diego, CA: Academic Press.
Loring, D. W., & Bauer, R. M. (2010). Testing the limits: Cautions and concerns regarding the new Wechsler
iq and memory scales. Neurology, 74, 685-690.
McGrew, K. S. (2009). Chc theory and the human cognitive abilities project: Standing on the shoulders of
the giants of psychometric intelligence research. Intelligence, 37, 1-10.
Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity of alternative fit indices in
tests of measurement invariance. Journal of applied psychology, 93, 568-592.
Millis, S. R., Malina, A. C., Bowers, D. A., & Ricker, J. H. (1999). Confirmatory factor analysis of the wmsiii. Journal of Clinical and Experimental Neuropsychology, 21, 87-93.
Price, L., Tulsky, D., Millis, S., & Weiss, L. (2002). Redefining the factor structure of the Wechsler memory
scale-iii: Confirmatory factor analysis with cross-validation. Journal of Clinical and Experimental Neuropsychology, 24, 574-585.
Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical neuropsychologists in
the united states and canada: A survey of ins, nan, and apa division 40 members. Archives of Clinical
Neuropsychology: The Official Journal of the National Academy of Neuropsychologists, 20(1), 33-65.
Raftery, A. E. (1993). Bayesian model selection in structural equation models. In K. A. Bollen, & J. S. Long
(Eds.), Testing structural equation models (pp. 163-180). Newbury Park, CA: SAGE.
Reynolds, C. R., & Kamphaus, R. W. (2003). Reynolds intellectual assessment scales and the reynolds
intellectual screening test. Lutz, FL: Psychological Assessment Resources.
Salthouse, T. A. (2010). Does the meaning of neurocognitive change change with age? Neuropsychology,
24, 273-278.
Salthouse, T. A., & Saklofske, D. H. (2010). Do the WAIS-IV tests measure the same aspects of cognitive
functioning in adults under and over 65? In L. G. Weiss, D. H. Saklofske, D. L. Coalson, & S. E. Raiford
(Eds.), WAIS-IV: Clinical use and interpretation (pp. 217-235). San Diego, CA: Elsevier.
Schwartz, G. (1978). Estimating the dimensions of a model. Annals of Statistics, 6, 461-464.
Soubelet, A., & Salthouse, T. A. (2011). Personality-cognition relations across adulthood. Developmental
psychology, 47, 303-310.
Stanos, J. F. (2004). Test review: Wechsler abbreviated scale of intelligence. Rehabilitation Counseling
Bulletin, 48, 56-57.
Steiger, J. H. (1990). Structural model evaluation and modificationAn interval estimation approach. Multivariate Behavioral Research, 25, 173-180.
Sullivan, K., & Bowden, S. C. (1997). Which tests do neuropsychologists use? Journal of Clinical Psychology, 53, 657-661.
Tabachnick, B. G., & Fidel, L. S. (2007). Using multivariate statistics (5th ed.). Boston, MA: Allyn and
Bacon.
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1-10.
390
Tucker-Drob, E. M. (2011). Global and domain-specific changes in cognition throughout adulthood. Developmental psychology, 47, 331-343.
Tulsky, D. S., & Price, L. R. (2003). The joint WAIS-III and WMS-III factor structure: Development
and cross-validation of a six-factor model of cognitive functioning. Psychological Assessment,
15(2), 149-162.
Ward, L. C., Bergman, M. A., & Hebert, K. R. (2012). WAIS-IV subtest covariance structure: Conceptual
and statistical considerations. Psychological Assessment, 24, 328-340.
Wechsler, D. (2008). Wechsler adult intelligence scale (4th ed.). San Antonio, TX: Pearson.
Wechsler, D. (2009). Wechsler memory scale (4th ed.). San Antonio, TX: Pearson.