311
M. van D a m
National Court of Audit, 'sGravenbage, The Netherlands
M. H. van IJzendoorn
Department of Education, Leiden University
A. Mooijaart
Department of Pycbology, Leiden University
In this paper, we present a structural equation approach to modelling infant
behaviour in the Strange Situation. A model was developed on a Dutch data set, and
was subsequently crossvalidated for an American data set containing the original
Ainsworth data. Model building is reported in some detail as no previous similar
analyses of the Strange Situation exist in the literature. The latent variables in the
preferred model are stranger wariness, minimization or deactivation of attachment
concerns, and maximization or hyperactivation of attachment concerns. Stranger
wariness influences only the subsequent behaviour towards the mother, and
behaviour in the second reunion episode is dependent on the same mother
behaviour in the first reunion episode, and not on other mother behaviours.
Structural equation modelling behaviour in the Strange Situation is shown to
provide further insight into the dynamics of the procedure.
In this paper we will present models of infants' behaviour in the Strange Situation
procedure which were developed and tested through a structural equation approach.
The focus of this paper is not only substantive matters which go into the modelling
and the results emerging from the models, but also procedural considerations with
respect to the modelling process itself will be treated in some detail. Given that the
paper is a first attempt at a fully fledged analysis of the Strange Situation with
structural equation modelling, a detailed presentation of the considerations which
went into the selection of appropriate models seemed called for.
The Strange Situation is a standardized laboratory procedure to assess the
organization of attachment behaviours (Ainsworth, Blehar, Waters & Wall, 1978;
* Requests for reprints.
312
P. M. Kroonenberg et al.
Sroufe & Waters, 1977). The procedure has been applied in hundreds of studies on
the development of infant attachment and its sequelae in various countries (Van
IJzendoorn & Kroonenberg, 1988). The Strange Situation procedure consists of
eight threeminute episodes that have been arranged so as to create increasing levels
of stress and to activate the attachment behavioural system (Bowlby, 1969). The
infants are consecutively confronted with a strange laboratory playroom, with an
unknown female, and with two brief separations from their caregivers. Because of its
standardized nature and its series of separate episodes the Strange Situation can be
considered a minilongitudinal design (Connell & Goldsmith, 1982). In their paper,
Connell & Goldsmith presented an early but somewhat problematic attempt at
modelling the Strange Situation in a longitudinal fashion. Apart from the too small
sample size, their structural model was also seriously flawed (for a detailed critique
see Van Dam, 1993, pp. 42@. Lamb, Thompson, Gardner & Charnov (1985)
expected structural equation modelling to be of help in exploring the origin of
Strange Situation behaviour and other individual differences in attachment, and our
paper may be seen as a response to the questions raised by Lamb e t a/. (1985).
The Strange Situation is generally used to classify the attachment relationship
between infant and caregiver into three main categories : insecureavoidant
attachment, secure attachment and insecureresistant attachment. Securely attached
infants strike a balance between exploration of the new environment and their need
to be comforted by the caregiver in stressful circumstances. Insecure avoidantly
attached infants tend to continue exploring the environment even when they are
stressed, and they tend to minimize the display of attachment concerns. The insecure
resistantly attached infants are inclined to discontinue their exploration in favour of
close but angry proximity to the caregiver, and they are therefore said to maximize
the display of attachment concerns (Kobak & Sceery, 1988; Main, 1990). The
reliability and validity of the Strange Situation classifications have been established
in various crosssectional, longitudinal and experimental studies (Bretherton, 1985;
Van IJzendoorn, Juffer & Duyvesteyn, 1995).
The Strange Situation classifications of attachment relationships are based upon
several interactive behaviours between the infant and the stranger and the caregiver
during the eight episodes (Ainsworth e t al., 1978). Richters, Waters & Vaughn (1988)
showed that about 88 per cent of the classifications can be predicted through
discriminant functions consisting of some core interactive behaviours, in particular
during the two episodes in which the infants are reunited with the caregiver after a
brief separation. Nevertheless, most attachment researchers take the internal structure
and dynamics of the Strange Situation procedure for granted, and focus exclusively
on the classifications. Although research on the antecedents and consequences of the
classifications has been very successful, the Strange Situation itself has remained a
black box. In this study we would like to shed some light on the internal structure
and dynamics of this important assessment procedure, and to derive some models
that adequately and efficiently describe the black box in terms of structure and
dynamics of the interactive attachment behaviours.
The most important interactive behaviours between infants and caregivers or
strangers are the following : proximity seeking, contact maintaining, resistance,
avoidance and distance interaction. The behaviours are coded on sevenpoint rating
313
314
P . M. Kroonenberg et al.
correlated. More elaborate models can be conceived, in which also specific relations
between the latent variables are specified, for instance that one latent variable has an
influence on one, but not another latent variable. The part of such models that
describes the relations between observed and latent variables is called the
measurement model, and the part that describes the relations between the latent
variables themselves is called the latentvariable model. Together they form a
structural model for the observed covariance matrix.
Measurement models with respect to the stranger and to the caregiver were tested
and then combined to find a joint measurement model. The initial separate evaluation
of the measurement models with respect to the caregiver and the stranger was based
on the expectation that it would be easier to spot and assess inconsistencies if the
complexity of the models was kept as low as possible. Based on the joint
measurement model, an integrated structural model for the entire Strange Situation
was developed. Different structural equation models were tested on a set of 326
Dutch Strange Situations, and crossvalidated with data from 155 American Strange
Situations (including the original 105 Strange Situations that Ainsworth e t al., 1978,
presented). Earlier studies have made clear that the Strange Situation can be validly
applied in various Western, industrialized countries such as the USA and The
Netherlands (Main, 1990; Van IJzendoorn & Kroonenberg, 1988).
Method
ParticipantJ
A total of 326 Dutch infants, or rather infantmother pairs, were included in the analyses. They
originate from five different studies conducted at the Centre for Child and Family Studies of the
Department of Education, Leiden University. The primary references for these studies are Goossens
(1986; see also Van IJzendoorn, Goossens, Kroonenberg & Tavecchio, 1985), Goossens & Van
IJzendoorn (1990), Hubbard & Van IJzendoorn (1991), Lambermon (1991; see also Lambermon &
Van IJzendoorn, 1989) and Van Dam & Van IJzendoorn (1988); a comprehensive description can be
found in Van Dam (1993). A summary of the reliability of the Dutch measurements can be found in
Kroonenberg, Basford & Van Dam (1995).
In order to evaluate the appropriateness of the data for structural equation modelling, we checked
the distributions of the variables using Brownes MUTMUM program (Browne, 1990). This program
computes both univariate and multivariate measures for the kurtosis (for details, see Browne, 1982
section 1.5, 1984). The univariate estimated relative kurtosis (Browne, 1982, equation 1.5.23a) varied
between 0.56 and 2.65, where the relative kurtosis of the normal distribution is 1, and the multivariate
estimate of the relative kurtosis was 1.11 (Browne, 1982, equation 1.5.23~).Thus there is little reason
to doubt the multivariate normality of our observations. Furthermore, our original sample was
sufficiently large (i.e. N = 326) to allow for structural equation modelling (e.g. see simulation studies
by Boomsma, 1985; see also Tanaka, 1987).
The second (crossvalidation) sample was kindly provided by Dr Everitt Waters. It consisted of the
105 infants from the original Ainsworth samples and another 50 infants from a study by Waters (1978).
Also for the crossvalidation set the kurtosis figures were satisfactory, viz. 0.561.99 (univariate relative
kurtosis) and 1.10 (multivariate relative kurtosis). Unfortunately, the size of this sample falls below the
size recommended for structural equation modelling, but as it was primarily used for crossvalidation,
we decided to continue with this data set, mainly because no real alternative was available.
For the crossvalidation to be successful the two samples have to be reasonably alike. The distribution
of attachment classifications in the Dutch sample was 80 A (= 25 per cent), 209 B (= 64 per cent) and
37 C (= 11 per cent) classifications, and in the US sample the distribution was 33 A (= 21 per cent),
99 B (= 64 per cent) and 23 C (= 15 per cent) classifications, so that no real imbalance exists with
315
respect to the classification categories. Further general information on the two samples is provided in
Table 1, which gives the means and standard deviations on the interactive scales used. Table 1 shows
that the samples are also comparable with respect to the trend in the means. In particular, in both
samples the means increase from the earlier to the later episode for both mother and stranger episodes,
except for avoidance towards the mother, where they decrease.
Table 1. Means and standard deviations for the Dutch and US samples
Resistance
NL
Means
Stranger episodes
2.1
s4
s7
2.6
Mother episodes
M5
2.0
2.5
M8
Standard deviations
Stranger episodes
s4
1.6
s7
2.0
Mother episodes
M5
1.4
M8
1.5
Crying
Avoidance
US
NL
US
NL
US
2.0
2.7
2.2
3.2
2.1
2.4
2.4
2.7
1.7
1.9
1.7
2.3
1.6
2.0
2.0
2.4
3.0
2.4
2.7
1.7
1.8
2.0
2.2
2.5
1.7
1.6
1.3
1.6
1.3
1.5
1.4
1.6
1.2
1.6
1.5
1.6
1.5
1.5
1.9
2.0
Proximity
Contact
NL
US
NL
US
3.4
3.9
3.5
4.4
2.4
3.4
2.7
4.4
2.0
2.1
2.1
1.9
2.0
2.4
2.1
2.1
P. M . Kroonenberg et al.
31 6
matrix and the implied (or fitted) covariance matrix, can only be used as global indication of the fit of
the overall model, as has been extensively demonstrated in the literature (see Bollen, 1989 and Sugawara
& MacCallum, 1993, for references). The asymptotic distribution of (N 1)F is a
distribution with
(s 1) f degrees of freedom, where N is the sample size, p is the number of observed variables,
and f is the number of independent free parameters. F will also indicate the value of the statistic
evaluated for the final estimates. Rather than using F itself, it is often easier to use F/d.f. for model
comparisons because its value is independent of the degrees of freedom.
The Steiger & Lind (1980) measure is called t,he Root Mean Square Error of Approximation
(RMSEA). It is defined as RMSEA = max (Fo/d.f.)z, where F, is the minimal population discrepancy
0). Values below .10
function, which is replaced in practice with its estimate Max{Fd.f./(N1),
represent a reasonable fit, and values below .05 represent a very good fit (Steiger, 1989). Browne &
Cudeck (1992, p. 239) state that:
x2
gp) +
[plractical experience has made us feel that a value of the RMSEA of 0.05 or less would indicate
a close fit of the model in relation to its degrees of freedom. This figure is based on subjective
judgement. It cannot be regarded as infallible or correct, but is more reasonable than the
requirement of exact fit with the RMSEA = 0.0. We are also of the opinion that a value of about
0.08 or less for the RMSEA would indicate a reasonable error of approximation and would not
want to employ a model with a RMSEA greater than 0.1.
In order to select adequate models, comparisons are made between different models in a hierarchical
fashion starting with a fairly unrestricted model and introducing increasingly stringent restrictions.
Anderson & Gerbing (1988) proposed to estimate measurement submodels prior to the simultaneous
estimation of measurement and latent variable submodels. When during the simultaneous estimation of
the two submodels, the regression coefficients from the measurement models differ only trivially from
their initial values, one knows that socalled interpretational confounding (Burt, 1976) has not occurred. We
will not follow Anderson & Gerbings proposal in all its detail, but take it as a general framework
within which we develop our models. In the Appendix, we discuss the procedure we have followed to
select adequate models, as well as some more technical issues.
General modelling considerations. The conceptualization of the Strange Situation as a longitudinal design
determines for a large part the general characteristics of our models. First, as mentioned above, latent
variables measured in different episodes always have the same indicators. This implies that the strength
of the relations between theoretical constructs and their indicators may change over time, but measured
variables always are indicators for the same latent variables. Second, we assume a priori that the same
indicators for a latent variable have correlated measurement errors between two points in time. In other
words, there exists a certain amount of variation which is specifically connected with the measurement
itself. Occasionally, we had to drop this assumption, especially in variables with low variance, in order
to prevent numerical problems during the analysis. Third, only relations between latent variables with
a temporal order are assumed to be causal. Fourth, the same latent variables in different episodes are
always assumed to be causally related. In other words, earlier behaviours always have a direct effect on
the same behaviours in later episodes.
To guard ourselves against overly optimistic model acceptance and to put the results on a firm basis
the main models of this paper were crossvalidated with the independently collected US data set, so that
we have a calibration sample and a validation sample.
To estimate the models, Joreskog & Sorboms (1988) LISREL 7 (as implemented in SPSS, 1988) was
used on a VAX mainframe. In accordance with standard practice, all analyses were performed on
covariance matrices.
Results
In line with the approach by Anderson & Gerbing (1988) mentioned above, we first
developed the measurement model. In particular, to simplify spotting misspecifications, we first developed measurement models for the stranger and mother
episodes separately, followed by a joint measurement model. The resulting
measurement model was crossvalidated before we proceeded to the construction of
317
Table 2. Evaluation of measurement models for stranger episodes (S4, S7) and
mother episodes (M5, M8) (Dutch sample)
Model evaluation
Model
Stranger (S4, S7)
1. Stranger wariness
Mother episodes (M5, M8)
2. SS behaviour
3. Resistance/Pos. contactb
4. Deactivation/hyperactivation'
5. Deactivation/hyperactivation
No. of
factorsa
x2
0.70
0.00
1
2
2
2
261
229
130
9.00
5.20
2.95
0.16
0.15
0.11
68
29
27
25
23
158
81
1.95
8.48
0.08
0.05
CFA
based
on 5
a The numbers in this column are the numbers of different latent variables. From a practical modelling
aspect the number should be doubled, because each latent variable is present in two episodes.
* The correlated error variances for crying and contact maintaining (M5, M8) were set to 0, and the
unique variance for contact maintaining in M5 was set to .01.
' The correlated error variance for crying (M5, M8) was set to 0.
The correlated error variance for crying (M5, M8) was set to 0, and the unique variance for crying
in M5 was set to .01.
CFA = Confirmatory factor analysis model.
The measurement model had a very good fit, it had an RMSEA of zero, and the
signs of the parameter estimates were all in the same direction in accordance with our
expectations.
Mother episodes. Given that we have five variables for each episoderesistance,
crying, avoidance, contact maintaining and proximity seekingmeasurement models
P. M. Kroonenberg et al.
31 8
with two latent variables were deemed acceptable. Two possibilities seemed to exist
on theoretical grounds : one with a reservationwithmother latent variable (avoidance,
resistance, crying (positive)) and a positivecontactwithmother latent variable (proximity seeking, contact maintaining (positive)). The other model would have a latent
variable minimixation or deactivation of attachment concerns (avoidance (positive) ;contact
maintaining (negative);proximity seeking (negative)) and a latent variable maximixation or byperactivation of attachment concerns (resistance and crying (positive)) (see also
Kobak & Sceery, 1988; Kobak, Cole, FerenzGillies & Fleming, 1993; Main, 1990).
Of these measurement models only the last had a more or less acceptable solution (see
Table 2). It had, however, a marginal RMSEA. The only nonsignificant parameters
US sample
NL sample
Manifest
variables
Episode
Wary
S4/S7
Crying
Resistance
s4
s4
2.00a
1.41
(07)
0.66
(*07)
Avoidance
s4
Crying
M5
Resistance
Contact
M5
M5
Avoidance
Proximity
M5
M5
Crying
Resistance
s7
s7
Avoidance
s7
Crying
M8
2.32"
1.57
(J9)
0.77
(08)
Resistance
Contact
M8
M8
Avoidance
Proximity
M8
M8
Hyper
M5/M8
Deact
M5/M8
Wary
S4/S7
Hyper
M5/M8
Deact
M5/M8
319
Table 3 (cont.)
Factor correlations
Latent var.
Wary
Hyper.
Deact.
Wary
Hyper.
Deact.
S4
M5
M5
S7
M8
M8
US sample
NL sample
1.00
1.oo
.66 1.00
.83 1.00
.61 .50 1.00
.62 .50 1.00
.69 .53 .57 1.00
.47 .44 .52 1.00
.51 .73 .45
.75 1.00
.34 .43 .23
.38 1.00
.50 .36
.65 .63 .53 1.00 .32 .27
.62 .39 .26 1.00
S4
M5
M5
S7
M8
M8
US sample
NL sample
.07
.09
.08
.08
.10
.08
.06
.08
.12
.06
.06
.08
.09
.14
.16
.19
.45
.46
.31
.37
.40
.36
.30
.31
.18
.13
.17
.15
.14
.12
Note. The numbers in parentheses in this first part of the Table indicate the standard errors of the factor
patterns, and that the parameter in question was fixed during the analysis. The fixed starting values
(indicated by ") are derived from the separate analyses for mother and stranger episodes.
Key. Wary = stranger wariness; Hyper. = hyperactivation of attachment concerns; Deact. = deactivation of attachment concerns.
were the unique variances of crying and contact maintaining in episode 5 and the
error correlation between contact maintaining in episodes 5 and 8. Model
modification indices (see e.g. Bollen, 1989, p. 299) suggested that allowing contact
maintaining to have a (positive) coefficient on hyperactivation would increase the fit
considerably and would give a considerably improved RMSEA, as is evident from
Table 2. From a substantive point of view such a coefficient is entirely acceptable, and
it does not affect the interpretation of the hyperactivation variable. Because of the
close agreement between the empirical results and the theoretical acceptability, it was
decided to accept the a posteriori change in the measurement model.
As expected there was considerable consistency between the crucial reunion
episodes 5 and 8. The regression parameters had more or less the same values except
for somewhat higher values in episode 8. The correlations between the latent
variables varied between .36 and .74. The high correlations between the same latent
variables across episodes (.74 and .65) indicated their substantial stability over time.
The generally moderately correlated measurement errors confirmed the necessity of
estimating these effects.
Combining measurement models. The next step was to check whether an acceptable
measurement model could be found for the mother and stranger episodes combined.
320
P. M. Kroonenberg et al.
The results of this investigation are also included in Table 2. The model for the
mother and stranger episodes based on the separate models proved to behave
adequately in all respects. Note that this model would have been a simple
confirmatory factor analysis model if there had been no correlated errors. Incidentally,
8 )398 (RMSEA = .104). It is
in the model without correlated errors ~ ~ ( 8 =
important to note that an adequate model from a substantive point of view, also
provided a reasonable statistical fit.
Crossvalidation of measurement model. Before developing the structural model, it is
important to see how well the measurement model crossvalidates in the US sample.
The loosest form of crossvalidation (Bentler, 1980; MacCallum, Roznowski, Mar
& Reith, 1994) is to assume that only the structure of the model crossvalidates, but
none of the actual values. Such a model was called conjgural invariant by Thurstone
(1947, p. 365). The result of requiring the US sample to have the same structure or
configuration as the Dutch (NL) one, led to a x2 of 128 and RMSEA = .042
compared to 155 and .054 for the Dutch sample. As the x2 values were not
comparable due to the different sample sizes, we multiplied the X2/d.f. values of the
US sample with 326/155 for comparability, which meant that X2/d.f. = 1.95 (NL)
and X2/d.f. = 3.32 (US). If we had fixed the factor pattern of the US sample using
the values of the Dutch measurement model we would have obtained a X2/d.f. =
4.78 and RMSEA = .091. These results showed that only a loose crossvalidation
was feasible, and that there was a modest agreement between the Dutch and the US
structure.
Nature of the measurement model. During the model search to be reported, it turned out
that the parameter estimates of the measurement model were in general sufficiently
stable to lead to identical assessment and interpretations. As indicated above, this
suggested that there is no serious interpretational confounding. Therefore, we will
now present the parameters from the confirmatory factor analysis model of both
Dutch and US samples, so as to be able to concentrate entirely on the latent variables
in the sequel.
In Table 3 the factor patterns of both samples are listed, as well as the factor
correlations. The latter form the basis of the latentvariable models to be discussed
later. The values for the 2 x 2 factors for the mother episodes showed considerable
similarity and nearperfect rank correlations both across samples and across reunion
episodes. The values for the stranger episodes were less similar across samples: the
Dutch values tended to be higher than the US ones, and the relative importance of
crying and resistance was reversed. The value of 0.19 for avoidance on the stranger
wariness factor in the US sample was the only nonsignificant pattern value, i.e. it had
a t value between 2.5 and +2.5 (Joreskog & Sorbom, 1988).
There was considerable overall similarity in the structure of the factor correlation
matrices providing a basis for searching for similar latentvariable models, but there
were also systematic differences. All but the (S4, M5) correlations were lower in the
US sample compared to the Dutch one. All correlations were significant at the .05
level.
321
In the Dutch sample the error variances were all significant (a = .05), and in the
US sample this was true for all variables except resistance in episode S7. The error
variances of crying in episode M5 in the Dutch sample and crying in episode M8 in
the US sample had to be fixed at a small positive level (here: .Ol) in order to obtain
a solution at all. All this meant was that in the confirmatory model nearly all variance
in crying in episode M5 (or M8) was estimated to be common variance. The
consequence was that also the testretest correlation between crying in episodes M5
and M8 had to be fixed at zero in both samples. The other testretest correlations in
both the Dutch and the US samples were all significant except the one for proximity
seeking.
In conclusion, it can be said that the confirmatory factor analysis model with
correlated error terms (testretest correlations) provided a reasonable model for the
covariances in both samples, but that, notwithstanding considerable similarities, the
values of the pattern of the Dutch sample did not crossvalidate sufficiently well in
the US sample for them to be considered equal. Clearly fixing the parameter estimates
for the factor correlations of the US sample at those of the Dutch sample would lead
to a further decrease of fit.
Pseudo chisquare tests (Bentler & Bonnett, 1980). Above we concluded that the
developed measurement model was acceptable from both a modelling and a
substantive perspective. Following Anderson & Gerbing (1988), we constructed
pseudo chisquare tests (see Appendix) for the Dutch sample to assess the existence
1 )155. As in the null
of acceptable structural models. In the saturated model ~ ~ ( 8 =
model without any paths between the latent variables, 15 (= $ x 6 x 5) factor
correlations did not need to be estimated, the null model had 97 d.f. The resulting
pseudo chisquared.f. ratio was thus 1.6 (RMSEA = .043), which is a very good
value, and thus a search for a more parsimonious structural model than the saturated
confirmatory factor analysis model was warranted.
Latentvariable models
In this section we will concentrate on finding acceptable structural models, and we
will only refer to the latent variable part of these models. All diagrams will thus omit
the measurement part. The procedure will be similar to the measurement model
search. Using the Dutch sample we will search for acceptable models, and
subsequently we will use the US sample for (primarily loose) crossvalidation. The
details of the principles behind the search and decisions taken therein are explained
in detail in the Appendix; here we will concentrate on the results of the search.
Model search :Results. The results of the model search are summarized in Table 4,and
the corresponding latentvariable models are depicted in Figs 1 and 2. The models
have been examined in accordance with the procedure outlined in the Appendix.
First the longdistance paths between S4 and M8 have been removed : Mo + Ml ;then
the less interesting paths from M5 + S7: Ml + M2. In these cases, the three versions
(see below) led to equivalent models. This is an example of the socalled replacing rule
(Lee & Hershberger, 1990, p. 318; see also MacCallum, Wegener, Uchino &
Fabrigar, 1993, pp. 187q. In the next step, the crosslagged paths between M5+
158
81
270
358
352
89
89
89
89
437
304
426
89
89
89
89
207
206
201
212
332
3.04
4.02
3.96
5.02
3.42
4.79
2.38
2.37
2.31
2.38
3.32
2.28
194
85
87
87
87
89
91
2.12
176
1.95
X2/d.f.
83
81
x2
d.f.
128
.054
.079
.096
.095
.110
.086
.lo8
198
4.68
3.38
4.03
4.01
143
171
170
3.97
4.58
3.28
3.29
3.26
3.26
3.33
3.32
3.24
4.78
3.32
.089
.063
.077
.077
.076
.087
.060
.060
.060
.060
.061
.061
.059
.091
.042
X2/d.f.a RMSEA
168
194
136
136
135
138
144
134
.063
.065
.065
.063
.065
.090
128
.059
184
x2
RMSEA
US (N= 155)
The Xa/d.f. ratio for the US sample has been multiplied by k = 326/155 to facilitate comparisons.
Notes. For all NL (US) models the error variance of crying in episode 5 (8) has been set at .01 and the testretest correlation of the error variances of
crying in M5 and M8 has been set at 0.
Kg.  = no admissible or identified model found; W = stranger wariness; Hyp = hyperactivation of attachment concerns; De = deactivation of
attachment concerns.
M4
M3
M2
M1
MO
Model description
NL (N= 326)
0;:
B
is
"J
323
M8 : M , + M3 were eliminated. This destroyed the replacing rule for the M8 episode
but not for M5, so that the three versions were no longer equivalent. The next set
of possibly acceptable models were the M4 models, and this set consisted of two
models each with three versions.
Model MO
Model M1
Model
M2
Mode1
M3a
Mode1
M3c
Model
M4Ba
There were three different versions of the M3 and M 4 models because the factor
correlations between stranger wariness in S4 (S7), deactivation in M5 (M8) and
hyperactivation in M5 (M8) can be modelled in three different ways. In the M3
models this was done via two direct paths from S4 (S7) to M5 (M8) and a connection
between deactivation and hyperactivation. The latter can be done in one of three
ways, hence the three versions. In the set M4, only M4Aband M4Bamodelled all three
correlations and hence provided a more or less adequate fit, while the other versions
each failed to model one of the correlations.
From the point of view of fit, all models from the M 3 set seemed equally
acceptable, and was M4Bafrom the M 4 set the next best. Removing paths after this
model led to a quickly increasing x2 and values for RMSEA well above 0.10.
Model search: Crossvalidation. Another way of choosing a model is to investigate
which models crossvalidate better than other models. This is reasonable strategy, as
in this study there was only loose crossvalidation (see MacCallum e t al., 1994). Table
324
P. M. Kroonenberg et al.
4 also shows the results of this process. Surprisingly, eliminating paths up to and
including one of the models in the M4 set for the US sample did not change the
RMSEA very much, although some versions in the M4 set did not result in
admissible models. Going beyond the M4 set mainly led to inadmissible models. The
disadvantage of both M3dand M4Bafor the US sample was that there were still three
nonsignificant paths, while in M3e,a model without paths between S7 and M8, there
was only one.
To complement the unrestricted crossvalidation, two morerestricted crossvalidations were carried out for M3d and M4Ab,respectively, using the parameter
values of the Dutch measurement model just as was done for the confirmatory factor
between all restricted crossvalidated
analysis model (M,,). The differences in
models and their original models (56,50 and 28, respectively) seemed to indicate that
crossvalidation performed somewhat better for more restricted models. In Table 4
we have discounted the gain in degrees of freedom by fixing the pattern, and used
the degrees of freedom of the original model.
x2
Model search: Conclusion. The results of the model search led to a selection of the M 3
sets of models. Only one M4model was more or less acceptable but it did not perform
as well as the M3 ones, even though it crossvalidated nearly as well. From the point
of view of fit, there was not much difference between the versions of the models in
the M , set, both in the Dutch and in the US samples, though one of them, M3d,had
more degrees of freedom, thus illustrating the general difficulty of accepting models
rather than rejecting them.
From a theoretical substantive point of view a model without correlated error
terms was to be preferred over the other models, because no directional decision had
to be taken within the M5 and M8 episodes. In addition, its interpretation was
simpler, because stranger wariness has a direct influence on both the latent variables
deactivation of attachment concerns and hyperactivation of attachment concerns in
the subsequent period without any additional indirect paths. Therefore, we are
inclined to favour model M3don substantive grounds.
Model parameters. It is possible to make statements about parameters which are
present in all models considered. If the choice of model does not influence a particular
path coefficient, we can evaluate that parameter irrespective of the particular model,
and if there is a difference in values, we can try to explain this both in terms of the
different structures of the models themselves, and in terms of different theoretical
implications. To make the values comparable across models, the solutions had to be
standardized by equalizing the variances of the latent variables (e.g. see Bollen, 1989,
pp. 349, 350).
In Table 5 (see Fig. 2 also) we have provided the partial regression coefficients for
both the Dutch and US sample for the selected latentvariable models. The first
conclusion from this table is that independent of the specific model preferred, the
stabilities in the Dutch sample were about .69, .49 and .44 for strange wariness,
hyperactivation and deactivation, respectively. The parallel values in the US sample
were .53, 34 and .56, respectively.
The values for the paths from S4 (S7) to M5 (M8) were also fairly stable in the M3
set of models. For the Dutch sample, the approximate overall strength of the
.54
.55
.56
.55(.11)
.61
.58
.43
.43
.46
.43(.06)
.50
De
De
.84
35"
.92
.65
.84
.66
.82(.18)  .62(.17)
.82
.63
.89

.69
.53
.69
.62
.67
.62
.70(.05)  .63(.08)
.68

HYP
S4 +M5
.20"
.11"
.11"
.56
.13"
.OP  . l l a

.24"
.12"
.47
.33
.OF
.07"
.08"
M8
.14"
.13"
.OF
M5
.36
 .40
.38
 .41(.07)
De
.19"
.09"
.16"
.12"
.19"
.lla
.20"(.11) .12"(.12)
.50
.45
.49
.50(.07)
.53
HYP
S7 + M8
(Hyp+De)
(Hyp+De)
(De+Hyp)
(CorrEr)
(No CorrEr)
(Hyp+De)
(De+Hyp)
(CorrEr)
(No CorrEr)
(Hyp+De)
.53
.34"
.53
.32"
.53
.35"
.53(.18) .34"(.15)
.53"
.43
.50
.33"
US sample
M3a
M3b
M3c
M3d
M3e
M3Ba
.47
.48
.49
.47(.06)
.45
HYP
.69
.69
.69
.70(.05)
.68
Dutch sample
M3a
M3b
M3c
M3d
M4Ba
Model
Stabilities
Table 5. Standardized solutions of latentvariable models for stranger episodes (S4,S7) and mother episodes (M5, M8) : Dutch
and US samples (standard errors in parentheses)
cn
5
2.
p
sa
2
2
5
$
3.
P. M . Kroonenberg et al.
326
Model M3d
mpk
Model M3d
US Sample
Figure 2. Preferred latentvariable models for Dutch and US samples with path coefficients
(for standard errors see Table 5 ) .
327
episode, but it did not include any crosslagged influences of a particular attachment
strategy in an earlier episode on the other strategy in a later episode. The structural
model was derived using data from a Dutch sample but it appeared to fit the data
from a US sample in a satisfactory way. At the same time it was also clear that there
were substantial differences between the regression coefficients of the two samples,
which awaits further investigation, especially with other large samples of Strange
Situation data.
In terms of our hypotheses we may conclude the following. First, the infants
Strange Situation behaviour towards the parent was indeed patterned according to
two main attachment strategies : minimization or deactivation of attachment concerns
as indicated by intensive avoidant (and exploratory behaviour) and lack of proximity
seeking and contact maintaining ; and maximization or hyperactivation of attachment
concerns as indicated by strong resistant and crying behaviours as well as strong
contact maintaining. The two patterns or latent variables fit nicely into the
classification system of the Strange Situation procedure (Ainsworth e t a/., 1978) in
which two insecure attachment categoriesavoidant and resistant attachmentare
being discriminated. The model also concurred with Kobak & Sceerys (1988)
analysis of the main attachment strategies displayed by adults in the context of the
Adult Attachment Interview (Main, Kaplan & Cassidy, 1985). Kobak & Sceery
(1988), however, considered deactivation versus hyperactivation of attachment as
two extremes of the same continuum. In our structural modelling of the Strange
Situation behaviours we found that the two strategies were related but at the same
time they could also be clearly differentiated, as was evident from the inadequate fit
of a measurement model with a single latent variable for the reunion episodes.
Furthermore, deactivation of attachment in an earlier episode did not affect
hyperactivation of attachment in a later episode, although both latent variables were
(negatively) correlated within the same episode. The structural model therefore
seems to support Mains (1990) analysis of two separate attachment strategiesthe
minimization and maximization of attachmentand to be in line with the
discrimination of two insecure attachment classifications that cannot easily be
reduced to a single underlying dimension.
With respect to the second hypothesis, the infants interactive behaviours did not
show qualitative changes of structure or dynamics across episodes. The Strange
Situation procedure indeed seemed to create a gradual increase of stress by adding
more stressors successively : the strange environment, the stranger and the separations
from the attachment figure. This was also supported by the generally increasing
coefficients for the interactive behaviours on the latent variables. The infants
interactive behaviours as well as the latent variables were highly stable across
episodes. The increasing stress was manifest in more intensive interactive behaviour
but not in different configurations or patterns of attachment behaviours. This
followed from the good fit of our models which were symmetric in the two stranger
episodes (S4 and S7) and in the two reunion episodes (M5 and M8). Therefore, the
Strange Situation appears to contain a builtin replication of the essential
separationreunion sequence : the behavioural pattern in the first separationreunion
sequence (episodes 4 and 5) appears to be replicated and confirmed in the second
separationreunion sequence (episodes 7 and 8). The second sequence does not add
328
P. M. Kroonenberg et al.
qualitatively new information to what is observed in the first sequence but merely
intensifies the behavioural pattern. The replicated nature of the Strange Situation
procedure may be one of the reasons for its robustness and its validity despite its
relatively short duration. The only caveat is that in the US sample the same pattern
was observed as in the Dutch sample, but the influence of the last stranger episode
(S7) was far less pronounced, and even a model without this influence would also fit
the US data.
Third, stranger wariness indeed seems an important component of Strange
Situation behaviour. The infants differed from each other in the degree to which they
seemed to be able and willing to interact with the stranger in a positive way. Stranger
wariness was stable across episodes, and it also seemed to be one of the causes for
the subsequent attachment strategy towards the parent. If infants were wary of the
stranger in an earlier episode they more intensively displayed their attachment
concerns in the subsequent episode. If they were more friendly and sociable with the
stranger, they seemed more inclined to minimize the display of their attachment
concerns in the following episode. This outcome may be interpreted in different
ways, and concurs with earlier findings of Sagi e t al. (1986), who measured stranger
sociability in a separate procedure prior to the Strange Situation assessment (see also
Frodi, 1983; Main & Weston, 1981; Thompson & Lamb, 1983). Stranger wariness
may be considered as an indicator of some temperamental characteristic related to
behavioural inhibition or shyness (Fox, 1992). In that case the structural model
would support the idea that temperamental differences may cause some differences in
patterns of attachmentmaybe at the level of the two insecure strategies (Vaughn,
Lefever, Seifer & Barglow, 1989). An alternative interpretation may be that stranger
wariness is part of an overall pattern of dealing with stressful circumstances, and
therefore fits into a certain attachment strategy instead of independently causing it.
It is not possible to choose between these alternative interpretations on the basis of
the Strange Situation data alone.
In developing the structural model we have taken the problem of equivalent
models into account. Recently, MacCallum et al. (1993) showed that many structural
analyses in the behavioural and social sciences have failed to consider the possibility
of equivalent models and assumed the adequateness of the preferred model if it fit the
data. In our case, the measurement model was based on substantive theory, and
whenever equivalent models could be defined this was looked into. Furthermore, we
were able to crossvalidate the selected model in a different sample from another
country. Although exact replication of the model and its parameters was not possible,
the model basically appeared to fit the data from the validation sample. Because of
the differences between the two samples, which were collected in different countries
under different circumstances, we would have been surprised if more than configural
confirmation of the model would have been possible. Last, the selection of an
adequate model for the interactive behaviours in the Strange Situation procedure was
based on a sufficiently large number of cases (N = 326). In an earlier attempt to
construct a structural model, Connell & Goldsmith (1982) used a sample of only
55 participants. It has been shown, however, that replicable models may only be
expected in samples of at least 200 participants (Boomsma, 1985).
Structural modelling of Strange Situation behaviour raises at least two further
329
issues. We labelled the latent factors of the reunion episodes in terms of deactivation
and hyperactivation of attachment concerns. Of course it would be important to try
and assess the regulation of emotions inherent in these attachment strategies more
directly, for example through observations of facial expressions of emotions (Izard,
Haynes, Chisholm & Baak, 1991) or through psychophysiological indicators of the
infants stresses during the Strange Situation (Gunnar, Mangelsdorf, Larson &
Hertsgaard, 1990; Spangler & Grossmann, 1993). Furthermore, the current approach
raises the issue of the dimensional versus the categorical nature of Strange Situation
behaviour. The structure and dynamics of the procedure appear to be adequately
reflected in a linear model based on continuous variables. There is, however, an
important caveat in this respect. Recent work by Bartholomew (1993) and by
Molenaar & Von Eye (1994) seems to indicate that [tlhe covariance structure
associated with an arbitrary common factor model can be represented by a latent
profile model [a model with categorical latent classes]. Hence, at the level of secondorder moments the two latent variable models [i.e. the one with continuous latent
variables and the one with discrete latent variables] are completely equivalent
(Molenaar & Von Eye, 1994, p. 227). If this statement is also true for more complex
linear structural equation models, than a good fitting model with continuous
variables cannot be used as proof or even indication that the underlying processes
must be continuous as well.
Thus whether the model derived in this paper shows similar predictive validity as
the traditional classification system still has to be documented empirically. And the
dimensional and categorical interpretations of the Strange Situation may not be
incompatible but may constitute two sides of the same coin. The choice between the
two approaches may therefore be a pragmatic one dependent on the issue to be
addressed.
Acknowledgements
Part of this work was supported by a Pioneer grant awarded to Marinus H. van IJzendoorn by the
Netherlands Organization of Scientific Research (NWO).
References
Ainsworth, M., Blehar, M., Waters, E. & Wall, S. (1978). Patterns of Attachment. Hillsdale, N J :
Erlbaum.
Anderson, J. G. & Gerbing, D. W. (1988). Structural equation modelling in practice: A review and
recommended twostep approach. Psychological Bulletin, 103, 41 1423.
Bartholomew, D. J. (1993). Estimating relationships between latent variables. Sankya, 35, 409419.
Bentler, P. M. (1980). Multivariate analysis with latent variables : Causal modelling. Annual Review of
Psychology, 31, 419456.
Bentler, P. M. & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of
covariance structures. Psychological Bulletin, 88, 588606.
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: Wiley.
Boomsma, A. (1985). Nonconvergence, improper solutions, and starting values in LISREL maximum
likelihood estimation. Psychometrika, 50, 229242.
Bowlby, J. (1969). Attachment and Loss, vol. 1, Attachment. New York: Basic Books.
Bretherton, I. (1985). Attachment theory: Retrospect and prospect. In I. Bretherton & E. Waters (Eds),
Growing Points of Attachment Theory and Research, pp. 338. Monographs of the Socieg for Research in Child
Development, 50 (12, serial no. 209).
330
P.
M.Kroonenberg et al.
33 1
Pseudo chisquare tests. Our measurement model is a confirmatory factor model with a full correlation
332
P. M. Kroonenberg et al.
matrix between the factors or latent variables, or a saturated latentvariable model. The opposite
confirmatory factor analysis model has uncorrelated factors and no paths between the latent variables.
Any other structural model will be in between the saturated model and the nopaths model. Whether
it is at all fruitful to search for a parsimonious latentvariable model can be assessed by supposing that
setting all paths coefficients to zero has no effect on the fit of the model. In other words, the increase
in restrictions in the model has no influence on the fit. The assessment is accomplished with Bentler &
Bonetts (1980) pseudo chisquare test (see Anderson & Gerbing, 1988, for a further discussion of this
strategy). This statistic is constructed from the chisquare value for the saturated model with the degrees
of freedom of the nopaths model. If the statistic is significant, then no structural model will give an
acceptable fit, because it would have a chisquare value greater than or equal to the value for the
saturated model with fewer degrees of freedom than for the nopaths model. When a nonsignificant
pseudo chisquare statistic results, one can investigate several (nested) structural models by means of
sequential chisquare tests. A drawback of this procedure is the dependence of chisquare tests on
sample size. For rough model comparisons we have used both the chisquared.f. ratios and the
RMSEA.
Latentvariable model. In order to conduct a fairly systematic search, certain principles had to be
devised. First, it was decided to follow a backwards strategy, i.e. removing paths from the saturated
model, rather than adding paths to a minimal or null model. In fact that kind of strategy was already
implicit in first looking at the measurement model, which contained a saturated latentvariable model.
Second, the S7 and M8 section of the model was to be treated in the same manner as the S4 and M5
section. In other words, the models with a path from S4 to M5 should also have a path from S7 to M8.
Not adhering to the principle opens up a large number of parallel models between which it would be
difficult to decide and which could be difficult to interpret. Third, all models should include the stability
paths of the three latent variables. Fourth, longdistance paths and theoretically least interesting paths
should be removed first.
This strategy led us to first eliminate the longdistance paths from S4 to M8, and then the paths from
M5 to S7 (see Fig. 1). Hereafter, the situation was unclear. There were three options open. One option
was to first eliminate one or more of the links between the stranger and the mother episodes, another
was to start with the elimination of the simultaneous path(s) between the latent variables at M5 and M8
(in a similar fashion), and the final one was to start with the elimination of the crosslags between M5
and M8. On theoretical grounds, it was not clear which route to follow, on empirical grounds one could
look at the nonsignificant parameters in the matrix of regression coefficients of the then current model.
This latter approach suggested eliminating the crosslagged path deactivation5 + hyperactivation8 in
one version of the model and the hyperactivation5 + deactivation8 path in the other version (see below
for information on different versions of models). Given this situation, we decided to follow the route
of first eliminating the crosslagged paths, but also inspect the other two possibilities. It turned out that
the strategy of eliminating the crosslagged paths first proved to be the best one, and we have therefore
reported only those results.
A further remark should be made with respect to the simultaneous paths within each of the mother
episodes, i.e. between deactivation5 and hyperactivation5, and between deactivation8 and hyperactivationb. From a substantive point of view there is no reason to suggest a direction from deactivation
to hyperactivation or vice versa, and we would have preferred an undirected path, or two directed paths.
Due to modelling consideration this is unfortunately not possible, as it leads to unidentified models. As
we preferred not to express a directional statement about the influence of deactivation on hyperactivation
or vice versa in the same episode, we have had to consider correlated error terms between the latent
variables. Substantively, this means that there were external (i.e. nonspecified) influences which caused
the simultaneous correlation. Therefore, for each model, we had to investigate three versions, one
version with paths from deactivation + hyperactivation, one with paths from hyperactivation +
deactivation, and one with correlated errors. As indicated in the main body of the text, there were
situations in which, on theoretical grounds, the three versions were indistinguishable with respect to
the fit of the model.