The Forecasting Accuracy and Effectiveness of Complexity Manager

THE FORECASTING ACCURACY AND EFFECTIVENESS OF
COMPLEXITY MANAGER
LINDY-JO SMART
A Thesis
Submitted to the Faculty of Mercyhurst College
In Partial Fulfillment of the Requirements for
The Degree of
MASTER OF SCIENCE
IN
APPLIED INTELLIGENCE
DEPARTMENT OF INTELLIGENCE STUDIES

MERCYHURST COLLEGE
ERIE, PENNSYLVANIA
APRIL 2011
DEPARTMENT OF INTELLIGENCE STUDIES
MERCYHURST COLLEGE
ERIE, PENNSYLVANIA
THE FORECASTING ACCURACY AND EFFECTIVENESS OF COMPLEXITY

MANAGER
A Thesis
Submitted to the Faculty of Mercyhurst College
In Partial Fulfillment of the Requirements for
The Degree of
MASTER OF SCIENCE
IN
APPLIED INTELLIGENCE
Submitted By:
LINDY-JO SMART
Certificate of Approval:
___________________________________
Kristan J. Wheaton
Associate Professor
Department of Intelligence Studies
___________________________________
William J. Welch
Instructor
Department of Intelligence Studies
___________________________________
Phillip J. Belfiore
Vice President
Office of Academic Affairs
April 2011
Copyright © 2011 by Lindy-Jo Smart
All rights reserved.
3
ACKNOWLEDGEMENTS
First, I would like to thank Kris Wheaton for his incredible guidance and patience
through this process and for always having the time to sit down and work through
challenges.
I would like to thank Bill Welch for taking on the role as my secondary reader.
I would like to thank Hema Deshmukh for helping me complete all statistics in this thesis
and for her patience throughout the process.
I would like to thank Richards Heuer for his personal correspondence throughout the
experiment creation process.
I would like to thank all faculty members in the Intelligence Department at Mercyhurst
College for their dedication, guidance, and for offering such a rewarding challenge that is
the Intelligence Studies graduate program.
I would also like to thank my friends and family for their continual support,
encouragement, and patience with me the past two years.
5
ABSTRACT OF THE THESIS
The Forecasting Accuracy and Effectiveness of Complexity Manager
A Critical Examination
By
Lindy-Jo Smart
Master of Science in Applied Intelligence
Mercyhurst College, 2011
Associate Professor Kristan J. Wheaton, Chair
The purpose of this study was to assess the forecasting accuracy and effectiveness
of the structured analytic technique, Complexity Manager. The study included an
experiment with Mercyhurst College Intelligence Studies graduate and undergraduate
students placed into small groups to assess an intelligence problem and forecast using
intuition or the structured analytic technique, Complexity Manager. Data was collected
using a researcher-created Forecasting Answering Sheet that included the variables that
each group considered and a researcher-created questionnaire to capture individual
responses to the study. Students that used Complexity Manager spent significantly more
time working with their groups than the groups that used intuition alone. Students that
used intuition alone generated a greater number of variables. However, there was no
connection between the generation of variables and forecasting accuracy; the experiment
group produced more accurate forecasts. The results of the study show that the use of
Complexity Manager may increase forecasting accuracy; three out of 24 control groups
5
forecasted accurately while six out of 23 experiment groups forecasted accurately.
Therefore, the use of this structured analytic technique increases collaboration with more
accurate results than intuition alone. To yield more statistically sound results, it is
necessary for further studies to yield a higher level of participation.
4
TABLE OF CONTENTS
Page
TABLE OF CONTENTS……………………………………………………… vii
LIST OF FIGURES…………………………………………………………..... x
CHAPTER 1: INTRODUCTION………………………………………………. 1
CHAPTER 2: LITERATURE REVIEW……………………………………….. 5
Intelligence Failures……………………………………………………... 6
Cognitive Bias…….……………………………………………………... 11
Unstructured and Structured Techniques………………………………... 19
Collaboration ……..……………………………………………………... 29
Complexity Manager…………………………………………………….. 43
Hypothesis ………..……………………………………………………... 47
CHAPTER 3: METHODOLOGY……………………………………………… 48
Setting……………………………………………………………........... 49
Participants …………………………………………………………….. 49
Intervention and Materials ……...……………………………………… 51
Measurement Instruments ………………………………………….…... 52
Data Collection and Procedures ………………..……………………... 53
Data Analysis ……….………………………………………...………... 56
CHAPTER 4: RESULTS……………………………………………………..… 58
Survey Responses……………………………………………………... 58
Group Analytic Confidence ……………………………………………. 62
Group Source Reliability …………………………………………….. 62
Variables………………………………………………………………. 63
Forecasting Accuracy……..……………………………………………. 65
Quality of Variables.............……………………………………………. 66
CHAPTER 5: CONCLUSIONS………………………………………………... 70
Discussion...…………………………………………………………... 70
Limitations…………..………………………………………………….. 76
Recommendations for Further Research………………………………... 78
Conclusions……………………………………………………………... 80
REFERENCES………………………………………………………………...... 83
APPENDICES………………………………………………………………...... 87
Appendix A……………………………………………………………. 87
Appendix B………………………………………………................... 88
Appendix C………………………………………………................... 89
Appendix D………………………………………………................... 90
Appendix E………………………………………………................... 91
Appendix F………………………………………………................... 92
Appendix G………………………………………………................... 108
Appendix H………………………………………………................... 109
Appendix I.………………………………………………................... 110
Appendix J….………………………………………………................... 112
Appendix K.………………………………………………...................... 113
3
LIST OF FIGURES
Page
Figure 2.1 Complexity Manager: Cross-Impact Matrix 46
Figure 3.1 Number of Participants Per Academic Class 50
Figure 4.1 Source Reliability Per Group 63
Figure 4.2 Forecasting Per Group 66
Figure 5.1 Analytic Confidence Per Group 71
3
5
1
CHAPTER 1: INTRODUCTION
The entire project cost $40,000 and consisted of 1,200 experiments in a 14-month
period (“Edison Gets the Bright Light Right,” 2009). The team searched the world,
testing materials from beard hair to fishing line to bamboo. Over 40,000 pages of notes
were taken (“Edison’s Lightbulb at The Franklin Institute,” 2011). Then, in 1879, after
testing over 1,600 materials, Thomas Edison and his associates found a filament that
would burn for 15 hours. Edison stated, "I tested no fewer than 6,000 vegetable growths,
and ransacked the world for the most suitable filament material" (“Edison’s Lightbulb at
The Franklin Institute,” 2011). By 1880, Edison had produced a bulb that could last for
1,500 hours and was then placed on the market (“Light Bulb History – Invention of the
Light Bulb,” 2007).
The success of Edison’s invention, as with other scientific findings, is evident
because it produces tangible and in this case, visual results. The longer a filament would
burn, the more successful it was. Failure was depicted through nothingness; a filament
didn’t burn for any noticeable length of time. When considering Edison’s filament
experiments compared to experiments in other fields of study, are the results just as
definite and stunning?
In the context of intelligence, the two couldn’t be more different. If intelligence is
successful, nothing may happen. But if intelligence fails, the results could be
catastrophic. It is not something that can be tested within the confines of a vacuum bulb,
nor can thousands of possible solutions be tested before the problem can be resolved. But
it can and must improve somehow. One improvement can be through the use of
structured analytic techniques. However, though structured techniques exist, not all
analysts use them because effectiveness has not been proven and constraints further
2
hinder use. Therefore, the testing of the methods is needed to increase use within the
Intelligence Community (IC) and determine the validity of each method.
In “Assessing the Tradecraft of Intelligence Analysis,” Gregory F. Treverton and
C. Bryan Gabbard define structured analytic techniques as “technologies, products, or
processes that will help the analyst in three ways…searching for and dealing with data…
building and testing hypotheses…and third, in communicating more easily both with
those who will help them do their work” (2008, p. 18). Though these techniques are
created to assist analysts, there is no consensus on the need for or value of them
(Treverton & Gabbard, 2008). Some analysts willingly use them while others prefer not
to. Therefore, even though a large number of structured analytic techniques are available
to analysts, without proper training and understanding, the techniques are useless.
Another major issue is that many of these structured analytic techniques have yet
to be tested. Richards Heuer states that the concept of structured analytic techniques
began in the 1980s when Jack Davis began teaching and writing about “alternative
analysis” (Heuer & Pherson, 2010, p. 8). Even though the concept has been around for
almost thirty years, it remains largely untested. Therefore, without proper training for the
use of structured analytic techniques and without proven results to give the methods
authority, the use of structured analytic techniques will remain limited.
Richards Heuer states that all structured analytic techniques provide the same
benefit; they guide communication among analysts who need to share evidence, provide
alternative perspectives, and discuss significance of evidence (2009). Structured analytic
techniques involve a step-by-step process that externalizes an analyst’s thoughts.
Therefore, their thoughts and ideas can be reviewed and discussed at each step of the
process. The techniques provide structure to individual thought processes and to the
3
interaction between collaborations to help generate hypotheses and mitigate cognitive
limitations (Heuer, 2009).
Heuer suggests the evaluation of structured analytic techniques because the only
testing the IC has done is through experience and through a small number of colleges and
universities that offer intelligence courses (Heuer, 2009). He further states that there is no
systematic program established for evaluating and validating the techniques (Heuer &
Pherson, 2010). To resolve this issue, Heuer suggests conducting experiments with
analysts using the technique to analyze typical intelligence issues (Heuer and Pherson,
2010). Heuer states that the most effective approach to evaluating the “techniques is to
look at the purpose for which a technique is being used, and then to determine whether or
not it actually achieves that purpose, or if there is some better way” (2009, p. 5).
Structured analytic techniques are needed to mitigate limitations such as
organizational and individual bias and to decrease the number and negative effects of
intelligence failures. Though structured analytic techniques may be an effective way to
mitigate these issues, few have been tested including Complexity Manager, the subject of
this study. Therefore, testing Complexity Manager would increase the validity of
structured analytic techniques, particularly this specific technique.
The purpose of this study was to conduct an experiment to test the effectiveness
of the methodology Complexity Manager according to Richards Heuer’s
recommendations of using intelligence analysts to analyze a typical intelligence issue.
This thesis is one of thousands of filaments that need tested to ensure that structured
analytic techniques are serving their purpose; to give the visible, tangible results needed
in the intelligence field of study.

5
CHAPTER 2: LITERATURE REVIEW
Since the creation of the Intelligence Community (IC), issues surrounding the
validity and soundness of its forecasts have surfaced due to intelligence failures. The IC
has taken steps to proactively avoid further significant intelligence failures. However,
there exist debates both about the causes of an intelligence failure and how preventable it
may be actually. Further, the methods used for reaching a forecast are not universally
agreed upon; whether it is more effective to use structured analytic techniques as opposed
to strictly using intuition varies amongst analysts. Using structured analytic techniques
gives the analyst greater confidence in their forecast, especially when managing a
complex situation with multiple hypotheses. However, the very techniques that the
analyst uses have not been proven effective. One reason is because structured analytic
techniques assist the analyst in forecasting the outcome of future events; it is difficult to
state with absolute certainty that the technique is effective. In other words, the technique
may have been used properly, but the forecast may still have been wrong. Regardless, the
testing of structured analytic techniques is essential for not only validating the method
itself, but for the validity of the IC’s ability to forecast. Therefore, the testing of Richards
Heuer’s Complexity Manager will be an initial step at assessing the strength of this
structured analytic technique.
The literature will address four areas related to the testing of Richards Heuer’s
technique, Complexity Manager. The first section will address research related to
intelligence failures and will be followed in the second section by a proposed cause of
them, cognitive bias. The third section will focus on research about the use of
6
unstructured versus structured analytic techniques. Finally, the fourth section will discuss
research related to the use of collaboration and its connection to Complexity Manager.
Intelligence Failures
For the purpose of this thesis, “intelligence” will be defined according to the
Mercyhurst College Institute for Intelligence Studies (MCIIS) definition created by
Kristan Wheaton and Michael Beerbower which states that intelligence is “a process
focused externally, designed to reduce the level of uncertainty for a decision maker using
information derived from all sources” (2006, p. 319). For the purpose of this thesis,
“intelligence failure” will be defined according to Gustavo Diaz’s definition from
“Methodological Approaches to the Concept of Intelligence Failure” which states that an
intelligence failure is: “the failure of the intelligence process and the failure of decision
makers to respond to accurate intelligence” (Diaz, 2005, p. 2).
Diaz cites both Mark Lowenthal and Abram N. Shulsky for the creation of this
definition. Mark Lowenthal’s definition emphasizes that an intelligence failure is a failing
in the intelligence cycle: “the inability of one or more parts of the intelligence process
(collection, evaluation, analysis, production, dissemination) to produce timely, accurate
intelligence on an issue or event of importance to national interest” (Lowenthal, 1985, p.
51, as cited by Diaz, 2005). Shulsky’a definition acknowledges the connection between
intelligence and policy: “a misunderstanding of the situation that leads a government to
take actions that are inappropriate and counterproductive to own interests” (Shulsky,
1991, p. 51, as cited by Diaz, 2005).
The discussion of intelligence failures is not only to understand why they
occurred, but to what extent they can be minimized in the future. There is a lack of
consensus regarding the cause of intelligence failures. Some researchers state that
2
analytic failure causes intelligence failures while others state that they are caused by the
decision maker’s potential misunderstanding of the intelligence product given to them. In
other words, it can either be caused by faulty analysis or by miscommunication with the
decision maker. Regardless of the cause, researchers often agree that intelligence failures
are inevitable.
In “Methodological Approaches to the Concept of Intelligence Failure” Gustavo
Diaz claims that intelligence failures are inevitable because of unavoidable limitations.
He suggests that there are two schools of thought for intelligence failure: the traditional
point of view and the optimistic point of view. The traditional point of view believes that
policymakers are responsible for the failures because they do not listen to the analysis or
they misinterpret it (Diaz, 2005). The optimistic point of view believes that intelligence
can be improved through the use of technology and that failures can be reduced by the
use of new techniques (Diaz, 2005).
Diaz suggests a third approach, an alternative approach that captures both; there is
not one single source of guilt for intelligence failures. An intelligence failure, like any
human activity, is inevitable because failures and imperfections are normal. Intelligence
cannot always give the same result even with the same environment because there are
always factors that cannot be controlled (Diaz, 2005). Diaz states that accidents in
complex systems, such as a county’s national security, are inevitable because it is
impossible to account for all failures. Not only is it impossible to account for every
factor, but there are limitations to the amount and relevance of data collection and the
reliability of sources. Also, funding limits the amount of resources available to fight the
threats and leads to a need to prioritize them.

4
Richard Betts, in “Analysis, War, and Decision: Why Intelligence Failures are
Inevitable” notes barriers that are inherent to the nature of intelligence that include:
ambiguity of data, ambivalence of judgment due to conflicting data, and useless reforms
in response to previous intelligence failures (1978). Therefore, the inability of
intelligence to be infallible and its intrinsic ties to decision making makes intelligence
failures inevitable. Betts further states that if decision makers had more time, then
intelligence failures would not occur because it could be resolved, just as academic issues
are resolved (Betts, 1978). Because time will always be a main concern and reason for
needing intelligence analysis, Betts concludes by suggesting “tolerance for disaster”
(Betts, 1978, p. 89).
Intelligence failures can be inevitable due to the nature of intelligence work, or by
human nature. According to Simon Pope and Audun Josang, co-authors of “Analysis of
Competing Hypothesis using Subjective Logic,” intelligence failures or errors in general,
are due to problems with framing, resistance to change, risk aversion, limitations of short-
term memory, and cognitive bias. These issues can negatively affect intelligence,
especially where issue outcomes appear similar. (Pope & Josang, n.d.). Because of the
inevitability of error with human reason and past intelligence failures, the researchers
conclude that to continue to rely solely on intuition would be irresponsible. As the
authors state, “management of intelligence analysis should encourage the application of
products that allow clear delineation of assumptions and chains of inference.” (Pope &
Josang, p.2). The inevitability of intelligence failures due to cognitive bias will be
discussed in more detail after the discussion of intelligence failures.
The inevitability of intelligence failures may not only be natural, but may be a
necessary part of intelligence. Stephen Marrin, a former CIA analyst, suggests in

5
“Preventing Intelligence Failures by Learning from the Past” that intelligence failures
occur as a trade-off to another action that could have caused future, unavoidable failures.
Also, imperfections in the intelligence process are the result of unavoidable tradeoffs to
structures. He states, for example, that any changes that would have been made to prevent
September 11, 2001 would have caused other unavoidable future failures (Marrin, 2004).
Therefore, the only way to make improvements is to understand that everything has
tradeoffs and either work to minimize them or find new ways of doing things that move
beyond the tradeoffs.
An intelligence failure often implies a negative impact on the U.S. national
security; however, failures occur every day in varying degrees. Though these things occur
on a daily basis, it is not until the information is applied to a high profile situation that it
is then known as an intelligence failure (Marrin, 2004). Marrin also makes the point that
though intelligence failures are becoming more public through investigations; successes
are often not discussed to avoid losing sources and methods (Marrin, 2004).
Consequently, in the public view, failures outnumber successes; the degree of success is
not known. If intelligence failures are inevitable, then the misconception that failures
outnumber successes may also be a necessary part of intelligence to maintain source
confidentiality.
John Hollister Hedley, states in “Learning from Intelligence Failures” that from
the United States’ perspective, anything that occurs that catches the U.S. by surprise or
was unintended, is then seen as an intelligence failure. Aside from the perception of
intelligence failure, Hedley also notes that failure is inevitable because analysts must be
willing to take risks to do their job well because even when information is incomplete,
inaccurate, or contradictory, a decision must be made (Hedley, 2005). Even under these
6
circumstance and though it is impossible to learn how to prevent something inevitable
like intelligence failures, the ratio of success to failure could be improved (Hedley, 2005).
Therefore, to improve the ratio, structures and methods could be applied to increase the
likelihood of success.
In Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in
Applying Structured Methods, Master Sergeant Robert D. Folker, Jr. states that the root
cause of intelligence failures is analytic failure; the lack of analysis of the collected raw
data. He states that regardless of what analysts believe about intelligence failures, it is
the opinion of the decision maker that matters most; if the decision maker doesn’t believe
the accuracy of the intelligence then it will not be useful. Therefore, improvements
should focus on improving accuracy and producing timely and useful products (Folker,
2000). Regardless of how inevitable failures are, Folker emphasizes a need for
improvements in the quality of analyst’s work so decision makers can make quality
decisions.
The literature reviewed on intelligence failures stated that failures are caused by
cognitive bias, analytic failure, or the decision maker’s misunderstanding of the
intelligence product they are given. Whatever the cause, the IC agrees that intelligence
failures are inevitable because failure and imperfection are inevitable and normal. Also,
the nature of intelligence requires judgments to be made on time sensitive issues so
tradeoffs must be made. Though failure is inevitable, it is not justifiable or excusable to
not attempt to prevent or reduce the severity of its consequences. Rather, the more
intelligence failures are understood and what the causes are, the more likely it would be
to lessen its effects, especially the factors that are within the analyst’s control including
7
an understanding of their own cognitive bias and its ramifications on their intelligence
products.
Cognitive Bias
In “Fixing the Problem of Analytic Mind-Sets: Alternative Analysis,” Roger
George describes cognitive bias as a mindset from which both the analyst and the
decision maker develop a series of expectations based on past events and draw their own
conclusions. As both are presented with new data, they either validate it because it is
consistent with earlier data or they disregard it if it does not fit into the pattern. As new
events occur, data consistent with earlier patterns of beliefs are more likely to be accepted
as valid, while data that conflicts with expectations lack precedent. It is human nature for
individuals to ‘‘perceive what they expect to perceive,’’ making the mindset unavoidable
(George, 2004, p. 387). Initially, the mindset can help create experts for data collection;
however, eventually the mindset will make the experts obsolete as they are unable to
accept or process new information or changing events.
Heuer identifies cognitive bias as a mental error that is consistent and predictable.
In Jack Davis’ introduction to Heuer’s Psychology of Intelligence Analysis, Davis
identifies three factors that Heuer recognizes as the cognitive challenges that analysts
face: the mind cannot effectively deal with uncertainty; even if the analyst has an
increased awareness of their biases, this does little to help analysts deal effectively with
uncertainty; and tools and techniques help the analyst apply higher levels of critical
thinking and improve analysis on complex issues, especially when information is
incomplete or deceptive (Davis, 1999). In the chapter, “Thinking About Thinking,”
Heuer notes that weakness and bias are inherent in the human thinking process. However,
2
they can be alleviated by the analyst’s conscious application of tools and techniques
(Heuer, 1999). Though bias is present, techniques can help mitigate its effects.
Patterns are necessary for analysts to know what to look for and what is
important. These patterns then form the analyst’s mindset and create their perception of
the world (Davis, 1999). Mindsets are unavoidable and objectivity is achieved only by
making assumptions as clear as possible so when others view the analysis, they can
assess its validity. Cognitive bias may be unavoidable, but overt acknowledgment
reduces its negative effect on intelligence analysis.
Cognitive biases can develop within individuals and collectively in the
organization they work for. David W. Robson states in “Cognitive Rigidity: Methods to
Overcome It” that organizations develop mental models that serve as the basis for belief
within the organization. Like an individual’s mindset, these mental models can be
difficult to overcome (Robson, n.d.). This cognitive rigidity can lead to reliance on
hypotheses that purely reinforce what the organization believes to be true and valued.
Just as an expert’s judgment can become obsolete, cognitive rigidity within an
organization can ultimately lead it to dismiss radical alternatives to its approach and
restrict its abilities to change over time though necessary.
Organizations often struggle with cognitive rigidity because, by its very nature, it
is undetectable (Robson, n.d.). Robson notes that this is especially true for organizations
that handle complex problems and forecast possible outcomes. The danger of cognitive
rigidity lies in the experience of the organization; the more experienced the organization,
the more susceptible it is to being set in its mental model. Over time, the organization’s
cognitive framework becomes self-reinforcing as it accepts only data that confirms its
core assumptions that it has been built on. For organizations that deliver actionable
3
intelligence, these frameworks consequently influence the estimation of probability and
may negatively influence the intended solution and possibly lead to an intelligence failure
(Robson, n.d.).
Rob Johnston identifies two general types of bias in “Integrating Methodologists
into Teams of Substantive Experts” that include pattern bias and heuristic bias. Pattern
bias, more commonly known as confirmation bias, is looking for evidence that confirms
instead of rejects a hypothesis and heuristic bias uses inappropriate guidelines or rules to
make predictions (Johnston, 2005). Johnston looks at how each affects experts. He states
that unreliable expert forecasts are often caused by both pattern and heuristic bias.
Becoming an expert requires years of viewing the world through a particular lens;
however, because of these biases, it can also lead to poor intelligence.
Johnston states that intelligence analysis is like other complex tasks that demand
expertise to solve complex problems; the more complex the task, the longer it takes to
build necessary expertise. However, this level of expertise paradoxically makes expert
forecasts unreliable. Johnston notes that experts outperform novices with pattern
recognition and problem solving, but expert predictions are “seldom as accurate as
Bayesian probabilities” (2005, p. 57). Johnston attributes this to cognitive bias and time
constraints. Experts are effective but only to the extent to where their bias does not affect
the quality of their analysis.
Johnston also discussed bias in Analytic Culture in the US Intelligence
Community. An Ethnographic Study. Johnston conducted a series of 439 interviews, focus
groups, and other forms of participation from members of the IC (2005). The purpose of
the work was to identify and describe elements that negatively affect the IC. Within this
study, he focused a section of his work on confirmation bias. Johnston found through
4
interviews and observation that confirmation bias was the most prevalent form of bias in
the study (Johnston, 2005, p. 21). For example, when Johnston asked the participants to
describe their work process, they responded that the initial steps to investigating an
intelligence issue were to do a literature search on previous literature. The issue that
Johnston notes with this is that searches can quickly lead to unintentional searching for
confirming information. Therefore, the evidence collection could quickly become a
search that only confirmed the analyst’s own thoughts and assumptions.
A weakness of Johnston’s ethnographic study on confirmation bias may be the
presence of his own bias. Johnston chose to include only four quotations from interviews
with analysts during his discussion on confirmation bias. All answers that he provides in
the body of his results conclude the same thing: initial searches are done by reading
previous products and past research. However, with over 439 participants in the study, it
is doubtful that only four participants answered this question and it is even less likely that
all 439 participants only discussed literature searches. A more comprehensive look at
confirmation bias within the IC would have been to measure, in a more quantitative form,
the responses to see exactly where the bias stems from. If not, it appears that Johnston
chose those quotes only to make his point that analysts most often use literature searches
to begin the analysis process instead of presenting all the responses.
To test cognitive bias, researchers Brant A. Cheikies, Mark J. Brown, Paul E.
Lehner, and Leonard Adelman assessed the effectiveness of a structured analytic
technique in their study “Confirmation Bias in Complex Analyses.” The researchers
believe that most studies of confirmation bias involve abstract, unrealistic experiments
that do not mirror complex analysis tasks managed by the IC. Therefore, the purpose of
this study was to recreate a study of an actual event, using techniques to assess the
5
presence of confirmation and anchoring bias and if the structured analytic technique,
Analysis of Competing Hypotheses (ACH), successfully reduces it. For their study, the
researchers define an anchoring effect as a “tendency to resist change after an initial
hypothesis is formed” (Cheikies et al, 2004, p. 9). The researchers define confirmation
bias as the “tendency to seek confirming evidence and/or bias the assessment of available
evidence in a direction that supports preferred hypotheses” (Cheikies et al, 2004, p. 10).
This is the first recorded experiment that looked to test ACH’s ability to minimize
confirmation bias (Cheikies et all, 2004). The researchers replicated a study by Tolcott,
Marvin, and Lehner, conducted in 1989, to obtain the same confirmation bias results.
Therefore, they could then test ACH’s ability to mitigate confirmation bias using the
previous study as a control. For the study, the researchers used 24 employees from a
research firm. The participants averaged 9.5 years of intelligence analysis experience. All
participants were emailed 60 pieces of evidence regarding The USS Iowa Explosion that
occurred in April 1989, including three hypothesized causes of the explosion: Hypothesis
1 (H1) inexperienced rammerman inadvertently caused the explosion, (H2) friction
ignited powder, and (H3) gun captain placed incendiary device. The experiment group
was given the same information but was also given an ACH tutorial.
To test for confirmation and anchoring bias, the researchers had H1 and H3
receive the most confirming evidence in the first two rounds while having H2 have the
least confirming evidence. Also H1 and H3 were constructed to be easiest to visualize. To
analyze the results, two analyses of variance (ANOVA) were performed to determine the
confidence ratings of the participants.
The results showed that as the participants assessed new evidence, it was greatly
affected by the beliefs that they had at the time the evidence was given. The evidence
6
that confirmed the participants’ current belief was given more weight than the
disconfirming evidence (Cheikies et all, 2004). In this study, ACH reduced confirmation
bias, but only to the participants that did not have professional analysis experience.
The study was able to show that an anchoring effect was present and also that
ACH was able to minimize an analyst’s tendency toward confirmation bias. The
researchers were effective at establishing an anchoring effect because they built on a
successful study that had previous done the same thing. However, the researchers’ use of
participants that were not experienced in intelligence analysis is a weakness of the study.
Though all participants were interested in analysis, only 12 had analysis experience. The
varying abilities of the participants calls into question the validity of the results because
those not trained or experienced in analysis would not only be less aware of the purpose
of weighing criteria and the use of a structured analytic technique, but also of the
presence of cognitive bias.
In his Applied Intelligence Master’s Thesis, “Forecasting Accuracy and Cognitive
Bias in the Analysis of Competing Hypothesis,” Andrew Brasfield looked to further
investigate the inconclusive and varying results from previous studies about the
structured analytic technique, Analysis of Competing Hypotheses (ACH) (2009).
Specifically, Brasfield looked at ACH’s goals of increased forecasting accuracy and
decreased cognitive bias.
Seventy undergraduate and graduate Intelligence Studies student participants were
divided into control and experiment groups and further divided into groups based on
political affiliation to detect the presence of a pre-existing mindset regarding the topic,
the 2008 Washington State gubernatorial election. The two possible outcomes would
either be the incumbent governor or the challenger would win the election. The
7
participants had access to open source material to gather evidence for their forecast. The
control group used an intuitive process and the experiment group used ACH to structure
their analysis. The participants were given a full week to complete the assignment.
Brasfield tested accuracy by comparing the results of the control group to the
results of the experiment group. To test for cognitive bias, Brasfield looked to see if there
was a pattern between the participants’ party affiliation and their forecasts. Also, for the
experiment group, Brasfield used the evidence in the participants’ ACH that
overwhelmingly supported a particular party affiliation; this detected the presence of
cognitive bias.
The results of the election showed that the incumbent won. The results of the
study showed that the experiment group was 9 percent more accurate than the control
group; 70 percent of the participants in the experiment group forecasted the winner and
61 percent of the participants in the control group forecasted the winner (Brasfield,
2009). Brasfield states that structured analytic techniques “should only improve overall
forecasting accuracy incrementally since intuitive analysis is, for the most part, an
effective method itself” (Brasfield, 2009, p.39). Therefore, though the improvement is
only minor, the findings show that ACH does improve analysis.
Regarding cognitive bias, ACH appeared to mitigate bias of party affiliation
among Republicans but not Democrats. Brasfield notes that this may be due to the
Democratic candidate winning the election. For the experiment group using ACH,
participants used more evidence and applied it appropriately. Nearly all control group
participants used evidence that only supported the candidate they forecasted to win,
suggesting confirmation bias in the control group (Brasfield, 2009). Therefore, ACH does
appear to mitigate this bias.

8
As intelligence failures are inevitable, so is the cognitive bias that contributes to
them. Cognitive bias within the IC is most prevalent in the form of confirmation bias
where analysts and decision makers seek information that conforms most to the data that
they currently have. Within organizations, this takes the shape of confirming the beliefs
and values that the organization already holds. Because intelligence relies on judgment,
an analyst’s cognitive bias can affect the accuracy of the forecast and the decision
maker’s cognitive bias can affect the action they take. The biases of both are a major
contributor to whether intelligence succeeds or fails. Cognitive bias is a mindset that
either rejects or confirms information based on previous experiences. By consciously
recognizing cognitive bias, the analyst is taking the first proactive measure to mitigate
their contribution to an intelligence failure. However, the analyst must do more than just
recognize their bias; they must take the next step to distance their bias from their forecast
either by using intuition or structured analytic techniques.
Unstructured and Structured Techniques
It is debated within the IC about the use of structured analytic techniques for
effective forecasting and decision making. One side believes unstructured, intuitive
thinking is effective and comes with experience working in the field. The other side
believes that the use of structured analytic techniques organizes and manages complex
situations more effectively than intuition alone. The argument for both, however,
recognizes a need to overcome bias and the need for a process to effectively aid in
strategic decision-making. Because bias is inherent and present in every decision that is
made, the analyst must make a conscious application of analytic techniques outside of
their own mind to help reduce the effects of bias. However, intuition is also a powerful
tool that can be used alone or in conjunction with analytic techniques.

1
Intuition
David Meyers, a professor of Psychology at Hope College Memory, describes the
intuitive process in his book, Intuition: Its Powers and Perils. He states that memory is
not a single, unified system. Rather, it is “two systems operating in tandem” (Meyers,
2002, p. 22). Implicit memory, or procedural memory, is learning how to do something
whereas explicit memory, or declarative memory, is being able to state what and how
something is known. Meyers offers an example: as infants, we learn reactions and skills
used throughout our lives; however, we cannot explicitly recall anything from our first
three years (2002). This phenomenon continues throughout our lifetime. Though we may
not explicitly recall much of our past, we implicitly and intuitively remember skills.
Beyond a basic idea of intuition, Meyers also discusses intuitive expertise.
Compared to novices, experts know more through learned expertise. Meyers describes
William Chase and Herbert Simon’s chess expert study. The researchers found that the
chess experts could reproduce the board layout after looking at the board for only 5
seconds and could also perceive the board in clusters of positions that they were familiar
with. Therefore, the experts could intuitively play 5 to 10 seconds a move without
compromising their level of performance. Through this example, Meyers relays that
experts are able to recognize cues that enable them to access information they have stored
in their memories; expert knowledge is more organized and, therefore, more efficiently
accessible (Meyers, 2002). Experts see large, meaningful patterns while novices are only
able to see the pieces. Another difference between experts and novices is that experts
define problems more specifically. Meyers does note, however, that expertise is
discerning. Expertise is within a particular field and scope for each individual (Meyers,
2002).
3
Though intuition allows us to access experiences and apply them efficiently, it
does have its drawbacks. Meyers recognizes three forms of bias that intuition is prone to:
hindsight bias, self-serving bias, and overconfidence bias (Meyers, 2002). Hindsight bias
is when events and problems become obvious retrospectively. Then, once the outcome is
known, it is impossible to revert to the previous state of mind (Meyers, 2002). In other
words, we can easily assume in hindsight that we know and knew more than we actually
did. Another disadvantage of intuition is a self-serving bias. Meyers states that in past
experiments, people more readily accepted credit for successes but attributed failure to
external factors or “impossible situations” (Meyers, 2002, p. 94). A third drawback to
intuition is an overconfidence bias that can surface from judgments of past knowledge in
estimates of “current knowledge and future behavior” (Meyers, 2002, p. 98).
Overconfidence is then sustained by seeking information that will confirm decisions
(Meyers, 2002).
Intuition can be a powerful tool, especially when quick decisions need to be
made. However, intuition is subject to bias and an individual’s expertise is limited in
scope and personal ability. Because of this, it is necessary for analysts to remain
cognizant of its possible disadvantages and to use tools that most effectively make use of
their intuition.
Naresh Khatri and H. Alvin in “Role of Intuition in Strategic Decision Making”
look to fill the gap in the field of research on the role that intuition serves in decision
making. At the time of their study, the researchers state that there are only a few scholarly
works on intuition and even less research that has been conducted in the field.
The researchers define intuition as a “sophisticated form of reasoning based on
‘chunking’ that an expert hones over years of job-specific experience...in problem-

4
solving and is founded upon a solid and complete grasp of the details of the business”
(Khatri & Alvin, p. 4). Khatri and Alvin (n.d.) state that intuition can be developed
through exposure to and experience with complex problems, especially those that have a
mentor through the process.
They note six important properties of intuition. The first is that intuition is a
subconscious drawing of experiences. The second is that intuition is complex because it
can handle more complex systems than the conscious mind (Parikh, 1994, as cited by
Khatri & Alvin, p. 5). The rational mind thinks more linearly while intuition can
overcome those limitations. A third property is that the process of intuition is quick.
Intuition can recall a number of experiences in a short period of time; compressing years
of learned behavior into seconds (Isenberg, 1984, as cited by Khatri & Alvin, p.5). A
fourth property of intuition is that it is not emotion. Intuition does not come from
emotion; rather, emotions such as anger or fear cloud the subtle intuitive signals. The
fifth property is that intuition is not bias. The researchers state that there are two sides to
the bias debate over intuition. The first is that cognitive psychology research states
decision making “is fraught with cognitive bias” (Khatri & Alvin, p. 6). However,
another body of research suggests that intuition it not necessarily biased but “uncannily
accurate” (Khatri & Alvin, p.6). The researchers’ line of reasoning follows that the same
cognitive process that is used for valid judgments is the same one that generates the
biased ones; therefore, “intuitive synthesis suffers from biases or errors, so does rational
analysis” (Khatri & Alvin, p.6). Finally, intuition is part of all decisions. It is used in all
decisions, even decisions based on concrete facts. As the researchers note, “at the very
least, a forecaster has to use intuition in gathering and interpreting data and in deciding
which unusual future events might influence the outcome” (Goldberg, 1973, cited by
5
Khatri & Alvin). The researchers show through these properties that intuition is not an
irrational process because it is based on a deep understanding and rooted in years of
experience surfacing to help make quick decisions.
Khatri and Alvin state that strategic decisions are characterized by incomplete
knowledge; that decision makers cannot rely solely on formulas to solve problems, so a
deeper sense of understanding of intuition is necessary. The authors note that intuition
should not be viewed as the opposite of quantitative analysis or that analysis should not
be used. Rather, “the need to understand and use intuition exists because few strategic
business decisions have the benefit of complete, accurate, and timely information”
(Khatri & Alvin, p. 8).
Khatri and Alvin surveyed senior managers of computer, banking, and utility
industries in the Northeastern United States and found that intuitive processes play a
strong role in decision making within each respective industry. The industries were
chosen based on each environment; the computer industry is least stable, the banking
industry is moderately stable, and the electric and gas companies are the most stable but
the least competitive of the three. Khatri and Alvin acknowledged the effect the size of
the organization has on the culture; small organizations tend to “use more of
informal/intuitive decision making and less of formal analysis than large organizations”
(Khatri & Alvin, p. 14).
The researchers narrowed their scope by sampling organizations that fell within a
specified sales volume range. For the scope of the study, organizations in the computer
and utility industries all had sales of over $10 million and nine banks ranged from $50
million to $350 million in assets. The researchers used both subjective and objective
6
indicators for measurement of performance. The researchers had a response rate of 68
percent, or 281 individuals from 221 companies.
The industry mean scores were examined using the Newman-Keuls procedure and
a hierarchical regression analysis. The results showed that the computer industry uses a
higher level of intuitive synthesis than banks and the banking industry uses a higher level
of synthesis than the utility industry. The researchers’ three indicators of intuition include
judgment, experience, and gut-feeling. Each of these varied according to industry. They
found that managers in banks and computer companies use more judgment and rely on
their previous experience more so than the utility company managers. Managers of
computer companies rely on gut-feelings significantly more than bank or utility
managers. Therefore, Khatri and Alvin found that intuition is used more in unstable
environments than stable.
Due to their findings, the researchers suggest that intuition be used for strategic
decision making in less stable environments and cautiously in more stable environments.
Khatri and Alvin state that intuition can be developed through exposure to and experience
with complex problems, especially those that have a mentor through the process.
Khatri and Alvin note that their geographic range, size, and choice of industries
chosen were limitations to their study. The Northeast was the geographic area studied and
its economy can vary widely not only regionally, but nationally. They note that further
research should draw from large sample sizes of varying industries. Another limitation of
the study may be the researcher’s use of indicators and definition of a “stable”
environment. The researchers noted that the indicators were subjective. Without making
the indicators as objective as possible, it becomes increasingly difficult to use the
indicators as a standard for comparing data across the industry. Also, Khatri and Alvin
7
state that the use of intuition is more effective in an unstable environment. However,
without having a clear definition of what a “stable” vs. an “unstable” environment is, the
appropriate use of intuition for a specified environment may not be properly determined.
Structured Analytic Techniques
“The National Commission on Terrorist Attacks Upon the United States”, or the
9/11 Commission, was created to evaluate and report the causes relating to the terrorist
attacks on September 11, 2001 (Grimmet, 2004). The Commission also reported on the
evidence collected by all related government agencies about what was known
surrounding the attacks, and then reported the findings, conclusions and
recommendations to the President and Congress on what proactive measures could be
taken against terroristic threats in the future (Grimmet, 2004).
Throughout the document, there are repeated recommendations stressing the
necessity of information sharing not only throughout United States agencies but through
international efforts. The Commission recommended information sharing procedures that
would create a trusted information network balancing security and information sharing
(Grimmet, 2004). Also throughout is the emphasis on improved analysis. A Key
Recommendation of the Joint Inquiry House and Senate Intelligence Committees stated
that the IC should increase the depth and quality of its domestic intelligence collection
and analysis (Grimmet, 2004). The committees also suggested an “information fusion
center” where all-source terrorism analysis could be improved in both quality and focus
(Grimmet, 2004). The committees also stated that the IC should “implement and fully
utilize data mining and other advanced analytical tools, consistent with applicable law”
(Grimmet, 2004, p. 20). By stating this, the committee is recognizing the value in using
2
structured analytic techniques to improve intelligence. Therefore, the use of analytic
techniques is necessary within the IC, especially when working directly with terrorism.
Folker states in Intelligence Analysis in Theater Joint Intelligence Centers: An
Experiment in Applying Structured Methods that a debate exists between unstructured
and structured analytic techniques. It is a difference in thinking of intelligence as either
an art form or a science. The researcher states only a small number of analysts
occasionally use structured analytic techniques when working with qualitative data and
instead rely on unstructured methods (Folker, 2000). In the context of this experiment,
structured analytic techniques are defined as “various techniques used singly or in
combination to separate and logically organize the constituent elements of a problem to
enhance analysis and decision making” (Folker, 2000, p. 5).
Folker states that advocates of unstructured methods feel that intuition is more
effective because structured analytic techniques too narrowly define the intelligence
problem and ignore other important pieces of information (Folker, 2000). Those who use
structured analytic techniques claim that the results are more comprehensive and
accurate. The methods can be applied to a broad range of issues to assist the analyst to
increase objectivity. The emphasis on structured analytic techniques is not to replace the
intuition of the analyst, but to implement a logical framework to capitalize on intuition,
experience, and subjective judgment (Folker, 2000). However, no evidence exists for
either; at the time of the experiment, Folker states that there has not been a study done to
adequately assess if the use of structured analytic techniques actually improves
qualitative analysis (2000). Folker points to the advantages of structured analytic
techniques when he states:

3
A structured methodology provides a demonstrable means to reach a conclusion.
Even if it can be proven that, in a given circumstance, both intuitive and scientific
approaches provide the same degree of accuracy, structured methods have
significant and unique value in that they can be easily taught to other analysts as a
way to structure and balance their analysis. It is difficult, if not impossible, to
teach an intelligence analyst how to conduct accurate intuitive analysis. Intuition
comes with experience. (2000, p. 14)
The ability of structured analytic techniques to be taught and replicated is shown to be a
clear advantage over intuition which is learned through personal experience. Folker states
that even though structured analytic techniques have advantages, they are not used
because of time constraints, a sense of increased accountability, and because there is no
proof that it will actually improve analysis.
Analysts are faced with an ever increasing amount of qualitative data that is used
for solving intelligence problems. In order to use the data in a more objective way, Folker
designed an experiment to test the effectiveness of a structured analytic technique and its
ability to improve qualitative analysis. This was accomplished by comparing analytic
conclusions drawn from two groups; between those that used intuition and those that used
a structured analytic technique. Then the participants’ answers were scored as correct or
incorrect and compared statistically to determine which group performed better.
There were 26 total participants in this study; 13 in the control and 13 in the
experiment group. The low participation level was taken into account and Fisher’s Exact
Probability Test was used to determine “statistical significance for the hypotheses and for
the influence of the controlled factors (rank, experience, education, and branch of
service)” (Folker, 2000, p. 16). The participants completed a questionnaire to give

4
demographics and identify any prior training and experience they had. Both groups were
given a map, the same two scenarios, and an answer sheet. All were given one hour to
complete the first scenario and 30 minutes to complete the second scenario. Both
scenarios were built using extensive testing and were based on actual events. The results
indicated that the use of structured analytic techniques improved qualitative analysis and
that the controlled factors did not seem to affect the results.
Folker noted that time constraints for learning the methodology was a limiting
factor. Folker allotted one hour for teaching Analysis of Competing Hypotheses (ACH)
and stated that the complexity of the scenarios may have affected the results. Therefore,
for future studies, it is necessary to either use experienced analysts of varying degree that
are familiar with the methodology, or allot more time for learning it.
In “The Evolution of Structured Analytic Techniques,” Heuer states that
structured analytic techniques are “enablers of collaboration”; that the techniques are the
process by which effective collaboration occurs (Heuer, 2009). Structured analytic
techniques and collaboration should be developed together for the success of both. Heuer
states that there is a need for evaluating the effectiveness of structured techniques beyond
just experience; each needs tested.
Heuer states that testing the accuracy of a methodology is difficult because it
assumes that the accuracy of intelligence can be measured. Also, testing for accuracy is
problematic when most intelligence questions are probabilistic (2009). Heuer notes that
this would require a large number of experiments to acquire a distinguishable comparison
between the accuracy of one technique over another. Further, the number of participants
that would be needed for these would be unrealistically high. Heuer states that the most
feasible and effective approach for evaluating a technique is to look at the purpose for
5
which the technique is being used and then determine whether it achieves that purpose or
if there is a better way to achieve that purpose; simple empirical experiments can be
created to test these.
Rather than a debate between the use of unstructured versus structured analytic
techniques, it is possible to view the two sides as existing on either ends of a spectrum;
with both being necessary and useful for decision making and forecasting. The emphasis
of structured analytic techniques is not to replace intuition but to create support for the
analyst’s intuition, experience, and subjective judgment; analytic tools increase
objectivity. As shown through the literature, intuition is especially effective for highly
experienced professionals, and analytic tools help analysts attain a higher level of critical
thinking and improve analysis on complex issues. This is necessary in intelligence,
especially when information can be deceptive or datasets can be incomplete. In sum,
structured analytic techniques can be taught and intuition cannot. Though both are
valuable, the use of structured analytic techniques can increase objectivity, especially for
the novice that may have a less developed intuition than an experienced analyst.
Collaboration
Collaboration may be the first initial step towards reducing confirmation bias. As
Johnston stated in Analytic Culture in the US Intelligence Community: An Ethnographic
Study, analysts often use literature searches as the initial step for assessing an intelligence
problem. However, collaboration could generate multiple hypotheses that a literature
search would miss. Though collaboration could alleviate issues with confirmation bias,
problems with group dynamics could hinder multiple hypotheses generation.
In The Wisdom of Crowds, James Surowiecki proposes the use of groups not only
to generate more ideas, but to increase the quality of the decisions made. Surowiecki
2
affirms that groups make far better judgments than individuals, even experts. Surowiecki
states that experts are important, but their scope is narrow and limited; there is no
evidence to support that someone can be an expert at something broad like decision
making or policy (2009). Instead, groups of cognitively diverse individuals make better
forecasts than the most skilled decision maker (Surowiecki, 2009).
Surowiecki furthers the argument that cognitively diverse groups are important by
stating that a diverse group of people with varying levels of knowledge and insight are
more capable at making major decisions than “one or two people, no matter how smart
those people are” (Surowiecki, 2004, p. 31). When making decisions, it is best to have a
cognitively diverse group (Surowiecki, 2009). This is because diversity adds perspective
and it also eliminates or weakens destructive group decision making characteristics such
as overly influential members. In other words, conscientious selection helps to alleviate
dominate personalities from taking over the group.
Surowiecki cites James Shanteau, one of the leading thinkers on the nature of
expertise, to back up his claim. Shanteau asserts that many studies have found individual
expert judgment to be inconsistent with other experts in their field of study (Surowiecki,
2009). Also, they are prone to what Meyers would call the “overconfidence bias”;
experts, like anyone else whose judgments are not calibrated, often overestimate the
likelihood that their decisions are correct. In other words, being an expert does not
necessarily mean accurate decision-making. Experts should be integrated into a group to
make them the most effective they can be.
Surowiecki uses scientific research as an example to show the effectiveness of
collaboration. He states that because scientists collaborate and openly share their data
with others, the scientific community’s knowledge continues to grow and solve complex
3
problems (Surowiecki, 2009). Collaboration not only improves research, it also fully
utilizes experts’ abilities. Individual judgment is not as accurate or consistent as a
cognitively diverse group. Therefore, diverse groups are needed for sound decision
making.
Wesley Shrum, in his work, “Collaborationism” discusses the motivations and
purpose of collaboration. Shrum states that collaboration should not be generalized
because it occurs in many forms across a wide range of disciplines. For example, some
disciplines require collaboration while others can easily opt not to use it. Shrum questions
what motivates individuals and groups to collaborate when their field does not
necessarily require it and what those motivators are. The most common motivation is
resources such as technology and funding. By collaborating, individuals are more likely
to have access to resources they need. Others are motivated by bettering their discipline
through strategic efforts. In other words, if a less established discipline collaborates with
a well established discipline, the less established discipline will gain legitimacy (Shrum,
n.d.). A third motivation is to gain information from other disciplines to solve complex
problems. Shrum (n.d.) states that cross discipline collaboration is increasing.
A major issue that Shrum sees in current collaboration is that it is often
technology-based. Collaboration is designed to “produce knowledge later rather than
now” (Shrum, p. 19). The collaboration isn’t being used to solve problems or produce
results at that present moment, but to create things such as databases to be used at a later
point in time. Shrum states that the knowledge produced later may not even involve the
same individuals that originally collaborated to create it. This is a problem because the
farther all disciplines move away from the “interactivity of collaboration…the farther we
move from the essential phenomenon that the idea of collaboration centrally entails:
4
people working together with common objectives” (Shrum, p.19). Shrum looks at
collaboration in a realistic sense that individuals are using collaboration to get what they
need out of it as individuals and can abandon the process at any point they feel it is no
longer useful. Collaboration is being used for individual instant gratification rather than
strategic pursuits. By analyzing collaboration in its current state, Shrum is able to identify
the benefits and issues surrounding modern collaboration.
In his study, “Processing Complexity in Networks: A Study of Informal
Collaboration and its Effect on Organizational Success” Ramiro Berardo seeks to identify
how individual organizations “active in fragmented policy arenas” are able to achieve
their goals through collaborations and what motivates collaboration (Berardo, 2009, p.
521). The basis of the study is the resource exchange premise that individuals or
organizations rarely have enough resources to pursue their goals; therefore, they must
exchange resources. The more resources the individual or organization is able to acquire,
the more likely they will be able to achieve their goals. Berardo states that it is through
the expansion of connections that the individuals or organizations will be most
successful. It is not only the number of connections but more importantly, the way that
the collaborations are connected to others in the network.
Berardo studied a multi-organizational project that was addressing water-related
problems in southwest Florida. Berardo used data collected from the applicants that were
part of the project that would determine whether or not the applicant received funding.
The data also contained detailed information about the nature of work the applicant did.
Then, using this information, Berardo contacted the 92 applicants that worked on the
project through a semi-structured phone survey. The information provided through the
survey gave Berardo the names of other organizations that participated in the project as
5
“providers of resources that could be used to strengthen the application” (Berardo, 2009,
p. 527). The data was then put into a matrix that contained information about the
organizations and their common relationships (Berardo, 2009). It then showed the pattern
of participation of organizations in the project.
The main organization controlled 50 percent of the budget and other organizations
became a part of the project through an application process, hoping to obtain funding
from the main organization. The applicants had diverse backgrounds ranging from
financial to legal to technical expertise. Therefore, both the funder and the applicants
benefited because the funder received knowledge from the expertise of the applicants and
the applicants received funding (Berardo, 2009). Berardo explains that all involved in
this process, from the main organization to the experts, were part of an informal
collaboration and that these types of collaborations are becoming more common (2009).
The results of the study showed that a larger number of partners increase the
likelihood of a project to be funded and organizations that are most active and take a
leadership role are most likely to be funded. This study confirmed the resource exchange
theory that the more partners, the more resources available to improve quality. Berardo
found that the leading organization is most successful when its partners collaborate with
each other in other projects. Also, once the collaboration gets to a certain number, over
seven, the likelihood of getting funding for the project declines. This is because it creates
a level of unmanageable complexity for the main funding organization (Berardo, 2009).
Berardo states, “there is a limit to the benefits of engaging in collaborative activities with
more and more partners, and that limit is given by the increasing complexity that an
organization is likely to face when the partners provide large amounts of nonredundant
resources” (Berardo, 2009, p. 535).

7
A weakness with the study was that it looked at only one collaborative effort.
Therefore, future studies would need to confirm the results by looking at other types of
collaborations, ranging in size and areas of expertise. When thinking about this study in
terms of collaboration within the intelligence community, factors may differ from the
results of this study. For example, agencies may not be collaborating to mutually benefit
because of funding. Therefore, incentives to collaborate may be different within the IC
than through nonprofit organizations or for-profit companies. The question is, then, what
is the incentive to collaborate within the IC when the resource may not be as
straightforward as funding? In other words, what would be the incentive for an expert
working in the for-profit sector to collaborate with an analyst? This may be why the
individual analyst lacks the motivation to collaborate and mandates for collaboration are
necessary in the field.
In their study, “A Structural Analysis of Collaboration between European
Research Institutes,” researchers Bart Thijs and Wolfgang Glänzel investigate the
influence the research profile has on an institute’s collaborative trends. Thijs and Glänzel
note that there is extensive research on the collaborative patterns of nations, institutes,
and individuals with most of them finding a positive correlation between collaboration
and scientific productivity (2010). The researchers aimed to provide a more micro look at
collaborative behavior; focusing less on nations or institutions as a whole, but instead
looking at the research institute and its international and domestic collaborations. The
researchers classified a research institute by its area of expertise in order to establish what
other types of research institutes it collaborated with and why. The researchers then
looked to find the group that, according to its research profile, was the most preferred
partner for collaboration.

9
Thijs and Glänzel used data from the Web of Science database of Thomas Reuters
and limited their scope to include only articles, letters, or reviews indexed between 2003
and 2005. The documents were classified into a subject category system, dividing them
into eight different groups: Biology (BIO), Agriculture (AGR), a group of institutions
with a multidisciplinary focus (MDS), Earth and space sciences (GSS), Technical and
natural sciences (TNS), Chemistry (CHE), General medicine (GRM) and, Specialized
medicine (SPM) (Thijs & Glänzel, 2010).
The researchers found that institutions from the multidisciplinary group are most
likely to be partners. Also, groups that are more closely related are more likely to
collaborate than the other groups. For example, biology with agriculture; technical and
nature sciences with chemistry; and general medicine with specialized medicine (Thijs &
Glänzel, 2010).
Aside from showing what the collaboration strengths are within the sciences, this
study shows that collaborations are usually the strongest within a certain field or focus.
Also, instead of the multidisciplinary group collaborating with a more specialized group,
it partners with others like itself. In other words, instead of a multidisciplinary group
seeking the expertise of a particular field for collaboration, it seeks others like itself.
Blaskovich looked into group dynamics in her research, “Exploring the Effect of
Distance: An Experimental Investigation of Virtual Collaboration, Social Loafing, and
Group Decisions.” Global businesses use technology to virtually collaborate with a
dispersed workforce. Through past studies, it has been shown that virtual groups have
improved in brainstorming capabilities and more thorough analysis (Blaskovich, 2008).
While virtual collaboration (VC) has potential benefits, it may be counterproductive,
resulting in social loafing; “the tendency for individuals to reduce their effort toward a
10
group task, resulting in sub-optimal outcomes” (Latane, et al., 1979, as cited by
Blaskovich, 2008). Social loafing has been considered a contribution to poor group
performance, but it is a critical problem intensified by VC.
In her study, participants were grouped into teams and given a hypothetical
situation. They were to be management accountants responsible for the company’s
resources for information technology investments. The groups were asked to give one of
two recommendations: “(1) expend resources to invest in the internal development of a
new technology system (insource) or (2) use the resources to contract with a third-party
outsourcing company (outsource)” (Blaskovich, 2008, p. 33). The groups were given a
data set with mixed evidence as their source of information.
A total of 279 undergraduate and graduate students were placed randomly into
groups of three. The control groups worked face-to-face in a conference room and the
experiment group worked from individual computers in separate rooms; the VC group
used text-chat as their form of communication. To measure the communication of the
groups, Blaskovich had the groups continually update their recommendations as new
pieces of evidence were introduced. The recommendation pattern of the face-to-face
groups moved toward the outsourcing option regardless of the evidence order. However,
the VC groups were dependent on the order of the evidence. Therefore, Blaskovich
concluded that group recommendations were influenced by the mode and order of the
evidence introduced. The groups made their decisions and submitted them through the
designated group recorder. Then, the face-to-face group members were moved to separate
computers. The VC members logged-off of their chat session and all completed a
questionnaire about the experiment (Blaskovich, 2008).

11
Social loafing was recorded as being present according to time spent on the task,
the participants’ ability to recall information about the task, and self-reported evidence
about their personal effort (Blaskovich, 2008). The face-to-face groups spent an average
of 20.6 minutes on the task while the VC groups spent an average of 22.0 minutes
(Blaskovich, 2008).The accuracy score for the participants’ recall ability was 8.28 items
on the test for the face-to-face groups and the VC group was 7.93. For the self-reporting,
the VC groups perceived their efforts and level of participation to be lower than the face-
to-face groups (Blaskovich, 2008).
Blaskovich concludes that VC causes group performance to decline and that
social loafing may be a contributing factor to this. Also, the VC group decisions may be
of poorer quality than that of face-to-face groups because their judgments were
influenced by the order of evidence instead of the quality of evidence (Blaskovich, 2008).
Blaskovich’s research shows that virtual collaboration should be used cautiously if the
virtual group is making a decision or recommendation. Collaboration has shown to be
beneficial for brainstorming, especially when a diverse group of experts contribute.
However, Blaskovich’s study raises the issue of exactly what the advantages and
disadvantages of VC are. Also, applied to the IC, this study brings up the issue of the
level of collaboration that may be appropriate virtually; if VC is effective at
brainstorming or if decisions or recommendations by participants of VC should be
considered reliable.
The Office of the Director of National Intelligence (ODNI) created the report
“United States Intelligence Community: Information Sharing Strategy” which discusses
the increased need for information sharing, especially after September 11, 2001. The
“need to know” culture that was formed during the Cold War now impedes the
12
Intelligence Community’s ability to respond properly to terroristic threats (ODNI, 2008).
Therefore, the IC needs to move towards a “need to share” mindset; a more collaborative
approach to properly uncover the threats it now faces (ODNI, 2008). The report stresses
that “information sharing is a behavior and not a technology” (ODNI, 2008, p. 3).
Information sharing has to take place within the community; it has to happen through
effective communication and not just through the availability of new technology.
The Office of the Director of National Intelligence supports the transformation of
the IC culture to emphasize information sharing. However, it recognizes the difficulty
that would come with overhauling the entire culture and mindset of the IC. The new
environment that the ODNI proposes could include the same information, but it would
make the information available to all authorized agencies that would benefit from the
collaborative analysis (ODNI, 2008). ODNI’s vision and model stresses a “responsibility
to provide” that would promote greater collaboration in the IC and to its stakeholders.
Ultimately, the report is stating that the IC has a responsibility to improve communication
and collaboration to effectively manage new threats.
Douglas Hart and Steven Simon’s “Thinking Straight and Talking Straight:
Problems of Intelligence Analysis” discusses the need for structured arguments and
dialogues in intelligence. Hart and Simon note that the 9/11 Commission Report, the 9/11
Joint Congressional Inquiry Report, and other reports have cited that a lack of
collaboration is one of the causes for more recent intelligence failures (Hart & Simon,
2006). Hart and Simon propose that dialogues encourage analysts from different
backgrounds to develop common definitions and understandings to decrease potential
misunderstandings. Communication also encourages the exchange of different viewpoints

13
to reduce confirmation bias. Through the use of communication and brainstorming,
conversations evolve into critical thinking sessions for both the individual and the group:
Critical thinking can be enabled by collaboration, especially when it involves
compiling, evaluating, and combining multi-disciplinary perspectives on complex
problems. Effective collaboration; however, is possible only when analysts can
generate and evaluate alternative and competing positions, views, hypotheses and
ideas (Hart & Simon, 2006, p. 51).
The authors view collaboration as necessary and effective; however, the authors
state that documents like the National Intelligence Estimates (NIE) seem to discourage
collaboration between individuals and agencies: “Enforced consensus relegating
alternative assessments to footnotes…has been a disincentive to collaboration…in
addition, collaboration and sharing generally require extra work that competes with time
spent on individual assignments” (Hart & Simon, 2006, p. 51). Less time is being spent
on collaboration because individual assignments are the priority.
Researchers Jessica G. Turnley and Laura A. McNamara address collaboration
issues in “An Ethnographic Study of Culture and Collaborative Technology in the
Intelligence Community.” The goal of the study was to research improvements in
intelligence analysis that could be implemented through methods that effectively merged
sources and analysis through multi-agency teams. The researchers conducted their
ethnographic study at two intelligence agencies located at three different sites to address
the question: “What does collaboration mean in the analytic environment, and what is the
role of technology in supporting collaboration” (Turnley & McNamara, p. 2).
The research was conducted through interviews of analysts and through group
and daily work routine observations. The researchers visited three sites. Two sites were
14
within the same agency which the researchers called Intelligence Agency One (IA-1).
This agency focused on strategic intelligence. The third site was an agency that
developed software tools for tactical intelligence. The researchers called this site
Intelligence Agency Two (IA-2). One researcher spent five and one-half weeks observing
and interviewing analysts at IA-1. She collected data through 30 interviews and forty
hours of observation. At IA-2, the other researcher spent 20 hours becoming familiar with
the site and organization and spent 40 hours interviewing and observing operations.
In the sites the researchers studied, the word “collaboration” is intrinsically tied to
information, hierarchy, and power in the IC. Therefore, the analyst’s ability to collaborate
was only effective if the collaboration did not have a negative impact on the investments
of the individual within the organization. The structure of IA-1 was noted to be
hierarchical with each analyst given a specific area of responsibility and subject focus.
The researcher noted collaboration issues at IA-1 because of the hierarchy structure.
Participants’ responses about issues with collaboration were placed into these five
categories: introversion, a feeling of ownership over subject matter, privilege of
individual effort over group effort for rewards, organizational knowledge, and over-
classification of information.
IA-2, the site responsible for information management technology to produce
tactical intelligence, also had issues with collaboration but for a different reason. The
issues at this site stemmed from multiple companies working together, but having
different agendas for their participation in the contract. An even bigger issue was defining
ownership of the technology used for collaboration. At this site, the analysts could call up
an inquiry and multiple resources from multiple sensors are displayed on a single
platform. This is an issue to the organization because it defines who owns, controls, and
15
manages the data. Due to power struggles or fear of diminished confidentiality of
sources, certain sensors may refuse to give up necessary information and the
collaboration can be stalled or stopped.
This research was effective at showing how collaboration may already be used,
but that the organization’s culture greatly affects the use of it. The limitation of this study
was the lack of facilities the researchers were able to visit. Because they only visited
three sites, two of which operated under the same agency, they had a less comprehensive
view of the IC and its use of collaboration.
In “Small Group Processes for Intelligence Analysis,” Heuer discusses the role of
collaboration in the production of quality intelligence products and the elements needed
for successful collaboration. Heuer states that intelligence analysis is requiring more of a
group effort rather than an individual effort (Heuer, 2008). This is because intelligence
products need input from multiple agencies and from subject matter experts outside of
their field. Collaboration is also encouraged within agencies that have multiple locations
and can work online together to save time and travel costs.
However, there are issues within groups that can be counterproductive.
Individuals can be late to the group’s sessions or may be unprepared. The groups may be
dominated by certain types of individuals which prevents others from speaking up or
allowing for full generation of ideas. Also, the positions that the individual holds in the
agency can affect the group’s performance. For example, top level professionals are often
less likely to express dissent for fear of retribution or even embarrassment (Heuer, 2008).
Group dynamics play an important role in the effectiveness of collaboration.
To avoid these issues or to mitigate them, Heuer suggests the use of small, diverse
groups of analysts that openly share ideas and an increased use of structured analytic
16
techniques (2008). Using analysts from multiple agencies will broaden perspectives
“leading to more rigorous analysis” (Heuer, 2008, p. 16). Structured analytic techniques
can give structure to individual thoughts and the interaction between analysts. By using
structured techniques, analysts are providing group members with a written example of
their thought process; this can then be compared and critiqued by the other members
(Heuer, 2008). Heuer states that each step of the structured analytic technique process
induces more divergent and novel discussion than just collaboration alone (2008).
Analysts should not only use tools but should also collaborate with other analysts
or subject matter experts to make sure personal, individual cognitive bias is not affecting
the product and to generate multiple hypotheses. When describing the need for
collaboration in the form of subject matter experts, Heuer stated that expertise is needed
because the methodology itself does not solve the problem. The combination of expertise
and methodology “is always needed” because it is the methodology that guides the
expertise (R. Heuer, personal communication, June 2010). Collaboration also allows
those with diverse backgrounds from various fields to apply their expertise to the
intelligence problem. In other words, the more brainstorming, the more that hypotheses
can be identified than just a literary search only could provide. With collaboration,
communication and dialogue evolve into critical thinking for both the individual and the
group.
The use of collaboration and structure analytic techniques has gained the attention
of the IC when considering solutions to minimizing the frequency of intelligence failures.
Also, it is necessary for individual analysts and organizations to be aware of the presence
of cognitive bias and take safeguards to avoid its negative effects on intelligence analysis.
There exists little evidence of the effectiveness of structured analytic techniques or

17
intuition and both need to be empirically tested for the validity of the methods and for
managing complex situations within the IC.
Complexity Manager
According to Richards Heuer, the origin of his idea for Complexity Manager goes
back “over 30 years ago” when a future forecasting technique called Cross-Impact
Analysis was tested at the CIA (R. Heuer, personal communication, September 2010).
Heuer recalls taking a group of analysts through the development of a cross-impact
matrix, used in Complexity Manager, and was inspired by the technique’s effectiveness
as “a learning experience for all the analysts to develop a group understanding of each of
the relationships” (R. Heuer, personal communication, September, 2010).
Taking this experience, along with a broad understanding of how increasingly
complex the world has become, Heuer looked to create a technique that dealt with this
new level of complexity while still allowing ease of use to the analyst. Heuer states that
research organizations often deal with complexity by developing complex models that are
expensive and take a lot of time (R. Heuer, personal communication, September 2010).
However, much of the benefit from such modeling comes in the early stages when
identifying the variables, rating their level of significance, and understanding the
interactions between each. As Heuer states: “that [variable identification and interaction]
is easy to do and can be sufficient enough to generate new insights, and that is what I
tried to achieve with Complexity Manager” (R. Heuer, personal communication,
September, 2010). By using Complexity Manager, the analyst is breaking down the
complex system into its smallest component parts before moving forward to analyze the
entire system. By doing so, the analyst can understand potential outcomes and
2
unintended side effects of a potential course of action (R. Heuer, personal
communication, June 2010).
Complexity Manager is a structured analytic technique that also makes use of
collaboration to brainstorm multiple hypotheses for a complex issue. Therefore, if proven
effective, Complexity Manager would help to further decrease an analyst’s contributions
to intelligence failures by limiting the influence of cognitive bias. This would be
alleviated through the use of collaboration and through the process of using this
structured technique.
Complexity Manager combines the advantages of both structured analytic
techniques and collaboration through small teams of subject matter experts. Complexity
Manager “is a simplified approach to understanding complex systems—the kind of
systems in which many variables are related to each other and may be changing over
time” (Heuer & Pherson, 2010, p. 269). Complexity Manager, as a decision support tool,
helps to organize all options and relevant variables in one matrix. It also provides an
analyst with a framework for understanding and forecasting decisions that a leader,
group, or country is likely to make as well as their goals and preferences. Complexity
Manager is most useful at helping the analyst to identify the variables that are most
significantly influencing a decision. As Heuer states, Complexity Manager “enables
analysts to find a best possible answer by organizing in a systematic manner the jumble
of information about many relevant variables” (Heuer & Pherson, 2010, p. 273).
Complexity Manger is an eight step process. The following is Richards Heuer’s
directions for use of the structured analytic technique:
1. Define the problem

2. Identify and list relevant variables
3. Create a Cross-Impact Matrix
4. Assess the interaction between each pair of variables
1
5. Analyze direct impacts

6. Analyze loops and indirect impacts
7. Draw conclusions
8. Conduct an opportunity analysis (Heuer & Pherson, 2010, p. 273-277).
For a more detailed description of each of the eight steps, consult Heuer and Pherson’s
Structured Analytic Techniques for Intelligence Analysis. Below, Figure 2.2 shows the
Cross-Impact Matrix that is used for recording the nature relationships between all the
variables (Heuer & Pherson, 2010, p. 273). Heuer recognizes that the Cross-Impact
Matrix includes the same initial steps that are required to build a computer model or
simulation (Heuer & Pherson, 2010, p. 272). Therefore, when an analyst does not have
the time or budget to build a social network analysis or use the Systems Dynamics
approach, they can gain the same benefits using Complexity Manager through:
“identification of the relevant variables or actions, analysis of all the interactions between
them, and assignment of rough weights or other values to each variables or interaction”
(Heuer & Pherson, 2010, p. 272).
Figure 2.1 The Cross-Impact Matrix is used as to assess interactions between variables.
2
In theory, Complexity Manager is able to mitigate cognitive bias through the use
of both small group collaboration and a structure analytic technique. However, there is no
research proving the effectiveness of this technique, nor is there literature on it being
used in the field. When used in the appropriate context, Complexity Manager may be one
effective tool that may be used to reduce the risks of intelligence failures caused by
cognitive bias. However, unless tested or used in the field by analysts, this will not be
known. Therefore, it is necessary to test the effectiveness of Complexity Manager
through teams of analysts.
Hypothesis
Assessing Complexity Manager as an intelligence analysis tool, I have developed
four testable hypotheses. My first hypothesis is that the groups using Complexity
Manager will have a higher level of confidence in their forecast than those using intuition
alone. My second hypothesis is that analysts using Complexity Manager will produce
higher quality variables than those using intuition alone. My third hypothesis is that the
groups using Complexity Manager will identify more variables than those that used
intuition alone. My fourth hypothesis is that those using Complexity Manager will
produce more accurate forecasts than those that used intuition alone.
1
CHAPTER 3: METHODOLOGY
The purpose of this study is to assess the forecasting accuracy and effectiveness
of Complexity Manager, a structured technique. For the validity of structured techniques
and for all tools and techniques used by professionals in the Intelligence Community, it is
necessary to evaluate effectiveness through multiple experiments. This study is one of
many evaluations needed for that purpose.
The following research questions were addressed in this study:
1. Do analysts that use Complexity Manager have a higher level of confidence than
those that used intuition alone?
2. Do analysts that use Complexity Manager have higher quality variables assessed
before delivering their forecast than those that used intuition alone?
3. Do analysts that use Complexity Manager have a higher number of variables
assessed before delivering their forecast than those that used intuition alone?
4. Do analysts that use Complexity Manager produce a more accurate forecast than
those that use intuition alone?
The study was designed to compare a structured analytic technique, Complexity
Manager, to intuition alone when forecasting. If Complexity Manager is effective,
advantages of its use would be shown through the data collected. Data was collected by
standardized questionnaires and forms created by the researcher. Pre and post
intervention data were collected and analyzed through statistics and descriptive analysis
of results between the control and experiment group.

3
Setting
The research was conducted at Mercyhurst College in Erie, Pennsylvania. The
researcher recruited Intelligence Studies students through classroom visits across the
campus and the intervention was completed at the computer labs at the Intelligence
Studies building. Classroom visits were conducted two weeks before the intervention to
allow for the students to plan for the intervention and to maximize the number of sign-
ups for the researcher. The intervention was conducted in the computer labs at the
Intelligence Studies building because the department supports such endeavors and the
researcher could reserve the two computer labs exclusively for the purpose of the study.
This controlled environment allowed for each student to utilize a computer for pre-
intervention collection. Both computer labs were equipped with a projector so the
researcher could present a tutorial to the experiment on how to use Complexity Manager.
Participants
To ensure that this study was conducted in an ethical manner, the researcher
submitted the study to Mercyhurst College’s Institutional Review Board and obtained
permission before starting the study. A copy of the consent form and all related
documents can be found in the appendix of this thesis.
The participants were selected through purposive sampling to meet the needs and
criteria of the study. The participants were restricted to undergraduate and graduate level
Intelligence Studies students only because of their understanding of the Intelligence
process and the need for analysts to evaluate Complexity Manager. Freshman Intelligence
Students were able to participate even though they have very limited experience in the
field because the researcher created groups with each freshman paired with an
2
undergraduate upperclassman or second year graduate student. This maximized the
sampling size for the study and varied the level of expertise for each group.
Figure 3.1 shows the distribution of students according to their academic class.
There were 56 females and 106 males for a total of 162 participants. Freshman through
second year graduate students participated in the study: 43 freshmen; 37 sophomores; 28
juniors; 18 seniors; 26 first year graduate students; and 10 second year graduate students.
When completing initial sign-ups for the experiment, the researcher requested that the
participants disclose information about their education. For the undergraduates, because
they were all intelligence majors, the researcher requested that the participants list any
minors they may

Figure 3.1. Number of participants per academic class.
have. For the
graduate students, the researcher requested that these participants list their undergraduate
major and minor if applicable. This was done to show the range of expertise of the
participants. Not all undergraduate intelligence students had minors; 28 of the 126 had a
declared minor. Undergraduate intelligence students that participated in the study had the
following minors: Russian Studies, Business Intelligence, Business Administration,
History, Criminal Justice, Criminal Psychology, Philosophy, Psychology, Political
Science, Spanish, Computer Systems, and Asian Studies.
The 36 graduate students that participated in the study disclosed the following
undergraduate majors: Intelligence, Social and Political Thought, History, English,
Political Science, Spanish, International Affairs, Russian Studies, Psychology,
Telecommunication and Business, International Business, Security and Intelligence,
Biochemistry, Mathematics, French, Forensics, Sociology, Social Work, and Criminal
Justice. 17 of the 36 graduate students had the following minors at the undergraduate
3
level: Science and Technology, International Affairs, Life Sciences, Political Science,
Mandarin, Spanish, Russian Studies, French, Middle Eastern Studies, Asian Pacific
Studies, Economics, Public Policy, and East Asian Studies.
Intervention and Materials
The independent variable for this intervention was the experiment group’s use of
Complexity Manager and the dependent variable is the accuracy of the groups’ forecasts,
as well as the number and quality of the variables that the groups produced.
The researcher first consulted Richards Heuer and Randy Pherson’s book
Structured Analytic Techniques for Intelligence Analysis because it contained the step-by-
step procedure for using Complexity Manager. Then the researcher created forms based
on the procedure and from further instruction from email correspondence with Mr.
Richards Heuer. The methodology form was created to replicate the step-by-step
procedure while allowing for participants’ maximum understanding of the methodology
in a short period of time; each step of the procedure and instructions guiding the
participant were put on separate pages (See Appendix F). Each form created was used to
collect data directed towards the research questions and all forms were approved by the
Institutional Review Board at Mercyhurst College.
Measurement Instruments
The researcher collected data through the use of a post-intervention questionnaire,
through assessment of the groups’ forecasting accuracy, and by the number and quality of
the variables documented by the groups.
Questionnaire Answers
The questionnaire for the control group contained nine questions. Four of the
questions, Questions 1-4, asked for the amount of time and the number of variables that
2
the individual contributed compared to the amount of time and the number of variables
that the group produced. Questions 1-4 asked for quantitative amounts that could be
compared to other individuals and other groups. Four questions, Questions 5-8, asked the
participant to rank their knowledge of the intelligence issue, the clarity of instructions,
the availability of open source information, and the helpfulness of working in teams for
the assigned task. Questions 5-8 asked the participants to rank their experience on a scale
of one to five. The final question asked for general comments about the experiment.
The questionnaire for the experiment group consisted of thirteen questions.
Questions 1-9 were identical to the control group questions. Questions 10-13 were
specific to the use of Complexity Manager including: the usefulness of Complexity
Manager for assessing significant variables, understanding of Complexity Manager
before the experiment, understanding of Complexity Manager after the experiment, and if
the participant would use Complexity Manager for future tasks. All questions allowed for
space for the participant to comment further if they wanted to do so. (See Appendix H
and I for both questionnaires).
Forecasting Accuracy
All participants were tasked with forecasting whether the vote for the Sudan
Referendum set for January 9, 2011, would occur as scheduled or if it would be delayed.
The use of an actual event allowed for definite results to compare the groups’ forecasts
against. (See Appendix G for the forecasting worksheet.)
Number of Variables
Along with forecasting if the Sudan Referendum would occur on the set date, the
participants were also tasked with identifying the variables that were most influential for
2
deciding the course of the Sudan Referendum. The researcher calculated the number of
variables that the control group produced compared to the experiment group to assess if
Complexity Manager aided in the production of an increased number of variables
considered.
Quality of Variables
The quality of the variables recorded by the control group was qualitatively
compared to the experiment group. The researcher assessed quality by visually
comparing the thoroughness and comprehensiveness of the control versus the experiment
groups’ variables.
Data Collection and Procedures
Pre-Intervention
From October 11, 2010, to October 19, 2010, the researcher visited eleven
Intelligence Studies classes. Recruitment occurred at the beginning of the class period.
The researcher handed out sign-up sheets requesting general information that included:
name, email address, undergraduate minor if applicable, and a ranking for preferred days
to participate in the study for November 1, 2010, to November 4, 2010. The form also
requested graduate students to include their undergraduate major and minor, if applicable.
Also, because many of the second year graduate students did not have a class during the
week of recruitment, emails requesting participation for the experiment were sent to only
second year graduate students. By October 19, 2010, 239 participants had volunteered to
participate. All undergraduate and first year graduate professors offered extra credit to
those students that participated in the study.
The researcher then entered all the sign-up data into a spreadsheet and organized
the participants into one of the four dates, November 1 through 4, 2010, with nearly all
2
the participants receiving their first-ranked choice. Once the participants were organized
into days, the researcher then organized the participants into groups; each group had three
members. All groups had at least one freshman assigned to each group. On October 25,
2010, the researcher emailed the participants to let them know their assigned date and
time. From October 25, 2010, to October 31, 2010, participants that were unable to
participate emailed the researcher; at this point, 17 participants withdrew from the study.
From November 1 to November 4, 2010, the researcher then emailed the participants the
morning of their assigned date and time to remind them that the experiment was to occur
that evening.
Intervention
At the beginning of the intervention, participants were asked to sit with their
assigned group. At the front of the room was a list of all the participants organized into
groups of three and four. The groups all had a number and the participants were to sit at
the computers with corresponding numbers. All documentation the participants would
need was placed at the computers before the intervention began. First, after all
participants were seated, they signed a consent form. After all participants completed
this, the researcher gave instructions for the intervention. Both documents are located in
Appendix C and D. After all forms that would be used for the intervention were
explained, the researcher then addressed the participants who had group members
missing. Those that did not have a full group of three were asked to come to the front of
the room so they could be moved into another group. This instruction and reconfiguration
of groups took 10 minutes.
For the control group, the next step was to begin collection. The groups were
given a list of possible sources they could use to begin their collection process and the
2
groups independently divided the workload. Please see Appendix E for this document.
After an hour of collection, the groups reconvened to brainstorm possible variables and to
give their forecast on a Forecasting Answer Sheet the researcher created. Please see
Appendix G for this document.
Before the experiment group began collection, the researcher gave a brief
PowerPoint tutorial on how to use Complexity Manager. The researcher then described a
packet that was created for the groups to work through the methodology step-by-step.
The participants were also given the directions as written by Richards Heuer and Randy
Pherson in their book, Structured Analytic Techniques for Intelligence Analysis. This
tutorial and explanation took 10 minutes. The groups were then given an hour for
collection and given the same list of possible sources as the control group. After an hour
of collection, the groups reconvened to brainstorm possible variables using Complexity
Manager. The experiment group participants then gave their forecast on the Forecasting
Answer Sheet the researcher created. Please see Appendix E for the methodology packet.
All participants were given two and one half hours to complete the experiment.
Post-Intervention
The post-intervention period including completing the questionnaire described in
the Measurement Instruments section. The students were also given a debriefing
statement describing the purpose of the experiment. Please see Appendix J for the
debriefing statement. On November 4, 2010, the researcher emailed the names of all the
students that participated in the study to the professors that offered extra credit.
Data Analysis
Descriptive and inferential statistics were used for data analysis of the survey
responses, group analytical confidence, group source reliability, and the number of
2
variables both the control and experiment group considered. The data was subdivided for
analysis purposes and Statistical Package for the Social Sciences (SPSS) software was
used to identify the mean and standard deviation for the control and experiment group.
An independent sample t test was used to compare the mean scores and to identify any
significant differences between the control and experiment data sets’ mean scores. The
survey questions comparing the control and experiment group are: individual amount of
time spent working in the study; group amount of time spent working in the study;
previous knowledge of the Sudan Referendum before beginning the study; clarity of
instructions; availability of open source materials; and how helpful it was to work in
teams. The variables comparing the number that the control and experiment group
produced within each group are: economic, social, political, geographic, military,
technology, and then a total number of variables for both the control and the experiment
group. The quality of the variables was analyzed descriptively and assessed for content.
3
CHAPTER 4: RESULTS
The results will be presented in order of reference from the Methods section of
this study: survey responses, group analytical confidence, group source reliability, the
number of variables the control and experiment groups considered, and the forecasting
accuracy of both groups. This will be followed by the descriptive analysis of the quality
of variables. Please see Appendix K for complete SPSS data.
Survey Responses
Surveys were distributed to each individual after their group completed and
returned their forecasting answer sheet to the researcher.
Time
Surveys asked each individual to state the approximate amount of time they spent
working individually and the amount of time that they spent working with their groups.
80 control group members and 65 experiment group members answered the survey
question regarding the amount of time they each spent working individually. The control
group’s individual amount of time spent ranged from 20 minutes to 110 minutes. The
experiment group’s individual amount of time ranged from 15 minutes to 120 minutes.
Using SPSS software, the results showed that there is no difference between the control
and experiment group for the individual amount of time spent working, t (142) = -.797, p
(.455) > (α = 0.05).
80 control group members and 67 experiment group members answered the
question regarding the amount of time they spent working as a group. The control group
amount of time spent working together ranged from 5 to 90 minutes. The experiment
group amount of time spent working together ranged from 25 to 150 minutes. Using
2
SPSS software, the results showed that there is a difference between the control and
experiment group for the amount of time spent working together, t (145) = -7.71, p
(0.00) ˂ (α = 0.05). The experiment group had a greater mean (M = 74.1045 minutes, SD
= 30.30058) than the control group (M = 40.4375 minutes, SD = 20.71802). The
experiment group spent more time working in their groups than the control group did.
Knowledge of Sudan Referendum
To gauge the understanding of the subject matter used for the intervention, the
researcher asked the students to state their knowledge of the Sudan Referendum prior to
beginning the study. The students were given a scale ranging from 1 to 5 with 1
indicating that the individual had little knowledge of the Sudan Referendum and 5
indicating that the individual had great knowledge about the Sudan Referendum.
survey question regarding their prior knowledge of the Sudan Referendum. Using SPSS
software, the results showed that there was no difference between the control and
experiment group’s knowledge of the Sudan Referendum, t (115.143) = -1.699, p (0.092>
(α = 0.05). The experiment group had a slightly greater mean (M = 1.7463, SD =
1.17219) than the control group (M = 1.4578, SD = .83083).
Clarity of Instructions
To gauge the participants’ perception of the clarity of the researcher’s
instructions, the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated
little clarity and 5 indicated that the directions were entirely clear.
survey question regarding the clarity of instruction. Using SPSS software, the results
showed that there is a difference between the control and experiment group’s perception
2
of the clarity of instructions provided by the researcher, t (140.859) = 6.098, p (0.000) <
(α = 0.05). The control group had a greater mean (M = 4.3614, SD = .77426) than the
experiment group (M = 3.5821, SD = .78140). The control group perceived the
instructions to be clearer compared to the experiment group. This is likely due to the
differences in the directions between the control group and the experiment group. The
control group had more straightforward instructions; collaborate with the team to come
up with a forecast. The experiment group’s task was more ambiguous with the added
instruction of learning and using Complexity Manager. Though both groups were
explained the process of the experiment, the experiment group may have perceived the
instructions to be less clear because of the added complexity of learning and applying a
structure analytic technique.
Open Source Availability
To gauge the participants’ perception of the availability of open source
information regarding the Sudan Referendum, the researcher asked the students to rate it
on a scale of 1 to 5. 1 indicated little availability and 5 indicated an abundance of open
source information regarding the Sudan Referendum.
survey question regarding the availability of open source materials. Using SPSS software,
the results showed that there is no difference between the control and experiment group’s
perception of the availability of open source materials regarding the Sudan Referendum,
t (138.980) = -0.914, p (0.3620) > (α = 0.05). The experiment group had a very slight
greater mean (M = 4.2985, SD = .79801) than the control group (M = 4.1807, SD = .
76739)
Team Helpfulness
3
Individuals were placed into groups of 3 or 4 to complete a team forecast. To
gauge the participants’ perception of how helpful it was to work in teams for the study,
the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated that working in
teams was not helpful and 5 indicated that it was very helpful to work in teams.
survey question regarding the helpfulness of teamwork. Using SPSS software, the results
showed that there is no difference between the control and experiment group’s perception
of the helpfulness of working in teams for this study, t (115.648) = 1.175, p (0.242) > (α
= 0.05). The control group had a very slight greater mean (M = 4.4819, SD = .70471)
than the experiment group (M = 4.3134, SD = .98794). Both the control group and the
experiment group found that working in a team was helpful.
Initially, it would seem that those that use a structured analytic technique would
value teamwork more because it consciously facilitates collaboration. However, the
academic major may have overshadowed this and played a larger role in the participants’
perception of team helpfulness. The Intelligence Studies major at Mercyhurst College
values and draws heavily on the use of groups to facilitate learning and collaboration.
Therefore, all students likely came into the experiment with the mindset that teamwork
adds value and validity to the forecast. Another factor when considering the shared
perception of team helpfulness for both the control and experiment groups is the nature of
the task. The amount of learning that had to be done would have been very difficult for
one person to do in a two and one half hour timeframe. Therefore, a team would likely be
a welcomed solution to the workload regardless of if a structured analytic technique was
used or not. A third factor is a varied level of individual experience with analysis. 53% of
the participants in the control group were freshmen or sophomores and 39% of
4
participants in the experiment group were freshmen or sophomores. Collectively,
freshmen and sophomores accounted for 46% of the total participants. Therefore it is
likely that many of the freshmen and sophomores valued working on a team with more
experienced upperclassmen.
Group Analytic Confidence
On the group forecasting answer sheet, the researcher requested that the groups
give their analytic confidence for their forecast regarding the Sudan Referendum. The
participants were to gauge their confidence with “High” being the most confident and
“Low” being of the lowest confidence.
24 control groups and 23 experiment groups gauged their analytic confidence.
Normality assumptions were not satisfied because the sample size was small, less than
30, so the Mann-Whitney test was used. The results showed that there is no difference
between the control and experiment groups’ analytic confidence, p (0.458) > (α = 0.05).
The implications of this finding will be explored in more detail in the Conclusions
chapter.
Group Source Reliability
On the group forecasting answer sheet, the researcher requested that the groups
give their source reliability for their forecast regarding the Sudan Referendum. The
participants were to gauge their confidence with “High” being the most confident in the
sources used for forecasting and “Low” being of the lowest confidence.
24 control groups and 23 experiment groups gauged their source reliability. Normality
assumptions were not satisfied because the sample size was small, less than 30, so the
Mann-Whitney test was used. The results showed that there is no difference between the
control and experiment groups’ analytic confidence, p (0.914) > (α = 0.05). Figure 4.1
Figure 4.1. Source reliability per group.

2
shows the majority of the control and experiment groups had medium reliability in
sources. No group indicated that they had low source reliability.
Variables
Individuals were placed into groups of three or four, 24 control groups and 23
experimental groups, and were asked to give a team forecast that included a list of
variables that were used for considering their group forecast. The researcher created
categories of variables for the groups’ consideration that include: economic, social,
political, geographic, military, and technology. The variables were examined by the
researcher through both statistics and descriptive analysis; recognizing that it is not only
the quantity but also the quality of the variables that make accurate forecasts.
Implications of all the variables findings will be explored in more detail in the
Conclusions chapter.
Economic Variables
Using SPSS software, the results showed that there is a difference between the
control and experiment group’s number of economic variables considered, t (34.545) =
4.476, p (0.000) < (α = 0.05). The control group had a greater mean (M = 2.9583, SD =
1.26763) than the experiment group (M = 1.6522, SD = .64728).
Social Variables
Normality was not satisfied for the experiment group, so the Mann-Whitney test
for independent samples was used. The results showed that there is a difference between
the control and experiment groups’ number of social variables considered, p (0.000) < (α
= 0.05). The experiment group produced 34% less variables than the experiment group.
Political Variables
3
control and experiment group’s number of political variables considered, t (36.222) =
Geographic Variables
Normality was not satisfied for the experiment group, so the Mann-Whitney test
for independent samples was used. The results showed that there is a difference between
the control and experiment groups’ number of geographic variables considered, p (0.000)
< (α = 0.05). The experiment group produced 48% less variables than the control group.
Military Variables
control and experiment group’s number of military variables considered, t (42.888) =
7.178, p (0.000) < (α = 0.05). The control group had a greater mean (M = 2.8750, SD = .
89988) than the experiment group (M = 1.2174, SD = .67126).
Technology Variables
control and experiment group’s number of technology variables considered, t (31.176) =
Total Variables
control and experiment group’s number of total variables considered, t (39.195) = 8.295,
p (0.000) < (α = 0.05). The control group had a significantly greater mean (M = 17.1250,
2
SD = 4.08936) than the experiment group (M = 8.8696, SD = 2.59903). Again,
implications regarding all variables considered can be found in the Conclusions chapter.
24 control groups and 23 experiment groups forecasted whether the vote for the
Sudan Referendum would occur on January 9th, 2011 or if it would be delayed. On
January 9th, 2011, the voting process did begin as scheduled (Ross, 2011). 3 of the 24
control groups and 6 of the 23 experiment groups accurately forecasted the event. Using
SPSS software, it was determined that there was no statistical difference between the
control and experiment group’s ability to accurately forecast (P-value = 0.2367) > (α =
0.05). Although assumptions of normality were not satisfied due to the small sample size,
the raw data does show that twice as many experiment groups accurately forecasted the
event (see figure 4.2).
19 of the 24 control groups and 16 of the 23 experiment groups inaccurately
forecasted the event. Using SPSS software, it was determined that there was no
difference between the control and experiment group’s inaccurate forecast (P-value =
0.4505) > (α = 0.05). Three groups’ forecasts were not included in the statistical testing:
one control and one experiment group did not give a forecast and one group in the control
forecasted that the chances were even. Therefore, these three forecasts were not included
in the analysis.
Quality of Variables
Figure 4.2. Forecasting per group.
Because quantity
may not reflect the quality of information, the researcher also descriptively analyzed the
written variables completed by the 24 control groups and 23 experiment groups. Quality,
according to the Merriam-Webster’s Collegiate Dictionary, id defined as “a degree of

2
excellence, superiority in kind” (n.d.). In Structured Analytic Techniques for Intelligence
Analysts, Heuer and Pherson discuss a three-step approach to the evaluation of structured
analytic techniques. In this evaluation, they note that quality of analysis is not restricted
to just accuracy. Heuer and Pherson suggest that quality of analysis is measured by
“clarity of presentation, transparency in how the conclusion was reached, and
construction of an audit trail for subsequent review” (2010, p. 317). Considering the
definition of quality and Heuer and Pherson’s measure of quality of analysis, the
researcher defined quality variables as “variables that are superior as shown through
clarity of presentation and transparency in how conclusions were reached.”
The researcher did not include “construction of an audit trail” because both the
control and the experiment group were asked to write out the variables considered.
Therefore, this instruction required both groups to leave an audit trail of their variables
and significant findings. Using the above definition of quality, the researcher found that
quality variables were presented in two ways: completeness of the description and
specificity.
Completeness of the Description
Both the control and the experiment groups consistently cited similar variables
for consideration when forecasting. Both groups consistently spoke of border disputes,
issues involving oil rights, and ethnic tensions. However, the teams in the control group
routinely used full sentences while only one team in the experiment group used full
sentences. Though this does not increase the validity of the data, it does show the
completeness of the team’s thought; it showed clarity of presentation. The teams that
used complete sentences were also able to show cause and effect. Therefore, the ability to
2
show cause and effect allowed for transparency in how the analysts arrived at their
conclusions.
For example, one team in the control group writes, “Southern Sudan’s economy is
mostly comprised from oil revenue that it receives from the North. However, because the
north has ceased paying its share of oil revenues in foreign currency, turmoil internally
will likely result.” One team in the experiment group writes on the same topic, “Share of
oil revenues.” Both groups are stating that oil revenues are a variable to consider when
forecasting whether then Sudan Referendum will occur on the date scheduled, but the
team in the control group conveyed why oil revenues would affect the possibility of
delay.
Specificity
The completeness of the description allowed for more specific variables to be
considered. The teams in the control group more frequently cited specific pieces of
evidence for consideration while the experiment group used broader concepts for
consideration. For example, one team in the control group writes, “The Northern
government has control over the TV and radio signals, and only allows broadcasts that
are in line with their policies.” One team in the experiment group writes, “Low tech
capabilities affect other key variables.” The team in the control group is citing specifics;
the Northern government has control over certain technologies in the country. The team
in the experiment group is stating that a low level of technical capability in Sudan affects
other variables. In this control group example, as does the greater part of the control
group, the team is not only stating a specific piece of evidence, but showing the split
between north and south Sudan; the reason for the Sudan Referendum. In this experiment
group example, the team in the experiment group is stating that there is a connection
2
between a low level of technology and how that affects the other variables; however, this
broad generalization does not allow for an understanding of the urgency of the
technological issues in Sudan and how it could affect the possible delay of the
referendum.
Two factors could influence the result of variable specificity. The first factor is
possibly a lack of experience with the method and the topic. By using broad topics such
as “political pressure” and “security issues” the students were decreasing the complexity
of the task by equating the variables to things they were more familiar with. “Political
pressure” and “security issues” are more common to the students than the specifics of the
Sudan Referendum. Therefore, by broadening the variables, the students could more
easily use the cross impact matrix; assessing how politics affected security, rather than
how one specific instance in Sudan affected the other. The other factor could be a lack of
clarity. While the control group may have cited specific instances for their variables, the
experiment group may have placed those specific instances into broader categories,
which they identified as variables. Therefore, clarity may be lacking; is a variable a
specific example (“The Northern government has control over the TV and radio signals,
and only allows broadcasts that are in line with their policies”) or a broadened
understanding of specific instances (“Low tech capabilities affect other key variables”).
A definition of “variable” in the context of Complexity Manager may help to clarify what
exactly is needed for the structured technique to be most effective.

4
CHAPTER 5: CONCLUSION
Throughout the IC’s history, intelligence failures have driven reform in
organizational structure, information sharing, and the use of structure analytic techniques.
However, if the use of structured analytic techniques is the solution to decreasing the
possibility or severity of future intelligence failures, then structured analytic techniques
should be tested to ensure each is a valid means to reducing such risks. Though
encouraged, testing of the techniques is limited. The intent of this study was to test one
structured analytic technique and offer further research for testing. The purpose of this
study was to assess the validity of the structured method, Complexity Manager. To do so,
the researcher designed an experiment comparing the use of Complexity Manager versus
intuition alone. The research was conducted at Mercyhurst College’s Intelligence Studies
Program with participants from every academic class level; freshman to second year
graduate students.
Discussion
Do analysts that use Complexity Manager have a higher level of confidence than
those that used intuition alone?
Analytic confidence is based on the use of a structured analytic technique, source
reliability, source corroboration, level of expertise on the subject, amount of
collaboration, task complexity, and time pressure (Peterson, 2008). The results of the
survey questions asking groups to rate their level of analytic confidence shows that those
that used Complexity Manager did not have a higher level of confidence than those that
used intuition alone. Furthermore, looking at the results from the other survey questions
that identify components of analytic confidence confirms that the experiment group did
Figure 5.1. Analytic confidence per group.

2
not experience a higher level of confidence than the control group. The control group
and the experiment group had no difference for their source reliability; level of expertise
on subject matter; and amount of collaboration, shown through “team helpfulness.”
Assessing task complexity and time pressure, both the control group and experiment
group were given the same task with the same amount of time. However, the experiment
group, (M = 74.1045 minutes, SD = 30.30058) spent a greater amount of time working
than the control group (M = 40.4375 minutes, SD = 20.71802). The time that the
experiment groups spent working together may have been related to the teams’ need to
learn and then use Complexity Manager.
This finding suggests that using a structured analytic technique may not increase
analytic confidence, but may better calibrate the analyst. The analysts could have lacked
confidence in not only their ability to use the structured analytic technique, but lacked
confidence in the analysis that it helped to produce. Task complexity was high and there
was a time constraint of 2.5 hours; however, the majority of both the control and
experiment groups had a medium analytic confidence. Nine groups gave low analytic
confidence; three control and six experiment. Overall, the experiment group had a lower
confidence level, but had a greater forecasting accuracy. Therefore, this suggests that
there may not be a connection between analytic confidence and the use of a structured
analytic technique. Analytic confidence may have no bearing on forecasting accuracy. In
other words, having a high analytic confidence may not suggest that the analyst is more
likely to forecast accurately. In summary, this finding suggests that using a structured
analytic technique could assist the analyst in assessing their own analytic confidence, but
does not improve their overall analytic confidence. Further studies yielding a higher
3
number of group or individual analytic confidence ratings would be needed to statistically
confirm this.
Do analysts that use Complexity Manager have higher quality variables assessed
before delivering their forecast than those that used intuition alone?
The researcher concluded after a descriptive analysis comparing the control to the
experiment group that those that used Complexity Manager did not have higher quality
variables than those that used intuition alone. The teams that used Complexity Manager
often used short, broad generalizations while those that used intuition alone wrote out
complete sentences that specified specific points and conflicts between north and south
Sudan. These complete sentences allowed the control group to fully explain the cause and
effect of each variable. The experiment group spent a greater amount of time working
together than the control group, yet the quality of their variables was less than that of the
control group.
Two factors may have influenced the style of reporting. The experiment group
was given an answer sheet for their forecast along with a methodology packet; a step-by-
step guide to using Complexity Manager where they could complete the steps directly on
those pages. However, the control group was only given an answer sheet. The first factor
the researcher considered was redundancy, having to write the variables twice, to have
influenced the experiment group to only write short, broad generalizations on their
answer sheet. However, the methodology packets reveal the same statements. This leads
to a second factor that may have influenced the style of writing. For Complexity
Manager, the experiment group was tasked with completing a cross-impact matrix and to
do so, the variables first had to be listed in the left hand column. The teams may have
judged the lines too short to include whole sentences and therefore only transferred those
2
statements onto the answer sheet. Though this may account for the length of the sentence,
it does not account for why the experiment group often used broader concepts, such as
referencing oil refineries, while the control group used more specific statements, such as
stating where the refineries were and why it was a conflict.
Do analysts that use Complexity Manager have a higher number of variables
assessed before delivering their forecast than those that used intuition alone?
As shown through the intervention results, analysts that used Complexity
Manager did not have a higher number of variables assessed before delivering their
forecast. The control group had a greater number of variables assessed in every category.
Also, the control group had a greater mean total number of variables (M = 17.1250, SD =
4.08936) than the experiment group (M = 8.8696, SD = 2.59903).
The amount of variables assessed did not connect to more accurate forecasts.
Therefore, quantity had no bearing on quality. Increasing the pieces of evidence could
easily bias the analyst, thinking that the more evidence found, the more likely it would be
that the Sudan Referendum would not occur on the scheduled date. Though the
experiment group had less variables assessed, they produced a greater number of accurate
forecasts. This suggests that the experiment group weighed the significance of each
variable, rather than totaling the pieces of evidence confirming the likelihood of one
event over another. Therefore, using a structured analytic technique may have helped
decrease analyst bias when forecasting.
Do analysts that use Complexity Manager produce a more accurate forecast than
those that use intuition alone?
There was no statistical difference between the control and experiment group for
producing more accurate forecasts. This may have been due to the small sample size of
2
both the control and experiment groups. However, in effect, a p-value of 0.2367 indicates
that there is a 76% chance that the data is not due to chance. Furthermore, looking at the
raw data, 6 out of 23 experiment groups had accurate forecasts while only 3 out of 24
control groups had accurate forecasts. Both of these facts suggest that Complexity
Manager assisted the experiment group with producing more accurate forecasts.
Analytic Confidence findings showed that 3 control groups and 6 experiment
groups gave a low confidence rating. Forecasting Accuracy findings also showed that 3
control groups and 6 experiment groups produced accurate forecasts. However, only one
experiment group that forecasted accurately gave a low confidence rating; 8 of the 9
accurate forecasts gave a medium analytic confidence rating.
Two further conclusions can be drawn from the forecasting accuracy of the
experiment group. The first conclusion is that forecasting accuracy may be connected to
collaboration, a required process within the Complexity Manager technique. The
experiment group spent more time working collaboratively than the control group and
also produced more accurate forecasts. (Time in Minutes: Experiment group mean =
74.1045 and Control group mean = 40.4375). The second conclusion is that there appears
to be no connection between the number of variables assessed and forecasting accuracy;
the control group recorded a greater number of variables than the experiment group but
did not forecast more accurately.
Shannon Ferrucci recorded similar findings in her 2009 Master’s thesis, “Explicit
Conceptual Models: Synthesizing Divergent and Convergent Thinking.” When assessing
the size of conceptual models that participants produced in her study, Ferrucci found that
though the experimental group’s conceptual models were larger than the control group’s
models, the control group forecasted better than the experiment group (Ferrucci, 2009).
3
Ferrucci suggested that the large number of concepts that the experiment group created in
their models created confusion and decreased their ability to understand the most relevant
information for completing their forecast (2009). As in Ferrucci’s experiment, the large
number of variables that the control group created may have overwhelmed the analysts
and made it more difficult to select the most relevant variables that would affect the
possible delay for the Sudan Referendum.
Another factor that may account for the control groups’ low forecasting accuracy
could be a connection between the number of variables assessed and cognitive bias.
Robert Katter, John Montgomery, and John Thompson found in their 1979 study,
“Cognitive Processes in Intelligence Analysis: A Descriptive Model and Review of
Literature,” that intelligence is conceptually driven rather than data driven. This
understanding is important because it shows how the analyst arrives at their conclusions.
An analyst’s forecast is not pure data. Instead, it is a process driven by how the analyst
interprets that data after moving through their cognitive model (Katter et al., 1979). The
purpose of the cognitive model is to account for “inputs” of the analyst, with input
meaning stimuli from the external world or what is in their internal memory (Katter et al.,
1979). The model has three parts that are summarized below:
1. The individual’s initial processing of outside information is automatically
conducted in less than a second. Then the new information is automatically
compared with information already stored in the memory. When even just a
gross match is found, the new information that matches existing memory
patterns is stored.
2. New information that may not fit into the existing memory patterns can be
automatically ignored or viewed as irrelevant or uninteresting.

1
3. The central cognitive function consists of a continuous “Compare/Construct”
cycle that modifies the memory-storage. Three types of information
modification are: sensory information filtering, memory information
consolidation, and memory access interference.
In this study on Complexity Manager, the control group did not have anything in place to
force themselves to be more cognizant of their cognitive models as they recorded all of
their variables for forecasting. Therefore, without having an external regulator such as a
structured analytic technique, the control groups’ forecasts were more negatively affected
by their cognitive models, taking the form of cognitive bias.
Limitations
A limitation to the study was the sample size. 162 individuals participated, but the
number of forecasts was limited because the individuals were grouped into teams of three
and four. Therefore, instead of 162 forecasts, only 47 were given. Time constraints were
also a limitation. The study was conducted during a two and one half hour time period.
This did not allow for full development or understanding of the issue. The researcher
intentionally chose this time period to maximize the number of participants and decrease
the number of drop-outs. Also, the participants were students with time constraints due to
other classes and obligations. A third limitation was the level of expertise from the
participants. The participants were students with limited knowledge of the field and very
limited knowledge of both Complexity Manager and the intelligence topic, the Sudan
Referendum.
Three other limitations of this study relate directly to the implementation of the
intervention. The first limitation was the amount of time that may be appropriate for
learning not only Complexity Manager, but any structured analytic technique. One of the
2
main considerations when restricting the intervention to 2.5 hours was maximizing the
sample size. The participants, being students, had many other obligations. If the
researcher had asked participants to commit to a longer intervention, the sample size may
have decreased significantly. However, the time restriction may not have allowed for
proper understanding and absorption of Complexity Manager. Besides time restrictions
with learning the structured analytic technique, assessing when to provide the Complexity
Manager tutorial was a limitation. The researcher gave the tutorial directly after giving
the tasking for the analysis. This was done to allow the groups to work at their own pace
and complete their analysis earlier if they desired to. However, this turned into a
limitation because it overloaded the participants with information. Collection may have
suffered because the students were more concerned with understanding the technique.
The third limitation is the timing of the intervention. Mercyhurst College operates on a
trimester system, with each term lasting ten weeks. The researcher gave the intervention
during the eighth week of the term. Though students may have attended to earn extra
credit knowing the end of the term was near, this may also negatively affected the
intervention. Participants dropped out of the intervention because they had other
obligations such as team meetings and projects. For those that attended the intervention,
they may have worked more quickly through it than if it had been held earlier in the term
when they had less pressing obligations to manage. If the technique had been presented
separately and more thoroughly, and if the intervention had taken place earlier in the
term, the results may have more accurately reflected the purpose of the study.
Recommendations for Future Research
Based on the results of this study, there are several recommendations for future
research. Limitations concerning time constraints for training the participants using
2
Complexity Manager could be mitigated or eliminated if the training is separate from the
intervention. Not only could understanding increase, but this would allow the participants
to analyze more complex issues that have multiple outcomes, a major function of
Complexity Manager. Along with taking more time to train the participants, having
participants that are trained in a particular area of expertise could also improve the
intervention.
Therefore, the second recommendation is to either include professionals that have
had more experience in the field or to choose a topic within an area of expertise that
would be more familiar to all participants. The participants for this intervention had to
familiarize themselves with a topic that was largely unknown to the majority of the
participants and also learn a new structured analytic technique. Having participants that
are subject matter experts or have a higher level of expertise could assist the participants
to more fully utilizing Complexity Manager; to explore potential outcomes and
unintended side effects of a potential course of action. In other words, involving subject
matter experts and working through issues with multiple outcome possibilities are two
major components of Complexity Manager that could be tested in further studies.
The researcher of this intervention focused on variables, analyst confidence, and
the forecasting accuracy of Complexity Manager. The third recommendation is to
compare Complexity Manager to another well-tested structured technique. The researcher
compared intuition to the use of Complexity Manager. Doing so showed that those that
used an intuitive process produced a greater number of specified variables compared to
those that used Complexity Manager. However, those that used Complexity Manager
worked significantly longer as in their groups. Having both the control and experiment
group use a structured analytic method could possibly eliminate the dramatic change in
3
control and experiment group time and isolate the question: “Is Complexity Manager
more effective at assisting analysts brainstorm variables that impact a complex issue than
other techniques?”
The fourth recommendation is to obtain a higher number of participants or
forecasts for the study. Collaboration was necessary for Complexity Manager, therefore,
the researcher organized the participants into groups of three to four; this significantly
minimized the number of forecasts. Therefore, increased participation or a method that
would allow for individual forecasting would minimize this limitation. Further studies
using these recommendations could more fully assess the validity of Complexity
Manager.
Conclusions
Complexity Manager originated as a way for analysts to develop a group
understanding of each of the relationships within a complex system. Heuer created
Complexity Manager to help analysts generate new insights through variable
identification and interaction in order to understand the potential outcomes and
unintended side effects of a potential course of action. Heuer states that Complexity
Manager is useful for identifying the variables that are most significantly influencing the
decision at hand and enables the analyst to find the best possible answer to an intelligence
question by organizing information into the structured technique.
The dynamics of the variable interactions were not noted in this study because
this experiment focused on how effective Complexity Manager is at variable
identification and its correlation to forecasting accuracy, or, the best possible answer to
an intelligence question. The results show that those that used Complexity Manager
identified fewer variables and less specific variables, but had a higher number of accurate
2
forecasts than those that did not use Complexity Manager. Therefore, it can be concluded
that Complexity Manager is effective at identifying variables that lead to more accurate
forecasts than using intuition alone.
Using Complexity Manager did not assist participants with identifying more
variables than those that did so intuitively. This may suggest that those that used intuition
may have actually used a divergent process, brainstorming, before forecasting or that
Complexity Manager is not effective at assisting analysts with drawing out a higher
volume of or higher quality variables. However, the increased number of variables
recorded did not connect to an increased forecasting accuracy.
The use of teams greatly reduced the number of forecasts, which in turn, reduced
the sample size for extracting statistically significant results. However, the results of this
suggest that Complexity Manager increases forecasting accuracy. Collaboration is a
necessary part of Complexity Manager; the teams must brainstorm variables and work
through the matrix together. The increased amount of time spent collaborating and
following the steps of the structured analytic technique increased the number of accurate
forecasts in the experiment group.
This experiment showed that Complexity Manager’s strongest abilities include
effective collaboration, possible improved analytic confidence calibration, and an aid to
increasing forecasting accuracy. An area of improvement would be a stronger definition
of what a variable is in the context of Complexity Manager; is it a specific event that
could be a catalyst to other events? Is it broader? Is a single variable composed of
multiple significant events that are categorized under one general umbrella? Or is it a
combination of both? Thinking of the entire IC, how should analyst balance single,
significant or seemingly insignificant events with more general trends?

1
Final Thoughts
The testing of the effectiveness of one analytic technique, at this point in time,
seems to be secondary to gathering empirical evidence regarding the collective benefits
and abilities that structured analytic techniques offer. Instead of testing one particular
technique one at a time, this researcher recommends testing two techniques against each
other or two techniques against intuition alone. This would increase the amount of
techniques tested and would keep the control of intuition in place. More importantly,
comparing techniques against each other could help to show emerging patterns through
the strengths and weaknesses that may overlap within all structured analytic techniques.
This would improve all structured analytic techniques.
Though this is only one study assessing the effectiveness and forecasting accuracy
of one structured analytic technique, it did produce quantitative results that confirm that
structured techniques decrease bias and increase forecasting accuracy. One by one,
experiments and results such as this add to the validity of each structured analytic
technique and the Intelligence Community as a whole.

3
REFERENCES
Berardo, R. (2009). Processing complexity in networks: a study of informal collaboration

and its effects on organizational success. Policy Studies Journal, (37)3, 521-539.
Retrieved September 24, 2010, from Academic Search Complete. doi:
10.1111/j.1541-0072.2009.00326.x
Betts, R. (1978). Analysis, war, and decision: why intelligence failures are inevitable.
World Politics, 31(1), 69-89. Retrieved June 20, 2010, from
http://www.jstor.org/stable/200 9967.
Blaskovich, J.L. (2008). Exploring the effect of distance: an experimental investigation of

virtual collaboration, social loafing, and group decisions. Journal of Information
Systems (22)1. 27-46. Retrieved September 3, 2010, from Academic Search
Complete.
Brasfield, A.D. (2009) Forecasting accuracy and cognitive bias in the analysis of
competing hypotheses (Unpublished master’s thesis). Mercyhurst College, Erie,
PA.
Cheikies, B. A., Brown, M. J., Lehner, P.E., & Adelman, L. (October 2004).
Confirmation bias in complex analyses. 1-16. Retrieved June 13, 2010, from
http://www.mitre.org/work/tech_papers/tech_papers_04/04_0985/04_0985.pdf.
Davis, J. (1999). Improving intelligence analysis at CIA: Dick Heuer’s contribution to

intelligence analysis. In Heuer, R., Jr. (1999). Psychology of Intelligence Analysis,
Center for the Study of Intelligence: Central Intelligence Agency, xiii-xxv.
Retrieved May 31, 2010, from https://www.cia.gov/library/center-for-the-study-
of-intelligence/csi-publications/books-and-monographs/psychology-of-
intelligence-analysis/PsychofIntelNew.pdf.
Diaz, G. (January 2005). Methodological approaches to the concept of intelligence

failure. UNISCI Discussion Papers, Number 7, 1-16. Retrieved July 31, 2010,
from http://revistas.ucm.es/cps/16962206/articulos/UNIS0505130003A.PDF.
Edison's Lightbulb at The Franklin Institute. (2011). Retrieved April 1, 2011, from
http://www.fi.edu/learn/sci-tech/edison-lightbulb/edison-lightbulb.php?
cts=electricity.
Folker, R. D., Jr. (2000). Intelligence Analysis in Theater Joint Intelligence Centers: An
Experiment in Applying Structured Methods Occasional Paper Number Seven, 1-
45. Retrieved June 13, 2010, from http://www.fas.org/irp/eprint/folker.pdf.
Ferrucci, S. (2009). Explicit conceptual models: synthesizing divergent and convergent

thinking (Unpublished master’s thesis). Mercyhurst College, Erie, PA.
3
George, R. Z. (2004). Fixing the problem of analytical mind-sets: alternative analysis.

International Journal of Intelligence and Counterintelligence, 17(3), 385-404.
doi: 10.1080/08850600490446727.
Goodman, M. (2003). 9/11: The failure of strategic intelligence: intelligence and

national security, 18(2), 59-71. doi: 10.1080/02684520310001688871.
Grimmet, R.F. (2004). Terrorism: key recommendations of the 9/11 commission and
recent major commissions and inquiries. (Congress Research Service).
Washington, DC. Retrieved September 3, 2010, from
http://www.au.af.mil/au/awc/awcgate/crs/rl32519.pdf.
Hart, D. & Simon, S. (2006). Thinking straight and talking straight: problems of
intelligence analysis. Survival, 48(1), 35-59. doi: 10.1080/00396330600594231.
Hedley, J. (2005). Learning from intelligence failures. International Journal of

Intelligence and Counterintelligence, 18(3), 436. doi:
10.1080/08850600590945416.
Heuer, R., Jr. (2009). The evolution of structured analytic techniques. Presentation to the
National Academy of Science, National Research Council Committee on
Behavioral and Social Science Research to Improve Intelligence Analysis for
National Security. Washington, D.C.
Heuer, R., Jr. (1999). Psychology of Intelligence Analysis. Center for the Study of
Intelligence: Central Intelligence Agency, 1-183. Retrieved May 31, 2010, from
https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-
publications/books-and-monographs/psychology-of-intelligence-
analysis/PsychofIntelNew.pdf.
Heuer, R. J., Jr. (2008). Small group processes for intelligence analysis, 1-38. Retrieved
September 16, 2010 from http://www.pherson.org/Library/H11.pdf.
Heuer, R. J. Jr. & Pherson, R. (2010). Structured Analytic Techniques for Intelligence
Analysis. Washington, D.C.: CQ Press.
Johnston, R. (2005). Analytic culture in the US Intelligence Community: an

ethnographic study. Retrieved July 31, 2010, from
http://www.au.af.mil/au/awc/awcgate/cia/analytic_culture.pdf.
Johnston, R. Integrating methodologists into teams of substantive experts. Studies in

Intelligence, 47(1), 57-65. Retrieved June 13, 2010, from http://www.dtic.mil/cgi-
bin/GetTRDoc?AD=ADA525552&Location=U2&doc=GetTRDoc.pdf.
Katter, R., Montgomery, C., & Thompson, J. (1979). Cognitive processes in intelligence
analysis: a descriptive model and review of the literature (Technical Report 445).
Arlington: US Army Intelligence Security and Command.
5
Khatri, N. & Alvin, H. Role of intuition in strategic decision making. 1-38. Retrieved
May 31, 2010, from http://www3.ntu.edu.sg/nbs/sabre/working_papers/01-97.pdf.
Lefebvre, S. J. (2003). A look at intelligence analysis. Retrieved from

http://webzoom.freewebs.com/swnmia/A%20Look%20At%20Intelligence
%20Analysis.pdf.
Light Bulb History - Invention of the Light Bulb. (2007). Retrieved April 1, 2011, from
http://www.ideafinder.com/history/inventions/lightbulb.htm.
Marrin, S. (2004). Preventing intelligence failures by learning from the past.

International Journal of Intelligence and Counterintelligence, 17(4), 655-672.
Retrieved June 20, 2010, from http://dx.doi.org/10.1080/08850600490496452.
Myers, D.G. (2002). Intuition: Its Powers and Perils. London: Yale University Press.
National Commission on Terrorist Attacks. (2004). The 9/11 commission report.

Washington, DC: US Government Printing Office. Retrieved October 15, 2010,
from http://govinfo.library.unt.edu/911/report/911Report.pdf.
October 21, 1879: Edison Gets the Bright Light Right. This Day In Tech. Wired.com.
(2009). Retrieved April 1, 2011, from
http://www.wired.com/thisdayintech/2009/10/1021edison-light-bulb/.
The Office of the Director of National Intelligence (2008). United States Intelligence
Community Information Sharing Strategy (ODNI Publication No. A218084).
Washington, DC. Retrieved September 3, 2010, from
http://www.dni.gov/reports/IC_Information_Sharing_Strategy.pdf.
Pope, S. & Josang, A. Analysis of Competing Hypothesis using subjective logic. 10th
International Command and Control Research and Technology Symposium: The
Future of C2 Decisionmaking and Cognitive Analysis. Retrieved May 31, 2010,
from http://www.cs.umd.edu/hcil/VASTcontest06/paper126.pdf.
Quality. (n.d.) In Merriam-Webster’s collegiate dictionary. Retrieved from

http://www.merriam-webster.com/dictionary/quality.
RAND National Security Research Division. (2008). Assessing the tradecraft of

intelligence analysis. Santa Monica: Treverton, G & Gabbard, C.B.
Ross, W. (2011, January 11). Southern Sudan votes on independence. BBC. Retrieved
from http://www.bbc.co.uk/news/world-africa-12144675.
Robson. D.W. Cognitive rigidity: methods to overcome it. Retrieved May 31, 2010, from
https://analysis.mitre.org/proceedings/Final_Papers_Files/40_Camera_Ready_Pap
er.pdf.
7
Shrum, W. Collaborationism. Retrieved September 24, 2010, from

http://worldsci.net/papers.htm#Collaboration.
Surowiecki, J. (2004). The Wisdom of Crowds. New York: Anchor Books.
Thijs, B. & Glänzel, W. (2010). A structural analysis of collaboration between European

research institutes. Research Evaluation (19)1, 55-65. doi:
10.3152/095820210X492486.
Thomas Edison's biography: Edison Invents! Smithsonian Lemelson Center. (n.d.). .

Retrieved April 1, 2011, from
http://invention.smithsonian.org/centerpieces/edison/000_story_02.asp.
Turnley, J. G. & McNamara, L. An ethnographic study of culture and collaborative

technology in the intelligence community. Sandia National Laboratory, 1-21.
Retrieved May 31, 2010, from http://est.sandia.gov/consequence/docs/JICRD.pdf.
Wheaton, K. J. & Beerbower, M. T. (2006). Towards a new definition of intelligence.

Stanford Law & Policy Review, 17(2), 319-330. Retrieved September 13, 2010,
from LexisNexis.
9
Appendix A: IRB Approval

3
Appendix B:
Structured Methods Experiment
Sign Up
Name:
Class Year:
E-mail Address:
Undergraduate Minor (If Applicable):
Graduate Student’s Undergraduate Major (If Applicable):
Graduate Student’s Undergraduate Minor (If Applicable)
Please select a ranked preference for the dates: (Rank Session Preference:
1=Highest, 4=Lowest)
Monday, November 1, 2010: 6 pm _____
Tuesday, November 2, 2010: 6pm _____
Wednesday, November 3, 2010: 6 pm _____
Thursday, November 4, 2010: 6 pm _____
Upon completion, please return this form to Lindy Smart.

3
Appendix C:
Participation Consent Form
You have been invited to participate in a study about forecasting in Intelligence analysis.
Your participation in the experiment involves the following: team assignments, a team
evaluation of a designated subject, and returning the completed forms back to the
researcher of the experiment. Teams will be given one hour for collection and then will
reconvene for up to an hour and a half to put the analysis together and give a team
forecast.
Your name will only be used to notify professors of your participation in order for them
to assign extra credit. There are no foreseeable risks or discomforts associated with your
participation in this study. Participation is voluntary and you have the right to opt out of
the study at any time for any reason without penalty.
I, ____________________________, acknowledge that my involvement in this research

is voluntary and agree to submit my data for the purpose of this research.
_________________________________
__________________
Signature Date
_________________________________
__________________
Printed Name Class
Name(s) of professors offering extra credit: ____________________________________

Researcher’s Signature: ___________________________________________________
If you have any further question about forecasting or this research you can contact me at
Research at Mercyhurst College which involves human participants is overseen by the

Institutional Review Board. Questions or problems regarding your rights as a
participant should be addressed to Tim Harvey; Institutional Review Board
Chair; Mercyhurst College; 501 East 38th Street; Erie, Pennsylvania 16546-0001;
Telephone (814) 824.3372.
Lindy Smart, Applied Intelligence Master’s Student, Mercyhurst College

Kristan Wheaton, Research Advisor, Mercyhurst College
3
Appendix D:
Forecasting Thesis Experiment
Instructions
You are an analyst working at the Embassy of the United States of America in Sudan.
You have been tasked with forecasting whether the vote for the Sudan Referendum
set for January 9, 2011 will occur as scheduled or if it will be delayed. You are also
to identify the variables that are most influential for deciding the course of the
Sudan Referendum. The state-level high committees responsible for organizing the
referendum expect delays but the United Nations is committed to conducting it on time.
You and your teammates will be assigned areas of expertise in the economic, political,
military, social, technological, and geographic area of Sudan. You will use open source
information to complete your task. You will be given a list of sources to use at a starting
point for collection. Teams will be given one hour for collection and then will reconvene
for up to an hour and a half to put the analysis together and give a team forecast.
Researcher Contact: Lindy Smart
Appendix E:
Starting Point for Collection
2
Possible Sources:
http://www.usip.org/
http://www.state.gov/
http://www.bloomberg.com/
http://www.pbs.org/newshour/
http://www.alertnet.org/
http://www.reuters.com/
http://www.hrw.org/
http://news.yahoo.com/
http://www.washingtontimes.com/news/
http://allafrica.com/
http://www.bbc.co.uk/news/world/africa/
http://www.janes.com/
http://w3.nexis.com/new/ (Must have your username and password)
http://merlin.mercyhurst.edu/ (Databases available through the Mercyhurst Library)
Appendix F:
Structured Methods Experiment
Methodology
2
1. State the problem to be analyzed, including the time period to be covered by the
analysis:
_________________________________________________________________
_________________________________________________________________
__
2. Brainstorming list of relevant variables:
Economic:
Political:
Social:
Technology:
Military:
3
Geographic:
3. List the variables in the Cross-Impact Matrix. Put the most important
variables at the top.
*Matrix not limited to 10 variables/may not have 10 variables.
A B C D E F G H I J
A
B
C
2
D
E
F
G
H
I
J
Reading the Matrix: The cells in each row show the impact of the variable represented by that row on each
of the variables listed across the top of the matrix. The cells in each column show the impact of each
variable listed down the left side of the matrix on the variable represented by the column.
Direction and magnitude of the impact:

+ Strong positive impact - Strong negative impact
+ Medium positive impact - Medium negative impact
+ Weak Positive Impact - Weak negative impact
Use plus and minus signs to show whether the variable being analyzed has a positive or negative impact on
the paired variable.
The size of the plus or minus sign signifies the strength of the impact on a three-point scale. 3=strong,
2=medium, 1=weak
If the variable being analyzed has no impact on the paired variable, the cell is left empty.
If a variable might change in a way that could reverse the direction of its impact, from positive to negative or
vice versa, this is shown by using both a plus and a minus sign.
Please note: Size of matrix does not reflect actual size of the matrix given to
students. Students received a matrix that fit a page in its entirety.
DIRECTIONS FOR COMPLETING THE CROSS-IMPACT MATRIX
3. As a team, assess the interaction between each pair of variables and enter the
results into the relevant cells of the matrix. For each pair of variables, ask the
question: Does this variable impact the paired variable in a manner that will
increase or decrease the impact or influence of that variable?
1
a. When entering ratings in the matrix, it is best to take one variable at a

time, first going down the column and then working across the row. The
variables will be evaluated twice; for example, the impact of variable A on
variable B and the impact of variable B on variable A.
b. After rating each pair of variables, and before doing further analysis,
consider pruning the matrix to eliminate variables that are unlikely to
have a significant effect on the outcome.
c. Measure the relative significance of each variable by adding up the
weighted values in each row and column. Record the totals in each row
and column.
i. The sum of the weights in each row is a measure of each variable’s
impact on the system as a whole.
ii.The sum of the weights in each column is a measure of how much
each variable is affected by all the other variables.
iii.Those variables most impacted by the other variables should be
monitored as potential indicators of the direction in which events
are moving or as potential sources of unintended consequences.
4. Write about the impact of each variable, starting with variable A. (Use the
following pages to write out your answers.)
a. Describe the variable further if clarification is necessary (For example, if
one of the variables you identified is “Weak Government Officials” then
use this space to write exactly what you meant. You may want to include
names, party affiliations, and examples of why the officials are “weak”).
Variable A
1. Describe the variable further if clarification is necessary:
2
2. Identify all the variables that impact on Variable A with a rating of 2

or 3 (Medium or Strong Effect) and briefly explain the nature,
directions, and, if appropriate, the timing of this impact:
Variables that impact Variable A: (Shown in the COLUMNS)
a. How strong is it and how certain?
b. When might these impacts be observed?
c. Will the impacts be felt only in certain conditions?
3. Identify and discuss all variables on which Variable A has an impact

with a rating of 2 or 3 (Medium or Strong Effect):
Variables on which Variable A has an impact: (Shown in the ROWS)
b. Identify and discuss the potentially good or bad side effects of

these impacts.
Good side effects:
Bad side effects:
Variable B
2. Identify all the variables that impact on Variable B with a rating of 2

Variables that impact Variable B: (Shown in the COLUMNS)

2
3. Identify and discuss all variables on which Variable B has an impact

Variables on which Variable B has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable C
2. Identify all the variables that impact on Variable C with a rating of 2

Variables that impact Variable C: (Shown in the COLUMNS)

3
3. Identify and discuss all variables on which Variable C has an impact

Variables on which Variable C has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable D
2. Identify all the variables that impact on Variable D with a rating of 2

Variables that impact Variable D: (Shown in the COLUMNS)

1
3. Identify and discuss all variables on which Variable D has an impact

Variables on which Variable D has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable E:
2. Identify all the variables that impact on Variable E with a rating of 2 or

3 (Medium or Strong Effect) and briefly explain the nature, directions,
and, if appropriate, the timing of this impact:
Variables that impact Variable E: (Shown in the COLUMNS)

2
3. Identify and discuss all variables on which Variable E has an impact

Variables on which Variable E has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable F:
2. Identify all the variables that impact on Variable F with a rating of 2 or

Variables that impact Variable F: (Shown in the COLUMNS)

1
3. Identify and discuss all variables on which Variable F has an impact

Variables on which Variable F has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable G:
2. Identify all the variables that impact on Variable G with a rating of 2

Variables that impact Variable G: (Shown in the COLUMNS)

1
3. Identify and discuss all variables on which Variable G has an impact

Variables on which Variable G has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable H:
2. Identify all the variables that impact on Variable H with a rating of 2

Variables that impact Variable H: (Shown in the COLUMNS)
3. Identify and discuss all variables on which Variable H has an impact

3
Variables on which Variable H has an impact: (Shown in the ROWS)

these impacts.
Good side effects:
Bad side effects:
Variable I:
2. Identify all the variables that impact on Variable I with a rating of 2 or

Variables that impact Variable I: (Shown in the COLUMNS)
3. Identify and discuss all variables on which Variable I has an impact

Variables on which Variable I has an impact: (Shown in the ROWS)

3

these impacts.
Good side effects:
Bad side effects:
Variable J:
2. Identify all the variables that impact on Variable J with a rating of 2 or

Variables that impact Variable J: (Shown in the COLUMNS)
3. Identify and discuss all variables on which Variable J has an impact

Variables on which Variable J has an impact: (Shown in the ROWS)

3

these impacts.
Good side effects:
Bad side effects:
1. Analyze loops and indirect impacts:

a. Identify any feedback loops.
b. Determine if the variables are static or dynamic.
i. Static: Static variables are expected to remain more or less
unchanged during the period covered by the analysis.
ii.Dynamic variables are changing or have the potential to change.
c. Determine if the dynamic variables are either predictable or unpredictable.
i. Predictable: Predictable change includes established trends or
established policies that are in the process of being implemented.
ii.Unpredictable: Unpredictable change may be a change in
leadership or an unexpected change in policy or available
resources.
Feedback loops:
Static Variables:
Dynamic-Predictable:
Dynamic-Unpredictable:
2
7. Draw conclusions: Using data about the individual variables assembled in

Steps 5 and 6, draw conclusions about the system as a whole.
a. What is the most likely outcome or what changes might be anticipated
during the specified time period?
b. What are the driving forces behind the outcome?
c. What things could happen to cause a different outcome?
d. What desirable or undesirable side effects should be anticipated?

2
Appendix G:
Forecasting Thesis Experiment
Answer Sheet
Names and Corresponding Professor Offering Extra Credit:
Name____________________ Prof. Offering Extra Credit: ____________________

Forecast:
Variable(s) considered:
Economic:
Social:
Political:
Geographic:
Military:
3
Technological:
Source Reliability (circle one):

LOW MEDIUM HIGH
Analytic Confidence (circle one):

LOW MEDIUM
HIGH
Appendix H:
Follow-Up Questionnaire: Control Group
Thanks for your participation! Please take a few moments to answer the following
questions. Your feedback is greatly appreciated.
1. Individual amount of time spent on the assigned task? _____________
2. Amount of time spent working with your group? _____________
3. Number of variables that you contributed to the group? _____________
4. Total number of variables that the group considered before forecasting?

_____________
5. Please rate your level of knowledge of the Sudan Referendum before the
experiment with 1 being no knowledge at all and 5 being a very thorough
understanding.
1 2 3 4 5
6. Please rate the clarity of the instructions for the task with 1 being not clear at all
and 5 being very clear.
1 2 3 4 5
1
7. Please rate the availability of open source information with 1 being very difficult
to find and 5 being very easily found.
1 2 3 4 5
8. Please rate how helpful it was to work in teams for this task with 1 being not
helpful at all and 5 being very helpful.
1 2 3 4 5
9. Please provide any additional comments

you may have about the experiment:
Appendix I:
Follow-Up Questionnaire: Experiment Group
Thanks for your participation! Please take a few moments to answer the following
questions. Your feedback is greatly appreciated.
1. Individual amount of time spent on the assigned task? _____________
2. Amount of time spent working with your group? _____________
3. Number of variables that you contributed to the group? _____________
4. Total number of variables that the group considered before forecasting? ________
5. Please rate the clarity of the instructions for the task with 1 being not clear at all
and 5 being very clear.
1 2 3 4 5
6. Please rate the availability of open source information with 1 being very difficult
to find and 5 being very easily found.
1 2 3 4 5
1
7. Please rate how helpful it was to work in teams for this task with 1 being not
helpful at all and 5 being very helpful.
1 2 3 4 5
8. Please rate your level of knowledge of the Sudan Referendum before the
experiment with 1 being no knowledge at all and 5 being a very thorough
understanding.
1 2 3 4 5
9. Please rate how helpful Complexity Manager was for assessing significant
variables before forecasting the assigned tasks with 1 being not helpful at all and
5 being very helpful.
1 2 3 4 5
Comment:
10. Please rate your level of understanding of Complexity Manager before the
experiment with 1 being no understanding at all and 5 being a very thorough
understanding.
1 2 3 4 5
Comment:
11. Please rate your level of understanding of Complexity Manager after the
experiment with 1 being no understanding and 5 being a very thorough
understanding.
1 2 3 4 5
Comment:
12. Would you use Complexity Manager for future tasks? (circle one)
Yes No
3
Comments:
13. Please provide any additional comments you may have about Complexity
Manager or the experiment overall:
Appendix J:
Complexity Manager
Participant Debriefing
Thank you for participating in this research. I appreciate your contribution and
willingness to support the student research process.
The purpose of this study is to determine the accuracy of Complexity Manager for
forecasting compared to unstructured methods. This experiment was designed to test if
Complexity Manager aided analysts in accurately forecasting compared to using the
intuitive process alone. This is the first experiment conducted on Complexity Manager
and is also an addition to other experiments on the effectiveness and accuracy of
structured methodologies. Participants were organized at random and both the control
group and the experiment group were placed into groups of three to simulate subject
matter expert collaboration required for this methodology.
The results of this experiment will be given to Mr. Richards Heuer, the creator of
Complexity Manager.
If you have any further questions about Complexity Manager or this research you can
contact me.
1
Appendix K: SPSS Testing
Time in Minutes: Individual

Case Processing Summary
Cases
Valid Missing Total
Group N Percent N Percent N Percent
Time in Minutes Control 80 100.0% 0 .0% 80 100.0%
Experiment 64 100.0% 0 .0% 64 100.0%
Independent Samples Test
Levene's Test for Equality

of Variances t-test for Equality of Means
F Sig. t df Sig. (2-tailed) Mean Difference
Time in Minutes Equal variances 19.425 .000 -.797 142 .427 -3.35938
assumed
Equal variances -.750 92.938 .455 -3.35938

not assumed
Time in Minutes: Group

Group Statistics
Group N Mean Std. Deviation Std. Error Mean
Time in Minutes for Group Control 80 40.4375 20.71802 2.31635
Experiment 67 74.1045 30.30058 3.70181
Levene's Test for Equality of

Variances t-test for Equality of Means
F Sig. t df Sig. (2-tailed) Mean Difference

1
Group Statistics
Time in Minutes for Group Control 80 40.4375 20.71802 2.31635
Time in Minutes for Group Equal variances 14.017 .000 -7.963 145 .000 -33.66698
assumed
Equal variances -7.710 113. .000 -33.66698

not assumed 292
Analytic Confidence
Source Reliability
1
Survey Questions
Group Statistics
Knowledge of Sudan Control 83 1.4578 .83083 .09120
Experiment 67 1.7463 1.17219 .14321

Clarity of Instructions Control 83 4.3614 .77426 .08499
Experiment 67 3.5821 .78140 .09546
Open Source Control 83 4.1807 .76739 .08423
Experiment 67 4.2985 .79801 .09749
Team Help Control 83 4.4819 .70471 .07735
Experiment 67 4.3134 .98794 .12070
Levene's Test for

Equality of Variances t-test for Equality of Means
F Sig. t df Sig. (2-tailed)
Knowledge of Sudan Equal variances assumed 10.211 .002 -1.760 148 .080
Equal variances not assumed -1.699 115.143 .092

Clarity of Instructions Equal variances assumed .011 .917 6.104 148 .000
Equal variances not assumed 6.098 140.859 .000
Open Source Equal variances assumed .044 .834 -.918 148 .360
Equal variances not assumed -.914 138.980 .362
Team Help Equal variances assumed 6.172 .014 1.217 148 .225
Equal variances not assumed 1.175 115.648 .242

1
Variables
Economic Variables
Group Statistics
Economic Variable Control 24 2.9583 1.26763 .25875
Experiment 23 1.6522 .64728 .13497
Levene's Test for

Economic Variable Equal variances 9.613 .003 4.419 45 .000

assumed
Equal variances not 4.476 34.545 .000

assumed
1
Social Variables
Political Variables
Group Statistics
Political Variable Control 24 3.3333 1.43456 .29283
Experiment 23 1.7826 .79524 .16582

1
Levene's Test for

Political Variable Equal variances assumed 4.010 .051 4.555 45 .000

assumed
Geographic Variables
Military Variables
1
Group Statistics
Military Variable Control 24 2.8750 .89988 .18369
Experiment 23 1.2174 .67126 .13997
Levene's Test for

Military Variable Equal variances assumed 3.229 .079 7.133 45 .000

assumed 8
Technology Variables
Group Statistics
Technology Variable Control 24 2.3750 1.43898 .29373
Experiment 23 1.1739 .83406 .17391
Levene's Test for

Technology Variable Equal variances 9.883 .003 3.481 45 .001

assumed
1
Total Variables
Group Statistics
Total Variables Control 24 17.1250 4.08936 .83474
Experiment 23 8.8696 2.59903 .54193
Levene's Test for Equality of

Variances t-test for Equality of Means
Total Variables Equal variances assumed 6.132 .017 8.219 45 .000

assumed
1
Hypothesized Difference 0
Level of Significance 0.05
Group 1
Number of Successes 3
Sample Size 24
Group 2
Sample Size 23
Intermediate Calculations
Group 1 Proportion 0.125
0.26086956
Group 2 Proportion 5
-
0.13586956
Difference in Two Proportions 5
0.19148936
Average Proportion 2
-
1.18338919
Z Test Statistic 7
Two-Tail Test
-
1.95996398
Lower Critical Value 5
1.95996398
Upper Critical Value 5
0.23665493
p-Value 7
Do not reject the null
hypothesis
1
Data
Hypothesized Difference 0
Level of Significance 0.05
Group 1
Sample Size 24
Group 2
Sample Size 23
Intermediate Calculations
0.79166666
0.69565217
0.09601449
Difference in Two Proportions 3
0.74468085
Average Proportion 1
0.75462400
Z Test Statistic 2
Two-Tail Test
-
1.95996398
Lower Critical Value 5
1.95996398
Upper Critical Value 5
0.45047461
p-Value 8
Do not reject the null
hypothesis

The Forecasting Accuracy and Effectiveness of Complexity Manager

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

The Forecasting Accuracy and Effectiveness of Complexity Manager

Diunggah oleh

Hak Cipta:

Format Tersedia

THE FORECASTING ACCURACY AND EFFECTIVENESS OF

Submitted to the Faculty of Mercyhurst College

In Partial Fulfillment of the Requirements for

DEPARTMENT OF INTELLIGENCE STUDIES

THE FORECASTING ACCURACY AND EFFECTIVENESS OF COMPLEXITY

and for her patience throughout the process.

experiment creation process.

the Intelligence Studies graduate program.

encouragement, and patience with me the past two years.

The Forecasting Accuracy and Effectiveness of Complexity Manager

Master of Science in Applied Intelligence

Mercyhurst College, 2011

Associate Professor Kristan J. Wheaton, Chair

of the structured analytic technique, Complexity Manager. The study included an

experiment with Mercyhurst College Intelligence Studies graduate and undergraduate

each group considered and a researcher-created questionnaire to capture individual

necessary for further studies to yield a higher level of participation.

Figure 2.1 Complexity Manager: Cross-Impact Matrix 46

Figure 3.1 Number of Participants Per Academic Class 50

Figure 4.1 Source Reliability Per Group 63

Figure 4.2 Forecasting Per Group 66

Figure 5.1 Analytic Confidence Per Group 71

Light Bulb,” 2007).

The success of Edison’s invention, as with other scientific findings, is evident

definite and stunning?

In the context of intelligence, the two couldn’t be more different. If intelligence is

Intelligence Community (IC) and determine the validity of each method.

In “Assessing the Tradecraft of Intelligence Analysis,” Gregory F. Treverton and

C. Bryan Gabbard define structured analytic techniques as “technologies, products, or

authority, the use of structured analytic techniques will remain limited.

alternative perspectives, and discuss significance of evidence (2009). Structured analytic

techniques involve a step-by-step process that externalizes an analyst’s thoughts.

interaction between collaborations to help generate hypotheses and mitigate cognitive

limitations (Heuer, 2009).

Structured analytic techniques are needed to mitigate limitations such as

intelligence failures. Though structured analytic techniques may be an effective way to

structured analytic techniques, particularly this specific technique.

of the methodology Complexity Manager according to Richards Heuer’s

recommendations of using intelligence analysts to analyze a typical intelligence issue.

in the intelligence field of study.

CHAPTER 2: LITERATURE REVIEW

structured analytic technique.

Mercyhurst College Institute for Intelligence Studies (MCIIS) definition created by

“intelligence failure” will be defined according to Gustavo Diaz’s definition from

“Methodological Approaches to the Concept of Intelligence Failure” which states that an

makers to respond to accurate intelligence” (Diaz, 2005, p. 2).

definition. Mark Lowenthal’s definition emphasizes that an intelligence failure is a failing

(collection, evaluation, analysis, production, dissemination) to produce timely, accurate

intelligence on an issue or event of importance to national interest” (Lowenthal, 1985, p.

intelligence and policy: “a misunderstanding of the situation that leads a government to

1991, p. 51, as cited by Diaz, 2005).

The discussion of intelligence failures is not only to understand why they

decision maker’s potential misunderstanding of the intelligence product given to them. In

In “Methodological Approaches to the Concept of Intelligence Failure” Gustavo

use of new techniques (Diaz, 2005).

complex systems, such as a county’s national security, are inevitable because it is

threats and leads to a need to prioritize them.

in response to previous intelligence failures (1978). Therefore, the inability of

needing intelligence analysis, Betts concludes by suggesting “tolerance for disaster”

(Betts, 1978, p. 89).

Intelligence failures can be inevitable due to the nature of intelligence work, or by

Competing Hypothesis using Subjective Logic,” intelligence failures or errors in general,

conclude that to continue to rely solely on intuition would be irresponsible. As the