Anda di halaman 1dari 233

BEHAVIORAL SCIENCE AND SECURITY:

EVALUATING TSA’S SPOT PROGRAM

HEARING
BEFORE THE

SUBCOMMITTEE ON INVESTIGATIONS AND


OVERSIGHT
COMMITTEE ON SCIENCE, SPACE, AND
TECHNOLOGY
HOUSE OF REPRESENTATIVES
ONE HUNDRED TWELFTH CONGRESS

FIRST SESSION

APRIL 6, 2011

Serial No. 112–11

Printed for the use of the Committee on Science, Space, and Technology

Available via the World Wide Web: http://science.house.gov

U.S. GOVERNMENT PRINTING OFFICE


65–053PDF WASHINGTON : 2011

For sale by the Superintendent of Documents, U.S. Government Printing Office


Internet: bookstore.gpo.gov Phone: toll free (866) 512–1800; DC area (202) 512–1800
Fax: (202) 512–2104 Mail: Stop IDCC, Washington, DC 20402–0001
COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY
HON. RALPH M. HALL, Texas, Chair
F. JAMES SENSENBRENNER, JR., EDDIE BERNICE JOHNSON, Texas
Wisconsin JERRY F. COSTELLO, Illinois
LAMAR S. SMITH, Texas LYNN C. WOOLSEY, California
DANA ROHRABACHER, California ZOE LOFGREN, California
ROSCOE G. BARTLETT, Maryland DAVID WU, Oregon
FRANK D. LUCAS, Oklahoma BRAD MILLER, North Carolina
JUDY BIGGERT, Illinois DANIEL LIPINSKI, Illinois
W. TODD AKIN, Missouri GABRIELLE GIFFORDS, Arizona
RANDY NEUGEBAUER, Texas DONNA F. EDWARDS, Maryland
MICHAEL T. MCCAUL, Texas MARCIA L. FUDGE, Ohio
PAUL C. BROUN, Georgia BEN R. LUJÁN, New Mexico
SANDY ADAMS, Florida PAUL D. TONKO, New York
BENJAMIN QUAYLE, Arizona JERRY MCNERNEY, California
CHARLES J. ‘‘CHUCK’’ FLEISCHMANN, JOHN P. SARBANES, Maryland
Tennessee TERRI A. SEWELL, Alabama
E. SCOTT RIGELL, Virginia FREDERICA S. WILSON, Florida
STEVEN M. PALAZZO, Mississippi HANSEN CLARKE, Michigan
MO BROOKS, Alabama
ANDY HARRIS, Maryland
RANDY HULTGREN, Illinois
CHIP CRAVAACK, Minnesota
LARRY BUCSHON, Indiana
DAN BENISHEK, Michigan
VACANCY

SUBCOMMITTEE ON INVESTIGATIONS AND OVERSIGHT


HON. PAUL C. BROUN, Georgia, Chair
F. JAMES SENSENBRENNER, JR., DONNA F. EDWARDS, Maryland
Wisconsin ZOE LOFGREN, California
SANDY ADAMS, Florida BRAD MILLER, North Carolina
RANDY HULTGREN, Illinois JERRY MCNERNEY, California
LARRY BUCSHON, Indiana
DAN BENISHEK, Michigan
VACANCY
RALPH M. HALL, Texas EDDIE BERNICE JOHNSON, Texas

(II)
CONTENTS
Date of Hearing
Page
Witness List ............................................................................................................. 2
Hearing Charter ...................................................................................................... 3

Opening Statements

Statement by Representative Paul C. Broun, Chairman, Subcommittee on


Investigations and Oversight, Committee on Science, Space, and Tech-
nology, U.S. House of Representatives ............................................................... 16
Written Statement ............................................................................................ 17
Statement by Representative Donna F. Edwards, Ranking Minority Member,
Subcommittee on Investigations and Oversight, Committee on Science,
Space, and Technology, U.S. House of Representatives .................................... 18
Written Statement ............................................................................................ 20

Witnesses:

Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office
Oral Statement ................................................................................................. 24
Written Statement ............................................................................................ 26
Mr. Larry Willis, Program Manager, Homeland Security Advanced Research
Projects Agency, Science and Technology Directorate, Department of Home-
land Security
Oral Statement ................................................................................................. 39
Written Statement ............................................................................................ 40
Peter J. DiDomenica, Lieutenant Detective, Boston University Police
Oral Statement ................................................................................................. 42
Written Statement ............................................................................................ 44
Dr. Paul Ekman, Professor Emeritus of Psychology, University of California,
San Francisco, and President and Founder, Paul Ekman Group, LLC
Oral Statement ................................................................................................. 48
Written Statement ............................................................................................ 50
Dr. Maria Hartwig, Associate Professor, Department of Psychology, John
Jay College of Criminal Justice
Oral Statement ................................................................................................. 70
Written Statement ............................................................................................ 71
Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories
Oral Statement ................................................................................................. 79
Written Statement ............................................................................................ 80

Appendix I: Answers to Post-Hearing Questions

Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office .................................................................................. 114
Mr. Larry Willis, Program Manager, Homeland Security Advanced Research
Projects Agency, Science and Technology Directorate, Department of Home-
land Security ........................................................................................................ 118
Dr. Paul Ekman, Professor Emeritus of Psychology, University of California,
San Francisco, and President and Founder, Paul Ekman Group, LLC .......... 127

(III)
IV
Page
Dr. Maria Hartwig, Associate Professor, Department of Psychology, John
Jay College of Criminal Justice .......................................................................... 130
Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories ....................... 131
Peter J. DiDomenica, Lieutenant Detective, Boston University Police ............... 134

Appendix II: Additional Materials Submitted for the Record

Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office .................................................................................. 140
BEHAVIORAL SCIENCE AND SECURITY:
EVALUATING TSA’S SPOT PROGRAM

WEDNESDAY, APRIL 6, 2011

HOUSE OF REPRESENTATIVES,
SUBCOMMITTEE ON INVESTIGATIONS AND OVERSIGHT,
COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY,
Washington, DC.
The Subcommittee met, pursuant to call, at 10:03 a.m., in Room
2318 of the Rayburn House Office Building, Hon. Paul C. Broun
[Chairman of the Subcommittee] presiding.

(1)
2
3
HEARING CHARTER

COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY


SUBCOMMITTEE ON INVESTIGATIONS & OVERSIGHT
U.S. HOUSE OF REPRESENTATIVES

Behavioral Science and Security:


Evaluating TSA’s SPOT Program
WEDNESDAY, APRIL 6, 2011
10:00 A.M.—12:00 P.M.
2318 RAYBURN HOUSE OFFICE BUILDING

Purpose
The Subcommittee on Investigations and Oversight meets on April 6, 2011 to ex-
amine the Transportation Security Administration’s (TSA) efforts to incorporate be-
havioral science into its transportation security architecture. The Department of
Homeland Security (DHS) has been criticized for failing to scientifically validate the
Screening of Passengers by Observational Techniques (SPOT) program before oper-
ationally deploying it. SPOT is a TSA program that employs Behavioral Detection
Officers (BDO) at airport terminals for the purpose of detecting behavioral based in-
dicators of threats to aviation security.
The hearing will examine the state of behavioral science as it relates to the detec-
tion of terrorist threats to the air transportation system, as well as its utility to
identify criminal offenses more broadly. The hearing will examine several inde-
pendent reports-one by the Government Accountability Office (GAO), two by the Na-
tional Research Council, and a number of Defense and Intelligence Community advi-
sory board reports on the state of behavioral science relative to the detection of emo-
tion, deceit, and intent in controlled laboratory settings, as well as in an operational
environment. The Subcommittee will evaluate the initial development of the SPOT
program, the steps taken to validate the science that form the foundation of the pro-
gram, as well as the capabilities and limitations of using behavioral science in a
transportation setting. More broadly, the hearing will also explore the behavioral
science research efforts throughout DHS.
Background
The terrorist attacks on September 11, 2001 exposed a vulnerability in the na-
tion’s air transportation system. In order to augment other screening processes and
procedures, TSA conducted operational testing of behavior detection techniques at
a limited number of airports in October 2003. 1 In 2007, TSA created new BDO posi-
tions as part of the SPOT program with the goal of identifying persons who may
pose a potential security risk by using behavioral indicators such as stress, fear, or
deception. 2
The indicators BDOs use form a checklist with corresponding values and thresh-
olds. These indicators, values, and thresholds are used to assess passengers while
in line awaiting security screening. When an individual displays behaviors or an ap-
pearance that exceeds a predetermined threshold, they are referred for additional
screening. If, during the course of this secondary screening, individuals display be-
haviors that exceed another threshold, they are referred to law enforcement officers
for further investigation.
Initially established to detect terrorist threats to the aviation transportation sys-
tem, 3 the program’s mission has since broadened to include the identification of be-
haviors indicative of criminal activity. 4 Critics of the program have argued that this
expansion reflects the failure of the program to identify any terrorists, and therefore
program success could only be quantified by broadening the goals to include crimi-

1 Aviation Security: Efforts to validate TSA‘s Passenger Screening Behavior Detection Pro-
gram Underway, but Opportunities Exist to Strengthen Validation and Address Operational
Challenges, Government Accountability Office, May 2010. Available at http://www.gao.gov/
new.items/d10763.pdf
2 Ibid.
3 Ibid.
4 Congressional Budget Justification FY2012, Department of Homeland Security.
4
nal activity which has a higher rate of occurrence. 5 This may or may not be a fair
critique based on the extremely small sample size that terrorists would represent.
Regardless of the rationale for the program’s expanded scope, questions remain
about whether indicators for terrorism are the same for criminal behavior.
As of March 2010, TSA employed roughly 3,000 BDOs at approximately 161 air-
ports at a cost of $212 million a year. 6 In the President’s fiscal year 2012 budget
request, the Department seeks to add 175 more BDOs with an increase of $21 mil-
lion - a 9.5 % increase over current funding levels. 7 In total, the five year budget
profile for the SPOT program accounts for roughly $1.2 billion. 8
Relevant Reviews
U.S. Government Accountability Office (GAO)
Aviation Security: Efforts to validate TSA’s Passenger Screening Behavior Detec-
tion Program Underway, but Opportunities Exist to Strengthen Validation and
Address Operational Challenges
In May 2010, GAO issued a report titled ‘‘Efforts to Validate TSA’s Passenger
Screening Behavior Detection Program Underway, but Opportunities Exist to
Strengthen Validation and Address Operational Challenges’’ in response to a Con-
gressional request to review the SPOT program. In preparing the report, GAO ana-
lyzed ‘‘(1) the extent to which TSA validated the SPOT program before deployment,
(2) implementation challenges, and (3) the extent to which TSA measures SPOT’s
effect on aviation security.’’ 9
GAO issued the following findings associated with its review:
Although the Department of Homeland Security (DHS) is in the process of vali-
dating some aspects of the SPOT program, TSA deployed SPOT nationwide
without first validating the scientific basis for identifying suspicious passengers
in an airport environment. A scientific consensus does not exist on whether be-
havior detection principles can be reliably used for counterterrorism purposes,
according to the National Research Council of the National Academy of
Sciences. According to TSA, no other large-scale security screening program
based on behavioral indicators has ever been rigorously scientifically validated.
DHS plans to review aspects of SPOT, such as whether the program is more
effective at identifying threats than random screening. Nonetheless, DHS’s cur-
rent plan to assess SPOT is not designed to fully validate whether behavior de-
tection can be used to reliably identify individuals in an airport environment
who pose a security risk. For example, factors such as the length of time BDOs
can observe passengers without becoming fatigued are not part of the plan and
could provide additional information on the extent to which SPOT can be effec-
tively implemented. Prior GAO work has found that independent expert review
panels can provide comprehensive, objective reviews of complex issues. Use of
such a panel to review DHS’s methodology could help ensure a rigorous, sci-
entific validation of SPOT, helping provide more assurance that SPOT is ful-
filling its mission to strengthen aviation security. 10
Additionally, GAO found issues relating to performance metrics, data integrity,
and reach-back capabilities as well.
TSA is experiencing implementation challenges, including not fully utilizing the
resources it has available to systematically collect and analyze the information
obtained by BDOs on passengers who may pose a threat to the aviation system.
TSA’s Transportation System Operations Center has the resources to investigate
aviation threats but generally does not check all law enforcement and intel-
ligence databases available to it to identify persons referred by BDOs. Utilizing
existing resources would enhance TSA’s ability to quickly verify passenger iden-
tity and could help TSA to more reliably ‘‘connect the dots.’’ Further, most
BDOs lack a mechanism to input data on suspicious passengers into a database
used by TSA analysts and also lack a means to obtain information from the
Transportation System Operations Center on a timely basis. TSA states that it
is in the process of providing input capabilities, but does not have a time frame

5 Weinberger, Sharon, ‘‘Intent to Deceive’’ Can the Science of Deception Detection Help to
Catch Terrorists?’’ Nature, Vol. 465127, May 26, 2010, available at: http://www.nature.com/news/
2010/100526/pdf/465412a.pdf
6 Supra n.1.
7 Supra n.4.
8 Supra n.1.
9 Ibid.
10 Ibid.
5
for when this will occur at all SPOT airports. Providing BDOs, or other TSA
personnel, with these capabilities could help TSA ‘‘connect the dots’’ to identify
potential threats.
Although TSA has some performance measures related to SPOT, it lacks out-
come-oriented measures to evaluate the program’s progress toward reaching its
goals. Establishing a plan to develop these measures could better position TSA
to determine if SPOT is contributing to TSA’s strategic goals for aviation secu-
rity. TSA is planning to enhance its evaluation capabilities in 2010 to more
readily assess the program’s effectiveness by conducting statistical analysis of
data related to SPOT referrals to law enforcement and associated arrests. 11
Opportunities to Reduce Potential Duplication in Government Programs, Save
Tax Dollars, and Enhance Revenue
In March of 2011, GAO issued a report to Congress in response to a new statu-
tory requirement that GAO identify federal programs, agencies, offices, and ini-
tiatives, either within departments or governmentwide, which have duplicative
goals or activities. The report contained a section on SPOT and stated:
Congress may wish to consider limiting program funding pending receipt of an
independent assessment of TSA’s SPOT program. GAO identified potential
budget savings of about $20 million per year if funding were frozen at current
levels until validation efforts are complete. Specifically, in the near term, Con-
gress could consider freezing appropriation levels for the SPOT program at the
2010 level until the validation effort is completed. Assuming that TSA is plan-
ning to expand the program at a similar rate each year, this action could result
in possible savings of about $20 million per year, since TSA is seeking about
a $20 million increase for SPOT in fiscal year 2011. Upon completion of the
validation effort, Congress may also wish to consider the study’s results-includ-
ing the program’s effectiveness in using behavior-based screening techniques to
detect terrorists in the aviation environment-in making future funding decisions
regarding the program. 12
Credibility Assessment at Portals Report
In April 2009, the Portals Committee issued a report for the Defense Academy
for Credibility Assessment titled: ‘‘Credibility Assessment at Portals.’’ 13 The com-
mittee recognized the need for ‘‘advanced and accurate credibility assessment,’’ 14
which is described as ‘‘a decision making process whereby a communication is as-
sessed as to its veracity.’’ The Portals Committee had the following to say about
SPOT:
‘‘The adoption of SPOT occurred despite the fact that no study in the peer-re-
viewed scientific literature suggests that accurate credibility assessments can be
made from unstructured observations. Within SPOT it appears that the observ-
ers are attempting to assess airline passengers by casual observation of facial
micro-expressions (Wilber & Nakashima, 2007). There are several problems
with this. First, scientific research does not support the notion that microexpres-
sions reliably betray concealed emotion (Porter & ten Brinke, 2008). Second,
whereas brief facial activity may reveal the purposeful manipulation of a felt
emotion (Porter & ten Brinke, 2008), the problems of interpretation of such ma-
nipulation renders the approach useless for practical purposes. Third, the
microexpression approach equates deception with manipulated emotion. This
conceptual confusion obscures the fact that most forensically relevant lies are
not lies about feelings but about actions in the past, present or future. In con-
clusion, the use of microexpressions to establish credibility is theoretically
flawed and has not been supported by sound scientific research (Vrij, 2008).’’ 15
JASON
Comprised of world renowned scientists, JASON advises the federal government
on science and technology issues. The vast majority of its work is done at the re-

11 Ibid.
12 Opportunities to Reduce Potential Duplication in Government Programs, Save Tax Dollars,
and Enhance Revenue, Government Accountability Office, March 2011, available at: http://
www.gao.gov/new.items/d11318sp.pdf
13 ‘‘Credibility Assessment at Portals,’’ Portals Committee Report, April 17, 2009, available at:
http://truth.boisestate.edu/eyesonly/Portals/PortalsCommitteeReport.pdf
14 Ibid.
15 Ibid.
6
quest of the Department of Defense and the intelligence community, so its reports
are typically classified.
However, a 2010 Nature article that discusses the SPOT program in a piece on
deception detection provides the following: ‘‘No scientific evidence exists to support
the detection or inference of future behaviour, including intent,’ declares a 2008 re-
port prepared by the JASON defense advisory group.’’ 16
National Research Council (NRC) of the National Academies
Workshop Summary on Field Evaluation in the Intelligence and Counterintel-
ligence Context
On September 22-23, 2009, the NRC’s Board on Behavioral, Cognitive, and Sen-
sory Sciences held a workshop on ‘‘the field evaluation of behavioral and cognitive
sciences-based methods and tools for use in the areas of intelligence and counter in-
telligence.’’ 17 The workshop was sponsored by the Defense Intelligence Agency and
the Office of the Director of National Intelligence. The purpose of the workshop was
to ‘‘discuss the best ways to take methods and tools from behavioral science and
apply them to work in intelligence operations. More specifically, the workshop fo-
cused on the issue of field evaluation - the testing of these methods and tools in
the context in which they will be used in order to determine if they are effective
in real world settings.’’ 18
The NRC published a report in 2010 summarizing the presentations and discus-
sions over the 2-day period. Participants of the workshop included NRC members
and experts in the behavioral sciences and intelligence community. The goal of the
workshop was ‘‘not to provide specific recommendations but to offer some insight -
in large part through specific examples taken from other fields - into the sorts of
issues that surround the area of field evaluations. The discussions covered such
ground as the obstacles to field evaluation of behavioral science tools and methods,
the importance of field evaluation, and various lessons learned from experience with
field evaluation in other areas.’’ 19
While the report identified several obstacles, one of interest to this Subcommittee
hearing is ‘‘the pressure to use new devices and techniques as soon as they become
available, without waiting for rigorous validation. Because lives are at stake, those
in the field often push to adopt new methods and tools as quickly as possible and
before there has been time to evaluate them adequately. Once a method is in wide-
spread use, anecdotal evidence can lead its users to believe in its effectiveness and
to resist rigorous testing, which may show that it’s not as effective as they think.’’ 20
Protecting Individual Privacy in the Struggle Against Terrorists - A Framework for
Program Assessment
From 2005 to 2007, the NRC’s 21-member Committee on Technical and Privacy
Dimensions of Information for Terrorism Prevention and Other National Goals held
several meetings to ‘‘examine the role of data mining and behavioral surveillance
technologies in counterterrorism programs.’’ 21 The ensuing NRC report provides ‘‘a
framework for making decisions about deploying and evaluating those [programs]
and other information based programs on the basis of their effectiveness and associ-
ated risks to personal privacy.’’ 22
The report presented 13 conclusions and 2 broad recommendations. Of interest to
this Subcommittee hearing are the following conclusions:
• ‘‘Conclusion 3: Inferences about intent and/or state of mind implicate privacy
issues to a much greater degree than do assessments or determinations of capa-
bility.
Although it is true that capability and intent are both needed to pose a real
threat, determining intent on the basis of external indicators is inherently a much
more subjective enterprise than determining capability. Determining intent or

16 Supra n.5.
17 ‘‘Field Evaluation in the Intelligence and Counterintelligence Context,’’ National Research
Council of the National Academies , 2010, available at: http://books.nap.edu/
openbook.php?recordlid=12854&page=R1
18 Ibid.
19 Ibid.
20 ?Field Evaluation in the Intelligence and Counterintelligence Context,? National Research
Council of the National Academies, March 2010, available at: http://
www7.nationalacademies.org/bbcss/Highlights-
Field%20Evaluation%20in%20the%20Intelligence%20and%20Counterintelligence%20Context.pdf
21 ‘‘Protecting Individual Privacy in the Struggle against Terrorists - A Framework for Pro-
gram Assessment,’’ National Research Council of the National Academies, 2008, available at:
http://books.nap.edu/openbook.php?recordlid=12452&page=1
22 Ibid.
7
state of mind is inherently an inferential process, usually based on indicators
such as whom one talks to, what organizations one belongs to or supports, or
what one reads or searches for online. Assessing capability is based on such indi-
cators as purchase or other acquisition of suspect items, training, and so on. Rec-
ognizing that the distinction between capability and intent is sometimes unclear,
it is nevertheless true that placing people under suspicion because of their associa-
tions and intellectual explorations is a step toward abhorrent government behav-
ior, such as guilt by association and thought crime. This does not mean that gov-
ernment authorities should be categorically proscribed from examining indicators
of intent under all circumstances-only that special precautions should be taken
when such examination is deemed necessary.’’
• ‘‘Conclusion 4: Program deployment and use must be based on criteria more de-
manding than ‘it’s better than doing nothing.’’
In the aftermath of a disaster or terrorist incident, policy makers come under in-
tense political pressure to respond with measures intended to prevent the event
from occurring again. The policy impulse to do something (by which is usually
meant something new) under these circumstances is understandable, but it is sim-
ply not true that doing something new is always better than doing nothing. In-
deed, policy makers may deploy new information-based programs hastily, without
a full consideration of (a) the actual usefulness of the program in distinguishing
people or characteristic patterns of interest for follow-up from those not of interest,
(b) an assessment of the potential privacy impacts resulting from the use of the
program, (c) the procedures and processes of the organization that will use the
program, and (d) countermeasures that terrorists might use to foil the program.
• ‘‘Conclusion 10: Behavioral and physiological monitoring techniques might be
able to play an important role in counterterrorism efforts when used to detect
(a) anomalous states (individuals whose behavior and physiological states devi-
ate from norms for a particular situation) and (b) patterns of activity with well-
established links to underlying psychological states.
Scientific support for linkages between behavioral and physiological markers and
mental state is strongest for elementary states (simple emotions, attentional proc-
esses, states of arousal, and cognitive processes), weak for more complex states
(deception), and nonexistent for highly complex states (terrorist intent and beliefs).
The status of the scientific evidence, the risk of false positives, and vulnerability
to countermeasures argue for behavioral observation and physiological monitoring
to be used at most as a preliminary screening method for identifying individuals
who merit additional follow-up investigation. Indeed, there is no consensus in the
relevant scientific community nor on the committee regarding whether any behav-
ioral surveillance or physiological monitoring techniques are ready for use at all
in the counterterrorist context given the present state of the science.’’
• ‘‘Conclusion 11: Further research is warranted for the laboratory development
and refinement of methods for automated, remote, and rapid assessment of be-
havioral and physiological states that are anomalous for particular situations
and for those that have well-established links to psychological states relevant to
terrorist intent.
A number of techniques have been proposed for the machine-assisted detection of
certain behavioral and physiological states. For example, advances in magnetic
resonance imaging (MRI), electroencephalography (EEG), and other modern tech-
niques have enabled measures of changes in brain activity associated with
thoughts, feelings, and behaviors. Research in image analysis has yielded im-
provements in machine recognition of faces under a variety of circumstances (e.g.,
when a face is smiling or when it is frowning) and environments (e.g., in some
nonlaboratory settings).
However, most of the work is still in the basic research stage, with much of the
underlying science still to be validated or determined. If real-world utility of these
techniques is to be realized, a number of issues- practical, technical, and funda-
mental-will have to be addressed, such as the limits to understanding, the largely
unknown measurement validity of new technologies, the lack of standardization
in the field, and the vulnerability to countermeasures. Public acceptability regard-
ing the privacy implications of such techniques also remains to be demonstrated,
especially if the resulting data are stored for unknown future uses or undefined
lengths of time.
For example, the current state-of-the-art of functional MRI technology can identify
changes in the hemodynamics in certain regions of the brain, thus signaling activ-
8
ity in those regions. But such results are not necessarily consistent across individ-
uals (i.e., different areas in the brains of different individuals may be active
under the same stimulus) or even in the same individual (i.e., a slightly different
part of the brain may become active even in the same individual under the same
stimulus). Certain regions of the brain may be active under a variety of different
stimuli.
In short, understanding of what these regions do is still primitive. Furthermore,
even if simple associations can be made reliably in laboratory settings, this does
not necessarily translate into usable technology in less controlled situations. Be-
havior of interest to detect, such as terrorist intent, occurs in an environment that
is very different from the highly controlled behavioral science laboratory.’’
• ‘‘Conclusion 12: Technologies and techniques for behavioral observation have
enormous potential for violating the reasonable expectations of privacy of indi-
viduals.
Because the inferential chain from behavioral observation to possible adverse
judgment is both probabilistic and long, behavioral observation has enormous po-
tential for violating the reasonable expectations of privacy of individuals. It would
not be unreasonable to suppose that most individuals would be far less bothered
and concerned by searches aimed at finding tangible objects that might be weap-
ons or by queries aimed at authenticating their identity than by technologies and
techniques whose use will inevitably force targeted individuals to explain and jus-
tify their mental and emotional states. Even if behavioral observation and physio-
logical monitoring are used only as a preliminary screening methods for identi-
fying individuals who merit additional follow-up investigation, Because the infer-
ential chain from behavioral observation to possible adverse judgment is both
probabilistic and long, behavioral observation has enormous potential for vio-
lating the reasonable expectations of privacy of individuals. It would not be un-
reasonable to suppose that most individuals would be far less bothered and con-
cerned by searches aimed at finding tangible objects that might be weapons or
by queries aimed at authenticating their identity than by technologies and tech-
niques whose use will inevitably force targeted individuals to explain and justify
their mental and emotional states. Even if behavioral observation and physio-
logical monitoring are used only as a preliminary screening methods for identi-
fying individuals who merit additional follow-up investigation, these individuals
will be subject to suspicion that would not fall on others not so identified.’’ 23
Issues
Detection of Emotion
The state of science relative to the detection of emotion, deceit, and intent are
vastly different. Decades of research have been devoted to the detection of emotion
using verbal, nonverbal, and microfacial expressions. Each of these observational
techniques have shown to have varying degrees of success at determining an indi-
vidual’s emotion, but generally speaking, a scientific foundation does exist to sup-
port the assertion that emotion can be determined through behavioral cues.
Detection of Deceit
The foundation of research for detecting an expression of deceit is rooted in that
of emotion. For example, it is posited that a deceitful person would express emotions
such as stress, and that stress can be attributed to concealing a lie. The state of
the science in this regard is less solid. Witnesses at the hearing will testify to the
current strengths and weaknesses of this field.
Detection of Intent
Even less certainty exists regarding the ability to determine intent. This ability
is asserted by assuming that a person who intends to do harm will be concealing
this fact, thereby expressing deceitful behaviors - and that deceitful behavioral cues
are founded in stress, which in turn are displayed in emotion. This chain of rea-
soning takes the underlying assumption that behavioral indicators exist for detect-
ing emotion and infers that indicators can therefore be used to detect deceit, and
therefore intent. Very little, if any, evidence exists in the scientific literature to sup-
port this hypothesis, yet this is the goal of the SPOT program - to identify individ-
uals who may pose a threat to aviation security.

23 Ibid.
9
Laboratory vs. Operational Settings
The vast preponderance of behavioral science research conducted relative to the
detection of emotion, deceit, and intent has been done in a laboratory setting. As
the National Research Council noted in its 2008 report, ‘‘Behavior of interest to de-
tect, such as terrorist intent, occurs in an environment that is very different from
the highly controlled behavioral science laboratory.’’ 24
Utility for Counterterrorism
Even if one was to stipulate that a body of evidence existed to support the claim
that one could detect intent using behavioral indicators, it remains to be seen how
useful this would be in a counterterrorism context. In all likelihood, anyone seeking
to cause harm would employ countermeasures designed to conceal their emotions.
It remains to be seen what impact countermeasures will have on the ability to de-
tect emotions, deception, or intent, but if other deception detection tools (such as the
polygraph) are any indicator, they could severely degrade the capability.
Utility in a U.S. Aviation Transportation Setting
The SPOT program is loosely based on the Israeli model successfully employed by
El Al Airlines. This highly successful program employs more agents in more loca-
tions throughout the airport, conducts multiple face to face interviews, actively pro-
files passengers, and operates in smaller and fewer airports. They also have much
fewer passengers and far fewer flights than the U.S. air transportation system.
Israeli screeners also receive more training than the four days of classroom training,
and three days of on the job training that BDOs receive. Scaling up such an enter-
prise to accommodate the U.S. Aviation Transportation Sector would severely re-
strict the flow of commerce and passengers.
DHS S&T Validation
In its report, GAO states that ‘‘TSA deployed SPOT nationwide without first vali-
dating the scientific basis for the program.’’ 25 To its credit, DHS S&T initiated a
review two and a half years ago to ‘‘determine whether SPOT is more effective at
identifying passengers who may be threats to the aviation system than random
screening.’’ 26 GAO goes on to point out in its report, ‘‘However, S&T’s current re-
search plan is not designed to fully validate whether behavior detection and appear-
ances can be effectively used to reliably identify individuals in an airport terminal
environment who pose a risk to the aviation system.’’ 27 The report further states
that, according to the National Research Council, ‘‘an independent panel could pro-
vide an objective assessment of the methodologies and findings of DHS’s study to
better ensure that SPOT is based on valid science.’’ 28
These are two important points. First, the S&T review is not designed to validate
the underlying behavioral cues, but rather to simply demonstrate whether the pro-
gram, as a whole, is more successful than random sampling. As GAO stated in its
recent ‘‘Duplication’’ report, ‘‘DHS’s response to GAO’s report did not describe how
the review currently planned is designed to determine whether the study’s method-
ology is sufficiently comprehensive to validate the SPOT program.’’ 29 Second, based
on the Statement of Work associated with S&T’s review, questions remain as to
whether or not the review is truly independent.
The Statement of Work affirms that S&T had a direct role in selecting peer re-
viewers, as well as planning and structuring workshops that informed the method-
ology to validate the program. The Statement of Work also afforded DHS the ability
to review and provide revision recommendations at numerous points in the process.
Finally, the Statement of Work indicates that deliverables are to be provided to S&T
directly. 30 Whether or not this affected the outcome is uncertain. The validation
work was conducted by the American Institute for Research, a high respected and
reputable firm, but ultimately they are contractually bound by the parameters and
scope defined by Statement of Work negotiated with DHS. It remains to be seen
whether the review was an independent assessment, as recommended by the Na-
tional Research Council, or more of a collaboration.

24 Supra n.21.
25 Supra n.1.
26 Ibid.
27 Ibid.
28 Ibid.
29 Supran.12.
30 Statementof Work for the Naval Research Laboratory, Project Hostile Intent: Behavioral-
Based Screening Indicators Validation, U.S. department of Homeland Security, Science and
Technology Directorate, Human Factors and Behavioral Sciences Division, PR# RSHF-11-00007.
10
Nevertheless, S&T’s two and a half year review (at a cost of $2.5 million) was ini-
tially planned to be delivered in Fiscal year 2011, 31 then February 2011, 32 and then
the end of March 2011. Its current release date is for April 8th, two days after our
hearing. The Subcommittee postponed this hearing, initially scheduled for March
17th, for a number of reasons, including allowing S&T more time to produce the re-
port.
Witnesses
• Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office
• Transportation Security Administration (Invited)
• Mr. Larry Willis, Program Manager, Homeland Security Advanced Research
Projects Agency, Science and Technology Directorate, Department of Homeland
Security
• Dr. Paul Ekman, Professor Emeritus of Psychology, University of California,
San Francisco, and President and Founder, Paul Ekman Group, LLC
• Dr. Maria Hartwig, Associate Professor, Department of Psychology, John Jay
College of Criminal Justice
• Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories
• Lieutenant Detective Peter J. DiDomenica, Boston University Police

31 Supra n.1.
32 Supra n.12.
11
Appendix 1

Department of Homeland Security


Science and Technology Directorate
Human Factors Behavioral Sciences Projects

These projects advance national security by developing and applying the social,
behavioral, and physical sciences to improve identification and analysis of threats,
to enhance societal resilience, and to integrate human capabilities into the develop-
ment of technology.
Commercial Data Sources Project
Project Manager: Patty Wolfhope
Project Overview: The Science and Technology (S&T) Directorate Human Factors
Behavior Sciences Division (HFD) Commercial Data Sources Project will quan-
titatively assess the utility of commercial data sources to augment governmentally
available information about people, foreign and domestic, being screened, inves-
tigated, or vetted by the Department. The use of commercial data sources may pro-
vide a valuable source of corroborating information to ensure that an individual’s
identity and eligibility for a particular license, privilege, or status is correctly evalu-
ated during screening. This project is part of the Personal Identification Systems
Thrust Area and Credentialing Program within HFD.

Community Perceptions of Technology Panel Project


Project Manager: Ji Sun Lee
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Community Perceptions of Technology Panel
(CPT) Project brings together representatives of industry, public interest, and com-
munity-oriented organizations to better understand and integrate community per-
spectives and concerns in the development, deployment, and public acceptance of
technology. This will yield feedback to aid ongoing technology and process develop-
ment and strategies to accurately inform the public of new approaches to securing
the homeland. This is designed to better ensure acceptance of the technology within
affected communities. This project is part of the Human Technology Integration
Thrust Area and Technology Acceptance and Integration Program within HFD.

Community Resilience Project


Project Manager: Michael Dunaway
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Counter-Improvised Explosives Devices (IED)
Community Resilience Project conducts research into methodologies for effective
hazard and risk communications to enhance the ability of local officials to convey
understandable and credible warnings of IED activity to the public. This project will
help local government and civic officials understand how to properly frame risk
warnings and post-event instructions to the public in a manner that maximizes the
public’s understanding of the instructions provided and maintains public trust and
confidence. HFD is executing this project as part of the Counter Improvised Explo-
sive Devices (C-IED) Thrust Area and Mitigate Program within Explosives Division.

Counter-IED Actionable Indicators and Countermeasures Project


Project Manager: Allison Smith, Ph.D.
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Counter-Improvised Explosives Devices (IED)
Actionable Indicators and Countermeasures Project supports the intelligence and
law enforcement communities in identifying actors that pose significant IED threats
in the United States homeland. This project will provide practical tools through the
synthesis of state-of-the-art social and behavioral science databases, case studies,
surveys, and fieldwork and advanced computational modeling, simulation, and vis-
ualization technologies. It will also provide policymakers with scientifically tested
strategies to prevent radicalization and IED attacks before they occur by examining
how social and behavioral science principles can support the development of
counter-radicalization efforts. HFD is executing this project as part of the Counter
Improvised Explosive Devices (C-IED) Thrust Area and Prevent/Deter Program.
12
Credentialing Project
Project Manager: Patty Wolfhope
Project Overview: The Science and Technology (S&T) Directorate Human Factors
Behavior Sciences (HFD) Division Credentialing Project develops tamper-proof
credentialing systems that incorporate biometric information; such as a biometrics-
based card-and-reader system. The project developed a laboratory test and evalua-
tion protocol for the transportation worker identification card (TWIC) reader and
plans to initiate research and design activities to improve the range and reliability
of secure contactless technologies. This project is part of the Personal Identification
Systems Thrust Area and Credentialing Program within HFD.
Enhanced Screener – Technology Interface Project
Project Manager: Josh Rubinstein, Ph.D.
Project Overview: The Science and Technology (S&T) Directorate Human Factors
Behavioral Sciences (HFD) Division Enhanced Screener-Technology Interface Project
characterizes screener-performance issues, proposes new screener technologies and
procedures, and develops training curricula to optimize security effectiveness and re-
duce human fatigue and injury, while reducing training requirements and overall
cost. This project is part of the Human Technology Integration Thrust Area and
Transportation Technology-Human Integration Program within HFD.

Enhancing Public Response and Community Resilience Project


Project Manager: Michael Dunaway
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Enhancing Public Response and Community Re-
silience Project examines public needs (shelter, food, disaster relief, etc.) that arose
during the evacuation from southern Texas during Hurricanes Katrina and Rita in
order to enhance federal, state, local and private sector response to future cata-
strophic events. The goal is to capture and communicate lessons learned to enhance
federal, state, local and private sector responses to future catastrophic events. This
project is part of the Social and Behavioral Threat Analysis (SBTA) Thrust Area
and Community Preparedness and Resilience Program within HFD.
High Impact Technological Solution – Biometric Detector Project
Project Manager: Arun Vemury
Project Overview: The Science and Technology (S&T) Directorate High Impact
Technological Solutions (HITS) Project executed by the Human Factors/Behavioral
Science Division (HFD) will provide efficient, high quality, contact less acquisition
of fingerprint biometric signatures for identity management. This will result in sig-
nificantly improved throughput and signal quality, thereby improving recognition
and reducing false positive rates. The goal is to develop a fingerprint acquisition de-
vice that can be transitioned for implementation across Department components.
This project is part of the Innovations Portfolio/Homeland Security Advanced Re-
search Project Agency Program (HSARPA) within the S&T Directorate.

Homeland Innovation Prototypical Solutions – Future Attribute


Screening Technology (FAST) Project
Project Manager: Bob Burns
Project Overview: The Homeland Security Advanced Research Project Agency
(HSARPA) and Science and Technology (S&T) Directorate Human Factors/Behav-
ioral Sciences Division (HFD) Future Attribute Screening Technology (FAST) Project
is an initiative to develop innovative, non-invasive technologies to screen people at
security checkpoints. FAST is grounded in research on human behavior and
psychophysiology, focusing on new advances in behavioral/human-centered screening
techniques. The aim is a prototypical mobile suite (FAST M2) that would be used
to increase the accuracy and validity of identifying persons with malintent (the in-
tent or desire to cause harm). Identified individuals would then be directed to sec-
ondary screening, which would be conducted by authorized personnel. This project
is part of the Innovations Portfolio/Homeland Security Advanced Research Project
Agency (HSARPA) Program within the S&T Directorate.

Hostile Intent Detection – Automated Prototype Project


Project Manager: Larry Willis
13
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Hostile Intent Detection - Automated Prototype
Project demonstrates real-time automated intent detection using non-invasive and
culturally neutral behavioral indicators. S&T plans to transition the automated hos-
tile intent prototype to the Transportation Security Administration, Customs and
Border Protection, and Immigration and Customs Enforcement. This project is a
part of the Social and Behavioral Threat Analysis Thrust Area and Suspicious Be-
havior Detection Program within HFD.
Hostile Intent Detection – Training & Simulation Project
Project Manager: Larry Willis
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Hostile Intent Detection - Training and Simula-
tion Project develops computer-based simulation to train behavior-based stand-off
detection for future hostile intent using indicators from the interactive screening en-
vironment (Hostile Intent Detection - Automated Prototype) and the observational
environment (Hostile Intent Detection - Validation) to support screening and inter-
viewing interactions at air, land, and maritime portals. This project is part of the
Social and Behavioral Threat Analysis Thrust Area and Suspicious Behavior Detec-
tion Program within HFD.
Hostile Intent Detection – Validation Project
Project Manager: Larry Willis
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Hostile Intent Detection - Validation Project
provides cross-cultural validation of behavioral indicators employed by Department
of Homeland Security’s operational components to screen passengers at air, land,
and maritime ports. The project will integrate these validated behavioral indicators
into the screening curriculum of each component’s existing training program. This
project is part of the Social and Behavioral Threat Analysis Thrust Area and Sus-
picious Behavior Detection Program within HFD.

Human Systems Engineering Project


Project Managers: Darren P. Wilson and Janae Lockett-Reynolds, Ph.D.
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Project develops, demonstrates and evaluates a
standardized process for implementing human systems integration. It will focus on
defining human performance requirements in the development of systems and tech-
nology, and on methods and measures needed to evaluate existing technology in
terms of human performance requirements. This effort also will result in greater un-
derstanding of the needs of the various Department end-user communities, as well
as developing tools to best identify how to recruit, select, train, support, and retain
operational staff. A systematic approach based on the integration of the human com-
ponent will lead to enhanced system design, safety, efficiency, and operational per-
formance. This project is part of the Human Technology Integration Thrust Area
and Human Systems Research and Engineering Program within HFD.

Human Systems Engineering Research Project


Project Manager: Jennifer O’Connor, Ph.D.
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Science Division (HFD) projects examine human perception and ability
to detect targets and threats as they pertain to the design of systems that maximize
human performance, and the effectiveness of the technology operators use in the
field. Results of this research allow the program to focus more closely on the psycho-
logical determiners that impact successful discrimination of threats and reduce false
alarms. In addition to focusing on human perception, the project will also address
how humans process information and how that impacts the human-machine inter-
face. This project is part of the Human Technology Integration Thrust Area and
Human Systems and Engineering Program within HFD.

Insider Threat Detection Program


Project Manager: Jennifer O’Connor, Ph.D.
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Insider Threat Detection Project will detect in-
14
sider behavior that is likely to present or lead to a threat to critical infrastructure
using behavioral indicators. Department of Homeland Security will collaborate with
other U.S. agencies and international partners to move beyond the current focus on
responses to accomplished hostile insider acts, and begin developing a greater capac-
ity to deter and detect insider threats before substantial harm has been done. The
immediate operational goal is to produce new and better tools to identify behavior
patterns and characteristics identifiable before, during, and after employment that
are associated with insider threats. This project is part of the Social and Behavioral
Threat Analysis Thrust Area and Suspicious Behavior Detection Program
withinHFD.
Mobile Biometrics System Project
Project Manager: Patty Wolfhope
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavior Sciences Division (HFD) Mobile Biometrics Project develops prototype
technologies for mobile biometrics screening at remote sites along U.S. borders, dur-
ing disasters and terrorist incidents, at sea, and in other places where communica-
tions access is limited. The goal is to demonstrate mobile biometrics screening capa-
bilities and technologies that meet the future needs of Department operational
users, but currently are not available with conventional biometrics systems. This
project is part of the Personal Identification Systems Thrust Area and Biometrics
Program within HFD.

Multi-modal Biometrics Project


Project Manager: Arun Vemury
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavior Sciences Division (HFD) Multi-modal Biometrics Project develops biomet-
ric technologies that accurately and rapidly identify individuals. The operational
goal is to provide the capability to non-intrusively collect two or more biometrics
(fingerprint, face image, and iris recognition) in less than ten seconds at a ninety-
five percent acquisition rate without impeding the movement of individuals. The
multi-modal technology will allow the Department to compare and match biometric
samples from different sources, collected with different sensor technologies, under
varying environmental conditions -- a capability that eludes existing technology.
This project is part of the Personal Identification Systems Thrust Area and Bio-
metrics Program within HFD.

Muslim Community Integration Project


Project Manager: Allison Smith, Ph.D.
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Muslim Community Integration Project con-
ducts ethnographic research to examine the experiences of Muslims and non-Mus-
lims in several communities throughout the U.S. The project will provide insights
into the current state of Muslim communities focusing on their role and status in
America and their perceptions of American society. This project is part of the Social
and Behavioral Threat Analysis Thrust Area and Community Preparedness, Re-
sponse and Recovery Program within HFD.

Predictive Screening Project


Project Manager: Larry Willis
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Counter-Improvised Explosives Devices
(Counter-IED) Predictive Screening Project will derive observable behaviors that
precede a suicide bombing attack and develop extraction algorithms to identify and
alert personnel to indicators of suicide bombing behavior. HFD is executing this
project as part of the Counter-IED Thrust Area and Predict Program.

Risk Prediction Project


Project Manager: Larry Willis
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Counter-Improvised Explosives Devices Risk
Prediction Project will develop high speed software to identify improvised explosive
device (IED) target and staging areas based upon group-and-cultural-specific tactics,
techniques, and procedures derived from past foreign attacks. The goal is to use this
15
information to prioritize the risk of likely potential targets of IED attacks within
the United States. HFD is executing this project as part of the Counter-IED Thrust
Area and Predict Program.

Social Network Analysis for Community Resilence Project


Project Manager: Michael Dunaway
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Social Network Analysis for Community Resil-
ience Project develops a modeling capability for identifying formal and informal so-
cial networks that may be useful in enhancing preparedness and community resil-
ience to natural disasters and terrorist events. This effort will leverage social net-
work analysis research for understanding terrorist networks, social and financial
transactions, and the spread of infectious diseases, and apply that knowledge to the
construction of networks dedicated to strengthening local response capabilities and
preparedness. It will also leverage past and on-going work from the Department of
Defense (DOD) and other agencies. This project is part of the Social and Behavioral
Threat Analysis Thrust Area and Community Preparedness and Resilience Program
within HFD.

Violent Intent Modeling and Simulation Project


Project Manager: Ji Sun Lee
Project Overview: The Science and Technology (S&T) Directorate Human Factors/
Behavioral Sciences Division (HFD) Violent Intent Modeling and Simulation Project
develops intelligence analysis frameworks, including extraction of terrorist intention
signatures, systematic estimation of future terrorist behavior based on social and
behavioral sciences, and modeling and simulations of future terrorist behavior influ-
ences. It identifies leading edge social science modeling and simulation technologies
and advances social science modeling and data fusion capabilities in such areas as
hybrids of neural nets, structural equations, genetic algorithms, social networks, etc.
This project is part of the Social and Behavioral Threat Analysis Thrust Area and
Motivation and Intent Program within HFD.
Source: http://www.dhs.gov/files/programs/gcl1218480185439.shtm
16

Chairman BROUN. The Subcommittee on Investigations and


Oversight will come to order. Good morning. Welcome to today’s
hearing titled ‘‘Behavioral Science and Security: Evaluating TSA’s
SPOT Program.’’ You will find in front of you packets containing
our witness panel’s written testimony, biographies, and Truth-in-
Testimony disclosures.
Before we get started, this being the first meeting of the Inves-
tigations and Oversight Subcommittee for the 112th Congress, I
would like to ask the Subcommittee’s indulgence to introduce my-
self. It is an honor and a pleasure for me to chair the I&O Sub-
committee for this Congress, and it is a position that I do not take
lightly. I want all Members of this Subcommittee to know that my
door is always open, that I will endeavor to serve all Members fair-
ly and impartially, and that I will work to serve the best interests
of Congress, and all Americans, to ensure that the agencies and
programs under our jurisdiction are worthy of the public’s support.
And I recognize myself for five minutes for an opening statement.
Today the Subcommittee meets to evaluate TSA’s SPOT program.
Developed in the wake of September 11, 2001, it was deployed on
a limited basis in a select number of airports in 2003. In 2007, TSA
created new Behavioral Detection Officer (BDO) positions whose
goal was to use behavioral indicators to identify persons who may
pose a potential security risk to aviation. This goal expanded in re-
cent years to include the identification of any criminal activity.
TSA currently employs about 3,000 BDOs in about 161 airports at
the cost of over $200 million a year. The President’s fiscal year
2012 budget request asks for an increase of 9.5 percent and an ad-
ditional 175 BDOs. Over the next five years, the SPOT program
will cost roughly $1.2 billion.
Outside of a few brief exchanges at Appropriations Committee
hearings, Congress has not evaluated this program. That isn’t to
say that Congress wasn’t paying attention, as GAO conducted a
comprehensive review that culminated in a report on the SPOT
program last May. In that report, GAO identified several problems
with the program, most notably that it was deployed without being
scientifically validated.
This is a common theme that this Committee is increasingly
forced to deal with. Expensive programs are rolled out without con-
ducting the necessary analysis. This has become a trend through-
out the Federal Government but particularly at the Department of
Homeland Security.
This Committee has a long history with the development and ac-
quisition of the Advanced Spectroscopic—as a southerner it is hard
to say Spectroscopic—Portal program, but other technology pro-
grams such as the Backscatter Advanced Imaging Technology, ex-
plosives trace-detection portal machines, and the Cargo Advanced
Automated Radiography System all ran into problems because they
were rolled out before they were ready. DHS either fails to properly
test and evaluate the technology, does not conduct a proper risk
analysis, or neglects to conduct a cost/benefit analysis.
A crucial aspect that is oftentimes taken for granted by DHS is
the nexus between those developing the technology and those actu-
ally using it. In the case of SPOT, it seems as though the operators
got out ahead of the developers, but typically what we see is the
17

opposite; the scientists and engineers developing capabilities that


do not appropriately fit into an operational environment. Unfortu-
nately, this is an issue that the Committee is unable to address
today because of TSA’s refusal to attend.
The goal of this hearing is to shed light on the processes by
which DHS created the SPOT program, to better understand the
state of the science that forms the foundation of the program, to
examine the methodologies by which DHS S&T is evaluating the
program, and to identify any opportunities to improve how behav-
ioral sciences are utilized in the security context. The goal is not
to throw out the proverbial baby with the bath water, but rather
to ensure that the science being used is not oversold or undersold.
SPOT is the first behavioral science program to stick its neck out
for evaluation. This review is an opportunity to look at how behav-
ioral sciences can be used appropriately across the security enter-
prise and to understand its limitations and strengths.
To its credit, DHS S&T is conducting an evaluation of the pro-
gram for TSA. This report was due earlier this year in February,
then at the end of March, and now is expected shortly. And hope-
fully we will get that shortly. While this is a good first step, I am
eager to hear how independent this evaluation truly is. I look for-
ward to understanding the review’s methodology, its assumptions,
and what level of input and access DHS S&T had in its design, for-
mulation, and findings.
As GAO stated in its recent duplication report, ‘‘DHS’s response
to GAO’s report did not describe how the review currently planned
is designed to determine whether the study’s methodology is suffi-
ciently comprehensive to validate the SPOT program.’’ I hope you
all understood that bureaucratese.
The use of behavioral sciences in the security setting is not just
another layer to security. There is clear opportunity costs that have
to be paid. For every BDO employed to identify behaviors, there is
one screener who is not looking at an x-ray of baggage, one intel-
ligence analyst not employed, or one air marshal not in the sky. I
realize this isn’t a one-for-one substitute, but clearly there are
tradeoffs that have to be made in a very difficult fiscal environ-
ment.
Also, I would be remiss if I did not address the clear privacy
issues that this technology and other DHS technologies present.
Privacy, along with the serious Constitutional questions I have,
only compounds the complexity of the issue. While the focus of the
hearing today is the science behind the program, I don’t want these
other important issues to be forgotten.
Now, the Chair recognizes Ms. Edwards for an opening state-
ment. Ms. Edwards?
[The prepared statement of Mr. Broun follows:]
PREPARED STATEMENT OF CHAIRMAN PAUL BROUN
Today the Subcommittee meets to evaluate TSA’s SPOT program. Developed in
the wake of September 11, 2001, it was deployed on a limited basis in a select num-
ber of airports in 2003. In 2007, TSA created new Behavioral Detection Officer
(BDO) positions whose goal was to use behavioral indicators to identify persons who
may pose a potential security risk to aviation. This goal expanded in recent years
to include the identification of any criminal activity. TSA currently employs about
3,000 BDOs in about 161 airports at a cost of over $200 million a year. The Presi-
18
dent’s FY12 budget request asks for an increase of 9.5%, and an additional 175
BDOs. Over the next five years, the SPOT program will cost roughly $1.2 billion.
Outside of a few brief exchanges at Appropriations Committee Hearings, Congress
has not evaluated this program. That isn’t to say that Congress wasn’t paying atten-
tion, as GAO conducted a comprehensive review that culminated in a report on the
SPOT program last May. In that report, GAO identified several problems with the
program, most notably that it was deployed without being scientifically validated.
This is a common theme that this Committee is increasingly forced to deal with.
Expensive programs are rolled out without conducting the necessary analysis. This
has become a trend throughout the federal government, but particularly at the De-
partment of Homeland Security. This Committee has a long history with the devel-
opment and acquisition of the Advanced Spectroscopic Portal program, but other
technology programs such as the Backscatter Advanced Imaging Technology, explo-
sives trace-detection portal machines, and the Cargo Advanced Automated Radiog-
raphy System all ran into problems because they were rolled out before they were
ready. DHS either fails to properly test and evaluate the technology, does not con-
duct a proper risk analysis, or neglects to conduct a cost-benefit analysis. A crucial
aspect that is often times taken for granted by DHS is the nexus between those de-
veloping the technology, and those actually using it. In the case of SPOT, it seems
as though the operators got out ahead of the developers, but typically what we see
is the opposite, the scientists and engineers developing capabilities that do not ap-
propriately fit into an operational environment. Unfortunately, this is an issue that
the Committee is unable to address today because of TSA’s refusal to attend.
The goal of this hearing is to shed light on the processes by which DHS created
the SPOT program, to better understand the state of the science that forms the
foundation of the program, to examine the methodologies by which DHS S&T is
evaluating the program, and identify any opportunities to improve how behavioral
sciences are utilized in the security context. The goal is not to ‘‘throw the baby out
with the bath water,’’ but rather to ensure that the science being used is not over-
sold, or undersold. SPOT is the first behavioral science program to stick its neck
out for validation. This review is an opportunity to look at how behavioral sciences
can be used appropriately across the security enterprise and to understand its limi-
tations and strengths.
To its credit, DHS S&T is conducting an evaluation of the program for TSA. This
report was due earlier this year in February, then at the end of March, and is now
expected shortly. While this is a good first step, I am eager to hear how independent
this evaluation truly is. I look forward to understanding the review’s methodology,
its assumptions, and what level of input and access DHS S&T had in its design,
formulation and findings. As GAO stated in its recent duplication report, ‘‘DHS’s re-
sponse to GAO’s report did not describe how the review currently planned is de-
signed to determine whether the study’s methodology is sufficiently comprehensive
to validate the SPOT program.’’
The use of behavioral sciences in the security setting is not just another layer to
security. There are clear opportunity costs that have to be paid. For every BDO em-
ployed to identify behaviors, there is one screener who is not looking at an x-ray
of baggage, one intelligence analyst not employed, or one air marshal not in the sky.
I realize this isn’t a one-for-one substitute, but clearly there are trade-offs that have
to be made in a very difficult fiscal environment. Also, I would be remiss if I did
not address the clear privacy issues that this technology and other DHS tech-
nologies present. Privacy, along with the serious Constitutional questions I have,
only compounds the complexity of the issue. While the focus of the hearing today
is the science behind the program, I don’t want these other important issues to be
forgotten.
Ms. EDWARDS. Thank you, Mr. Chairman. And congratulations to
you as you convene the first of what I hope are many oversight
hearings to make sure that we are paying attention to the kind of
oversight that we need to engage in on the Science and Technology
Committee on behalf of the taxpayers.
I would like to say that I, too, am disappointed that TSA is not
here today, wasn’t able to provide a witness. I think they lost an
important opportunity to inform the Congress and the public why
they believe the SPOT program is worthy of our support. And I
hope they will cooperate with this Committee and the Congress in
the future. And I hope it is not terribly distracting as we get to the
19

witnesses. I don’t want any one of them to be identified as TSA and


I know it is a little confusing for me up here.
Let me just say in opening that I think each one of us has had
an experience of instinctively sensing that something about a situa-
tion or person is wrong or it is worrying. Police officers, immigra-
tion officers, transportation security officers have those instinctive
feelings all the time. However, it is an open question whether in-
stinctive reactions are reliable as warnings of mal-intent. We also
do not know whether a person can be trained to accurately sort
through their instinctive reactions, choosing to intervene when
faced with a potential threat and to resist reactions based on racial
profiling.
What the Transportation Security Administration has tried to do
is develop behavioral training for officers so they can quickly and
accurately assess and screen passengers. Can hunches be har-
nessed in service of identifying potential threats to air safety? That
is the key question that underlies today’s hearing and I hope we
will be able to dig deeply into those questions.
After Richard Reid’s failed shoe bombing, some in the aviation
security community concluded that we were spending too much
time and money on trying to stop the bomb and not enough to stop
the bomber. Screening of passengers by observation techniques, or
SPOT, was viewed by TSA as a way to get some officers’ eyes off
the scanning screens and onto the passengers.
Those credited with helping to develop the SPOT program, some
of whom are testifying before us today, intended the program to
train Behavior Detection Officers (BDOs) to focus on an individ-
ual’s behavior, appearance, and demeanor. An ongoing concern,
however, with the BDOs and with law enforcement as well is that
they not engage in racial profiling. If BDOs focus on a passenger’s
ethnic, religious, or racial qualities, they are violating the law, and
they are not acting to protect the flying public.
Terrorists have come in all colors, shapes, and sizes, and if secu-
rity personnel were fixated on a profiling approach to finding the
next Mohammed Atta, then they would miss identifying the next
John Walker Lindh, Timothy McVeigh, or Richard Reid.
The SPOT program tries to identify a specific menu of behaviors
that will naturally emerge due to elevated levels of anxiety or
stress. The hypothesis is that terrorists would display those cues
when attempting to enter a secure facility such as an airport. But
behavioral scientists do not agree on these nonverbal cues and they
don’t agree on whether terrorists would exhibit them. Because it is
impossible to get a group of terrorists to participate in a double-
blind experiment, it is hard to validate the theory.
DHS points to the program’s success in identifying people who
have violated the law and are caught, but no one can be certain
criminals and terrorists behave in a similar fashion. TSA relies on
nonverbal cues to help sort through the more than one million pas-
sengers that fly into the United States each day. Nonverbal cues
provide a filtering method to allow officers to determine who they
should engage in discussion looking for verbal signs of deception.
There is more agreement among social scientists that verbal inter-
actions with individuals can actually help in detecting deception.
20

We would hope that a DHS-funded validation report on the


SPOT program would be available for this hearing today. That re-
port purportedly shows that SPOT-trained Behavior Detection Offi-
cers are much more likely to identify what TSA deems as ‘‘high-
risk passengers’’ as against a purely random sample of passengers.
We look forward to the report’s completion and its findings, but
without it, we are missing an important initial assessment of the
program’s performance.
Over the past ten years since the 9/11 terrorist attacks, Congress
has allocated billions of dollars to the Department of Homeland Se-
curity for the development of tools and technologies to keep our air
travel secure. Too often that investment has been wasted and too
often we have relied on technology that is not adequately tested be-
fore it is deployed. It is not based on adequate scientific evidence
of effectiveness, and almost inevitably, the technology has proven
costly to acquire, deploy, and service.
So I look forward to today’s hearing and to asking questions
about the more than $200 million a year that we are spending to
make sure that we carefully evaluate SPOT’s operational merit.
And with that, I yield.
[The prepared statement of Ms. Edwards follows:]
PREPARED STATEMENT OF RANKING MEMBER DONNA F. EDWARDS
Every one of us has had the experience of instinctively sensing that something
about a situation or a person is wrong, worrying. Police officers, immigration offi-
cers, Transportation Security Officers have those same instinctive feelings all the
time. However, it is an open question whether instinctive reactions are reliable as
warnings of mal-intent. We also do not know whether a person can be trained to
accurately sort through their instinctive reactions, choosing to intervene when faced
with a potential threat and to resist reactions based on racial profiling.
What the Transportation Security Administration (TSA) has tried to do is develop
behavioral training for officers so that they can quickly and accurately screen pas-
sengers. Can hunches be harnessed in service of identifying potential threats to air
traffic safety? That is the key question that underlies today’s hearing.
After Richard Reid’s failed shoe-bombing, some in the aviation security commu-
nity concluded that we were spending too much time and money on trying to stop
the bomb and not enough effort trying to stop the bomber. Screening of Passengers
by Observation Techniques or SPOT was viewed by TSA as the way to get some
officers’ eyes off the scanning screens and onto the passengers.
Those credited with helping to develop the SPOT program, some of whom are tes-
tifying before us today, intended the program to train behavior detection officers
(BDOs) to focus on an individual’s behavior, appearance and demeanor. An ongoing
concern with the BDOs, and with law enforcement as well, is that they not engage
in racial profiling, If BDO’s focus on a passenger’s ethnic, religious or racial quali-
ties they are violating the law, and they are not acting to protect the flying public.
Terrorists have come in all colors, shapes and sizes. If security personnel were fix-
ated on a profiling approach to finding the next Mohammed Alta, then they would
miss identifying the next John Walker Lindh, Timothy McVeigh or Richard Reid.
The SPOT program tries to identify a specific menu of behaviors that will natu-
rally emerge due to elevated levels of anxiety or stress. The hypothesis is that ter-
rorists would display those cues when attempting to enter a secure facility such as
an airport. But behavioral scientists do not agree on these non-verbal cues and they
do not agree on whether terrorists would exhibit them. Because it is impossible to
get a group of terrorists to participate in a double-blind experiment, it is hard to
validate the theory. DHS points to the program’s success in identifying people who
have violated the law, and are caught, but no one can be certain criminals and ter-
rorists behave in a similar fashion.
TSA relies on non-verbal cues to help sort through the more than I million pas-
sengers that fly in the U.S. each day. Non-verbal cues provide a filtering method
to allow officers to determine who they should engage in discussion looking for
verbal signs of deception. There does is more agreement among social scientists that
verbal interactions with individuals can help in detecting deception.
21
We had hoped that a DRS-funded ‘‘validation report’’ on the SPOT program would
be available for this hearing today. That report purportedly shows that SPOT-
trained behavior detection officers are much more likely to identify what TSA deems
‘‘high risk’’ passengers as against a purely random sample of passengers. We look
forward to the report’s completion and its findings; without it we are missing an
important initial assessment of the program’s performance.
Over the past ten years, since the 9.11 terrorist attacks, Congress has allocated
billions of dollars to the Department of Homeland Security for the development of
tools and technologies to keep our air travel secure. Too often that investment has
been wasted. Too often we have relied on technology that is not adequately tested
before it is deployed, is not based upon adequate scientific evidence of its effective-
ness and almost inevitably the technology has proven costly to acquire, deploy and
service. This Subcommittee has examined some of these DRS technologies in the
past, including the Advanced Spectroscopic Portal (ASP) radiation monitors. DRS
has been forced to withdraw other technologies and to re-scope and re-think pro-
grams, including the ASP program, SBInet, explosive detection ‘‘air puffers’’ and Ad-
vanced Imaging Technology (AIT) to screen passengers.
Costing more than $200 million per year we need to carefully evaluate SPOT’s
operational merit. Is the SPOT program –as it is now constructed worthwhile?
Should it be restructured? Should it be expanded? Can it be improved–and if so,
how? What are the ultimate costs of the program and would that money be spent
elsewhere for greater effect helping to improve security on unsecured non-aviation
transportation modes, for instance?
I hope our witnesses can help address some of these issues today. I again want
to express my disappointment at the lack of cooperation of TSA with the Committee.
One of the reasons that it is unclear to me what training TSA provides BDOs re-
garding ‘‘racial profiling’’ in their SPOT program is because TSA has so far refused
to permit Subcommittee staff to observe this training. They have also refused to pro-
vide a witness for this hearing. It is hard to make the case that the SPOT program
is working and worthy of continued Congressional funding and support when the
agency that runs the program refuses to participate in a hearing. I hope that the
agency will rethink their position. I want to thank the Chairman for calling this
hearing and I look forward to hearing the testimony of the witnesses who are here
today.
Chairman BROUN. Thank you, Ms. Edwards. If there are Mem-
bers who wish to submit additional opening statements, those
statements will be added to the record at this point.
At this time I would like to introduce our panel of our witnesses.
Mr. Stephen Lord is the GAO executive responsible for directing
GAO’s numerous engagements on aviation and service transpor-
tation issues. Before his appointment to the Senior Executive Serv-
ice in 2007, Mr. Lord led GAO’s work on a number of key inter-
national security, finance, and trade issues. Mr. Lord has received
numerous GAO awards for meritorious service, outstanding
achievement, and teamwork. Congratulations.
Mr. Larry Willis is the Program Director for suspicious behavior
detection within the Human Factors Division of the Homeland Se-
curity Advanced Research Projects Agency, Science and Technology
Directorate, Department of Homeland Security. Boy, your business
card must be a big one with all that.
Detective Lieutenant Peter J.—how do you pronounce your
name, sir?
Mr. DIDOMENICA. DiDomenica.
Chairman BROUN. DiDomenica. Okay. Mine is pronounced
Broun. My family either can’t spell or can’t pronounce, so I am very
cognizant of people’s pronunciation. Detective Lieutenant Peter J.
DiDomenica is employed by the Boston University Policy where he
commands the Police Detective Division. Prior to this he served as
a Massachusetts State Police Officer, as well as the Director of Se-
curity Policy at Boston Logan International Airport, where he de-
veloped innovative antiterrorism programs.
22

Dr. Paul Ekman is Professor Emeritus of Psychology at UCSF


and is currently the President of the Paul Ekman Group. He has
authored or edited 15 books—wow, you have been busy, sir—and
has consulted with federal and local law enforcement and national
security organizations. The American Psychological Association
identified Dr. Ekman as one of the 100 most influential psycholo-
gists of the 20th century. Quite an honor, sir. ‘‘Time’’ Magazine se-
lected him as one of the 100 most influential people of 2009. He
is also the Scientific Advisor to the dramatic television series on
Fox TV, ‘‘Lie to Me,’’ which was inspired by his research. I hope
you are getting rich with all that. I love the market system. This
is great.
Dr. Maria Hartwig is an Associate Professor in the Department
of Psychology at John Jay College of Criminal Justice. She has
published research on deception in a number of scientific journals,
is on the Editorial Board of Law and Human Behavior. In 2008,
Dr. Hartwig received an Early Career Award by the European As-
sociation of Psychology and Law for her contributions to psycho-
logical research. Congratulations.
Dr. Philip Rubin is the Chief Executive Officer and a Senior Sci-
entist at Haskins Laboratories, a private, nonprofit research insti-
tute affiliated with Yale University and the University of Con-
necticut. In 2010, Dr. Rubin received APA’s Meritorious Research
Service Commendation. Dr. Rubin is the Chair of the National
Academies Board on Behavioral, Cognitive, and Sensory Sciences,
and was previously the Chair of the National Research Council
Committee on Field Evaluation of Behavioral and Cognitive
Sciences Based Methods and Tools for Intelligence and Counter-
intelligence and a member of the NRC Committee on Developing
Metrics for Department of Homeland Security’s Science and Tech-
nology Research.
Noticeably absent from the witness table is the Transportation
Security Administration. TSA was invited to the initial hearing on
March 13 that was postponed. They were invited to this hearing
several weeks ago. In response to these invitations, DHS has re-
fused to send a TSA representative. On another Committee hearing
just yesterday the Department of Homeland Security refused to
have a witness sit on a panel with other witnesses. DHS has
staked out a claim that I think is intolerable. It is unconscionable
that TSA will not send their representative here today to this im-
portant hearing on this program that is slated to spend $1.2 billion
of the taxpayers’ money to talk to us about it, and I find that to-
tally reprehensible.
In a letter to this Committee, DHS sought to detail the Sub-
committee’s interest, presumably quoting from Rule 10 of the
House of Representatives that delineates jurisdiction. In this letter
they state ‘‘Given the Subcommittee’s interest in scientific re-
search, development, and demonstration in projects,’’ Larry Willis,
Project Manager for the Hostile Intent Detection Validation Project
at DHS’s Science and Technology Directorate, ‘‘S&T will represent
DHS at the aforementioned hearing.’’
I find it highly presumptuous that DHS thinks it knows our ju-
risdiction better than we do. It shows their arrogance. I find it ap-
palling. Considering this Committee was formed in 1958 and
23

played an active role in creating the Department of Homeland Se-


curity. While DHS surprisingly cites our black-letter jurisdiction
under Rule 10 correctly, they must have stopped reading there.
Under Rule 11, the Committee on Science, Space, and Technology
is tasked with the responsibility to ‘‘review and study on a con-
tinuing basis laws, programs, and government activities relating to
non-military research and development.’’
Unless TSA and DHS are arguing that science and research
played no role in the development of SPOT program, I see a com-
pelling reason for their attendance here today. The nexus between
science and operations is vitally important to understanding how
programs were developed, why there are problems, and how they
can improve.
If TSA and DHS are, in fact, making a claim that science and
research played no role in the formation of the program whatso-
ever, then this program should be shut down immediately for lack-
ing any scientific basis and being little more than snake oil. If DHS
does not value this Committee’s role in overseeing the Agency and
if TSA does not value S&T’s scientific advice, there are a number
of legislative options that this Committee could employ to change
that impression.
I will also note that DHS has sent Agency officials to testify be-
fore this Committee from Customs and Border Protection and the
Coast Guard. I find it odd that in this instance TSA would not
want to talk about this program. It makes me wonder what they
are trying to hide. When DHS is asking for a 9.5 percent increase
in the fiscal year 2011 budget request for SPOT, you would think
that they could justify that increase to us here in Congress.
Let me be clear. The Administration does not tell Congress how
to run its hearings. We will likely return to this issue once again
after the validation report is delivered. At that point we may seek
TSA’s input once again. If that is decided, this Committee may
seek more aggressive measures to compel TSA’s attendance, includ-
ing the issuance of a subpoena.
This Committee has not needed to issue a subpoena in almost
two decades and has been successful in reaching accommodations
with Republican and Democratic administrations. I am hopeful
that TSA will determine that they have a valuable contribution to
make to this topic in the future so that we do not find it necessary
to go down that road.
Now, as our witnesses should note, spoken testimony is limited
to five minutes each, if you all would please try to hold it to the
five minutes. If you go over a few seconds, then that will be okay.
But if you just go on and on, then I may have to tap the gavel so
you know please wrap up very quickly. Your written testimony will
be included in the record of the hearing. It is the practice of the
Subcommittee on Investigations and Oversight to receive testimony
under oath. Do any of you have any objections to taking an oath?
Any of you? Okay. Let the record reflect that all witnesses were
willing to take an oath. They all showed that by nodding their head
from side to side indicating no. You also may be represented by
counsel. Do any of you have counsel here with you today? No?
Okay. Let the record reflect that none of the witnesses have coun-
sel. Now, if you would, please, stand and raise your right hand.
24

Do you solemnly swear or affirm to tell the whole truth and


nothing but the truth, so help you, God?
Let the record reflect that all witnesses participating have taken
the oath. Thank you. You all may sit down.
I now recognize our first witness, Mr. Stephen Lord, Director of
Homeland Security Justice Issues, Government Accountability Of-
fice. Mr. Lord, five minutes.
TESTIMONY OF STEPHEN LORD, DIRECTOR,
HOMELAND SECURITY AND JUSTICE ISSUES,
GOVERNMENT ACCOUNTABILITY OFFICE
Mr. LORD. Thank you. Chairman Broun, Ranking Member Ed-
wards, and other Members of the Committee, thank you for invit-
ing me here today to discuss TSA’s behavior-detection program,
also known as SPOT.
Today, I would like to discuss two issues. First, DHS’s ongoing
efforts to validate the program and second, TSA’s efforts to make
better use of the information collected through this program. This
is an important issue as the Department is currently seeking $254
million in fiscal year 2012 funds, including 350 additional Behav-
ioral Officer positions. And as we reported in May 2010, TSA de-
ployed SPOT to 161 airports across the Nation before completing
ongoing validation efforts. Thus, it is still unclear whether behavior
and appearance indicators can be used to reliably identify individ-
uals who may pose a threat to the U.S. aviation system. According
to TSA, the program was deployed before these efforts were com-
pleted to help address potential security threats.
To help ensure the program is based on sound science, our report
recommended that TSA and DHS convene an independent panel of
experts to review the methodology and results of the ongoing vali-
dation effort you mentioned in your opening comments. The good
news is DHS agreed with this recommendation. However, as other
panel members will note in their statements today, a scientific con-
sensus does not yet exist on whether behavior detection principles
can be reliably used for counterterrorism purposes in an airport en-
vironment.
It is also important to note that the current DHS validation ef-
fort will not answer several important questions. For example, how
long can Behavior Detection Officers observe passengers without
becoming fatigued? What is the optimal number of officers needed
to ensure adequate coverage? To what extent are the behavior and
appearance indicators the right mix of indicators? Should the list
of indicators be larger or should the list be smaller? Also, while Mr.
Willis will report that SPOT is nine times more effective than ran-
dom screening in identifying so-called high-risk individuals, the re-
sults of this analysis have yet to be shared with GAO or independ-
ently reviewed.
Our report also highlighted some difficulties that TSA faced in
capturing and analyzing the rich information that was collecting at
airports. Thus, we recommended that TSA better collect and ana-
lyze SPOT information to help connect the dots on passengers who
may pose a threat to the U.S. aviation system.
For example, we recommended that TSA clarify its guidance to
BDOs for inputting information into the database used to track
25

suspicious activities. We also recommended that they expand ac-


cess to this database across all SPOT airports. The good news is
TSA agreed with our recommendations and has revised its proce-
dures accordingly. TSA also expanded access to this database to all
SPOT airports as of March of this year.
Our 2010 report also recommended that TSA make better use of
information collected through airport video systems. We noted that
16 individuals who were later charged with or pleaded guilty to ter-
rorism-related offenses transited through eight SPOT airports on
23 different occasions. Thus, we recommended that TSA examine
the feasibility of using airport video systems to refine the current
number of behaviors currently assessed and also to use this infor-
mation to help refine the program going forward. We believe such
recordings could help identify behaviors that may be common
among terrorists or could demonstrate that terrorists do not gen-
erally display any identifying behaviors. Again, TSA agreed with
our recommendation and is now exploring ways to better use these
video recordings.
In closing, behavior and appearances monitoring might be able
to play a useful role in airport counterterrorism efforts. However,
it is still an open question whether these techniques can be suc-
cessfully applied on a large scale in the airport environment. And
while I am encouraged that DHS has taken steps to validate the
program, I am still surprised the Department is seeking additional
funding for this program before the issue is fully addressed. Now,
hopefully, today’s hearing will help clarify S&T’s future plans for
validating the program.
Chairman Broun, Ranking Member Edwards, and other Mem-
bers of the Committee, this concludes my statement. I look forward
to your questions.
[The prepared statement of Mr. Lord follows:]
26
PREPARED STATEMENT OF MR. STEPHEN LORD, DIRECTOR, HOMELAND SECURITY AND
JUSTICE ISSUES, GOVERNMENT ACCOUNTABILITY OFFICE
27
28
29
30
31
32
33
34
35
36
37
38
39

Chairman BROUN. Thank you, Mr. Lord. I now recognize our


next witness, Dr. Paul Ekman, Professor Emeritus—wait a minute.
I skipped over one and I apologize. I now recognize Mr. Willis—our
next witness, Mr. Larry Willis, Program Manager, Homeland Secu-
rity Advanced Research Project Agency, Science and Technology Di-
rectorate, Department of Homeland Security. Mr. Willis, you have
five minutes. Thank you, sir.
TESTIMONY OF LARRY WILLIS, PROGRAM MANAGER,
HOMELAND SECURITY ADVANCED RESEARCH PROJECTS
AGENCY, SCIENCE AND TECHNOLOGY DIRECTORATE,
DEPARTMENT OF HOMELAND SECURITY
Mr. WILLIS. Thank you. Good afternoon, Chairman Broun, Rank-
ing Member Edwards, distinguished Members of the Subcommittee.
I am honored to appear before you today on behalf of the Depart-
ment of Homeland Security, Science and Technology Directorate, to
discuss our evaluation of the Transportation Security Administra-
tion’s Screening Passenger by Observation Technique, or SPOT re-
ferral report, which is a checklist of predefined behavior indicators
used by TSA to identify potentially high-risk travelers.
For the purpose of S&T’s studies, high-risk travelers are defined
as those passengers in possession of serious prohibited and/or ille-
gal items or individuals engaging in conduct leading to arrest.
For background purposes, the SPOT validation effort began in
2007 as a result of the component-led, S&T-managed People
Screening Capstone Integrated Product Team process that identi-
fied and prioritized capability gaps of DHS operational customers.
As an active participant in this IPT process, TSA identified the
SPOT Referral Report and its associated indicators as a candidate
for the validation study. The SPOT Referral Report contains a dis-
crete list of observable indicators which have been designated by
TSA as Sensitive Security Information, or SSI. TSA’s Behavior De-
tection Officers, or BDOs, are trained to identify these indicators
and use them to make screening decisions, such as referral for ad-
ditional screening at the TSA checkpoint.
It is important to note that the behavioral screening isn’t limited
to aviation security and is conducted formally or informally by
DHS agencies, the Department of Defense, the intelligence commu-
nity, and law enforcement worldwide. The SPOT validation re-
search is a rigorous evaluation of TSA’s SPOT Referral Report that
supports our better understanding of the threat, the screening ac-
curacy of the existing indicators, and advances of science of behav-
ioral-based screening.
S&T, in cooperation with the American Institute for Research de-
signed the Base Rate Study to compare TSA’s SPOT Referral Re-
port process with a random screening process. AIR is one of the
largest non-profit behavioral science research organizations in
North America and has performed numerous validation studies.
Two databases were used for the study.
The first was designed to include case information from ran-
domly selected travelers who were subjected to the SPOT referral
process during the Base Rate Study conducted from December 2009
through October 2010 and included a total of 71,589 referrals from
43 airports. To make direct comparisons between the Base Rate
40

database and the Operational Referrals, a second dataset was cre-


ated for the 23,265 Operational SPOT Referrals collected during
the same time and at the same locations of the Base Rate Study.
Together, these two datasets allowed AIR to assess the extent to
which the SPOT Referral Report of observable indicators lead to
correct screening decisions. A key number of findings emerged from
the analysis of the SPOT Referral Report, including the following,
which I would like to share with you.
One, Operational SPOT identifies high-risk travelers at a signifi-
cantly higher rate than random screening. The study data indicate
that a high-risk traveler is nine times more likely to be identified
using Operational SPOT versus random screening. Moreover, to
achieve this outcome, BDOs within the study were able to engage
50,000 fewer travelers using Operational SPOT than with random
selection methods.
The second result is a population base rate for SPOT indicators
is low. Among those selected for random screening the Base Rate
Study, the most frequently observed indicator was displayed in
only 2.8 percent of the randomly selected travelers. All of the other
indicators were observed in fewer than two percent of the travelers
selected during the Base Rate Study.
In conclusion, these results indicate that the SPOT program is
significantly more accurate than random screening in identifying
high-risk travelers using the metrics that we employed. Our valida-
tion process, which included an independent and comprehensive re-
view of SPOT Referral Report, is a key example of how S&T works
to enhance the effectiveness of the Department’s operational activi-
ties.
Chairman Broun, Ranking Member Edwards, I thank you again
for this opportunity to discuss the research to validate the Screen-
ing of Passengers by Observation Technique Referral Report. And
I am happy to answer the questions that the Subcommittee may
have.
[The prepared statement of Mr. Willis follows:]
PREPARED STATEMENT OF MR. LARRY WILLIS, PROGRAM MANAGER FOR THE SCIENCE
AND TECHNOLOGY DIRECTORATE, DEPARTMENT OF HOMELAND SECURITY

Introduction and Study Objective:


Good afternoon, Chairman Broun, Ranking Member Edwards and distinguished
Members of the Subcommittee. I am honored to appear before you today on behalf
of the Department of Homeland Security (DHS) Science and Technology Directorate
(S&T) to discuss our evaluation of the Transportation Security Administration’s
(TSA) Screening of Passengers by Observation Techniques (SPOT) program. SPOT
is a behavior observation and analysis program in which personnel are trained to
identify behaviors that deviate from an established baseline that could be possible
indicators for terrorism or criminal activity. Today, I will describe S&T’s research
assessing the validity of the SPOT Referral Report, which is a checklist of
predefined observable indicators used by TSA to identify potentially high risk trav-
elers. For the purpose of S&T’s study, high risk travelers are defined as those pas-
sengers in possession of serious prohibited and/or illegal items or individuals engag-
ing in conduct leading to an arrest. Specifically, our study offers an assessment of
the extent to which the SPOT Referral Report of observable indicators leads to cor-
rect screening decisions at the security checkpoint.
Research Requirements and Background:
Approximately 1.2 million people fly within the United States daily. The SPOT
program trains TSA personnel to serve as an additional layer of security in airports
by providing a non-intrusive means of identifying individuals who may pose a risk
of terrorism or criminal activity. In behavior-based screening, trained personnel at-
41
tempt to identify anomalous behaviors by observing passengers and comparing what
they see to an established behavioral baseline of other passengers developed in the
same general location and within the same timeframe. It is important to note that
behavioral screening isn’t limited to aviation security and is conducted formally or
informally by other DHS agencies, the Department of Defense, the Intelligence Com-
munity, and law enforcement worldwide. The SPOT validation effort appears to be
the most rigorous evaluation of behavioral-based screening.
The SPOT validation effort began in 2007 as a result of the component-led, S&T-
managed People Screening Capstone Integrated Product Team (IPT) process that
identified and prioritized capability gaps of DHS operational components.
The ‘‘People Screening’’ Capstone IPT established the research requirement to
identify and validate observable behavior indicators of threats and suspicious behav-
iors in a screening environment. As an active participant in this IPT, TSA identified
the SPOT Referral Report and its associated indicators as a candidate for the vali-
dation study. Through a series of interactions with TSA, S&T determined that the
SPOT screening process and the effectiveness of the observable indicators list was
testable. The SPOT Referral Report contains a discrete list of observable indicators
which have been designated by TSA as Sensitive Security Information (SSI). TSA’s
Behavior Detection Officers (BDOs) are trained to identify these indicators and use
them to make screening decisions, such as referral for additional screening at the
TSA checkpoint. Furthermore, TSA records each behavior-based screening event, as
well as its corresponding indicators, screening results, and outcomes to help inform
future screening decisions. The SPOT process leads to three possible actions: the
traveler proceeds through the TSA checkpoint and to their flight as normal; the
traveler is identified as possibly carrying serious prohibited/illegal items and re-
ceives additional screening at the TSA checkpoint; or the traveler is identified to
a Law Enforcement Officer (LEO) for appropriate intervention.
Research Approach:
S&T, in cooperation with the American Institutes for Research (AIR), designed
the Base Rate Study to compare TSA’s SPOT Referral Report process with a random
screening process and to estimate the population base rate of high-risk travelers.
AIR is one of the largest non-profit behavioral science research organizations in
North America and has performed numerous validation studies. Two databases were
used for this study. The first was designed to include case information from ran-
domly selected travelers who were subjected to the SPOT referral process during the
Base Rate Study from December 1, 2009 through October 31, 2010, including a total
of 71,589 referrals from 43 airports. To make direct comparisons between the Base
Rate database and the Operational SPOT Referrals, a second dataset (SPOT com-
parison dataset) was extracted from TSA’s SPOT Referral database to contain the
23,265 Operational SPOT referrals collected during the same time period and from
locations covered by the Base Rate Study. Together, these two datasets allowed AIR
to assess the extent to which the SPOT Referral Report of observable indicators
leads to correct screening decisions at the security checkpoint.
Research Results:
A number of key findings emerged from the analysis of the SPOT Referral Report,
including four that I would like to share with you:
1. Operational SPOT identifies high-risk travelers at a significantly higher rate
than random screening. The study data indicate that a high risk traveler is
nine times more likely to be identified using Operational SPOT versus random
screening. (Operational SPOT refers to the standard operating procedure of the
BDOs executing the referral reporting process at the checkpoint as opposed to
the program as a whole.) Moreover, to achieve these outcomes, BDOs were able
to engage with 50,000 fewer travelers using Operational SPOT than they did
when using random selection methods.
2. SPOT indicators appear to be observed and utilized consistently across varying
airport characteristics. When we examined the consistency in implementation
overall, we found that observable indicators within the SPOT Referral Report
are used at relatively the same rate regardless of the year, time of year, or size
of airport. Moreover, indicators tended to be consistently related to outcomes
in the same ways across these characteristics, providing further evidence that
the indicators are reliable. These results also serve as initial support for reli-
ability in the use of the SPOT Referral Report, with little to no evidence of
major coding variations or random fluctuations.
3. The population base rate for high-risk travelers is extremely low. In other
words, the large majority of travelers pose no security risks. Results of the
42
Base Rate Study confirm that the measurable outcomes that represent high-
risk travelers are rare events. These data indicate that the estimated popu-
lation parameter for:
i. Arrested by Law Enforcement Officer is 1 in 10,000 travelers
(or 0.01 percent).
ii. Possession of Fraudulent Documents is 1 in 2,000 travelers
(or 0.05 percent).
iii. Possession of Serious Prohibited/Illegal Items is 1 in 750 travelers
(or 0.13 percent).
iv. Combined Outcome, or presence of any outcome (of the above),
is 1 in 750 travelers (or 0.13 percent).
4. The population base rate for SPOT indicators is low. Among those selected for
random screening in the Base Rate Study, very few travelers (approximately
8 percent) exhibited any SPOT indicators. The most frequently observed indi-
cator (again, SPOT indicators are designated SSI) was displayed in only 2.8
percent of the randomly selected travelers. In contrast, this indicator is exhib-
ited in more than half of SPOT-referred travelers. All of the other indicators
were observed in fewer than 2 percent of the travelers selected by the Base
Rate Study.

Conclusion:
In conclusion, these results indicate that the SPOT program is significantly more
effective than random screening: a high-risk traveler is nine times more likely to
be identified using Operational SPOT versus random screening. Our validation proc-
ess, which included an independent and comprehensive review of SPOT, is a key
example of how S&T works to enhance the effectiveness of the Department’s oper-
ational activities. Expanding on these initial findings, we would like to conduct fur-
ther research to assess the screening accuracy of these observable indicators in simi-
lar operational screening environments, in aviation and beyond. Additionally, we
would like to work to identify other indicators that could further increase accuracy
in operational screening.
Chairman Broun, Ranking Member Edwards, I thank you again for this oppor-
tunity to discuss the Screening of Passengers by Observation Techniques program.
I am happy to answer any questions the Subcommittee may have.
Chairman BROUN. Thank you, Mr. Willis. You kept your remarks
under five minutes, and sometimes that is not done here. In fact,
most times it is not done here.
Our next witness is Mr. Peter DiDomenica of the Boston Univer-
sity Police. Thank you, Lieutenant. Appreciate it. You have five
minutes, sir.
TESTIMONY OF PETER J. DIDOMENICA, LIEUTENANT
DETECTIVE, BOSTON UNIVERSITY POLICE
Mr. DIDOMENICA. Thank you. Good morning. Chairman Broun,
Ranking Member Edwards, and Members of the Committee, I
thank you for this opportunity to address you today regarding the
future of the TSA SPOT program that I originally developed.
By way of additional background, I have trained over 3,000 po-
lice, intelligence, and security officials in over 100 federal, state,
and local agencies in the United States and U.K. in behavior as-
sessment. I have also been a lecturer or advisor on behavior assess-
ment for the FBI, CIA, Secret Service, DHS, U.S. Army Night Vi-
sion Lab, Defense Department Criminal Investigations Task Force,
and the National Science Foundation. I appear today representing
only myself and not any of the organizations I am or have been em-
ployed by.
On December 22, 2001, while assigned to Logan International
Airport as a member of the State Police, I was part of a large team
of public safety officials who responded to the airfield to meet
43

American Airlines flight 63, diverted to Boston from a flight from


Paris, France to Miami. On board was a passenger named Richard
Reid who attempted to detonate an improvised explosive device art-
fully concealed in his footwear that, if successful, would have killed
all 197 passengers and crewmembers aboard. As I stood only a few
feet away from Reid, who was now securely in custody in the back
of a state police cruiser, it hit me that this man was the real thing,
that the threat of another terrorist attack by Al Qaeda would not
stop, and that we need to do more, much more, to properly screen
passengers than merely focusing on weapons detection. Thus began
the development of what would become the Behavior Assessment
Screening System or BASS in the SPOT program.
I began to explore the scientific literature in an effort to quantify
the human capacity to detect dangerous people. My research in-
cluded many disciplines including physiology, psychology, neuro-
science, as well as specific research into suicide bombers. In devel-
oping the program, specific behaviors were selected that were both
supported in the scientific literature and consistent with law en-
forcement experience.
The BASS program went on to be delivered to numerous agen-
cies, including the entire Washington, D.C., Metro Transit Police,
Amtrak Police, and the Atlanta Police officers assigned to the
world’s busiest airport, Atlanta Hartsfield-Jackson International
Airport. In 2006, two BASS trainers and I spent two weeks in Lon-
don where we set up a British version of the BASS program for the
British Transport Police as a response to the July 7, 2005, terrorist
attacks on the London Underground.
During the course of training police officers around the Nation,
the State Police BASS instructors discovered four individuals with
suspected terrorist ties. In 2004, while conducting BASS training
with the New Jersey Transit Police at Newark Penn Station, I ob-
served three males exhibiting suspicious behavior using BASS
techniques. One of the subjects was in the United States on a reli-
gious visa from a Middle Eastern country and was being escorted
to an Amtrak train for a claimed week-long trip with no luggage.
It was later confirmed the subject listed on the visa was on a terror
watch list. I even intercepted a DHS inspector on a covert test of
the screening checkpoint at Logan Airport in late 2003 with a con-
cealed weapon through BASS techniques.
Although I believe that the SPOT program is effective at identi-
fying high-risk passengers, its effectiveness is limited because prop-
er resolution of highly suspicious people discovered by the TSA
BDOs requires a law-enforcement response by police officers
trained in the same behavior detection and interview skills. I de-
signed the program so that the most dangerous people would be ei-
ther removed from the critical infrastructure or arrested by BASS-
trained police officers. I do not believe the current TSA airport
SPOT familiarization training program is enough. The airport po-
lice, in my opinion, need to be trained in the same techniques and
skill sets which would engender confidence in the program and
their own ability to detect terrorist behavior and prevent additional
devastating attacks.
Another issue I see with the SPOT program is that the TSA has
created too high an expectation for what it is able to achieve. The
44

original SPOT program I designed was not primarily for the appre-
hension of suspects but as a means to deny access to critical infra-
structure of high-risk persons who could be involved in terrorism
or other dangerous activity. It was to be the last and, most impor-
tantly, the best chance to prevent a tragedy when other methods
such as intelligence and traditional physical screening have failed.
Catching a terrorist through a random encounter in a public place
without any prior intelligence is extremely difficult.
By way of example, if we use the known number of terrorist sus-
pects who boarded domestic commercial flights at airports with
BDOs and the approximately four billion passenger enplanements
at U.S. commercial airports from 2004 to 2009, the base rate of ter-
rorist passengers is about 1 in 173 million. The expectation that
the SPOT program will result in the arrest of all terrorists at-
tempting to board a domestic flight in the United States is unreal-
istic and threatens its continued support. If, however, it is seen as
part of a multi-layered approach with the primary goal of pre-
venting terrorist access to critical infrastructure in conjunction
with properly trained law enforcement, the program sets reason-
able and attainable goals and should have the support of this Con-
gress.
Thank you for this opportunity to address the program and I am
prepared to answer any questions that you may have.
[The prepared statement of Mr. DiDomenica follows:]
PREPARED STATEMENT OF MR. PETER J. DIDOMENICA,
LIEUTENANT DETECTIVE, BOSTON UNIVERSITY POLICE
Good morning. Chairman Broun, Ranking Member Edwards, and Members of the
Committee, I thank you for this opportunity to address you today regarding the fu-
ture of the TSA Screening of Passengers by Observation Techniques program that
I developed, which is more commonly referred to as the SPOT program.
I am Peter DiDomenica presently employed as a Detective Lieutenant with the
Boston University Police Department. I recently joined the Boston University force
after serving for more than 22 years with the Massachusetts State Police where I
retired as a Lieutenant. While a member of the State Police I served as an investi-
gator in the Major Crime Unit, as the Director of Legal Training for the State Police
Academy, as a staff member to five different superintendents, and as Director of Se-
curity Policy for Boston Logan International Airport in the two years after the dev-
astating 9/11 attacks. I also served the State Police for a decade as a subject matter
expert and lead trainer for Massachusetts police agencies in racial profiling and bi-
ased policing. In this capacity I designed statewide police training programs and the
State Police traffic stop data collection and analysis system created to monitor en-
forcement efforts for indications of biased policing. I am also presently a consultant
for EOIR Technologies of Fredericksburg, VA where I serve as an advisor on human
behavior detection for the U.S. Army Night Vision and Electronic Sensors Direc-
torate. I am a certified instructor in the interview, behavior assessment, and decep-
tion detection programs for The Forensic Alliance, a consulting firm of forensic psy-
chologists based in British Columbia, Canada. I am presently an adjunct instructor
for the graduate criminal justice program at Anna Maria College in Paxton, MA.
I am a licensed attorney in Massachusetts having earned my J.D. in 1995. I have
trained over 3,000 police, intelligence, and security officials in over 100 federal,
state, and local agencies in the U.S. and U.K. in behavior assessment. I have also
been a lecturer or advisor on behavior assessment for the FBI, CIA, Secret Service,
Department of Homeland Security, Defense Department Criminal Investigations
Task Force, and National Science Foundation. I appear today representing only my-
self and not any of the organizations I am or have been employed by.
On December 22, 2001, while assigned to Logan International Airport as a mem-
ber of the State Police and as Director of Security Policy, I was part of a large team
of public safety officials who responded to the airfield to meet American Airlines
flight 63, diverted to Boston on a flight from Paris, France to Miami. On board was
a passenger named Richard Reid who attempted to detonate an improvised explo-
45
sive device artfully concealed in his footwear that, if successful, would have killed
all 197 passengers and crewmembers aboard. As I stood only a few feet away from
Reid, who was now securely in custody in the back of a state police cruiser, it hit
me that this man was the real thing, that the threat of another terrorist attack from
Al Qaeda would not stop, and that we needed to do more, much more, to properly
screen passengers than merely focusing on weapons detection. Over the next several
days I met with the incident commander for Reid’s arrest, Major Tom Robbins, who
was the Aviation Security Director for Logan Airport and Troop Commander for
State Police Troop F at the airport. One evening, while having dinner with Major
Robbins, he wrote the words ‘‘walk and talk’’ on a dinner napkin - a reference to
airport narcotics interdiction - and directed me to look into airport drug interdiction
programs as a model for a terrorist behavioral profiling program to augment the
weapons screening process. Thus began the development of what would become the
Behavior Assessment Screening System or BASS.
Because of my legal background and experience in training on racial profiling and
bias policing, I knew immediately what the BASS program would not be. Whatever
program we would create to identify potential terrorists, it would not include racial
profiles that target people of apparent Islamic belief or Arab, Middle Eastern, or
South and Central Asian ethnicities. As well as being illegal such profiling could
distract security officials from detecting true threats. Moreover, the unconscious bias
against these groups would be so strong because of 9/11 that security officials would
need training to counter these biases. I began to explore the scientific literature in
an effort to quantify the human capacity to detect dangerous people. My research
included many disciplines including, physiology, psychology, neuroscience, as well as
specific research into suicide bombers. What this literature indicated was that a per-
son who is engaged in a serious deception of consequence or otherwise engaged in
an act in which the person has much to lose by being discovered or by failing to
succeed will suffer mental stress, fear, or anxiety. Such stress, fear, or anxiety will
be manifested through involuntary physical and physiological reactions such as an
increase in heart rate, facial displays of emotion, and changes in speed and direction
of movement. In developing the program specific behaviors were selected that were
both supported in the scientific literature and consistent with law enforcement expe-
rience. In addition to avoiding the legal prohibition on selective enforcement based
on race, ethnicity, or religion 1 the program also had to ensure that police encoun-
ters with the public not meeting the standard of reasonable suspicion were vol-
untary under the U.S. Supreme Court case of U.S. v. Medenhall. 2 In addition to
behavior, the program also examines: aspects of appearance unrelated to race, eth-
nicity, or religion; responses to law enforcement presence and questioning; and, the
circumstances surrounding the presence of the person at a specific location. I cre-
ated a simple method called ‘‘A-B-C-D’’ which means Analysis of Baseline, addition
of a Catalyst, and scan for Deviations. Baselines are merely an evaluation of what
was normal for a specific environment and a catalyst is the insertion into the envi-
ronment of something that would be particularly threatening to a terrorist or crimi-
nal to provoke behavioral changes.
In 2002 and 2003 I taught the BASS program to all the troopers, the primary law
enforcement agency for Logan Airport, and developed a staff of additional instruc-
tors. We also began training other police departments In Massachusetts; in fact we
trained the entire Massachusetts Transit Police force and a group of Boston Police
officers in preparation for the 2004 Democratic National Convention. Because of the
success of the program, I created a derivative program called PASS or the Passenger
Assessment Screening System suitable for TSA screeners that eventually became
the SPOT program. Over the course of two years I worked with TSA officials at Bos-
ton, including the Federal Security Director George Niccara, and officials at TSA
headquarters including their Office of Civil Rights, Science and Technology, and
Workforce Performance and Training. In 2004 my team of State Police BASS in-
structors conducted a training program with TSA to create two pilot SPOT pro-
grams at Portland International Jetport in Maine and T.F. Green International Air-
port in Rhode Island.
One of the reasons the BASS program got the interest of TSA headquarters as
a model for a behavior detection program was an incident that occurred in the fall
of 2003 at Logan Airport while I was training members of the Boston Police in
BASS. A middle-age male caught my attention due to an appearance and luggage
deviation as well as baseline deviation in movement. When the Boston police officer

1 Whren v. United States, 517 U.S. 806 at 813 (1996).


2 446 U.S. 544 at 554 (1980). (‘‘We conclude that a person has been ‘seized’ within the meaning
of the Fourth Amendment only if, in view of all of the circumstances surrounding the incident,
a reasonable person would have believed that he was not free to leave.’’)
46
and I engaged this purported passenger in conversation he immediately produced
credentials identifying himself as an official of the Department of Homeland Secu-
rity Office of Investigations and stated he was on his way to test a screening check-
point to see if they would discover a concealed weapon he was carrying.
The BASS program went on to be delivered to numerous agencies including the
entire Washington DC Metro Transit Police, Amtrak Police, and Atlanta Police offi-
cers assigned to the world’s busiest airport, Atlanta Hartsfield-Jackson Inter-
national Airport. In 2006 Two BASS trainers and I spent two weeks in London
where we set up a British version of BASS for the British Transport Police as a
response to the July 7, 2005 terrorist attacks on the London Underground.
During the course of training police officers around the nation, the State Police
BASS instructors discovered four individuals with suspected terrorist ties. In 2004,
while conducting BASS training with the New Jersey Transit Police at Newark
Penn Station, I observed three males exhibiting suspicious behavior using BASS
techniques. One of the subjects was in the United States on a religious visa from
a Middle Eastern country and was being escorted to an Amtrak train for a claimed
week long trip with no luggage. Another subject presented a non-government ID
card that was designed to look like a real government ID. There were three behavior
cues that led to the encounter followed by three non-verbal cues during the inter-
view as well as conflicting factual statements that made these individuals highly
suspicious. It was later confirmed that the subject on the visa was on a terror watch
list. In 2004 at the Metro Center rail station in Washington D.C. a member of the
BASS training team, while conducting training with the TSA, observed a suspicious
male subject who exhibited five behavioral cues under the BASS program. The sub-
ject had a British passport with visa stamps from visits to Iraq and was in the U.S.
to learn how to fly planes. It was later confirmed that the subject was under inves-
tigation for terrorism. Back in 2002 at Logan Airport, a BASS trainer discovered
a suspicious subject exhibiting four BASS behavior cues and three non-verbal cues
during an interview who had failed to report for deportation and was connected to
Ahmed Ressam of the 1999 Millennium bombing plot of Los Angeles Airport.
Unfortunately, since the successful pilot programs in 2004 the TSA has chosen not
to continue my services despite my strong recommendation that I remain involved
in training, particularly with respect to airport police officers in BASS techniques
at airports where the SPOT program is implemented. Although I believe the SPOT
program is effective at identifying high risk passengers, its effectiveness is limited
because proper resolution of highly suspicious people discovered by the TSA Behav-
ior Detection Officers, or BDOs, requires a law enforcement response by police offi-
cers trained in the same behavior detection and interview skills. I designed the pro-
gram so that the most dangerous people would be either removed from the critical
infrastructure or arrested by BASS trained police officers. So, no matter how effec-
tive the BDOs are, the most dangerous people will tend to slip through the cracks
because of a response by non-BASS trained police officers who may discount the va-
lidity of SPOT or who may fail to follow-up with BASS techniques. In most cases
where denials of access occur or arrests or detentions are made by police, it is be-
cause there are warrants for arrest or because contraband is discovered in the
screening process. I do not believe the current TSA airport police SPOT familiariza-
tion training program is enough. The airport police, in my opinion, need to be
trained in the same techniques and skill sets which will engender confidence in the
program and in their own ability to detect terrorist behavior and prevent additional
devastating attacks.
Another issue I see with the SPOT program is that the TSA has created too high
an expectation for what it is able to achieve. The original SPOT program I designed
was not primarily for the apprehension of suspects but as a means to deny access
to critical infrastructure of high risk persons who could be involved in terrorism or
other dangerous activity. It was to be the last and, most importantly, the best
chance to prevent a tragedy when other methods such as intelligence and tradi-
tional, needle in the haystack, screening have failed. Catching a terrorist through
a random encounter in a public place without any prior intelligence is extremely dif-
ficult. By way of example, if we use the number of known terrorism suspects who
boarded domestic commercial flights at airports with BDOs, as cited in the Govern-
ment Accountability Office May 2010 report on Aviation Securitythe last and, most
importantly, the best chance to prevent a tragedy when other methods such as intel-
ligence and traditional, needle in the haystack, screening have failed. Catching a
terrorist through a random encounter in a public place without any prior intel-
ligence is extremely difficult. By way of example, if we use the number of known
terrorism suspects who boarded domestic commercial flights at airports with BDOs,
as cited in the Government Accountability Office May 2010 report on Aviation
Securitythe last and, most importantly, the best chance to prevent a tragedy when
47
other methods such as intelligence and traditional, needle in the haystack, screening
have failed. Catching a terrorist through a random encounter in a public place with-
out any prior intelligence is extremely difficult. By way of example, if we use the
number of known terrorism suspects who boarded domestic commercial flights at
airports with BDOs, as cited in the Government Accountability Office May 2010 re-
port on Aviation Security 3, and the approximately 4 billion passenger enplanements
at U.S. commercial airports from 2004 to 2009, the base rate of terrorist passengers
is about one in every 173 million or .0000006 percent. The expectation that the
SPOT program will result in the arrest of all terrorists attempting to board a do-
mestic flight in the United States is unrealistic and threatens its continued support.
If, however, it is seen as part of a multi-layered approach with the primary goal
of preventing terrorist access to critical infrastructure in conjunction with properly
trained law enforcement, the program sets more reasonable and attainable goals.
In 2004 Major Robbins and I, as well as the Massachusetts Port Authority and
Massachusetts State Police, were sued by an African-American lawyer for the ACLU
who served at the National Coordinator of the American Civil Liberties Union’s
Campaign Against Racial Profiling. The plaintiff alleged that he was unlawfully de-
tained by the State Police at Logan Airport in October of 2003 and that this unlaw-
ful detention was based on BASS training that the troopers received. It was alleged
that the BASS training directed the troopers at the airport to detain people without
reasonable suspicion of criminal activity and condoned and encouraged racial and
ethnic profiling. After a weeklong trial in December 2008 in the Federal District
Court for Massachusetts 4, the jury found that the plaintiff was, in fact, unlawfully
detained by State Police officers but that the BASS program was not the cause of
the unlawful detention. During the trial the judge asked the plaintiff what provi-
sions of the BASS program on its face violate federal law? The plaintiff responded
the following provision was unlawful: a provision that allows police, after reasonable
efforts to dispel elevated suspicion have failed to escort away from critical infra-
structure persons who refuse to identify themselves. The plaintiff also cited the pro-
vision allowing for a running of a records check on such persons. The judge ruled
from the bench: ‘‘I don’t see this as on its face being unconstitutional. I mean, there
is nothing unconstitutional about running a records check of a person, subjecting
a person to additional consensual searches or testing [or] preventing a person
from proceeding into the critical infrastructure or escort[ing] the person
away from the critical infrastructure.’’ (Emphasis added) One of the key compo-
nents of the BASS program is its anti-detention policy: to empower police to deny
persons access to critical infrastructure such as commercial aircraft who display ele-
vated suspicion after reasonable attempts to dispel the suspicion fail. The elevated
suspicion is articulable facts and circumstances that do not necessarily have to rise
to the level required for a lawful detention under the U.S. Supreme Court case of
Terry v. Ohio 5. In keeping with Constitutional mandates, this denial of access in
an extremely small number of cases of unresolved suspicion may be the best we can
do but it may be enough to prevent a tragedy and it also may provide for the collec-
tion of crucial intelligence for an investigation and later arrest. It is important to
note that the 9th Circuit U.S. Court of Appeals in the case of Gilmore v. Gonzales
has ruled that ‘‘the Constitution does not guarantee the right to travel by any par-
ticular form of transportation.’’ 6 The Supreme Court has declined to review this de-
cision.
For SPOT to be effective there has to be a cadre of BASS trained police officers
to bring about an appropriate resolution from an initial TSA observation. Based on
my extensive law enforcement experience using behavioral analysis and those other
police officers who have similar experience, as well as having a basic understanding
of psychological, neurological, and physiological processes, I know SPOT and BASS
techniques do work in identifying potential terrorists and other dangerous people.
If done correctly, the process only takes a couple of minutes and is done openly in
public areas minimizing interference with the free flow of the public and, most im-
portantly, without interfering with civil rights. This program specifically trains TSA
personnel and police officers to counter the effects of unconscious bias that may oth-
erwise result in undue attention on certain ethnic and religious groups and the fail-
ure to detect suspicious behavior by truly dangerous people who do not fit the
unstated but subconsciously present religious or ethnic profile. When the next shoe
bomber or underwear bomber arrives at one of our airports or train stations to blow
up one of our planes or subway trains or if they try to gain access to the Super

3 GAO-10-763. The report cites 23 suspected terrorists having passed through SPOT airports.
4 King Downing v. Massachusetts Port Authority, et al, Civil Action No. 2004-12513-RBC.
5 392 U.S. 1 (1968).
6 435 F. 3d 1125.
48
Bowl or other major sporting event, even when we don’t have the constitutional au-
thority to arrest we must have the confidence to deny them access based on the
sound principles of BASS and SPOT. This is our last and best chance of preventing
another terrorist attack.
Thank you again for this opportunity to address the SPOT program and I am pre-
pared now to answer any questions you may have.
Chairman BROUN. Thank you, Lieutenant. You did not exceed
your five minutes either. Congratulations and thank you for being
here and——
Mr. DIDOMENICA. Two seconds.
Chairman BROUN. That is right. I recognize our next witness, Dr.
Paul Ekman, Professor Emeritus of Psychology, University of Cali-
fornia, San Francisco, and President and Founder of the Paul
Ekman Group. Doctor, you have five minutes for your testimony.
TESTIMONY OF PAUL EKMAN,
PROFESSOR EMERITUS OF PSYCHOLOGY,
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO,
AND PRESIDENT AND FOUNDER, PAUL EKMAN GROUP, LLC
Dr. EKMAN. Thank you, Chairman Broun, Ranking Member Ed-
wards. I really appreciate this opportunity to testify on this very
important issue.
I have been working with TSA on SPOT for eight years based on
40 years of research on how demeanor—facial expression, gesture,
voice, speech, gaze and posture—can help in identifying lies and
also harmful intent. My research has examined four very different
kinds of lies: lies to conceal a very strong emotion felt at that mo-
ment, lies claiming to hold a social political opinion the exact oppo-
site of your truly strongly held opinion, lies denying that you have
taken money that isn’t yours, and lies in which members of extrem-
ist political groups attempt to block an opposing political group
from receiving money.
Now, our research focuses on real-world lies that matter to soci-
ety in which each person decided for him or herself whether to lie
or tell the truth, just as we do in the real world. No scientist comes
out of the clouds and tells us you are supposed to lie, you are sup-
posed to tell the truth, except in experiments published in journals.
The person who tells the truth knows that if he or she is mistak-
enly judged to be lying, they will receive the same punishment of
the liar who is caught. This makes the truthful person apprehen-
sive and harder to distinguish from the liar, just as it is in the real
world. And the punishment threatened is as severe and highly
credible to those who participate in the research as we could make
it, passed by the University IRB.
I should mention I work in a medical school. I would never get
it passed at Berkley, but at a medical school what I do is consid-
ered trivial.
Now, unlike any other research team, we have performed the
most precise comprehensive measurements of face, gesture, voice,
speech, and gaze, and those measurements have yielded between
80 and 90 percent identification of who is lying and who is telling
the truth. The clues we have found are not specific to what the lie
is about. As long as the stakes are very high, especially the threat
of punishment, the behavioral clues to lying will be the same. It
is this finding that suggested there would be no clues specific to
49

the terrorist hiding harmful intent than the money smuggler, the
drug smuggler, or the wanted felon.
In my written testimony I raised three questions. First, what is
the basis for the SPOT checklist? I have explained why I believe
our findings on four very different kinds of lies provided a solid
basis for reviewing what was on the SPOT checklist.
Question two, what is the evidence for the effectiveness of SPOT?
Mr. Willis has already covered that. I won’t attempt to repeat it.
I am very eager to see that report that you are eager to see.
Question three, can SPOT be improved? That is a dangerous
question to ask a scientist. We could always think that more re-
search is necessary. But is it a wise investment compared to other
things that the government can invest in regarding airport secu-
rity? That is your decision, not mine. In my testimony I have out-
lined a couple of types of research that I think could be useful if
you decide you would want to do more research. But we do not
need to do more research now to feel confidence in this layer of se-
curity provided to the American people.
In my written testimony I attempted to answer questions that
have been raised by critics of SPOT. Would it have not been better
to base SPOT on how terrorists actually behave? Wasn’t SPOT
based on—Why wasn’t SPOT based on people role-playing terror-
ists? Why is SPOT catching felons and smugglers, not just terror-
ists? And aren’t people with Middle Eastern names or Middle East-
ern appearance more likely to be identified by SPOT?
I would be glad in responding to questions to provide brief an-
swers to each of these that are in my written testimony. Again, my
thanks to the Committee and the staff of the Committee for the op-
portunity to talk to you and to the men and women in TSA who
make flying a safer path than it would be without their dedicated
efforts. Thank you.
[The prepared statement of Dr. Ekman follows:]
50
PREPARED STATEMENT OF DR. PAUL EKMAN, PROFESSOR EMERITUS OF PSYCHOLOGY,
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO, AND PRESIDENT AND FOUNDER, PAUL
EKMAN GROUP, LLC
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

Chairman BROUN. Thank you, Doctor. I appreciate your testi-


mony. I now recognize our next witness, Dr. Maria Hartwig, Asso-
ciate Professor, Department of Psychology, John Jay College of
Criminal Justice. Dr. Hartwig, your testimony for five minutes.
TESTIMONY OF MARIA HARTWIG, ASSOCIATE PROFESSOR,
DEPARTMENT OF PSYCHOLOGY,
JOHN JAY COLLEGE OF CRIMINAL JUSTICE
Dr. HARTWIG. Good morning. It is an honor to be here. Thank
you for allowing me the opportunity.
The SPOT program is based on the idea that judgments of credi-
bility can be made on the basis of observing facial cues and non-
verbal cues that indicate stress, fear, or deception. And I have been
asked to address the scientific support for this.
First of all, there are more than 30 years of research on decep-
tion that shows that people are quite poor at detecting deception
on the basis of observing behavior. In a recent meta-analysis, a sta-
tistical overview of all the research, people obtained a hit-rate of
54 percent and you should, of course, keep in mind that 50 percent
is the hit-rate you obtain by chance alone. So why are people so
poor at detecting deception on the basis of observation? And one
answer is that there are very few non-verbal demeanor-based cues
to deception and these cues of deception tend to be weak. So simply
put, there may not be much to observe. And contrary to what
laypeople and presume lie experts such as law enforcement believe,
liars don’t display more signs of stress, fear, and arousal.
And critics of this research very often say that these findings are
due to the nature of the laboratory experiments that most research
relies on. And the claim is that when liars—when the stakes are
sufficiently high, these cues to deception will appear. Research has
addressed this concern by studying high-stake lies, such as lies told
by people suspected of serious crimes like murder and rape, and
these studies don’t show any evidence that cues to stress and anx-
iety appear as the stakes increase.
And let me turn to the issue of detecting deception from facial
cues to emotion. So this is based on the idea that liars experience
emotion or fear of detection and that observing these facial cues
can help you detect lies. I don’t have time to go into details about
the theoretical problems of that assumption, but in brief, it invites
both missives and false alarms. It may miss travelers with hostile
intentions who don’t experience these emotions or who successfully
conceals them and it may generate false alarms for travelers who
don’t have hostile intentions but experience these feelings for other
reasons.
Most people are quite surprised to hear that there is very little
evidence on the issue of these so-called micro-expressions, brief dis-
plays of an underlying emotion that are revealed automatically. I
am aware of only one study published in the peer-reviewed lit-
erature conducted by Steve Porter and his colleague, Leanne ten
Brinke, in the Journal of Psychological Science, they examined the
prevalence of micro-expressions in falsified and genuine displays of
emotion. They found no complete micro-expression in any of the
697 facial expressions they analyzed. They found 14 partial micro-
expressions occurring in either the lower or the upper half of the
71

face, but these micro-expressions occurred with similar frequency


in true and falsified expressions.
So this study shows that micro-expressions occur very rarely, and
to the extent that they do occur, they occur in genuine displays as
well. And the authors of this paper conclude that the occurrence of
micro-expressions in true expressions makes their usefulness in
airline security settings questionable. And they also state that the
current training that relies heavily on the identification of full-
faced micro-expressions may be misleading.
And finally, I would like to address a point of view expressed by
Dr. Ekman in a recent article in Nature on the SPOT program. He
stated that he no longer publishes all of the details of his work in
the peer-reviewed literature because those papers are closely fol-
lowed by scientists in countries such as Syria, Iran, and China,
which the United States view as a potential threat. I object to de-
liberate strategy not to publish research for three reasons.
First, in that the enemy, whoever they are, a potential terrorist
or criminals, may be aware of results from research applies to all
deception research, so if we took this argument seriously, we
shouldn’t publish any lie-detection research because it may ulti-
mately help the enemy.
And second, it is my understanding of the theory of micro-expres-
sion that these are automatic involuntary displays, and if that is
the case, I fail to see how knowledge about these behaviors or the
research on these behaviors could help the person.
And third and most importantly, these claims of micro-expres-
sions as cues to deception or the cues included in the SPOT pro-
gram, they are empirical questions that should be addressed with
data and subjected to scientific peer review. And given the amount
of resources that have already been spent on this program, I think
such validation is absolutely necessary.
So in summary, my view is that the SPOT program is out of step
with the scientific research. It relies on an outdated view of decep-
tion and there is very little support in the peer-reviewed literature.
And if I had more time, I would say a few words about what I
think may be a more productive approach to assessing credibility,
but I believe I am out of time.
[The prepared statement of Dr. Hartwig follows:]
PREPARED STATEMENT OF DR. MARIA HARTWIG, ASSOCIATE PROFESSOR, DEPARTMENT
OF PSYCHOLOGY, JOHN JAY COLLEGE OF CRIMINAL JUSTICE

The TSA has implemented the SPOT program, a security screening protocol that
relies on observation of nonverbal and facial cues to assess the credibility of trav-
elers. In particular, the program relies on behavioral indicators of ‘‘stress, fear, or
deception’’ (GAO, p. 2). A key question is whether there is a scientifically validated
basis for using behavior detection for counterterrorism purposes. This testimony will
review the relevant empirical evidence on this question. In brief, the accumulated
body of scientific work on behavioral cues to deception does not provide support for
the premise of the SPOT program. The empirical support for the underpinnings of
the program is weak at best, and the program suffers from theoretical flaws. Below,
I will elaborate on the scientific findings of relevance for this issue.
Accuracy in deception judgments
For several decades, behavioral scientists have conducted empirical research on
deception and its detection. There is now a considerable body of work in this field
(Granhag & Strömwall, 2004; Vrij, 2008). This research focuses on three primary
questions: First, how good are people at judging credibility? Second, are there be-
72
havioral differences between deceptive and truthful presentations? Third, how can
people’s ability to judge credibility be improved?
Most research on credibility judgments is experimental. An advantage of the ex-
perimental approach is that researchers may randomly assign participants to condi-
tions, which provides internal validity (the ability to establish causal relationships
between the variables, in this context between deception and a given behavioral in-
dicator) and control of extraneous variables. Importantly, the experimental approach
also allows for the unambiguous establishment of ground truth, that is, knowledge
about whether the statements given by research participants are in fact truthful or
deceptive. In this research, participants provide truthful or deliberately false state-
ments, for example by purposefully distorting their attitudes, opinions, or events
they have witnessed or participated in. The statements are subjected to various
analyses including codings of verbal and nonverbal behavior. This allows for the
mapping of objective cues to deception–behavioral characteristics that differ as a
function of veracity. Also, the videotaped statements are typically shown to other
participants serving as lie-catchers who are asked to make judgments about the ve-
racity of the statements they have seen. Across hundreds of such studies, people av-
erage 54% correct judgments, when guessing would yield 50% correct. Meta-anal-
yses (statistical summaries of the available research on a given topic) show that ac-
curacy rates do not vary greatly from one setting to another (Bond & DePaulo, 2006)
and that individuals barely differ from one another in the ability to detect deceit
(Bond & DePaulo, 2008). Contrary to common expectations (Garrido, Masip, &
Herrero, 2004), presumed lie experts such as police detectives and customs officers
who routinely assess credibility in their professional life do not perform better than
lay judges (Bond & DePaulo, 2006). In sum, that judging credibility is a near-chance
enterprise is a robust finding emerging from decades of systematic research.
Cues to deception
Why are credibility judgments so prone to error? Research on behavioral dif-
ferences between liars and truth tellers may provide an answer to this question. A
meta-analysis covering 1,338 estimates of 158 behaviors showed that few behaviors
are related to deception (DePaulo et al., 2003). The behaviors that do show a sys-
tematic covariation with deception are typically only weakly related to deceit. In
other words, people may fail to detect deception because the behavioral signs of de-
ception are faint.
Lie detection may fail for another reason: People report relying on invalid cues
when attempting to detect deception. Both lay people and presumed lie experts,
such as law enforcement personnel, report that gaze aversion, fidgeting, speech er-
rors (e.g., stuttering), pauses and posture shifts indicate deception (Global Deception
Research Team, 2005; Strömwall, Granhag, & Hartwig, 2004). These are cues to
stress, nervousness and discomfort. However, meta-analyses of the deception lit-
erature show that these behaviors are not systematically related to deception. For
example, in DePaulo et al. (2003), the effect size d (a statistical measure of the
strength of association between two variables) of gaze aversion as a cue to deception
across all studies is a non-significant 0.03. DePaulo et al. state: ‘‘It is notable that
none of the measures of looking behavior supported the widespread belief that liars
do not look their targets in the eye. The 32 independent estimates of eye contact
produced a combined effect that was almost exactly zero (d = 0.01)’’ (p. 93). More-
over, fidgeting with object does not occur more frequently when lying, d = -0.12 (the
negative value suggests that object fidgeting occurs less, not more frequently when
lying, but this difference is not statistically significant), nor does self-fidgeting (d =
-0.01) and facial fidgeting (d = 0.08). Speech disturbances are not related to decep-
tion (d = 0.00), nor are pauses (silent pauses d = 0.01; filled pauses d = 0.00; mixed
pauses d = 0.03). Posture shifts are not systematically related to deception either,
d = 0.05.
In sum, the literature shows that people perform poorly when attempting to de-
tect deception. There are two primary reasons: First, there are few, if any, strong
cues to deception. Second, people report relying on cues to stress, anxiety and nerv-
ousness, which are not indicative of deceit.
High-stake lies. Some aspects of the deception literature have been criticized on
methodological grounds, in particular with regard to external validity (i.e., the gen-
eralizability of the findings to relevant non-laboratory settings, see Miller & Stiff,
1993) The most persistent criticism has concerned the issue of generalizing from
low-stake situations to those in which the stakes are considerably higher. Critics
have argued that when the deceit concerns serious matters, liars will experience
stronger fear of detection, leading to cues to deception. There are several bodies of
work of relevance for this concern. In a meta-analytic overview of the literature on
credibility judgments (Bond & DePaulo, 2006), the evidence on the effects of stakes
73
was mixed: Within studies that manipulated motivation to succeed, lies were easier
to tell from truths when there is relevant motivation. However, the effect size was
fairly small (d = 0.17). However, when the comparison was made between studies
that differed in stakes, no difference in lie detection accuracy was observed. Also,
the meta-analysis revealed that as the stakes rise, both liars and truth tellers seem
more deceptive to observers. That is, lie-catchers are more prone to make false posi-
tive errors - mistaking an innocent person for a liar - when judging highly motivated
senders.
Furthermore, research on real-life high-stake lies, such as lies told by suspects of
serious crimes during police interrogations, shows that people obtain at best mod-
erate hit rates when judging such material (for a review of these studies, see Vrij,
2008). Behavioral analyses of the suspects in these studies do not support the asser-
tion that cues to deception in the form of stress, arousal and emotions appear when
senders are highly motivated. Vrij noted that the pattern from high-stake lies stud-
ies are ‘‘in direct contrast with the view of professional lie-catchers who overwhelm-
ingly believe that liars in high-stake situations will display cues to nervousness,
particularly gaze aversion and self-adaptors’’ (2008, p. 77). Moreover, he notes that
the results ‘‘show no evidence for the occurrence of such cues’’ (2008, p. 77).
In sum, neither the research in general nor specific results on high-stake lies sup-
port the assumption that liars leak cues to stress and emotion, which can be used
for the purposes of lie detection.
Verbal vs. nonverbal cues to deception
The SPOT program seems to rely heavily on evaluation of nonverbal cues. This
emphasis on nonverbal behavior as opposed to verbal content cues runs counter to
the recommendations from research. A number of findings suggest that reliance on
nonverbal cues impairs lie detection accuracy. First, the meta-analysis on accuracy
in deception judgments investigated accuracy under four conditions: a) watching vid-
eotapes without sound b) watching tapes with sound c) listening to audiotapes and
d) reading transcripts (Bond & DePaulo, 2006). The accuracy rates in the first condi-
tion, where people based their judgments solely on nonverbal behavior, was signifi-
cantly lower than in the other three, which did not differ significantly from each
other. Thus, the combined results of hundreds of studies on lie detection suggest
that having access to only nonverbal cues impairs lie detection accuracy.
Second, a number of studies have correlated lie-catchers’ self-reported use of cues
with lie detection accuracy. The purpose of such analyses is to investigate whether
failure to detect deception coincides with the self-reported use of a particular set of
cues. The results of these studies are consistent: They show that the more fre-
quently a participant reports relying on nonverbal behavior, the less likely they are
to be accurate in detecting deception. First, Mann et al. (2004) investigated police
officers’ ability to assess the veracity of suspects accused of murder, rape and arson.
They found that successful lie detectors mentioned story cues (e.g., contradictions
in the statement, vague responses) more frequently than poor lie detectors. More-
over, the more nonverbal cues the detectives mentioned (e.g., gaze aversion, move-
ments, posture shifts), the lower their lie detection accuracy was. Second, Anderson
et al. (1999) and Feeley and Young (2000) found that the more vocal cues lie-catch-
ers mentioned, the more accurate they were in detecting deception. Third, Vrij and
Mann’s (2001) analysis of accuracy in judging the statement of a convicted murderer
showed that the participants who mentioned cues to stress and discomfort obtained
the lowest hit rates. Fourth, Porter et al. (2007) found that the more visual cues
participants reported, the poorer they were at detecting deception.
It should be noted that reliance on nonverbal cues is associated not only with
poorer lie detection accuracy, but also a more pronounced lie bias (a tendency to
judge statements as lies rather than truths). That is, paying attention to visual cues
increases the tendency for false positive errors - mistaking an innocent person for
a deceptive one. This finding was obtained in one of the meta-analyses on deception
judgments (Bond & DePaulo, 2006), as well as in a study of police officers’ judg-
ments of suspects of serious crimes (Mann et al., 2004).
The finding that reliance on nonverbal cues hampers lie detection is not sur-
prising, given the research findings on cues to deception. These findings suggest
that speech-related cues may be more diagnostic of deception than nonverbal cues
(DePaulo et al., 2003; Sporer & Schwandt, 2006, 2007; Vrij, 2008). For example,
DePaulo et al. (2003) showed that liars talk for a shorter time (d = -0.35), and in-
clude fewer details (d = -0.30). Liars’ stories are also less logically structured (d =
-0.25) and less plausible (d = -0.20). Liars and truth tellers differ in verbal and vocal
immediacy (d = -0.55), and with respect to the inclusion of particular verbal ele-
ments, such as admissions of lack of memory (d = -0.42), spontaneous corrections
74
(d = -0.29) and related external associations (d = 0.35). These findings are in line
with predictions from content analysis frameworks (e.g., Köhnken, 2004).
Detecting deceptions from facial displays of emotion
Theoretical concerns. Parts of the SPOT program seem to be predicated on the
assumption that analyses of facial displays of emotion can improve deception detec-
tion accuracy. The claims of effectiveness for such approaches are not modest. In
an interview with the New York Times, Ekman claimed that ‘‘his system of lie de-
tection can be taught to anyone, with an accuracy rate of more than 95 percent’’
(Henig, 2006). However, no such finding has ever been reported in the peer-reviewed
literature (Vrij et al., 2010). More broadly, there is no support for the assertion that
training programs focusing on identifying facial displays of emotions can improve
lie detection accuracy (Vrij, 2008).
Apart from lack of empirical support for the effectiveness of training programs fo-
cusing on the analysis of facial displays of emotion, there are theoretical problems
with the approach. The assumption behind the training program is that concealed
emotions may be revealed automatically, through brief displays sometimes referred
to as microexpressions. Implicit in this assumption is the notion that liars will expe-
rience emotions, and that leakage of emotions can betray their deceit. This seems
to equate cues to emotion with cues to deceit. But what is the evidence that lying
will entail emotions, while truth telling will not? Several scholars have noted that
the assumption that liars will experience emotion is a prescriptive view - it suggests
how liars should feel. Common moral reasoning suggests that lying is ‘‘bad’’
(Backbier et al., 1997). In line with this reasoning, Bond and DePaulo (2006) pro-
posed a double-standard hypothesis to explain the discrepancy between people’s be-
liefs about deceptive behavior (that liars will display signs of discomfort and stress)
and the actual findings on deceptive behavior (that liars typically do not display
such signs). The double-standard hypothesis suggests that people have two views
about lying: one about the lies they themselves tell, and one about the lies told by
others (a form of fundamental attribution error; Ross, 1977). In the words of the au-
thors: ‘‘As deceivers, people are pragmatic. They accommodate perceived needs by
lying. [.] [Lies] are easy to rationalize. Yes, deception may demand construction of
a convincing line and enactment of appropriate demeanor. Most strategic commu-
nications do. To the liar, there is nothing exceptional about lying’’ (p. 216). However,
people’s view of the lies told by others is markedly different: ‘‘Indignant at the pros-
pect of being duped, people project onto the deceptive a host of morally fuelled emo-
tions - anxiety, shame, and guilt. Drawing on this stereotype to assess others’ verac-
ity, people find that the stereotype seldom fits. In underestimating the liar’s capac-
ity for self-rationalization, judges’ moralistic stereotype has the unintended effect of
enabling successful deceit. Because deceptive torment resides primarily in the
judge’s imagination, many lies are mistaken for truths. When torment is perceived,
it is often not a consequence of deception but of a speaker’s motivation to be be-
lieved. High-stakes rarely make people feel guilty about lying; more often, they
allow deceit to be easily rationalized. When motivation has an impact, it is on the
speaker’s fear of being disbelieved, and it matters little whether or not the highly
motivated are lying (pp. 231-232).’’
These are important points, in that they highlight the discrepancy between the
perspective of the liar and the lie-catcher: People fall prey to an error of reasoning
when assuming that the liars are plagued by emotions. They fail to take into ac-
count the pragmatic nature of lies, as well as the liar’s ability to rationalize their
lie. Moreover, they may misinterpret the fear of a motivated innocent person as a
sign of deceit.
Beyond naı̈ve moral reasoning about lies, is it psychologically sound to assume
that people experience stress and negative emotion about lying? Can we expect that
a criminal will experience guilt or shame about the actions he has committed, or
that a prospective terrorist is plagued by negative feelings about the actions he is
about to commit? They may, but given the double-standard hypothesis, we cannot
be certain that this is the case. Apart from guilt and shame, it could be argued that
liars may experience fear of not being able to convince. However, we must acknowl-
edge the important fact that truth tellers might also experience such fear. For ex-
ample, Ekman coined the term ‘‘Othello error’’ to describe how lie-catchers may mis-
interpret an innocent person’s fear of not being believed as a sign of deception
(Ekman, 2001). Moreover, people may react not only with fear but also anger in re-
sponse to suspicion. Indeed, one study found that truth tellers reacted with more
anger to suspicion than did liars (Hatz & Bourgeois, 2010). For an innocent person,
suspicion is obviously undeserved. An emotional reaction to such treatment fits with
a large body of social justice research suggesting that people have affective re-
75
sponses to violations of fairness (De Cremer & van den Bos, 2007; Mikula et al.,
1998).
Empirical support. In sum, the concern raised above is that equating arousal, fear
and stress with deception may rest on shaky theoretical grounds. If one rejects this
concern and insists that such processes accompany lying, there is yet another hurdle
to overcome. If people do experience affective processes, can they conceal them?
Given the attention to microexpressions in the media, one might assume that there
is an abundance of research published in peer-reviewed journals addressing this
question. However, this is not the case. Porter and ten Brinke (2008) noted that ‘‘to
[their] knowledge, no published empirical research has established the validity of
microexpressions, let alone their frequency during falsification of emotion’’ (p. 509).
They proceeded to conduct an analysis of people’s ability to a) fabricate expressions
of emotions they did not experience and b) conceal emotions that they did in fact
experience. Their results showed that people are not perfectly capable of fabricating
displays of emotions they do not experience: When people were asked to present a
facial expression different from the emotion they were experiencing, there were
some inconsistencies in these displays. However, the effect depended on the type of
emotion people were trying to portray. People performed better at creating con-
vincing displays of happiness compared to negative expressions. This is plausibly
due to people’s experience of creating false expressions of positive emotion in every-
day life. With regard to concealing an emotion people did in fact experience, they
performed better: There was no evidence of leakage of the felt emotion in these ex-
pressions. As for microexpression, no complete microexpression (lasting 1/5th-1/25th
of a second) involving both the upper and lower half of the face was found in any
of the 697 facial expressions analyzed in the study. However, 14 partial micro-
expressions were found, 7 in the upper and 7 in the lower half of the face. Interest-
ingly, these partial microexpression occurred both during false and genuine facial
expressions. That is, not only those who were falsifying or concealing emotions dis-
played these expressions; true displays of emotion involved microexpressions to the
same extent. Porter and ten Brinke concluded that the ‘‘occurrence [of microexpres-
sions] in genuine expressions makes their usefulness in airline-security settings
questionable, given the implications of false-positive errors (i.e., potential human
rights violations). Certainly, current training that relies heavily on the identification
of full-face microexpressions may be misleading.’’ (p. 513).
Passive vs. active lie detection
If it is difficult, or even impossible to detect deception through analyses of leakage
of cues to affect, how can lie detection be accomplished? The research reviewed here
suggests that it is more fruitful to focus on the content of a person’s speech than
to observe their nonverbal behavior, since the latter provides little valid information
about deceit. The implication of this is that in order for lie judgments to be reason-
ably accurate, lie-catchers cannot simply observe targets. Instead, they should elicit
verbal responses from these targets, as verbal messages may be the carriers of cues
to deceit.
The proposition that lie-catchers ought to elicit verbal responses from targets fits
with an important paradigm shift in the literature on deception detection. In brief,
this paradigm shift involves moving from passive observation of behavior to the ac-
tive elicitation of cues to deception (Vrij, Granhag, & Porter, 2010). This shift in the
approach to lie detection is based on the now well-established finding that liars do
not automatically leak behavioral cues. However, that the behavioral traces of de-
ception are faint is not necessarily a universal fact: it may be possible to increase
the behavioral differences between liars and truth tellers by exploiting some of the
cognitive differences between the two. The approaches to elicit cues to deception are
thus anchored in a cognitive rather than emotional model of deception. This model
assumes that lying is a calculated, strategic enterprise that may demand cognitive
and self-regulatory resources: Liars have to suppress the truth and formulate an al-
ternative account that is sufficiently detailed to appear credible, while being mindful
of the risk of contradicting particular details or one’s own statement if one has to
repeat it later on. Liars may experience greater self-regulatory busyness than truth-
ful communicators, as a function of the efforts involved in deliberately creating a
truthful impression (DePaulo et al., 2003).
Departing from this theoretical framework, it is possible to identify several dif-
ferent approaches to elicit behavioral differences between liars and truth tellers.
First, if it is true that liars are operating under a heavier burden of cognitive load
than truth tellers, imposing further cognitive load should hamper liars more than
truth tellers. This hypothesis has been tested in several studies, in which cognitive
load was manipulated (for example, by asking targets to tell the story in reverse
order) and cues to deception were measured (e.g., Vrij et al., 2008; Vrij, Mann, Leal,
76
& Fisher, 2010). In support of the cognitive load framework, cues to deception were
more pronounced, and veracity judgments were more correct in the increased cog-
nitive load conditions.
A related line of research has investigated whether it is possible to elicit cues to
deception by exploiting the strategies liars employ in order to convince. For exam-
ple, this research has attempted to elicit cues to deception by asking unanticipated
questions, based on the assumption that liars plan some, but not all of their re-
sponses (Vrij et al., 2009). In line with the predictions, liars and truth tellers did
not differ with regard to anticipated questions, but when unanticipated questions
were asked, cues to deception emerged. Moreover, liars’ verbal strategies of avoid-
ance can be exploited through strategic use of background information, which elicits
inconsistencies or contradictions between the target’s statement and the background
information (Hartwig et al., 2005; 2006). For an extensive discussion on approaches
to elicit cues to deception, see Vrij et al. (2010).
Summary and directions for future research
In summary, the research reviewed above suggests that lie detection based on ob-
servations of behavior is a difficult enterprise. Hundreds of studies show that people
obtain hit rates just slightly above the level of chance. This can be explained by the
scarcity of cues to deception, as well as the finding that people report relying on
behavioral cues that have little diagnostic value. A wave of research conducted dur-
ing the last decade suggests that lie judgments can be improved by the elicitation
of cues to deception through various methods of strategic interviewing. This wave
of research has been accompanied by a theoretical shift in the literature, moving
from an emotional model of deception towards a cognitive view of deception.
The SPOT program’s focus on passive observations of behavior and its emphasis
on emotional cues is thus largely out of sync with the developments in the scientific
field. The evidence that accurate judgments of credibility can be made on the basis
of such observations is simply weak. Of course, it must be acknowledged that engag-
ing travelers in verbal interaction (ranging from casual conversations to more or
less structured interviews) is more time-consuming and effortful than simply observ-
ing behaviors from some distance. Still, the literature on elicitation of cues to decep-
tion suggests that this approach is likely to be substantially more effective than pas-
sive observations of behavior.
Evaluation of the SPOT program. At the time this testimony is written, the DHS’s
report on the validation of the SPOT program has yet to be released. Therefore, I
cannot comment on the methodological merits of this validation study. However, as
requested, I will briefly outline some methodological processes that I would expect
a validation study to follow. First, it would be necessary to establish clear oper-
ational definitions of the target(s) of the program. What is the program supposed
to accomplish? In order to evaluate the outcomes of the program, such definitions
are crucial. Moverover, I would expect analyses of the outcomes of the SPOT pro-
gram using the framework of decision theory. That is, a validation study should
minimally provide information about the frequency of hits, false alarms, misses and
correct rejections (to do this, one must have an operational definition of what a hit
is). Those values should be compared to chance expectations based upon the
baserate of the defined target condition. Then the obtained outcomes should be com-
pared to a screening protocol that does not include the key elements of the SPOT
program. For example, the outcome of a comparable sample of airports employing
a random screening method may serve as an appropriate control group.
In addition to analyzing the results using a decision theory framework, it would
be desirable to empirically examine the behavioral cues displayed by targets who
pose threats to security, and compare them to targets who do not. That is,
videotaped recordings of these targets (to the extent that they are available) should
be subjected to detailed coding to determine the behavioral indicators that indicate
deception and/or hostile intentions as these travelers move through an airport. The
behaviors displayed by such targets should be compared to an appropriate control
group, for example, a random sample of innocent travelers. The purpose of such
analyses would be twofold: First, the results would empirically establish the behav-
ioral indicators of deception and malicious intent in the airport setting. Second, the
results could be compared to the SPOT criteria to establish whether there is an
overlap between the two sets of indicators.
Moreover, it would be useful to evaluate the criteria on which Behavior Detection
Officers rely to make judgments that a target is worthy of further scrutiny. That
is, analyses of the behaviors of targets selected for scrutiny could be subjected to
coding, to establish a) whether the officers rely on valid indicators of deception and
hostile intentions and b) whether they rely on the criteria set forth in the SPOT
training program. This would validate the SPOT program in a slightly different
77
manner, as it would assess to what extent the Behavior Detection Officers follow
the protocol of their training.
A problem of using field data is that important data will likely be missing. That
is, while databases may include information about hits and false alarms from trav-
elers who are subjected to further scrutiny, the data on misses and correct rejections
are will be incomplete. For example, misses may not be detected for years, if ever.
For this reason it may be appropriate to subject the SPOT program to an experi-
mental test, in which the ground truth about the travelers’ status is known. The
field and experimental approaches are obviously not mutually exclusive: It is pos-
sible (and perhaps even preferable) to conduct both types of validation studies, as
the strength and weaknesses of each approach in terms of internal and external va-
lidity complement each other. A multi-methodological approach to validating the
SPOT program may also provide convergent validity. If a concern with the labora-
tory approach is that participants in an experimental study would not be sufficiently
motivated, it may be worth mentioning that it is possible to experimentally examine
the effect of motivation on targets’ behaviors within the context of a laboratory para-
digm. Some targets could be randomly assigned to receive a weaker incentive for
successfully passing through the screening, while others receive a stronger incen-
tive. Of course, it would not be possible to create a fully realistic incentive system
due to ethical considerations. Still, such a manipulation could provide some insight
into the role of motivation in targets’ behaviors, and to what extent motivation mod-
erates the display of relevant behavioral cues.
In closing, I will briefly note a few areas of relevance for the airport security
screening settings that I believe future research ought to focus on. First, most re-
search has examined truths and lies about past actions. In the airport setting,
truths and lies about future actions (intentions) may be of particular relevance. A
few recent studies have examined true and false statements about future actions
(Granhag & Knieps, in press; Vrij, Granhag, Mann, & Leal, in press; Vrij et al., in
press). The studies reveal some findings in line with the research on true and false
statements about past actions, for example in that false statements about intentions
are less plausible (Vrij et al., in press). However, there are also some differences
in these results. While research on statements about past actions shows that lies
are less detailed than truths, this finding has not been replicated for statements
about future actions. However, this body of work is still small, and further empirical
attention is needed. Second, and relatedly, it would be valuable to attempt to extend
the research findings on elicitation of cues to deception to airport settings. That is,
it would be useful to establish to what extent it is possible to increase cues to decep-
tion using cognitive models when the statements concern future actions. Such
knowledge could be translated into brief, standardized questioning protocols that
could be used to establish the veracity of travelers’ reports about both their past
actions and their intentions.
References
Anderson, D. E., DePaulo, B. M., Ansfield, M. E., Tickle, J. J., & Green, E.
(1999). Beliefs about cues to deception: Mindless stereotypes or untapped wis-
dom? Journal of Nonverbal Behavior, 23, 67-89.
Backbier, E., Hoogstraten, J., & Meerum Terwogt-Kouweenhove, K. (1997). Situ-
ational determinants of the acceptability of telling lies. Journal of Applied Social
Psychology, 27, 1048-1062.
Bond, C. F., Jr., & DePaulo, B. M. (2006). Accuracy of deception judgments. Per-
sonality and Social Psychology Review, 10, 214-234.
Bond, C. F., Jr., & DePaulo, B. M. (2008). Individual differences in judging de-
ception: Accuracy and bias. Psychological Bulletin, 134, 477-492.
De Cremer, D., & van den Bos, K. (2007). Justice and feelings: Toward a new
era in justice research. Social Justice Research, 20, 1-9.
DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., &
Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129, 74-118.
Ekman, P. (2001). Telling lies: Clues to deceit in the marketplace, politics and
marriage. New York: Norton.
Feeley, T. H., & Young, M. J. (2000). The effects of cognitive capacity on beliefs
about deceptive communication. Communication Quarterly, 48, 101-119.
Garrido, E., Masip, J., & Herrero, C. (2004). Police officers’ credibility judgments:
Accuracy and estimated ability. International Journal of Psychology, 39, 254-275.
78
The Global Deception Research Team (2006). A world of lies. Journal of Cross-
Cultural Psychology, 37, 60-74.
Government Accountability Office (2010). Aviation security. GAO-1-763.
Granhag, P. A., & Knieps, M. (in press). Episodic future thought: Illuminating
the trademarks of true and false intent. Applied Cognitive Psychology.
Granhag, P. A., & Strömwall, L. A. (2004). The detection of deception in forensic
contexts. New York, NY: Cambridge University Press.
Hartwig, M., Granhag, P. A., Strömwall, L. A., & Kronkvist, O. (2006). Strategic
use of evidence during police interviews: When training to detect deception
works. Law and Human Behavior, 30, 603-619.
Hartwig, M., Granhag, P. A., Strömwall, L. A., & Vrij, A. (2005). Deception de-
tection via strategic disclosure of evidence. Law and Human Behavior, 29, 469-
484.
Hatz, J. L., & Bourgeois, M. J. (2010). Anger as a cue to truthfulness. Journal
of Experimental Social Psychology, 46, 680-683.
Henig, R. M. (2006). Looking for the lie. New York Times, Feb 5.
Köhnken, G. (2004). Statement validity analysis and the ‘detection of the truth’.
In P.A. Granhag, & L.A. Strömwall (Eds.), The detection of deception in forensic
contexts (pp. 41-63). Cambridge: Cambridge University Press.
Mann, S., Vrij, A., & Bull, R. (2004). Detecting true lies: Police officers’ ability
to detect suspects’ lies. Journal of Applied Psychology, 89, 137-149.
Mikula, G., Scherer, K. R., & Athenstaedt, U. (1998). The role of injustice in the
elicitation of differential emotional reactions. Personality and Social Psychology
Bulletin, 24, 769-783.
Miller, G. R., & Stiff, J. B. (1993). Deceptive communication. Newbury Park:
Sage Publications.
Porter, S., & ten Brinke, L. (2008). Reading between the lies: Identifying con-
cealed and falsified emotions in universal facial expressions. Psychological
Science, 19, 508-514.
Porter, S., Woodworth, M., McCabe, S., & Peace, K. A. (2007). ‘‘Genius is 1% in-
spiration and 99% perspiration’’.or is it? An investigation of the impact of moti-
vation and feedback on deception detection. Legal and Criminological Psychology,
12, 297-310.
Ross, L. D. (1977). The intuitive psychologist and his shortcomings: Distortions
in the attribution process. In L. Berkowitz (Ed.), Advances in experimental social
psychology (Vol. 10), pp. 174-221. New York: Academic Press.
Sporer, S. L., & Schwandt, B. (2006). Paraverbal indicators of deception: A meta-
analytic synthesis. Applied Cognitive Psychology, 20, 421-446.
Sporer, S. L., & Schwandt, B. (2007). Moderators of nonverbal indicators of de-
ception: A meta-analytic synthesis. Psychology, Public Policy, and Law, 13, 1-34.
Strömwall, L. A., Granhag, P. A., & Hartwig, M. (2004). Practitioners’ beliefs
about deception. In P. A. Granhag & L. A. Strömwall (Eds.), The detection of de-
ception in forensic contexts (pp. 229-250). New York, NY: Cambridge University
Press.
Vrij, A. (2008). Detecting lies and deceit: Pitfalls and opportunities (2nd ed.).
New York, NY: John Wiley & Sons.
Vrij, A., Granhag, P. A., Mann, S., & Leal, S. (in press). Lying about flying: The
first experiment to detect false intent. Psychology, Crime & Law.
Vrij, A., Granhag, P. A., & Porter, S. (2010). Pitfalls and opportunities in non-
verbal and verbal lie detection. Psychological Science in the Public Interest, 11,
89-121.
Vrij, A., Leal, S., Granhag, P. A., Fisher, R. P., Sperry, K., Hillman, J., & Mann,
S. (2009). Outsmarting the liars: The benefit of asking unanticipated questions.
Law and Human Behavior, 33, 159-166.
Vrij, A., Leal, S., Mann, S., & Granhag, P. A. (in press). A comparison between
lying about intentions and past activities: Verbal cues and detection accuracy.
Applied Cognitive Psychology.
79
Vrij, A., & Mann, S. (2001). Telling and detecting lies in a high-stake situation:
The case of a convicted murderer. Applied Cognitive Psychology, 15, 187-203.
Vrij, A., Mann, S., Leal, S., & Fisher, R. P. (2007). ‘‘Look into my eyes’’: Can an
instruction to maintain eye contact facilitate lie detection? Psychology, Crime &
Law, 16, 327-348.
Vrij, A., Mann, S., Fisher, R. P., Leal, S., Milne, R., & Bull, R. (2008). Increasing
cognitive load to facilitate lie detection: The benefit of recalling an event in re-
verse order. Law and Human Behavior, 32, 253-265.
Chairman BROUN. Thank you, Dr. Hartwig. If you want to add
some suggestions, we would be glad to enter those in the record
and entertain those suggestions that you may have. And hopefully,
we can get those from you.
Now, I would like to recognize our final witness and that is Dr.
Philip Rubin, Chief Executive Officer of Haskins Laboratories. Dr.
Rubin, you have five minutes for your oral testimony.

TESTIMONY OF PHILIP RUBIN, CHIEF EXECUTIVE OFFICER,


HASKINS LABORATORIES
Dr. RUBIN. Chairman Broun, Ranking Member Edwards, and
distinguished Members of the Subcommittee, thank you for the op-
portunity to speak to you today. My name is Philip Rubin. I am
here as a private citizen. However, I currently serve or have served
in a number of roles, both inside and outside of government, that
might be relevant to today’s hearing.
In addition to the activities previously mentioned by Chairman
Broun, I am also a member of the Technical Advisory Committee
that was formed to provide critical input related to analyses and
methodologies used in the SPOT program.
I was invited here today to describe the current state of research
in science and the behavior and cognitive sciences related to lab-
oratory studies and field evaluation of various tools, techniques,
and technologies used in security and the detection of deception.
My written testimony provides some brief historical background on
selected activities in the behavioral sciences related to security and
it mentions a variety of documents and reports, some of which I
have here, include many produced by the National Academies Na-
tional Research Council, such as consensus reports and other docu-
ments. But the written testimony focuses on two that I was in-
volved with: a workshop on field evaluation in the intelligence and
counterintelligence context, and a short set of papers on threat-
ening communications and behavior. Because of time limitations, I
am not able to describe these in detail and refer you to my written
testimony.
Regarding the field evaluation workshop summary, however, a
number of the participants spoke about various obstacles to field
evaluation, obstacles they believe must be overcome if field evalua-
tion of techniques and devices derived from the behavioral sciences
is to become more common and accepted. Perhaps the most basic
obstacle is simply a lack of appreciation among many for the value
of objective field evaluations and how inaccurate informal ‘‘lessons
learned’’ approaches can be to field evaluation.
A number of people throughout the process of developing this
summary spoke about the pressures to use new devices and tech-
niques once they have become available because lives are at stake.
80

This sense of urgency can lead to pressure to use available tools


before they are evaluated, and it can even lead to ignoring the re-
sults of evaluations if they disagree with the user’s conviction that
the tools are useful.
As indicated earlier, I am a member of the Technical Advisory
Committee for SPOT. As the GAO report indicates, the Technical
Advisory Committee’s role is extremely limited. It focused in the
main on determining whether or not the research program success-
fully accomplished the goal of evaluating whether SPOT can iden-
tify high-risk travelers—defined as individuals who are knowingly
and intentionally attempting to defeat the airport security process.
The advisory committee has not been asked to evaluate the overall
SPOT program, nor has it been asked to evaluate the validity of
indicators used in the program, not asked to evaluate consistency
across measurement, field conditions, training issues, scientific
foundations of the program, and/or behavioral detective methodolo-
gies, et cetera. In order to appropriately scientifically evaluate a
program like SPOT, all of these and more would be needed.
To summarize my written testimony, I would like to just mention
a few points as highlights. These are some recommendations of
how to move forward, so I am just going to hit some bullets.
First, create a reliable research base of studies examining many
of the issues related to security and the detection of deception.
Peer review where and when possible is particularly important.
Shining a light on the process by making information on meth-
odologies and result as open as possible is necessary for deter-
mining if these technologies and devices are performing in a known
and reliable manner.
Incorporate knowledge on the complexities, subtleties, irregular-
ities, and idiosyncrasies of human behavior.
Next, understand the interplay and differences between affect,
emotion, stress, and other factors.
Make sure that we are not distracted or misled by the tools and
toys that fascinate us.
Pay serious attention to the ethical issues and regulations re-
lated to human subjects research, including 45 C.F.R. 46, the Com-
mon Rule, where applicable, and relevant emerging areas, includ-
ing privacy concerns, neuro-ethics, and ethical implications of the
deployment of autonomous agents and devices.
Reduce conflicts of interest to the extent possible, including fi-
nancial conflicts of interest.
Develop an understanding of how urgency, organizational struc-
ture, and institutional barriers can shape program development
and assessment.
And support the importance of the need for independent evalua-
tion of new and controversial projects and issues with appropriate
scientific, technical, statistical, and methodological expertise.
Thank you.
[The prepared statement of Dr. Rubin follows:]
PREPARED STATEMENT OF DR. PHILIP RUBIN
CHIEF EXECUTIVE OFFICER, HASKINS LABORATORIES

Chairman Broun, Ranking Member Edwards, and Members of the Subcommittee


on Investigations and Oversight of the Committee on Science, Space, and Tech-
81
nology, thank you for the opportunity to speak to you today. My name is Philip
Rubin, a resident of Fairfield, Connecticut. I am here as a private citizen. However,
I currently serve or have served in a number of roles, both inside and outside of
government, that might be relevant to today’s hearing. In addition to the separate
biography and resume that I have provided, I will mention some key positions and/
or responsibilities. I am the Chief Executive Officer and a senior scientist at
Haskins Laboratories in New Haven, Connecticut, a private, non-profit research in-
stitute affiliated with Yale University and the University of Connecticut that has
a primary focus on the science of the spoken and written word, including speech,
language, and reading, and their biological basis. I am also an adjunct professor in
the Department of Surgery, Otolaryngology at the Yale University School of Medi-
cine. My research spans a number of disciplines, combining computational, engi-
neering, linguistic, physiological, and psychological approaches to study embodied
cognition, most particularly the biological bases of speech and language.
Since 2006 I have served as the Chair of the National Academies Board on Behav-
ioral, Cognitive, and Sensory Sciences. I was also the Chair of the National Re-
search Council (NRC) Committee on Field Evaluation of Behavioral and Cognitive
Sciences-Based Methods and Tools for Intelligence and Counter-Intelligence, and a
member of the NRC Committee on Developing Metrics for Department of Homeland
Security Science and Technology Research. I am a member-at-large of the Executive
Committee of the Federation of Associations in Behavioral & Brain Sciences. The
American Institutes for Research (AIR), at the request of the Department of Home-
land Security Science & Technology, is conducting a study to assess the validity of
the Transportation Security Administration’s (TSA) Screening of Passengers by Ob-
servation Techniques (SPOT) program’s primary instrument, the SPOT Referral Re-
port, to identify ‘‘high risk travelers.’’ I am a member of the Technical Advisory
Committee (TAC) that was formed to provide critical input related to analyses and
methodologies in this project. The final report is expected shortly. The SPOT review
is an ongoing activity and I have let this committee’s staff know that I have signed
a nondisclosure agreement about aspects of the program. Since Feb. 2011 I have
also been a member of the federal interagency High-Value Detainee Interrogation
Group (HIG) Research Committee. From 2000 through 2003 I served as the Director
of the Division of Behavioral and Cognitive Sciences at the National Science Foun-
dation (NSF). During that period I served as the co-chair of the interagency NSTC
Committee on Science Human Subjects Research Subcommittee under the auspices
of the Executive Office of the President, Office of Science and Technology Policy
(OSTP) during both the Clinton and Bush administrations. I was also a member of
the NSTC Interagency Working Group on Social, Behavioral and Economic Sciences
Task Force on Anti-Terrorism Research and Development during the Bush adminis-
tration.
I was invited here today to describe the current state of research and science in
the behavioral and cognitive sciences related to laboratory studies and field evalua-
tion of various tools, techniques, and technologies used in security and the detection
of deception. My testimony will summarize some activities in these areas, particu-
larly those with which I have personal experience, that might be of use to this sub-
committee.
Before describing some recent reports of significance, let me begin by noting some
activities of particular relevance to behavioral science and security. The significance
of the behavioral and cognitive sciences to matters of security was highlighted with-
in the intelligence community in a number of articles written from 1978 to 1986 by
Richards J. Heuer, Jr., an analyst with the Central Intelligence Agency. These were
later collected in a book, Psychology of Intelligence Analysis (Heuer, 1999), that sur-
veyed cognitive psychology literature and suggested ways to apply these research
findings to improve performance in various tasks.
On Feb. 10, 2005, The National Science and Technology Council (NSTC) released
the report ‘‘Combating Terrorism: Research Priorities in the Social, Behavioral and
Economic Sciences.’’ Produced by the Subcommittee on Social, Behavioral and Eco-
nomic Sciences, this was the first NSTC report on the role of the social and behav-
ioral sciences (which include psychology, sociology, anthropology, geography, linguis-
tics, statistics, and statistical and data mining) in helping the American public and
its leaders to understand the causes of terrorism and how to counter terrorism. As
a member of the NSTC Interagency Working Group on Social, Behavioral and Eco-
nomic Sciences Task Force on Anti-Terrorism Research and Development, I was one
of the individuals who helped to draft the initial versions of this report. The focus
of the report was on how these sciences can help us to predict, prevent, prepare for
and recover from a terrorist attack or ongoing terrorists’ threats. A revised, printed
form of the report was released in 2009. Speaking of this report, John H. Marburger
III, then science advisor to the President and director of the Office of Science and
82
Technology Policy, said, ‘‘Our ability to maintain our American way of life depends
on our understanding of human behavior, which is the domain of the social, behav-
ioral and economic sciences. The report describes the powerful tools and strategies
these sciences offer as we respond to the threats and actions of terrorists.’’ The re-
port goes on to say, in part, that:
‘‘Terrorism has enormous impacts beyond the immediate destruction, injury,
loss of life, and consequent fear and panic. These impacts span the personal,
organizational and societal levels and can have profound psychological, eco-
nomic and social consequences. They apply not just to terrorist activity, but
to other crises of national and/or regional import, such as natural disasters,
industrial accidents, and other extreme events. Research in the social, behav-
ioral and educational sciences has also provided the knowledge, tools, tech-
niques, and trained scientists that are needed if we are to be prepared to un-
derstand, prevent, mitigate, and intervene where required in events related to
such national crises.Lessons learned from previous research and development
efforts are diverse and numerous. For example, research on the mental health
consequences of disasters, including terrorist acts such as the Oklahoma City
bombing, has produced a better understanding of the course of disruptive and
disabling symptoms of distress, who is at risk of developing a serious mental
illness, and helpful interventions to reduce trauma-related distress including
depression and anxiety disorders. Basic economic research on how markets
work was used by government economic advisors to devise policies that would
provide the right incentives and not interfere with transitions in industries
most affected by the changed security situation after 9/11.’’
Other important work related to the behavioral sciences and security included
work by the Intelligence Science Board on the art and science of interrogation, de-
scribed in the volume Educing Information (2006). Rapid developments in cognitive
neuroimaging technologies (PET, fMRI, MEG, NIRS, EEG, etc.) and their possibility
use in the detection of deception, attitude, and affect, have led to the beginnings
of a cottage industry in what some have called ‘‘brain reading’’ or ‘‘brain
fingerprinting.’’ In his 2006 book, Mind Wars: Brain Research and National Defense,
Jonathan Moreno, discusses current concerns related to such developments.
‘‘It’s especially hard to assess the plausibility that something such as mind read-
ing or mind control is feasible through the kinds of devices I’ve described . . . Many
of the technologies do seem hyped; just because national security agencies are
spending money on them doesn’t mean they are a sure thing . . . With brain theory
as inconclusive as it is, there are bound to be conflicting claims among
neuroscientists about what’s technically possible and what isn’t. Since neuroscience
hasn’t come close to finding the boundaries of its possibilities yet, that uncertainty
is likely to persist for a long time.’’ (112-113)
Things change rapidly in science and technology, however as recently as this
month one of our leading cognitive neuroscientists, Michael Gazzaniga, while enthu-
siastic about the potential of work in the area, struck a note of caution in an article
in Scientific American (April 2011) called ‘‘Neuroscience in the Courtroom.’’ Speak-
ing from a legal perspective related to the admissibility of juvenile brain scans as
evidence, he said, ‘‘In spite of the many insights pouring forth from neuroscience,
recent findings from research into the juvenile mind highlight the need to be cau-
tious when incorporating such science into the law.’’ . . . ‘‘Exciting as the advances
that neuroscience is making everyday are, all of us should look with caution at how
they may gradually become incorporated into our culture. The legal relevance of
neuroscientific discoveries is only part of the picture.’’
The National Academies, comprised of the National Academy of Sciences, the Na-
tional Academy of Engineering, the Institute of Medicine, and their operating arm,
the National Research Council, provide independent, objective advice on issues that
affect all of our citizens’ lives. Often this advice takes that form of published docu-
ments known as consensus reports. A number of these are of particular relevance
to today’s hearing, and I will list or summarize the most important ones. Most of
these were produced under the supervision of the Division of Behavioral and Social
Sciences and Education (DBASSE) of the NRC and the Board on Behavioral, Cog-
nitive, and Sensory Sciences (BBCSS) that I chair. Since its founding in 1997,
BBCSS has developed and managed many major studies conducted by expert pan-
els, involving hundreds of volunteers including scientists, policymakers, government
employees, and public citizens. The goal has been to create a sustainable infrastruc-
ture for ongoing review of fundamental and translational research, to inform policy
on issues of national priority, and to facilitate interactions among scholars and pol-
icymakers. Meetings and activities of BBCSS have been sponsored, in part, by: the
National Science Foundation, Directorate for Social, Behavioral and Economic
83
Sciences; the National Institutes of Health, including the National Institute on
Aging, Division of Behavioral and Social Research, the National Cancer Institute;
and the Office of Behavioral and Social Science Research (OBSSR); the American
Psychological Association; the Office of the Director of National Intelligence (ODNI);
the Defense Intelligence Agency (DIA); and the U. S. Secret Service. For today’s pur-
poses, the most relevant reports include:
• The Polygraph and Lie Detection. (2003)
• Human Behavior in Military Contexts. (2008)
• Behavioral Modeling and Simulation: From Individuals to Societies. (2008)
• Emerging Cognitive Neuroscience and Related Technologies. (2008)
• Protecting Individual Privacy in the Struggle Against Terrorists. (2008)
• Field Evaluation in the Intelligence and Counterintelligence Context. (2010)
• Intelligence Analysis: Behavioral and Social Scientific Foundations. (2011)
• Intelligence Analysis for Tomorrow: Advances from the Behavioral and Social
Sciences. (2011)
• Threatening Communications and Behavior: Perspectives on the Pursuit of
Public Figures. (2011)
Time and space prevent a detailed description of these important documents. In-
stead I will focus on the Field Evaluation and Threatening Communications reports.
Field Evaluation
On September 22-23, 2009, the Board on Behavioral, Cognitive, and Sensory
Sciences of the NRC held a workshop on the field evaluation of behavioral and cog-
nitive sciences-based methods and tools for use in the areas of intelligence and coun-
terintelligence. The workshop was organized by the Planning Committee on Field
Evaluation of Behavioral and Cognitive Sciences-Based Methods and Tools for Intel-
ligence and Counterintelligence that I chaired. Its purpose was to discuss the best
ways to apply methods and tools from the behavioral sciences to work in intelligence
operations. The workshop focused on the issue of field evaluation-the testing of
these methods and tools in the context in which they will be used in order to deter-
mine if they are effective in real-world settings. The workshop was sponsored by the
DIA and the ODNI and had considerable support from Susan Brandon, then chief
for research, Behavioral Science Program DEO- Defense CI and HUMINT Center
DIA, and Steven Rieber, then research director, Office of Analytic Integrity and
Standards, ODNI.
In 2010, the NRC published a Workshop Summary called Field Evaluation in the
Intelligence and Counterintelligence Context. This short report summarized the
meeting and highlighted key issues. Following [single-spaced sections] are extracts/
adaptations of the Field Evaluation Workshop Summary, edited for continuity [attri-
bution quotes omitted], that detail some of these issues and illustrate weaknesses
in our current approaches, while also considering future opportunities.
In one of the workshop presentations, David Mandel, a senior defense scientist
atDefence Research and Development Canada (DRDC), discussed the ways in
which the behavioral sciences can benefit intelligence analysis and why it is
important for the intelligence community to build a partnership with the be-
havioral sciences community.The intelligence community has long relied on
science and technology for insights and techniques, Mandel noted, so one
might wonder why it is necessary to talk about the importance of strength-
ening the relationship between the intelligence community and the broad com-
munity of behavioral scientists. One important reason, he said, is that there
area number of factors that tend to weaken the relationship between the two
communities and make analysts less likely to take advantage of what the be-
havioral sciences can offer. First, Mandel said, there is a natural inclination
among most people- including those in the intelligence community-to react
poorly to ‘‘scholarly verdicts that deal with issues such as the quality of their
judgment and decision making, their susceptibility to irrational biases, their
use of sub optimal heuristics, and over reliance on non-diagnostic information.’’
Like most people, experts have the sense that they are competent. Psycho-
logical research shows that most people believe themselves to be better than
average at what they do. Thus, Mandel said, experts are prone to challenge
conclusions offered by behavioral scientists with their own knowledge gained
from personal experience and, furthermore, to believe that such a challenge is
completely legitimate.This is a fundamental problem that behavioral scientists
face in making contributions to any practitioner community, Mandel said,
‘‘Their research is very easily disregarded on the basis of intuition and com-
mon sense. A second reason that analysts tend to disregard lessons from be-
havioral science is that it is seen as being ‘‘soft’’ science. Thus its knowledge
84
is considered to be less objective or trustworthy than knowledge generated by
the ‘‘hard’’ sciences and technology, such as satellite imaging or electronic
eavesdropping. Although that attitude is common in the intelligence commu-
nity, Mandel cautioned, it is misguided and underestimates both the value and
the analytical power of behavioral science. ‘‘When someone uses the term ‘soft
science,’ I correct them. I say‘ probabilistic science’ and [note that] we deal
with some very difficult problems.’’ Third,Mandel said, the relationship be-
tween the intelligence community and the behavioral science community is
still relatively new, so analysts do not necessarily understand what behavioral
science has to offer. Thus, he noted, forums like this workshop are important
for exploring ways in which the partnership between the two communities can
be developed.
It is telling, Mandel noted, that no one else has come along since Heuer to con-
tinue his work of translating cognitive psychology and other areas of behav-
ioral science into tools for analysis. In cognitive psychology alone there is at
least a quarter century of new research since Heuer published Psychology of
Intelligence Analysis that is waiting to be exploited by the intelligence commu-
nity. Another way in which establishing a connection with the research com-
munity can help the intelligence community is with validation, Mandel said.
Once knowledge and insights from behavioral science are used to develop new
tools for the intelligence community, it is still necessary to validate them. Sim-
ply basing recommendations on scientific research is not the same thing as
showing scientifically that those recommendations are effective or testing to
see if they could be substantially improved. Even Heuer was unable to do
much to validate his recommendations, Mandel noted, and, more generally,
this is not something that the intelligence community is particularly well
equipped to do. It is, however, exactly what research scientists are trained to
do. Science offers a method for testing which ideas lead to good results and
which do not. Thus, partnering with the behavioral science community can
help the intelligence community zero in on the techniques that work be stand
avoid those that work poorly or not at all.
In theory, Mandel said, it would be possible for the intelligence community to
build its own applied behavioral research capability, but that would draw sig-
nificant resources away from other operational areas and add an entirely new
focus and purpose to the intelligence community’s existing tasks. Furthermore,
if the intelligence community were to hire behavioral scientists, it would find
itself in competition with both academia, with its unparalleled freedoms, and
industry, with its lucrative salaries. It makes more sense,Mandel suggested,
for the intelligence community to develop partnerships with universities and
other institutions that already have the expertise and capability to perform be-
havioral science research. A final advantage of partnering with the existing be-
havioral science community, Mandel said, is the ‘‘multiplier effect.’’ By working
with scientists in academia, for example, the intelligence community is not
only drawing on the knowledge of those subject-matter experts but on all of
their contacts. ‘‘As a researcher in a research and development organization
and government,’’ Mandel said, ‘‘I am very keen on partnering with academics
because I understand that they have the ability to reach back into other areas
of academia and connect me with other experts who could be of use.’’ There
is a tremendous amount of such leverage that can be achieved by building re-
lationships rather than trying to do everything in-house.
In what ways might particular tools and techniques from the behavioral
sciences assist the intelligence and counterintelligence community? A variety
of devices and approaches derived from the behavioral sciences have been sug-
gested for use or have already been used by the intelligence community. Sev-
eral of these were described, with a particular emphasis on how the techniques
have been evaluated in the field. As Robert Fein put it, ‘‘Our spirit here is to
move forward, to figure out what kinds of new ideas, approaches, old ideas
might be useful to defense and intelligence communities as they seek to fulfill
what are often very difficult and sometimes awesome responsibilities.’’ To that
end the speakers provided case studies of various technologies with potential
application to the intelligence field. One common thread among all of these
disparate techniques, a point made throughout the workshop, is that none of
them has been subjected to a careful field evaluation.

Deception Detection
85
People in the military, in law enforcement, and in the intelligence community
regularly deal with people who deceive them. These people may be working for
or sympathize with an adversary, they may have done something they are try-
ing to hide, or they may simply have their own personal reasons for not telling
the truth. But no matter the reasons, an important task for anyone gathering
information in these arenas is to be able to detect deception. In Iraq or Af-
ghanistan, for example, soldiers on the front line often must decide whether
a particular local person is telling the truth about a cache of explosives or an
impending attack. And since research has shown that most individuals detect
deception at a rate that is little better than random chance, it would be useful
to have a way to improve the odds. Because of this need, a number of devices
and methods have been developed that purport to detect deception. Two in par-
ticular were described at the workshop: voice stress technologies and the Pre-
liminary Credibility AssessmentScreening System (PCASS).

Voice Stress Technologies


Of the various devices that have been developed to help detect lies and decep-
tion, a great many fall in the category of voice stress technologies. I offered
a brief overview of these technologies and of how well they have performed on
objective tests. The basic idea behind all of these technologies is that a person
who answers a question deceptively will feel a heightened degree of stress, and
that stress will cause a change in voice characteristics that can be detected by
a careful analysis of the voice. The change in the voice may not be audible to
the human ear, but the claim is that it can be ascertained accurately and reli-
ably by using signal-processing techniques. More specifically, many of the voice
stress technologies are based on the assumption that micro tremors-vibrations
of such a low frequency that they cannot be detected by the human ear-are
normally present in human speech but that when a person is stressed, the
micro tremors are suppressed. Thus by monitoring the micro tremors and not-
ing when they disappear, it should be possible to determine when a person is
speaking under stress-and presumably lying or otherwise trying to deceive.
Over the years, these technologies have been tested by various researchers in
various ways. A review of these studies that was carried out by Sujeeta Bhatt
and Susan Brandon of the Defense Intelligence Agency (Bhatt and Brandon,
2009). After examining two dozen studies conducted over 30 years, the re-
searchers concluded that the various voice stress technologies were performing,
in general, at a level no better than chance-a person flipping a coin would be
equally good at detecting deception. In short, there was no evidence for the va-
lidity or the reliability of voice stress analysis for the detection of deception
in individuals. Furthermore not only is there no evidence that voice stress
technologies are effective in detecting stress, but also the hypothesis under-
lying their use has been shown to be false. If indeed there are micro tremors
in the voice, then they must result from tremors in some part of the vocal
tract-the larynx, perhaps, or the supra laryngeal vocal tract, which is every-
thing above the larynx, including the oral and nasal cavities. Using a tech-
nique called electromyography to measure the electrical signals of muscle ac-
tivities, physiologists have found that there are indeed micro tremors of the
correct frequency-about 8 to 12 hertz-in some muscles, including those of the
arm. So it would seem reasonable to think that there might also be such micro
tremors in the vocal tract, which would produce micro tremors in the voice.
However, research has found no such micro tremors, either in the muscles of
the vocal tract or in the voice itself. So the basic idea underlying voice stress
technologies-that stress causes the normal micro tremors in the voice to be
suppressed-is not supported by the evidence.
The claim is not that voice stress technologies do not work, only that there has
been extensive testing with very little evidence that such technologies do work.
It is possible that some of the technologies do work under certain conditions
and in certain circumstances, but if that is so, more careful testing will be
needed to determine what those conditions and circumstances are. And only
when such testing has been carried out and the appropriate conditions and cir-
cumstances identified will it make sense to carryout field evaluations of such
technologies. At this point, voice stress technologies are not ready for field
evaluation. For the most part the intelligence community has now stayed away
from voice stress technologies mainly because of the absence of any evidence
supporting their accuracy. But the law enforcement community has taken a
difference approach. Despite the lack of evidence that the various voice stress
86
technologies work, and despite the absence of any field evaluations of them,
the technologies have been put to work by a number of law enforcement agen-
cies around the country and around the world. It is not difficult to understand
the reasons. The devices are inexpensive. They are small and do not require
that sensors be attached to the person being questioned; indeed, they can even
be used in recorded sessions. And they require much less training to operate
than a polygraph. Many people in law enforcement believe that the voice stress
technologies do work; even among those who are convinced that the results of
the technologies are unreliable, many still believe that the devices can be use-
ful in interrogations. They contend that simply questioning a person with such
a device present can, if the person believes that it can tell the difference be-
tween the truth and a lie, induce that person to tell the truth.

Preliminary Credibility Assessment Screening System


With the reliability of voice stress technologies called into question, the intel-
ligence community needed another way to screen for deception. Donald
Krapohl, special assistant to the director of the Defense Academy for Credi-
bility Assessment (DACA), described to the how, several years ago, the Pen-
tagon asked DACA for a summary of the research on voice stress technologies.
DACA, which is part of the Defense Intelligence Agency in the Department of
Defense, provided a review of what was known about voice stress analysis,
and, as Krapohl put it, ‘‘it was rather scary to them, and they decided to pull
those technologies back.’’
The need for deception detection remained, however, and DACA’s head-
quarters organization, the Counterintelligence Field Activity (CIFA) (CIFA was
shut down in 2008 and its responsibilities were taken over by a new agency,
the DefenseCounterintelligence and Human Intelligence Center), was given
the job of finding a new technology that would do the same job that voice
stress technologies were supposed to perform, but with significantly more accu-
racy. There were a number of requirements in order for a device to be effective
in the field: it had to have low training requirements, as it would be used by
soldiers on the front line rather than interrogation specialists; ideally it would
require no more than a week of training. It needed to be highly portable and
easy to use for the average soldier. It needed to be rugged, as inevitably it
would be dropped, get wet, and get dirty.
And it had to be a deception test, not a recognition test. That is, instead of
recognizing when someone knows something that they are trying to hide-the
so-called guilty knowledge test-it should be able to detect when someone was
giving a deceptive answer to a direct question. There is a great deal of re-
search concerning the guilty knowledge test, Krapohl explained, but the test
is not particularly useful in the field because the interviewers must know
something about the ‘‘ground truth.’’ Deception tests, by contrast, are not as
well understood by the scientific community, but they are far more useful in
the field, where interviewers may not know the ground truth.
The final requirement for the device was that it needed to be relatively accu-
rate as an initial screening tool. It was never intended to provide a final an-
swer of whether someone was telling the truth. Its purpose instead was to pro-
vide a sort of triage: when soldiers in the field question someone who claims
to have some information, they need to weed out those who are lying. The ones
who are not weeded out at this initial stage would be questioned further and
in more detail. There are polygraph examiners who can perform extensive ex-
aminations, Krapohl explained, but their numbers are limited. ‘‘So if you could
use a screening tool up front to decide who gets the interview, who gets the
interrogation, who gets the polygraph examination, the commanders thought
that would be very useful,’’ he said. ‘‘It was not designed to be a standalone
tool. It was designed only as an initial assessment.’’
One of the key facts about PCASS is that it was designed specifically to detect
deception, which made it possible, Krapohl said, to create an algorithm that
considers all of the response data and provides a straightforward answer to the
question of whether a person is being deceptive: yes, no, or maybe. It does not
provide nearly as much information as a polygraph can, but that is not its pur-
pose. The main use for PCASS is on the front lines where soldiers need help
in determining who seems trustworthy and who seems to have something to
hide. But the technique is not assumed to give a definite answer, only a condi-
tional one. Because PCASS is used on the front lines, it has never been field
87
tested. Still, it has proved its value in various ways, he said. In a recent oper-
ation in Iraq, for example, it allowed U.S. forces to identify a number of indi-
viduals who were working for foreign intelligence services and others who were
working for violent extremist organizations.
Still, Krapohl said, there is more work to be done. The group at DACA thinks,
for example, that by taking advantage of some of the state-of-the-art tech-
nologies for deception detection, it should be possible to develop more accurate
versions of PCASS. In particular, by using the so-called directed lie approach-
in which those being questioned are instructed to provide false answers to cer-
tain comparison questions-it should be possible to get greater standardization
and less intrusiveness, he said. Still, the issue of field evaluation remains,
Krapohl said. Although the technique has been tested in the laboratory, there
are no data on its performance in the field. ‘‘Doing validation studies of the
credibility assessment technology in a war zone has a number of problems that
we have not been able to figure out,’’ he said. Nonetheless, DACA researchers
would like to come up with ideas for how PCASS and other credibility assess-
ment technologies might be evaluated in the field.
In later discussions at the workshop, it became clear that a number of partici-
pants had serious doubts about the effectiveness of PCASS in the field, despite
the fact that it is in widespread use and popular among at least some of the
troops in the field. ‘‘Everybody in this room knows that there are real limita-
tions to it,’’ Fein said. ‘‘I think we can do better than put something out there
that has such limitations.’’ And Brandon commented that ‘‘if we were doing
really good field validation with the PCASS’’ then it might well become obvious
that other, less expensive methods could do at least as good a job as PCASS
at detecting deception. There are a number of important questions concerning
the validity and reliability of PCASS that can be addressed only by field eval-
uation, and until such validation is done, the troops in the field are relying
on what is essentially an unproved technology.

Obstacles To Field Evaluation


A number of the workshop presenters and participants spoke about various ob-
stacles to field evaluation inside the intelligence community- obstacles they be-
lieve must be overcome if field evaluation of techniques and devices derived
from the behavioral sciences is to become more common and accepted.
Lack of Appreciation of the Value of Field Evaluations
Perhaps the most basic obstacle is simply a lack of appreciation among many
of those in the intelligence community for the value of objective field evalua-
tions and how inaccurate informal ‘‘lessons learned’’ approaches to field eval-
uation can be. Paul Lehner of the MITRE Corporation made this point, for in-
stance, when he noted that after the9/11 attacks on the World Trade Center
there was a great sense of urgency to develop new and better ways to gather
and analyze intelligence information-but there was no corresponding urgency
to evaluate the various approaches to determine what really works and what
doesn’t.
David Mandel commented that this is simply not a way of thinking that the
intelligence community is familiar with. People in the intelligence and defense
communities are accustomed to investing in devices, like a voice stress ana-
lyzer, or other techniques, but the idea of field evaluation as a deliverable is
foreign to most of them. Mandel described conversations he had with a mili-
tary research board in which he explained the idea of doing research on meth-
ods in order to determine their effectiveness.’’The ideas had never been pre-
sented to the board,’’ he said. ‘‘They use [various techniques], but they had
never heard of such a thing as research on the effectiveness of [them].’’ The
money was there, however, and once the leaders of the organization under-
stood the value of the sort of research that Mandel does, he was given ample
funding to pursue his studies.
One of the audience members, Hal Arkes of Ohio State University, made a
similar point when he said that the lack of a scientific background among
many of the staff of executive agencies is a serious problem. ‘‘If we have rec-
ommendations that we think are scientifically valid or if there are tests done
that show method A is better than method B, a big communication need is still
at hand,’’ he said. ‘‘We have to convince the people who make the decisions
that the recommendations that we make are scientific and therefore are based
88
on things that are better than their intuition, or better than the anecdote that
they heard last Thursday evening over a cocktail.’’
A Sense of Urgency to Use Applications and Institutional Biases
A number of people throughout the meeting spoke about the pressures to use
new devices and techniques once they become available because lives are at
stake. For example, Anthony Veney, chief of counterintelligence investigation
and functional services at U.S. Central Command, spoke passionately about
the people on the front lines in Iraq and Afghanistan who need help now to
prevent the violence and killings that are going on. But, as other speakers
noted, this sense of urgency can lead to pressure to use available tools before
they are evaluated-and even to ignoring the results of evaluations if they dis-
agree with the users’ conviction that the tools are useful.
Robert Fein described a relevant experience with polygraphs. The NRC had
completed its study on polygraphs, which basically concluded that the ma-
chines have very limited usefulness for personnel security evaluations, and the
findings were being presented in a briefing (National Research Council, 2003).
It was obvious, Fein said, that a number of the audience members were becom-
ing increasingly upset. ‘‘Finally, one gentleman raised his hand in some degree
of agitation, got up and said, ‘Listen, the research suggests that psychological
tests don’t work, the research suggests that background investigations don’t
work, the research suggests interviews don’t work. If you take the polygraph
away, we’ve got nothing.’’ A year and a half later, Fein said, he attended a
meeting of persons and organizations concerned with credibility assessment, at
which one security agency after another described how they were still using
polygraph testing for personnel security evaluations as often as ever. It seemed
likely, Fein concluded, that the meticulously performed study by the NRC had
had essentially no effect on how often polygraphs were used for personnel se-
curity.
The reason, suggested Susan Brandon, is that people want to have some meth-
od or device that they can use, and they are not likely to be willing to give
up a tool that they perceive as useful and that is already in hand if there is
nothing to replace it. This was probably the case, she said, when the U.S. De-
partment of Defense decided to stop using voice stress analysis-based tech-
nologies because the data showed that they were ineffective. The user commu-
nity had thought they were useful, and when they were taken away, a vacuum
was left. The users of these technologies then looked around for replacement
tools. The problem, Brandon said, is that the things that get sucked into this
vacuum may be worse than what they were replacing. So those doing field
evaluations must think carefully about what options they can offer the user
community to replace a tool that is found ineffective.
I offered a similar thought. The people in the field often do not want to wait
for further research and evaluation once a technology is available and there
are those out there that will exploit some of these gray areas and faults and
will try to sell snake oil to us. The question is, How to push back? How to
prevent the use of technology that has not been validated, given the sense of
urgency in the intelligence field? And how does one get people in the field to
understand the importance of validation in the first place? These are major
concerns. Some of the most intractable obstacles to performing field evalua-
tions of intelligence methods are institutional biases. Because these can arise
even when everyone is trying to do the right thing, such biases can be particu-
larly difficult to overcome.

Threatening Communications
In March 2011, the NRC released a small collection of papers on the subject of
threatening communications and behavior. In my introduction (along with Barbara
A. Wanchisen) to the volume, we say:
‘‘Today’s world of rapid social, technological, and behavioral change provides
new opportunities for communications with few limitations of time or space.
The ease by which communications can be made with-out personal proximity
has dramatically affected the volume, types, and topics of communications be-
tween individuals and groups. Through these communications, people leave be-
hind an ever-growing collection of traces of their daily activities, including dig-
ital footprints provided by text, voice, and other modes of communication.
Many personal communications now take place in public forums, and social
89
groups form between individuals who previously might have acted in isolation.
Ideas are shared and behaviors encouraged, including threatening or violent
ideas and behaviors. Meanwhile, new techniques for aggregating and evalu-
ating diverse and multimodal information sources are available to security
services that must reliably identify communications indicating a high likeli-
hood of future violence.’’
The papers reviewed the behavioral and social sciences research on the likelihood
that someone who engages in abnormal and/or threatening communications would
actually then try to do harm. They focused on ‘‘how scientific knowledge can inform
and advance future research on threat assessments, in part by considering the ap-
proaches and techniques used to analyze communications and behavior in the dy-
namic context of today’s world. Authors were asked to present and assess scientific
research on the correlation between communication-relevant factors and the likeli-
hood that an individual who poses a threat will act on it. The authors were encour-
aged to consider not only communications containing direct threats, but also odd
and inappropriate communications that could display evidence of fixation, obsession,
grandiosity, entitled reciprocity, and mental illness.’’
‘‘The papers in this collection were written within the context of protecting high-
profile public figures from potential attack or harm. The research, however, is
broadly applicable to U.S. national security including potential applications for anal-
ysis of communications from leaders of hostile nations and public threats from ter-
rorist groups. This work high-lights the complex psychology of threatening commu-
nications and behavior, and it offers knowledge and perspectives from multiple do-
mains that can contribute to a deeper understanding of the value of communications
in predicting and preventing violent behaviors.’’
This volume focused on communication, forensic psychology, and the analysis of
language-based datasets (corpora) to help identify and understand threatening com-
munications and responses to them through text analysis. It serves as an example
of the kind of synthesis of current knowledge that is useful for generating ideas for
potential new research directions. (Chung & Pennebaker, 2011; Meloy, 2011; O’Hair,
et al, 2011).

TSA’s SPOT program


The United States Government Accountability Office’s (GAO) May 2010 report,
‘‘Aviation Security: Efforts to Validate TSA’s Passenger Screening Behavior Detec-
tion Program Underway, but Opportunities Exist to Strengthen Validation and Ad-
dress Operational Challenges,’’ questioned whether there was a scientifically valid
basis for using behavior and appearance indicators as a means for reliably identi-
fying passengers who may pose a risk to the U.S. aviation system. The report said
that, ‘‘According to TSA, SPOT was deployed before a scientific validation of the pro-
gram was completed in response to the need to address potential threats, but was
based upon scientific research available at the time regarding human behaviors.
TSA officials also stated that no other large-scale U.S. or international screening
program incorporating behavior-and appearance-based indicators has ever been rig-
orously scientifically validated.’’ The GAO report also mentioned a separate report
by the JASON group (‘‘The Quest for Truth: Deception and Intent Deception’’) that
had significant concerns about the SPOT program.
The GAO pointed out that a 2008 NRC report indicated that information-based
programs, such as behavior detection programs, should first determine if a scientific
foundation exists and use scientifically valid criteria to evaluate its effectiveness be-
fore going forward. ‘‘The report added that programs should have a sound experi-
mental basis and that the documentation on the program’s effectiveness should be
reviewed by an independent entity capable of evaluating the supporting scientific
evidence. Thus, and as recommended in GAO’s May 2010 report, an independent
panel of experts could help DHS develop a comprehensive methodology to determine
if the SPOT program is based on valid scientific principles that can be effectively
applied in an airport environment for counterterrorism purposes. Specifically, GAO’s
May 2010 report recommended that the Secretary of Homeland Security convene an
independent panel of experts to review the methodology of a validation study on the
SPOT program being conducted by DHS’s Science and Technology Directorate to de-
termine whether the study’s methodology is sufficiently comprehensive to validate
the SPOT program. GAO recommended that this assessment include appropriate
input from other federal agencies with expertise in behavior detection and relevant
subject matter experts. DHS concurred and stated that its current validation study
includes an independent review of the program that will include input from other
federal agencies and relevant experts.’’ According to DHS, this independent review
is expected to be completed soon.
90
As indicated above, I am a member of the Technical Advisory Committee (TAC)
for SPOT. As the GAO report indicates, TAC’s role is extremely limited, focusing
in the main on determining whether or not the research program successfully ac-
complished the goal of evaluating whether SPOT can identify ‘‘high-risk travelers’’
(i.e., individuals who are knowingly and intentionally attempting to defeat the air-
port security process). TAC has not been asked to evaluate the overall SPOT pro-
gram, the validity of indicators used in the program, consistency across measure-
ment, field conditions, training issues, scientific foundations of the program and/or
behavioral detection methodologies, etc. In order to appropriately scientifically
evaluate a program like SPOT, all of these and more would be needed.

How to Move Forward: Some Recommendations


• Create a reliable research base of studies examining many of the issues related
to security and the detection of deception. Peer review, where and when pos-
sible, is particularly important. Shining a light on the process by making in-
formation on methodologies and results as open as possible (such as with de-
vices like the polygraph, PCASS, voice-stress analysis, and neuroimaging) is
necessary for determining if these technologies and devices are performing in
a known and reliable manner. Clearly establishing the scientific validity of
underlying premises, foundations, primitives, is essential. The larger the base
of comparable scientific studies, the easier it is to establish the validity of
techniques and approaches. A good example of this is the Bhatt and Brandon
(2009) meta-analysis of the outcomes of studies in the literature related to
voice stress analysis technologies. Similarly, the NRC Threatening Commu-
nications paper collection (2011) is an initial small step at establishing a body
of literature on scientific approaches to understanding threatening commu-
nications and behavior.
• Develop model systems, simulations, etc. The use of model organisms in biol-
ogy, such as Drosophila (a small fly) for helping to understand genetics and
development, and Aplysia (the sea slug), for understanding neurons and mem-
ory, has spurred considerable scientific progress in these areas. Different
kinds of model systems are needed for understanding behavior at the level
of issues such as deception. Here we should look to the law enforcement com-
munity, the criminal justice system, and possibly border security, for models,
approaches, analogies, data, and scientific guidance. Examples of advances re-
lated to the complexity of behavior include well-known work on eyewitness
identification (Loftus, 1996; Wells & Quinlivan, 2009).
• Incorporate knowledge on the complexity, subtleties and idiosyncracies of
human behavior. Progress has been made on understanding how cognitive in-
fluences (Heuer, 1996; Pohl, 2004), psychological biases, and language use af-
fect judgment, decision making, and risk assessment (Kahneman & Tversky,
1972; Thompson, 1999; Barrett, 2007). Also consider cultural and social con-
texts (Nisbett, 2003; Gordon, et al., in press).
• Understand the interplay and differences between affect, emotion, stress, and
other factors. We have a tendency to oversimplify, categorize, and label com-
plex behavior. The issues related to such matters can be seen in the conten-
tious scientific debates on emotion and deception, discussed by other partici-
pants in today’s hearing and summarized in part in a Nature article by Shar-
on Weinberger (2010). (See, also: Aviezer, et al., 2008; Barrett, 2006; Barrett,
et al., 2007; Ekman, 1972; Ekman & Friesen, 1978; Ekman & O’Sullivan,
1991; Ekman, et al., 1999; Ekman, 2009; Hartwig, et al., 2006; Russell, et al.,
2003; Widen, et al., in press.)
• Make sure that we are not distracted or misled by the tools and toys that fas-
cinate us. While technological developments often hold considerable promise,
they can be seductive and sometimes even can be counterproductive. The de-
sire for automaticity and scale, coupled with urgent exigencies, should not re-
duce our need to attend to human aspects of the process and to the impor-
tance of devoting sufficient time to adequately understand behavior and man-
age interpersonal interactions.
• Pay serious attention to the ethical issues and regulations related to human
subjects research, including 45 CFR 46 (‘‘The Common Rule’’), where applica-
ble. Emerging areas include neuroethics (Farah, 2010) and autonomous
agents (Wallach and Allen, 2010).
91
• Reduce conflicts of interest to the extent possible, particularly financial con-
flict of interest. The opportunity to profit from new and emerging technologies
that have not been carefully and clearly scientifically validated and/or field
evaluated, if necessary and possible, potentially puts our citizens, soldiers,
and intelligence community at risk and could undermine our national secu-
rity. We should have a clear understanding of both the strengths and weak-
nesses of tools, techniques, and technologies that are either being deployed or
considered for future use.
• Develop an understanding of how urgency, organizational structure, and insti-
tutional barriers can shape program development and assessment. A detailed
discussion of these issues is provided in the NRC Field Evaluation Workshop
Summary (2010), summarized above in the Field Evaluation section. We
should also strive to avoid the tendency to view results of the latest study
as instantly confirming or falsifying controversial, new, or untested tech-
nologies (Mayew & Venkatachalam, in press). Consistency across multiple
studies is essential.
• Support the importance of and need for independent evaluation of new and
controversial projects and issues with appropriate scientific, technical, statis-
tical, and methodological expertise. The NRC Polygraph and Lie Detection re-
port (2003) provides a good case study for the importance of this point and
the preceding bullet. Other examples of such independent evaluations include
many of the NRC reports listed in the References section, below. Another pos-
sible example is the JASON report on the SPOT program. Such reports
should be seen as part of an iterative process that requires periodic modifica-
tion and updating.
In our desire to protect our citizens from those who intend to harm us, we must
make sure that our own behavior is not unnecessarily shaped by things like fear,
urgency, institutional incentives or pressures, financial considerations, career and
personal goals, the selling of snake oil, etc., that lead to the adoption of approaches
that have not been sufficiently and appropriately scientifically vetted. To do so
might ultimately end up being costly and counterproductive. We must not be dis-
tracted from the need for careful, well-considered, and well-established approaches
for evaluating programs and technologies. We must be careful and thoughtful before
investing in speculative or premature technologies that may be used out of despera-
tion or because of potential commercial benefit. Where and when new technologies
appear to be promising, we should obtain truly independent scientific expertise and
assistance to provide context and guidance for the development possibilities and, if
needed, for the consideration of appropriate metrics and methodologies for assess-
ment and use. We should also keep in mind human costs and unintended con-
sequences. As we all know, freedom and privacy must be considered in the context
of safety and security. These values and goals are not incompatible. Sacrificing free-
dom and privacy to purchase illusory safety and security benefits only those who
hope to harm us.
Chairman Broun, Ranking Member Edwards, and members of the Committee, I
appreciate the opportunity to testify today. I would be happy to answer any ques-
tions that you might have about my testimony or related issues. Thank you.

REFERENCES

Aviezer, Hillel, Hassin, Ran R., Ryan, Jennifer, Grady, Cheryl, Susskind, Josh,
Anderson, Adam, Moscovitch, Morris, and Bentin, Shlomo. (2008). Angry, dis-
gusted or afraid? Studies on the malleability of emotion perception. Psycho-
logical Science, Vol. 19, No. 7, 724-732.
Barrett, Lisa Feldman. (2006). Are emotions natural kinds? Perspectives on
Psychological Science, Vol. 1, #1, 28-58.
Barrett, Lisa Feldman, Lindquist, Kristen A., and Gendron, Maria. (2007).
Language as context for the perception of emotion. TRENDS in Cognitive
Sciences, Vol. 11, No. 8, 327-332.
Bhatt, S., and Brandon, S. E (2009). Review of voice stress-based technologies
for the detection of deception. Unpublished manuscript, Washington, DC.
Chung, Cindy K. and Pennebaker, James W. (2011). Using computerized
textanalysis to assess threatening communications and behavior. In National
Research Council, Threatening Communications and Behavior: Perspectives on
92
the Pursuit of Public Figures. National Academies Press, Washington, DC, 3-
32.
Damphouse, Kelly R. (2011). Voice Stress Analysis: Only 15 percent of lies
about drug use detected in field test. National Institutes of Justice (NIJ) Jour-
nal, 259, 8≥12.
Ekman, Paul. (1972). Universals and Cultural Differences in Facial Expres-
sions of Emotions. In J. Cole (ed.), Nebraska Symposium on Motivation, 1971,
University of Nebraska Press, Lincoln, Nebraska, 1972, 207-283.
Ekman, P. and Friesen, W. (1978). Facial Action Coding System: A Technique
for the Measurement of Facial Movement. Consulting Psychologists Press, Palo
Alto.
Ekman, Paul. (2009). Lie catching and micro expressions. In Clancy Martin
(ed.), The Philosophy of Deception. Oxford University Press.
Ekman, Paul and O’Sullivan, Maureen. (1991). Who can catch a liar? American
Psychologist, 46(9), Sep. 1991, 913-920.
Ekman, Paul, O’Sullivan, Maureen, and Frank, Mark G. (1999). A few can
catch a liar. Psychological Science, 10(3), May 1999, 263-266.
Farah, Martha J. (ed.). (2010). Neuroethics: An introduction with readings. The
MIT Press, Cambridge, MA.
Gazzaniga, Michael S. (2011). Neuroscience in the courtroom. Scientific Amer-
ican, April 2011, 54-59.
Gordon, J. B., Levine, R. J., Mazure, C. M., Rubin, P. E., Schaller, B. R., and
Young,J. L. (in press). Social contexts influence ethical considerations of re-
search. American Journal of Bioethics, 2011.
Hartwig, Maria, Granhag, Pär Anders, Strömwall, Leif A., and Kronkvist, Ola.
(2006). Strategic use of evidence during police interviews: When training to de-
tect deception works. Law and Human Behavior, 30(5), 603-619.
Heuer, Richards J., Jr. (1999). Psychology of intelligence analysis. Center for
the Study of Intelligence, Central Intelligence Agency, Washington, DC.
Intelligence Science Board. (2006). Educing Information: Interrogation: Science
and Art. The National Defense Intelligence College.
Kahneman, D. and Tversky, A. (1972). Subjective probability: A judgment of
representativeness. Cognitive Psychology, 3, 430-454.
Loftus, Elizabeth F. (1996). Eyewitness Testimony. Harvard University Press,
Cambridge, MA.
Mayew, William J. and Venkatachalam, Mohan. (in press). The power of voice:
Managerial affective states and future firm performance. Journal of Finance,
forthcoming.
Meloy, J. Reid. (2011). Approaching and attacking public figures: A contem-
porary analysis of communications and behavior. In National Research Coun-
cil, Threatening Communications and Behavior: Perspectives on the Pursuit of
Public Figures. National Academies Press, Washington, DC, 75-101.
Moreno, Jonathan D. (2006). Mind Wars: Brain Research and National De-
fense. The Dana Foundation, New York and Washington, DC.
O’Hair, H. Dan, Bernard, Daniel Rex, and Roper, Randy R. (2011). Commu-
nications-based research related to threats and ensuing behavior. In National
Research Council, Threatening Communications and Behavior: Perspectives on
the Pursuit of Public Figures. National Academies Press, Washington, DC, 33-
74.
National Research Council. (2003). The Polygraph and Lie Detection. Com-
mittee to Review the Scientific Evidence on the Polygraph. Board on Behav-
ioral, Cognitive, and Sensory Sciences and Committee on National Statistics,
Division of Behavioral and Social Sciences and Education. National Academies
Press, Washington, DC.
National Research Council. (2008). Behavioral Modeling and Simulation: From
Individuals to Societies. Committee on Organizational Modeling: From Individ-
uals to Societies. Board on Behavioral, Cognitive, and Sensory Sciences, Divi-
93
sion ofBehavioral and Social Sciences and Education. National Academies
Press, Washington, DC.
National Research Council. (2008). Emerging Cognitive Neuroscience and Re-
lated Technologies. Committee on Military and Intelligence Methodology for
EmergentNeurophysiological and Cognitive/Neural Science Research in the
Next Two Decades. Standing Committee for Technology Insight - Gauge,
Evaluate, and Review Division on Engineering and Physical Sciences. Board
on Behavioral,Cognitive, and Sensory Sciences, Division of Behavioral and So-
cial Sciences andEducation. National Academies Press, Washington, DC.
National Research Council. (2008). Human Behavior in Military Contexts.
Committee on Opportunities in Basic Research in the Behavioral and
SocialSciences for the U.S. Military. Board on Behavioral, Cognitive, and
SensorySciences, Division of Behavioral and Social Sciences and Education.
Washington, National Academies Press, Washington, DC.
National Research Council. (2008). Protecting Individual Privacy in the Strug-
gle Against Terrorists. Committee on Technical and Privacy Dimensions
ofInformation for Terrorism Prevention and Other National Goals; Committee
on Law and Justice (DBASSE); Committee on National Statistics (DBASSE);
Computer Science and Telecommunications Board (DEPS). National Academies
Press, Washington, DC.
National Research Council. (2010). Field Evaluation in the Intelligence and
Counterintelligence Context. Workshop Summary. Planning Committee on
Field Evaluation of Behavioral and Cognitive Sciences-Based Methods and
Tools for Intelligence and Counterintelligence. Board on Behavioral, Cognitive,
and Sensory Sciences, Division of Behavioral and Social Sciences and Edu-
cation. National Academies Press, Washington, DC.
National Research Council. (2011). Intelligence Analysis: Behavioral and Social
Scientific Foundations. Committee on Behavioral and Social Science Research
to Improve Intelligence Analysis for National Security. Board on Behavioral,
Cognitive, and Sensory Sciences, Division of Behavioral and Social Sciences
andEducation. National Academies Press, Washington, DC.
National Research Council. (2011). Intelligence Analysis for Tomorrow: Ad-
vances from the Behavioral and Social Sciences. Committee on Behavioral and
Social Science Research to Improve Intelligence Analysis for National Security.
Board on Behavioral, Cognitive, and Sensory Sciences, Division of Behavioral
and Social Sciences and Education. National Academies Press, Washington,
DC.
National Research Council. (2011). Threatening Communications and Behav-
ior: Perspectives on the Pursuit of Public Figures. Board on Behavioral, Cog-
nitive, and Sensory Sciences, Division of Behavioral and Social Sciences and
Education.National Academies Press, Washington, DC.
National Science and Technology Council, Subcommittee on Social, Behavioral
and Economic Sciences. Executive Office of the President of the United States.
(2009). Social, Behavioral and Economic Research in the Federal Context. Jan-
uary 2009.
Nisbett, Richard E. (2003). The Geography of Thought: How Asians and West-
erners Think Differently... And Why. Free Press.
Pohl, Rüdiger F. (2004). Cognitive Illusions: A Handbook on Fallacies and Bi-
ases in Thinking, Judgement and Memory, Psychology Press, Hove, UK, 215-
234.
Rubin, P. (2003). ‘‘Introduction.’’ In S. L. Cutter, D. B. Richardson, & T. J.
Wilbanks (Eds.), The Geographical Dimensions of Terrorism. Routledge, New
York.
Rubin, P. and Wanchisen, B. (2011). ‘‘Introduction.’’ In National Research
Council, Threatening Communications and Behavior: Perspectives on the Pur-
suit of PublicFigures. National Academies Press, Washington, DC.
Russell, James A., Bachorowski, Jo-Anne, and Ferńandez-Dols, Jośe-Miguel.
(2003). Facial and vocal expressions of emotion. Annual Review of Psychology,
54, 329349.
Thompson, Suzanne C. (1999). Illusions of control: How we overestimate our
personal influence. Current Directions in Psychological Science, 8(6), 187-190.
94
United States Department of Health and Human Services (HHS). (2009). Code
of Federal Regulations. Human Subjects Research (45 CFR 46). (See: http://
www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html )
United States Government Accountability Office (GAO). (2010). Aviation Secu-
rity: Efforts to Validate TSA’s Passenger Screening Behavior Detection Pro-
gram Underway, but Opportunities Exist to Strengthen Validation and Ad-
dress Operational Challenges. GAO-10-763, May 2010, Washington, DC.
Wallach, Wendell and Allen, Colin. (2010). Moral Machines: Teaching robots
right from wrong. Oxford University Press, New York.
Weinberger, Sharon. (2010). Airport security: Intent to deceive? Nature, 465,
412≥415.
Wells, Gary L. & Quinlivan, Deah S. (2009). Suggestive Eyewitness Identifica-
tion Procedures and the Supreme Court’s Reliability Test in Light of Eye-
witness Science: 30 Years Later. Law & Human Behavior, 33, 1-24.
Widen, S. C., Christy, A. M., Hewett, K., and Russell, J. A. (in press). Do pro-
posed facial expressions of contempt, shame, embarrassment, and compassion
communicate the predicted emotion? Cognition & Emotion, in press, 1-9.
Chairman BROUN. Thank you, Dr. Rubin. And I want to express
my appreciation for your being here. I know you have had some re-
cent challenges and I greatly appreciate you being here in spite of
those. So thank you very much.
Dr. RUBIN. Thank you.
Chairman BROUN. And I want to thank all the panel for your tes-
timony. Reminding Members that the Committee rules limit ques-
tioning to five minutes. The Chair at this point will open the round
of questions and the Chair recognizes himself for five minutes.
Mr. Willis, when can we expect the SPOT validation report?
Mr. WILLIS. The report was delivered to me by AIR last night.
It is being submitted through DHS’s review and release distribu-
tion process. I am not exactly sure what that time is or when it
is ultimately disseminated. I can certainly get that information for
you, sir.
Chairman BROUN. I would appreciate getting that report to us as
quickly as possible.
Mr. WILLIS. Yes, sir.
Chairman BROUN. What additional steps have to be taken before
we get the report?
Mr. WILLIS. I don’t know what DHS’s distribution process en-
tails. I know that I will submitting it this morning following my
participation here.
Chairman BROUN. Do you have any problems in releasing the
preliminary results?
Mr. WILLIS. I don’t know what DHS’s policy is on that, but I am
happy to provide whatever is consistent with DHS’s S&T’s policy
on release.
Chairman BROUN. I understand that the results, I assume, are
still preliminary. There appears to be a discrepancy in the SPOT’s
success rate. In your testimony you state ‘‘the study did indicate
that a high-risk traveler is nine times more likely to be identified
using Operational SPOT versus random screening.’’ Yet when you
met with the staff from the I&O Subcommittee on March 3 you
said that the SPOT program was 50 times more effective than ran-
dom screening. One of our other witnesses, Dr. Ekman, also makes
a similar claim in his testimony saying ‘‘malfeasance, felons, smug-
95

glers, et cetera, identified more than 50 times as often by those se-


lected by SPOT.’’ Can you please explain the discrepancy?
Mr. WILLIS. Well, there shouldn’t be a discrepancy. We use four
metrics by which to evaluate SPOT. The first one was the posses-
sion of illegal or prohibited items. The second one was possession
of fraudulent documents. The third was LEO arrest, law enforce-
ment arrest. And the fourth was a combination thereof. The LEO
arrest has the higher number that you referred to in your question,
sir.
Chairman BROUN. The 50 times?
Mr. WILLIS. Yes, sir. The possession of prohibited items and
fraudulent documents is approximately four and a half times, and
if one combines all of them, it is nine times.
Chairman BROUN. Are those that were identified—how many of
those were actually convicted?
Mr. WILLIS. Sir, I would have no idea. Our effort stops at wheth-
er a decision is recorded as being arrested or not, and that is the
information that is available through the SPOT database. It doesn’t
go beyond that.
Chairman BROUN. Do you have any data about false negatives?
I mean false positives?
Mr. WILLIS. On?
Chairman BROUN. On the people that have been identified at the
50 times or 9 times or 4–1/2 times?
Mr. WILLIS. Are you talking about the false positive associated
with arrests?
Chairman BROUN. No, with arrest or—yes, sir, with arrest and
with prosecution—the ultimate prosecution, et cetera.
Mr. WILLIS. Yes, sir. We do have information available on that.
So for example, if one looks at the false positive index, which is for
every person that you correctly classify as a high-risk traveler,
what is the number of travelers you misclassify? We have that in-
formation on any of the four metrics that we discussed. And so for
example, combined outcome for every person that you correctly
identify using Operational SPOT, 86 were misidentified. For the
base rate or random study, for every person that you correctly iden-
tify, 794 were misidentified.
Chairman BROUN. Wow. SPOT was initially developed as in-
tended to stop terrorism. That is the whole point of it. Now, we see
that the program has expanded to include criminal activity. Why
was this done?
Mr. WILLIS. You are asking a question about the mission. I am
from Science and Technology, sir. I am unable to answer that. May
I refer you to TSA?
Chairman BROUN. Well, that is the reason TSA should be here
and the reason that I think Ms. Edwards and I are both extremely
disappointed that they are not here.
Mr. WILLIS. I could, sir, talk to you about why we use metrics
that deal more with criminal than with terrorism.
Chairman BROUN. That would be sufficient—or helpful.
Mr. WILLIS. Sure.
Chairman BROUN. You have got a few seconds, so go ahead.
Mr. WILLIS. Okay, sir.
Chairman BROUN. My time is out.
96

Mr. WILLIS. The reason we use those metrics that we had just
listed, sir, was because they were available to us through the data
in sufficient numbers to analyze, even though they themselves are
low base rate or extremely rare. And data directly dealing with ter-
rorism is unavailable and, thus, can’t be used as a metric.
Chairman BROUN. Okay. My time is up. Ms. Edwards.
Ms. EDWARDS. Thank you, Mr. Chairman. And as I mentioned
earlier, I am disappointed that TSA isn’t here because I think that
there are a number of questions that actually go to things like
training protocols and other aspects of the SPOT program that they
would have, you know, really useful information to share and so I
look forward to working with the Chairman and the Committee.
This question about who needs to appear or not is not a decision,
really, for the Administration. Congress determines, under its Con-
stitutional authority, who appears before the Committees and what
the jurisdiction is. So I do share that concern.
I want to go to this question, though, of profiling——
Chairman BROUN. Does the gentlelady yield?
Ms. EDWARDS. Yes.
Chairman BROUN. I appreciate your comment. You took up about
almost a minute with that and I would like to give you an extra
minute on top of that, so I don’t want to charge you that time.
Ms. EDWARDS. I appreciate that, Mr. Chairman.
Chairman BROUN. So I will give you the extra minute. So if you
all would start her clock again, please.
Ms. EDWARDS. Thank you. Thank you again, Mr. Chairman. I
have a question, really, that goes to this issue of profiling. I mean,
as an African American woman who sometimes, because I have
short hair and I get cold, I wear a scarf on my head and that is
true in the airports especially. I have had the experience of actu-
ally being pulled over, questioned, and it hasn’t just happened once
or twice. It has actually happened multiple times. And, you know,
I don’t want to make any speculation about that, but it does raise
the question of who is identifying me and how and what I am send-
ing off.
I am also reminded in Dr. Hartwig’s testimony that, you know,
I remember when I broke a lamp and I tried to glue it together and
my mother walked in and she said what did you do? And I suspect
that part of the reason that she could say that and she knew—and
then I proceeded to tell her a lie, but I suspect that part of the rea-
son that she knew I was lying is because she knew me and because
she had had experience with me and because she had read my both
verbal and nonverbal cues many times over, which gave her a
much better indication of when I was doing truth-telling and when
I wasn’t.
We don’t have that experience in our airports, and so I have a
question for Lieutenant DiDomenica, and that is whether it is pos-
sible to train officers of all kinds not to engage in profiling? And
I have done police training, law enforcement training as well, and
I think it is tough to train out culture, culture in the sense of a
police culture and a law enforcement culture where you have to
train against type when it comes to these issues. And so I am curi-
ous, Lieutenant DiDomenica, if you can share with us whether it
is possible to train officers not to engage in profiling?
97

Mr. DIDOMENICA. I believe it is so and I have been training in


biased policing and racial profiling for over a decade now. Prin-
cipally, with the state police I designed statewide programs for the
Massachusetts police community on racial profiling, biased polic-
ing, and it is possible to make people aware of their own uncon-
scious bias and tendency to want to make snap decisions about peo-
ple based on very superficial things. We all have this hardware, it
is a survival instinct, and when we look at somebody, we are auto-
matically making an opinion about them. And a lot of it has to do
with our background and cultural influence, and a lot of those are
negative. But, you know, this part of your brain is about survival,
and it wants to understand what is going on very quickly. And it
actually gets a jump on your conscious awareness. So right away
when I walked in here and you saw me and I saw you, we made
a decision about each other before we were even consciously aware
of who we were and what we are. And that is going on all the time.
And this is the source of bias.
Now, knowing that I can’t stop my feelings about someone based
on how they look, that initial survival reaction about whether the
person might be dangerous or not, but I can take a few seconds,
maybe minutes, to think about, you know, what is going on, what
do I know objectively, and maybe even do some race transposition.
If this person was another race, you know, how would I feel about
the situation? And then I can make a decision. So it takes self-
awareness. It takes training. It takes the ability—willing to change
and monitor yourself. But it can be done.
One of the foundations of the behavior assessment training I
have done and what I initially gave to TSA for the SPOT program
is you have to address bias and racial profiling. In fact, I call it—
you know, it was—to me it was an antidote to racial profiling——
Ms. EDWARDS. Lieutenant DiDomenica, I would love to hear but
I just have just a minute and a half left and I wanted to get to—
I appreciate your answer. I wanted to get to Dr. Ekman because
I have to tell you, you have been unnerving me the entire time I
have been in here and I am sure we have been reading those cues.
And I wonder if you have something to share with us on this issue
of whether you can train against those kind of—what could be neg-
ative instincts in one context but train them to be positive factors
in recognizing behavior?
Dr. EKMAN. Yes. And thanks for the opportunity to respond to
that. I wanted to quickly put in that we did research years ago that
show that the better you knew someone, the worse you were in
identifying when they lied to you because you are biased. If they
are your friend, your spouse, et cetera, you don’t want to discover
that they are lying. Strangers do better than close people.
But the issue is monitoring—building into the SPOT program
some monitoring to discover the actual incidents of racial profiling.
And my bet is that some people show a lot more of it than others.
Not everybody can learn everything. Not everybody can unlearn ev-
erything. What we want as BDOs are the people who have the
flexibility of mind to benefit from that training and be susceptible
to racial profiling. How can we find out? It is not rocket science.
It is by having unannounced observers checking on who is it they
pay attention to and finding out whether there are some people
98

who are repeatedly showing racial profiling. And you either reedu-
cate or you reassign them to a different job.
Ms. EDWARDS. Thank you, Dr. Ekman, and thanks for your in-
dulgence, Mr. Chairman.
Chairman BROUN. You know, we will always be friends and I will
always give you some variances on the time so I am not going to
be worried about that at all.
Dr. Benishek, you are up next for your questions. Go ahead, sir.
Mr. BENISHEK. Thank you, Mr. Chairman. Thanks to the panel,
as well, for being here.
It is our job here to try to spend the money of the taxpayer the
most efficacious way and listening to the testimony here, it is real-
ly difficult for me to determine whether this SPOT process is accu-
rate or not. But I would like to address Mr. DiDomenica about the
process a little bit more. From your comments today it seems as
if there is some doubt, I mean, even after the BDO sees some kind
of behavior, then what is the process after that? If there is someone
there, it sounds as if you have some doubt as to the next step as
to what is happening, the next screening step. Are those people not
trained in the same thing? I mean I would hate to see somebody
get missed. So I would like to know more about the exact process
from the moment that the person gets taken out of the queue. Is
that effective? Is it—are we doing any good? Are we missing peo-
ple? I mean, this is the kind of thing I think you brought up in
your testimony.
Mr. DIDOMENICA. I think it is effective and I also think we are
missing people, but I think that could be improved. The process ac-
tually starts with an observation that may indicate a person that
is high-risk, that maybe should not get on that airplane or get onto
that train or into that government building, whatever the critical
infrastructure is. And based on the evaluation, this SPOT scoring,
which I really can’t go into because that is, you know, that is sen-
sitive information.
But there are two levels, and one is more screening, and one is
a law enforcement response. So for the people deemed to be the
most high-risk, the protocol is to invite or call a law enforcement
officer to do a follow-up interview. Now, this follow-up interview is
the opportunity to address the false positives, because a lot of peo-
ple that exhibit the behaviors that may indicate possible terrorist
intent or criminal intent are just people that are upset or dis-
tracted or late for work or going to a funeral, whatever it is, that
maybe a lot of people just get on the radar. And this interview,
which really only takes a couple of minutes to do, is the oppor-
tunity to resolve that so you are not creating false positives. And
it is also an opportunity to determine if you have got the real
thing, that this person is high-risk. And so that is another skill. I
mean that is the interview skill, which is another part of this proc-
ess. So there are——
Mr. BENISHEK. Are those people skilled enough in your opinion?
Mr. DIDOMENICA. When you say ‘‘those people’’——
Mr. BENISHEK. The people—the secondary person. Are there
enough of those people?
Mr. DIDOMENICA. I think the responsibility ultimately falls on
police officers when there is a high-risk person. I think they are ca-
99

pable. Every day they are making decisions around this country
whether to arrest somebody, not to arrest somebody, use lethal
force in some cases, deny people their freedoms, and so I don’t
think it is too much to ask them to make a decision, is this person
a high-risk person and do we need to slow down the process to fig-
ure out what is going on? I think they are capable of doing it. We
are doing it—whether this program gets funded or not, cops are
making these decisions every day. But I would like to see them get
more training and more support to make them better at what they
do. And this program has that potential.
Mr. BENISHEK. All right. Thank you. I don’t know where we are
at with the time, but I will yield back the remainder of my time,
if any.
Chairman BROUN. Thank you, Doctor. I just want to say your
questioning just shows further why TSA should be here so that we
could answer those questions, because if they were, then you could
direct it to the TSA individuals and it would be very instructive to
the whole Committee, Democrats and Republicans alike, and help
us to go forward.
The next person on the agenda is my friend, Mr. McNerney. You
are recognized for five minutes.
Mr. MCNERNEY. Thank you. And I appreciate you calling this
hearing. It is interesting. I have watched ‘‘Lie to Me’’ on occasion
and I find it is compelling but not too scientific in my opinion. But
it is good for us to examine this issue and see how much utility
there can be from it and how much money should be expended to
find that utility.
Dr. Hartwig, I think I heard you say—and you can correct me
if I am wrong—that you fail to see how knowledge of the indicators
could be useful.
Dr. HARTWIG. I think that is, again, an empirical question. There
isn’t enough research on—well, there is a lot of research on de-
meanor cues, but as far as I know, there is no study that tests
whether knowledge about, for example, micro-expressions help peo-
ple not display them. But that would be a second step. It would be
a good first step to establish that these expressions occur reliably.
Mr. MCNERNEY. Okay, and I was——
Dr. HARTWIG. So countermeasures come second.
Mr. MCNERNEY. Okay. Thank you, Dr. Hartwig. And I was going
to follow up with you, Dr. Ekman, to basically say would you agree
that knowledge of those indicators would also be useful to potential
wrongdoers?
Dr. EKMAN. We don’t know. I mean you are basically asking the
question in polygraph terms is could you develop countermeasures?
Mr. MCNERNEY. Right. Right.
Dr. EKMAN. A proposal I put in to the government to find out—
I mean I have reason to believe that the Chinese know the answer
because they were sending me questions that you would want to
prepare on if you were going to do a training study to see whether
you could inhibit people from showing not just micro-expressions
but there are dozens of items on that checklist. The—our govern-
ment has not decided that it is worth finding out whether you can
beat the system. Other governments are finding out and may be se-
lecting people who can and training them so they can. We just
100

don’t know. We know about the polygraph. We know counter-


measures are quite successful. We know about some verbal means
and we know they are quite successful.
If I can have a moment more, sir.
Mr. MCNERNEY. Yeah, go ahead.
Dr. EKMAN. You heard some complete contradictions between Dr.
Hartwig and myself. I think if you look carefully at the literature,
you would find that it comes out supporting me. But how can you
know? And I think you need to do, when you get a disagreement
among scientists, is you need to establish an advisory panel, ex-
perts, who have no vested interest and no connections to hear from
the people who disagree and look at the literature and resolve it
because you are really being given, in this testimony, advice that
is 180 degrees opposite in terms of is there a scientific basis for
what is being done?
But you could argue—and I don’t know whether Mr. Willis
would—that if this validity study holds up to scientific scrutiny, to
everyone who has looked at it, to this Committee, if it is as success-
ful as the report is, you have got to be doing something right to
get that kind of success. So maybe it——
Mr. MCNERNEY. It——
Dr. EKMAN. —is of scientific interest to find out.
Mr. MCNERNEY. Thank you, Dr. Ekman. Mr. Lord is chomping
at the bit here. Go ahead.
Mr. LORD. I would like to respond to Dr. Ekman’s point. In fact,
that was the key recommendation of our May 2010 report was to
have an independent panel review the results of this current AIR
validation effort. We think it is very important for a panel to be
established that has no ties to the current program, that is not an
advocate of the current program, to help weigh in on this very
issue. I think it is very interesting that the panel today shows a
lack of consensus, which was the basic point I made in my earlier
statement. There is no scientific consensus——
Mr. MCNERNEY. Well, a subject like this you would expect to
be—a broad range of disagreements. Has a panel—like what you
are recommending—been suggested in one of the budgets or lined
out somewhere or is this something——
Mr. LORD. Yeah, DHS agreed to establish an independent panel
to review the methodology of the AIR validation effort, as well as
to review the final results, but as Mr. Willis indicated, the final re-
sults of this latest validation effort have only recently been sub-
mitted. I believe he said as of last night.
Mr. MCNERNEY. I think I have run out of time so I am going to
yield back.
Chairman BROUN. Mr. Hultgren, five minutes.
Mr. HULTGREN. Thank you, Doctor. Thank you all for being here.
I share the frustration with some of the others that TSA is not here
today. I am a new Member here at Congress, along with quite a
few others, and so have been traveling much more in the last 3
months than I have ever traveled in my life. In fact, just on Mon-
day, the trip out here, I had my first experience of the full treat-
ment by TSA out of O’Hare and it was interesting. Didn’t realize
that it involved turning your head and coughing, but I now know
that that does—is what it is. But, you know, it is important for us
101

to have these discussions again to protect our liberty and freedom,


while at the same time making sure that we have security. So I
do thank you for your role. What I am learning is that we have got
a lot more work to do and a lot more discussion that needs to take
place.
I just have a couple questions. Dr. Rubin, if I can address my
questions to you if that would be all right. Much has been made
about the science and research behind the ability for an indi-
vidual—or in this case, BDO—to detect emotion, deceit, and intent
in another individual based on a combination of verbal and non-
verbal and micro-facial expressions. I wonder, speaking broadly
and keeping it as simple as you can for those of us laymen, could
you just tell us the state of the science as it relates to the detection
of emotion, deceit, and intent of behavioral cues?
Dr. RUBIN. Yes. In general I guess I would agree with Dr. Ekman
in the sense that we are at the point where there are two things
going on. If you look at something like voice stress analysis and
look at the meta-analysis done by Sujeeta Bhatt and Susan Bran-
don coming out of the Defense Department. What you basically see
in most of these studies is that the results are no different than
chance. Agreeing with both Dr. Hartwig and Ekman, there is a lot
of controversy here and there is very little real science and valida-
tion.
And it is not just that field evaluation when you can’t do it.
Again, there has been a committee established on the SPOT Pro-
gram regarding the report. I am on that committee. And we have
not been asked to do any overall scientific validation for the pro-
gram, just to look at one particular thing, are the results different
than chance? So I am agreeing here that what is really needed on
these issues, before we continue to invest more money, is to really
establish, without putting any information at risk, a baseline about
what is doable, what is not doable, what is known, and what is not.
So this is the classic issue of do you test first and then field a
product or project? Or field it and test? And this particular in-
stance, considering the investment, considering the intrusion on
people’s privacy, I think it is absolutely time to be testing, vali-
dating, and scientifically exploring these things now before we con-
tinue to do significant investment. I am not saying we shouldn’t
continue the program. I think it is important. But right now we
need to establish on some of the known kind of things that we are
doing without giving anything away. Is there good science behind
it? Otherwise, we are simply throwing money down the drain.
Mr. HULTGREN. I think kind of following up on that, one of the
concerns that operators have is that behavioral science is not dis-
missed because there are issues dealing with the validation of spe-
cific cues. Can you speak for a moment on the importance of behav-
ioral science in counterterrorism context and then what its limita-
tions are, what its strengths are as far as our work for
counterterrorism?
Dr. RUBIN. Okay. Well, we are changing the topic a little bit be-
cause we are moving to counterterrorism. I think that the behav-
ioral work is broad in counterterrorism. I think it is extremely im-
portant. Again, when we get to counterterrorism, you are broad-
ening your argument out because you get to analysts. There has
102

been an excellent report from an NRC Committee chaired by Ba-


rouche Fish. There is a lot that is known.
And again, we touched on some of this and a number of the pan-
elists did. You are starting to get involved in behavioral issues of
attitude, of biases. Some of this was described in the original intel-
ligence work of Richards Heuer on cognitive biases. There is a lot
that we know. The issue becomes structural and organizational.
Consider, two things. What do we know? And what don’t we
know? With the stuff that we do know, how do we make sure it
is being most effectively used by the intelligence community and by
whomever else needs to use it on those issues where we are not en-
tirely clear? Where things are uncertain or controversy, how can
we move ahead? And then there are emerging technologies that we
are going to start to be seeing used. We see some of them in terms
of the kind of devices like x-ray, but things like euro-imaging, re-
mote imaging, and sensing of other things. That is where I was
speaking of the seduction of technology. I support that stuff great-
ly, but we need to make sure on stuff that is new and emerging
that we also get a handle on it.
So I think the behavioral tools and technologies are stuff is grow-
ing rapidly, and are extremely important, but I think we are not
developing a comprehensive approach to appropriately evaluating
them before deploying them in the field.
Mr. HULTGREN. I see my time is up. I do want to thank you all
for being here. I do feel like this is a start of a discussion that we
need to continue, so I appreciate so much all of you being here. I
also would ask for any advice any micro-facial expressions I might
have so I don’t have to go through that examination again. That
would be helpful. So pass that along to me. Thank you.
Chairman BROUN. Thank you, Mr. Hultgren. I ask unanimous
consent that the gentleman from Florida, Mr. Mica, be allowed to
sit on the dais with the Committee and participate in the hearing.
Hearing none, so ordered. Mr. Mica, you are recognized for five
minutes.
Mr. MICA. Well, thank you. And first of all, thank you, Mr.
Chairman, Mr. Broun, and Ranking Member Edwards and other
Members of the panel.
I have great interest in the subject that you have before you. As
you may know, I was involved in the creation of TSA when I
chaired the Aviation Subcommittee in 2001 for some six years after
that and watched its evolution.
First, I might say that I am absolutely distraught that your Sub-
committee would be denied by TSA the opportunity for them to be
here and possibly learn something or participate. I don’t want you
to feel like they are just ignoring you. They have ignored our Com-
mittee and others, so they have a history of this. And I will work
with you and others. In fact, I think we need to convene a panel
of Chairs of various Committees and somehow rein this Agency in.
And it has an important mission. I am just stunned, again, that
they would not have someone at least to hear from the excellent
panel of witnesses you have had here today, particularly when they
come and ask for more money.
Let me just tell you my involvement with the SPOT program,
again, as Chair of the Committee that created it. I followed TSA
103

in its successes and failures and we have deployed a lot of expen-


sive technology out there, and unfortunately, the technology does
not do a very good job and the personnel failure performance rate
is just off the charts.
And if you haven’t had the classified briefing on the latest tech-
nology, which are both the backscatter and the millimeter wave, I
urge you to do that. I had GAO review that in December of last
year and then the pat-down, which was sort of their backup new
procedure, which they put in place the end of last year. And then
I had that reviewed by GAO in January. But that failure rate is
totally unacceptable.
The way we got started on SPOT is I found the technology lack-
ing in reports of performance both by screeners and the equipment
they used as leaving us vulnerable, particularly after the Hench-
men bombers. And I think we bought some puffer machines at the
time. I remember going up, having those tested. They didn’t work
but they promised me they would. They deployed them and they
didn’t work. So we needed something in place. We encouraged look-
ing at the Israeli model and you can’t really adopt the Israeli model
because they have a much smaller amount of traffic. We have 2/
3 to 3/4 of all the passenger traffic in the world and that is part
of America. You know, you get on a plane, you go where you want.
People just have a magic carpet through aviation in this country.
That is how we started this. I have observed their operations and
I can’t evaluate them. We had GAO evaluate them and you have
some representatives here to tell you that the failure rate is unac-
ceptable. It is almost a total failure. If it wasn’t money and per-
sonnel, maybe it wouldn’t matter, but they have got 3,300 SPOT
officers, I believe, in the program and they have got a quarter of
a billion dollars in expenditures and asking for more.
What I heard today is that, again, it doesn’t work. I had to leave
before I heard all the suggestions and I would look for—. Some of
the suggestions on the amount of time to do a verbal interview
would improve it, but maybe finding some way to get us to a num-
ber that we could have some exchange.
Ms. Edwards made some excellent points in her opening com-
ments, too, that we have got to have some way to improve this and
that unless there is some verbal exchange, I think that we are with
this standoff observation, we are wasting time, money, and re-
sources. So I don’t have a specific recommendation for the replace-
ment. I do know what is in place does not work. But I can’t tell
you how much I appreciate your Subcommittee taking time to re-
view this matter and try to seek a better approach, a better
science, and better application of something that is so important.
Because we are at risk. These people are determined to take us
out.
I just came from another meeting, the folks that developed both
backscatter and millimeter wave, which is two technologies we are
using, and the scary thing there is we had witnesses in one of the
other hearings that said that both of those technologies will not be
able to detect either body cavity or surgical implants. And we al-
ready see that they are always going one step ahead of whatever
we put in place. So we have got a failed system, we are spending
a lot of money on it, it is supposed to provide us with a backup.
104

The information we have and the review of the performance shows


that it is not doing that and it needs to be replaced or dramatically
revised if it is going to be effective in keeping us from this next set
of threats.
So those are my comments. I would ask that if you have sugges-
tions, we do have an FAA bill which we can include some positive
suggestions. We couldn’t do that in the House side because of juris-
diction, but we can do it in conference and the door has already
been opened by the Senate. And I would love to hear recommenda-
tions from you and from those who participated today how we can
do it better. So thank you for allowing me to participate.
Chairman BROUN. Well, thank you, Chairman Mica. I appreciate
your being here and appreciate your comments. I can speak for Ms.
Edwards. We both are very concerned about national security. We
both are concerned about civil liberties. We both are concerned
about that we make sure that the flying public are safe and I ap-
preciate her input. And I hope that you will find some way that
maybe we will have those terrorists subjects that we can put in a
study so that maybe some kind of behavioral science could be de-
veloped to try to identify these folks.
We will go to our next round of questioning. So I will recognize
myself for five minutes for questioning. Even if SPOT is more than
nine times more effective than random, we still are talking about
very low base rates. Lieutenant DiDomenica who states in his testi-
mony that the base rate for terrorism is .000000—I think one more
0, 6—I hope I didn’t get too many zeros and did not leave that one.
Can any of the panelists help put that into perspective? Anybody?
Mr. Lord?
Mr. LORD. Sure. That statistic implies that acts of terrorism are
very rare events. That makes it very difficult to test the efficacy
of the program and develop, as we recommended in our report, per-
formance metrics to allow you to better judge whether the program
works as designed. But we don’t think that should deter you from
trying to craft what we would call proxy measures, other measures
that help you get at this at least indirectly. And we made that very
important recommendation, and TSA and DHS agreed to try to de-
velop these indicators.
There is one step we think they could take that would make this
exercise a lot more useful, currently they use a very long list of be-
haviors, the exact number and the characteristics are considered
sensitive security information. But we posed a question, how do
you know this is the right number? And they also assign point
scores to each of these behaviors. Again, the details are sensitive
security information. But that would be one way that we think
would make the program more useful in identifying potential acts
of terrorism, validate the point system, scrub the list of behaviors,
cull the list, and try to come up with something that is more re-
lated to an eventual arrest or a hostile act. And there are ways to
do that statistically.
Chairman BROUN. Thank you, Mr. Lord. Anybody else? Mr. Wil-
lis, yes?
Mr. WILLIS. Thank you, Mr. Chairman. So first off, proxy meas-
ures are a standard part of research, especially in the area of ter-
rorism, because again, there are no direct measures in sufficient
105

quantities, typically, to use for terrorism. Criminal activity is often


used as a proxy measure. It is an accepted practice mainly because
when one is looking for terrorism or acts of terrorism in a lot of
transit areas, you are looking for somebody who is coming in to try
to use some false identification or you are looking for somebody
who is smuggling. And both of these things are represented in
higher numbers, even though they are still low base rate numbers
in criminal activity. And so that is why that is typically used and
used by other organizations as proxy measures. So I want to make
sure that we were comfortable that we had given forethought to
that and used what is a best practice for proxy measures, sir.
Chairman BROUN. Dr. Ekman?
Dr. EKMAN. There are a number of organizations. I work with
airport security in England. I have seen the videos of the bombers
before they bombed. I have worked in Israel where they do a lot
of, of course, security. But even within our own government, the
different parts of DOD that deal with counterterrorism and the at-
tempts to identify terrorists in field military situations, there is no
sharing of information. There is a lot of information out there that
hasn’t been brought together. It is sensitive, but it needs to be
brought together and then with that database, take a look at what
is on the SPOT list. I haven’t seen what is on the SPOT list for
four years so I don’t know how it has changed and I don’t know
how it has been informed by research findings from our group and
other groups and from observations by Special Forces, by our coun-
terintelligence, by NYPD. There is a lot of information in this coun-
try in separate little pockets that hasn’t been brought together.
Chairman BROUN. Thank you. My time has expired. For my
questioning now, I recognize the Ranking Member, Ms. Edwards,
for five minutes.
Ms. EDWARDS. Thank you, Mr. Chairman. I want to go to a ques-
tion that was raised by Mr. Mica’s comments when he was here.
And I just want to be clear that from the perspective of GAO and
the report and analysis that you have done, Mr. Lord, we don’t yet
know if the SPOT program is ‘‘a fiasco.’’ Isn’t that correct?
Mr. LORD. Yes, that is absolutely correct. Those were his words.
That is not in our vocabulary. Thank you.
Ms. EDWARDS. And just to be clear again, what metrics again
would you use to determine the success or failure as an operational
program?
Mr. LORD.Since we have identified several instances of terrorists
transiting through the U.S. system, studied the videotapes of their
movement. Are they, in fact, exhibiting signs of stress? Are they,
as some literature suggests, they don’t typically emote much be-
cause they believe they are going on to a more blissful state. So it
is unclear to us at this juncture whether there would be discernible
signs of stress or fear. But there is videotape evidence that would
allow you to get at that and we think that would be invaluable in
fine-tuning the program.
Ms. EDWARDS. Yeah, I think I highlighted that in your testimony
because there are a number of examples that we have. And I won-
der, Mr. Willis, has DHS made an attempt to pull together not just
video evidence here in the United States but with our international
partners to do some kind of an assessment stacked up against the
106

screening techniques that have been identified to see whether we


are on target? It is an awful lot of money to spend without, you
know, putting it up against real-time data.
Mr. WILLIS. Thank you. Again, I represent DHS Science and
Technology, not the operational community. From a——
Ms. EDWARDS. This is a science question.
Mr. WILLIS. Yeah, from a Science and Technology perspective, we
are attempting to locate video of terrorist threats in other coun-
tries, as well as within the United States. And it is very difficult
to try to get access to that information or to successfully get access
to that video. And so if——
Ms. EDWARDS. Well, part of the reason that we pulled DHS to-
gether is because it was—you know, because it is a, you know, a
collection of all of our, you know, sort of security and investigative
interests under one house to work with our international partners.
And so it is a little staggering to me to know that you have not
had the capacity in now a decade to look at video and use it to
make an analysis about whether the techniques that you seem to
be employing are—would be successful. I mean that seems to me
kind of a basic scientific question that DHS should be in a position
with our partners internationally and here in the United States to
get that video and, you know, conduct some real scientific analysis
of that. So I would urge DHS to consider that.
I want to go to Dr. Hartwig for a minute because in your testi-
mony you indicated that there are some other recommendations
that you might make and I wonder if you could just describe very
briefly those to us because I don’t think you had an opportunity
here in your testimony.
Dr. HARTWIG. Right. I think it is roughly captured by what Mr.
Mica said before he left, that is it important to engage a person in
conversation to elicit cues to deception. Overall, the research shows
that statements carry some cues to deception. And also there is an
emerging wave of new research that focuses on how to create cues
to deception, how to elicit cues to deception because there is such
an abundance of research showing that people don’t just automati-
cally leak. So my basic answer is that some form of questioning
protocol, some kind of brief interview protocol that is based on the
scientific research on how to elicit cues to deception, how to ask
questions so that the liars and truth-tellers respond differently. I
think that would be a worthwhile enterprise.
Ms. EDWARDS. So you are not really saying—and this is a yes
and no—saying scrap the program, but you are saying that there
are areas where we need to significantly improve the techniques
that we are using to take us down a track of really being able to
identify potential terrorists?
Dr. HARTWIG. Yes, I think if efforts would be spent on the ques-
tioning part of the program, that would put it much more in line
with the scientific research.
Ms. EDWARDS. Thank you. Thank you, Mr. Chairman.
Chairman BROUN. Thank you, Ms. Edwards. We have been
joined by the Congresswoman from Florida, Ms. Adams. You are
recognized for five minutes.
Mrs. ADAMS. Thank you, Mr. Chair. Mr. Willis, earlier you said
that there had been 71,000 referrals and you made a distinction of
107

that, the behavior leading to arrest. How many of those were ar-
rested?
Mr. WILLIS. Of the 71,000?
Mrs. ADAMS. Yes.
Mr. WILLIS. That is the random selection method.
Mrs. ADAMS. Correct.
Mr. WILLIS. 71,000 were referred in the random selection. Nine
arrests were made.
Mrs. ADAMS. Nine?
Mr. WILLIS. Yes.
Mrs. ADAMS. And in the other method?
Mr. WILLIS. Using SPOT 23,000 and a little bit were referred and
151 were arrested.
Mrs. ADAMS. And the types of arrests?
Mr. WILLIS. I don’t have the nature of the arrests in the data
that we looked at, ma’am.
Mrs. ADAMS. So it could have been belligerency or any other
thing for that matter?
Mr. WILLIS. Some of them were for prohibited items that were
on them at the time. Others could have been through outstanding
warrants or something of that nature, ma’am.
Mrs. ADAMS. Do you think that I have an appearance or would
I be a target for SPOT? I mean every time I go through the airport
I get pulled aside and searched. And the reason I ask that is be-
cause, you know, being a past law enforcement officer and trained,
I have some concerns about the way you are identifying pulling
people aside. Dr. Hartwig, you said you wanted—you thought the
program would work if more tools were available. Would it be bet-
ter to use a validated system as opposed to one that is untested
and invalidated?
Dr. HARTWIG. Well, first of all, I didn’t say that about that the
program would work. I was talking about where I think more em-
phasis should be spent or put.
Mrs. ADAMS. So even with the more emphasis do you believe that
it would work?
Dr. HARTWIG. I don’t know. I think we would need a properly
conducted study to find that out. And I think it would be important
to go beyond examining the arrest rates and to look at what are
the actual behaviors that are displayed by these people who are ar-
rested and to compare those behaviors with those that are in the
list of queues. I don’t know what those queues are because it is not
available. And to look at are the SPOT criteria actual indicators.
So I think that—it is definitely—we need to know whether it works
or now.
Mrs. ADAMS. Mr. DiDomenica, you are a law enforcement officer.
I am a past law enforcement officer. Do you believe that the TSA
employees have enough training and the skills sets based on the
training they are receiving to—you know, to provide this type of
screening at this level?
Mr. DIDOMENICA. I think with a proper follow up by trained law
enforcement that they do. But if we don’t have the proper follow
up by the police officers to figure out what is going on because this
is just like an alarm. It is like going through the magnetometer
and beeps. Well, what does that mean? So someone comes over and
108

pats you down. Well, the cops are like the pat-downs. All right.
Why did this beep? And so if you have that level of follow up by
trained law enforcement, I am comfortable with the training they
receive. But without that level of follow up, I am not comfortable.
Mrs. ADAMS. So would it be your opinion that there needs to be
more training?
Mr. DIDOMENICA. Yes.
Mrs. ADAMS. I yield back.
Chairman BROUN. Thank you, Ms. Adams. Mr. Willis, I have got
another question for you. Does TSA plan to use R and D to improve
the SPOT program or does it believe the program cannot be im-
proved upon?
Mr. WILLIS. We do have some ongoing research with them and
if I may say this is one of the beginning research elements that we
have with TSA, sir, and in fact it was started in 2007 prior to
GAO’s interests. Its focus is specific, not to evaluate absolutely ev-
erything going on with SPOT. That is a huge tasking of which we
are not tasked or resourced to do. This is looking at the indicators,
the checklist itself, the existing checklist.
The first question that needs to be asked from a scientific per-
spective is does the checklist as it is currently put together and as
it is currently deployed accomplish its mission. You would like to
be able to compare that against random and against something else
that has been shown to be out there and valid, but the fact is that
there isn’t another behavioral-based screening out there employed
by any other group that we are aware of, either in the United
States or abroad, that has been statistically validated. And so we
have not been able to address that. So we compared this against
random, which is the first scientific basis.
Chairman BROUN. So TSA is doing research?
Mr. WILLIS. We are doing research that supports TSA.
Chairman BROUN. Ms. Edwards, do you have another question?
Ms. EDWARDS. I do, thank you, Mr. Chairman. I just want to fol-
low up with you, Mr. Willis, because I am confused. My under-
standing is that you shared with our staff that there is a pool video
available of suicide bombers and the like that could be used to
study. And I mean I would expect that if TSA were operating the
right kind of way that would also be used for training. And so I
am a little confused by your answer and I just want to be clear.
Do we have video both from ourselves and perhaps from our inter-
national partners that we could use to assess the techniques that
have been developed and the questions that—the assessment ques-
tions that have been developed so that we can make sure that we
have a program that is working as effectively as we know it can
work?
Mr. WILLIS. We don’t presently have a sufficient number of vid-
eos to conduct scientific analysis on. S&T is attempting to work
with our partners in the United States and internationally to gath-
er these, but being a resource organization, we do not have the
ability to compel operational organizations, much less international
ones to provide us with that video. What we are doing is attempt-
ing to continue to collect that at—the best we can, as well as to
conduct other kinds of supporting things such as interviews of di-
rect eyewitnesses to suicide bombings, international subject matter
109

experts in the area to go beyond what the current validation study


was, which is of the existing indicators, to try to help establish
from a scientific perspective what is being used operationally
abroad and, in fact, what is being witnessed by, again, eye-
witnesses and subject matter experts so that we may be able to
then bring that information back and test it to see——
Ms. EDWARDS. Is S&T doing that or TSA? Who——
Mr. WILLIS. That is S&T research, ma’am.
Ms. EDWARDS. Okay. And so I guess I mean for the—for our Drs.
Hartwig and Ekman, it would be useful, wouldn’t it, to have a pool,
a real data pool to be able to assess that and develop a research
protocol that enabled us to stack our assessment tools against that?
And so my question, though, for Mr. Willis whether or not—what
agency do you think is—would be the responsible one to get this
pool together? Is it DHS? Is it TSA? Mr. Lord?
Mr. WILLIS. I don’t know the right organization for that.
Mr. LORD. In our report, we made 11 recommendations. One of
the recommendations was to use and study available video record-
ing to help refine the SPOT program. In their formal Agency com-
ments, the Department indicated they agreed and they were taking
steps to do that so I think the Department is already on record for
saying they agreed. It is a good idea. We are going to do it. So I
mean they are—they bought into this idea. To the extent they have
actually implanted it, we will have to follow up and see the extent
they have addressed it. But just so—to clarify, DHS has bought
into this idea. They have already agreed to do it.
Ms. EDWARDS. Thank you. And then finally, Mr. Lord, since you
already have the microphone, DHS hasn’t done a cost/benefit anal-
ysis on the program or a risk assessment. And it is my under-
standing that they don’t do a great job actually—and I apologize
for the critique—of either conducting cost/benefit analyses or risk
assessments for many of their programs. How do we know if we
even need the program?
Mr. LORD. Well, typically, as part of our analysis, we would look
at the cost/benefit analysis or the risk assessment to study, number
one, how they decided—for example, you need a risk assessment,
we would assume, to show where you needed to deploy the pro-
gram. It is at 161 airports, so our question was how did you estab-
lish this number? Did you have a risk assessment? And the answer
was no. They are in the process of ramping up the program now.
Every year, you know, the funding has increased. We assumed that
would be justified by a cost/benefit analysis. They don’t have one
yet, although to their credit they have agreed to complete both a
risk assessment and a cost/benefit analysis. But traditionally, we
would expect to find that early at program inception, not 4 or five
years after you deployed a program.
Ms. EDWARDS. Well, thank you all for your testimony. And Mr.
Chairman, I would just say for the record, it would be good to get
a cost/benefit analysis and risk assessment before we spend an-
other, you know, $20 million, $2 million, or $2 on the program.
Thank you very much.
Chairman BROUN. And I agree with you, Ms. Edwards. Ms.
Adams, you are recognized.
110

Mrs. ADAMS. Thank you, Mr. Chair. The program, Mr. Willis, has
been ongoing since 2007? Is that what I heard?
Mr. WILLIS. The validation research study has been ongoing
since 2007.
Mrs. ADAMS. A validation research study since 2007. And I heard
you say there was no system out there that you could use that was
validated or available, is that correct?
Mr. WILLIS. We are unaware of any behavioral-based screening
program that is used that has been rigorously validated, yes.
Mrs. ADAMS. What about Israel’s program?
Mr. WILLIS. We have not located any study that rigorously tests
that.
Mrs. ADAMS. Did they study it?
Mr. WILLIS. We are not provided any information——
Mrs. ADAMS. Did you ask?
Mr. WILLIS. Yes.
Mrs. ADAMS. And they have said they would not provide it?
Mr. WILLIS. We have not been—they didn’t say they wouldn’t
provide it.
Mrs. ADAMS. Okay. So it is maybe the way you were—you asked
for it maybe? I am trying to determine, since ’07 you have been
doing a study. We don’t have anything validated. You can’t give us
a cost/benefit analysis. We are four years out and when you say
there is no other programs out there, there are some out there, I
believe. Mr. DiDomenica, are there programs out there?
Mr. DIDOMENICA. There are similar programs—excuse me. There
are similar programs for behavior assessment, principally for law
enforcement. I mean I have been teaching BASS. There is a DHS
program called—it is proved by DHS called Patriot. I have another
training course called HIDE, Hostile Intent Detection Evaluation.
But these programs are given, it may be a few days of training,
and then people go off and do their thing. There is no follow up,
in other words, how successful it is. I mean people, I think, are get-
ting good ideas, they are getting good techniques, but it is not done
in a way where it can be measured and followed up on, and I think
that needs to be done.
Mrs. ADAMS. And these programs are all from DHS also?
Mr. DIDOMENICA. There is one that is approved. In other words,
it is approved for funding. And—but they are not DHS programs.
Mrs. ADAMS. Okay. So they are funded but they are trying to
then—they are kind of sent out and there is no true follow up. Is
that what you are saying?
Mr. DIDOMENICA. Yeah, there is no collection of data about suc-
cess or failures or effectiveness. It is like a lot of law enforcement
training, and you are probably aware of this, that you go in for a
class, you sit there for a week, you get a certificate, and you walk
out the door and that is the end of it. So I think, unfortunately,
that just falls in line with a lot of the training that is done. And
I think for this program, it is—you know, what is at—for what is
at stake, we need to be better at how we follow up on this.
Mrs. ADAMS. I know in my certificate we had to go back for train-
ing every so often or else we lost our certificate. So I can relate to
having to keep your training and your skills honed. I appreciate
that. No more questions, Mr. Chair.
111

Chairman BROUN. Thank you, Ms. Adams. I want to thank the


witnesses for being here today. I appreciate you all’s testimony and
I appreciate the Members, all the questions that we have had. This
is a very interesting topic. I am, again, very disappointed the TSA
has refused to come because there are a lot of questions that I
know Ms. Edwards and I both would like to have asked TSA if they
had graced us with their presence. And hopefully we don’t have to
go down the road of requiring them to be here in the future. But
we will look into that and they will be here at some point, I hope
voluntarily. And I hope you will pass that along to the folks that
are in the position to make that decision.
Members of the Subcommittee may have additional questions for
the witnesses, and we ask that you all will respond to those in
writing. The record will remain open for two weeks for additional
comments by Members. The witnesses are excused and the hearing
is now adjourned.
[Whereupon, at 12:00 p.m., the Subcommittee was adjourned.]
Appendix I

ANSWERS TO POST-HEARING QUESTIONS

(113)
114
ANSWERS TO POST-HEARING QUESTIONS
Responses by Mr. Stephen Lord, Director, Homeland Security and Justice Issues,
Government Accountability Office
115
116
117
118
Responses by Mr. Larry Willis, Program Manager, Homeland Security Advanced
Research Projects Agency, Science and Technology Directorate,
Department of Homeland Security

Questions submitted by Chairman Paul C. Broun


Q1. Question: Does S&T’s evaluation seek to validate the underlying behavioral indi-
cators that form the basis of the SPOT program?
A1. Response: The scope of the study was to conduct an operational examination
of the existing indicators contained within the Screening Passengers by Observa-
tional Techniques (SPOT) Referral Report. The results of the study provide evidence
to support the criterion-related validity (classification accuracy) of the SPOT Refer-
ral Report. In a comparison of Operational SPOT and random screening selection
outcomes, the classification accuracy for Operational SPOT was significantly more
accurate in identifying high-risk travelers as defined by possession of serious prohib-
ited and illegal items (weapons, fraudulent documents, etc.) and law enforcement ar-
rests. This finding was based upon a comparison of Operational SPOT and random
screening at 43 airports for a period of nine months and included over 23,000 Oper-
ational SPOT screenings and 70,000 random screenings.
Q2. Question: For the purpose of the S&T study, you describe ‘high risk travelers’
as ‘‘those passengers in possession of serious prohibited and/or illegal items or
individuals engaging in conduct leading to an arrest.’’
a. Why is ‘terrorism’ not included in the definition of high risk travelers?
A2 a. The number of terrorists identified as traveling through airports is too infre-
quent to support the inclusion of terrorists as high-risk passengers in an empirical
comparative analysis of screening methodologies. In keeping with the best practice
of developing proxy measures, the Science and Technology Directorate’s study de-
fined high risk travelers using behaviors common to both terrorists and criminals,
such as attempting to conceal identity and smuggling of potentially dangerous mate-
rials.
b. Has the definition of high risk travelers changed from when SPOT was first im-
plemented? If so, how?
A2 (b.) The definition has not changed.
Q3. At a recent Oversight and Government Reform hearing, TSA stated that it was
introducing training for screeners to put travelers at ease while going through
screening.
a. What impact would this, and other countermeasures employed by travelers such
as training to hide indicators, or anti-anxiety drugs, have on a BDO’s ability to
identify an individual intending to cause harm?
A2 (a.) Screening of Passengers by Observation Techniques (SPOT) indicators are
based on the involuntary physical and physiological behaviors that occur when a
person has a fear of discovery. Research supports that these behaviors are difficult
to countermeasure. First, involuntary behaviors originate in an area of the brain
that individuals do not have control over. People cannot stop these behaviors from
occurring; rather they must try to mask or suppress them once they are triggered.
Second, nonverbal behavior is more complex and more difficult to control than
verbal communication because there are many areas of nonverbal behavior an indi-
vidual needs to control, such as facial expression, posture, etc. Third, deception is
a cognitively demanding state, and this makes body movements even more difficult
to control, because people have lower cognitive capacity when they are trying to lie.
Research has not yet examined how medication, surgery, disguise, or drugs affect
human behavior in these situations, and this research is needed by the scientific
community. Even though medication or drugs may suppress some behaviors and
body movements, they may produce other signals to suggest that the person has
taken this medication.
Q4. How does TSA ensure that BDOs are using indicators to screen passengers rath-
er than something more troublesome like profiling or racial bias?
A4. Behavior Detection Officers (BDO) and candidates are trained to identify behav-
iors, and work to resolve any suspicions based on the training protocols. The BDO
training distinguishes between subjective profiling and proven scientific methods.
They are specifically trained not to consider ethnicity or race-and or other traits
that are not associated with behavior. Additionally, BDOs work in teams which aids
in integrity. Furthermore, the program office regularly performs Standardization
119
Visits with refresher training. Finally, the Screening of Passengers by Observation
Techniques (SPOT) Transportation Security Managers, who are the first line super-
visors to the BDOs, are required to spend time on the floor monitoring the BDOs
to ensure they are applying the behaviors in accordance with the SPOT standard
operating procedures.
Q5 a. On what basis was the SPOT checklist of indicators selected?
A5 (a.) The behavioral indicators incorporated within Screening of Passengers by
Observation Techniques (SPOT) are based on both law enforcement experience and
the most recent scientific findings.
Additionally, the work of Dr. David Givens, Director of the Center for Nonverbal
Studies, was utilized in selecting the SPOT behaviors. Dr. Givens is recognized as
an expert in nonverbal behavior. Behaviors outlined in his Nonverbal Dictionary
were selected based on their relationship to stress, fear, and deception cues associ-
ated with the fear of discovery and integrated into the SPOT program.
Q5 b. Why doesn’t the S&T study evaluate the validity of the indicator list? Do you
believe this would be helpful?
A5 (b.) The Science and Technology Directorate’s (S&T) study did directly evaluate
the indicator list as executed through the existing Screening Passengers by Observa-
tional Techniques (SPOT) Standard Operating Procedure (SOP).
Q6. According to the GAO report, S&T officials ‘‘agreed that SPOT was deployed be-
fore its scientific underpinnings were fully validated.’’ (p. 15). Additionally, in
discussing the S&T study, the GAO report states, ‘‘S&T’s current research plan
is not designed to fully validate whether behavior detection and appearances can
be effectively used to reliably identify individuals in an airport terminal environ-
ment who pose a risk to the aviation system.’’ (p. 20). Additionally, in the first
paragraph of Dr. Maria Hartwig’s written testimony, she says, ‘‘In brief, the ac-
cumulated body of scientific work on behavioral cues to deception does not pro-
vide support for the premise of the SPOT program. The empirical support for
the underpinnings of the program is weak at best, and the program suffers from
theoretical flaws.’’
a. Prior to implementing SPOT, why did TSA not validated the science behind the
program?
A6 (a.) Prior to the Transportation Security Administration’s Screening of Pas-
sengers by Observation Techniques (SPOT) program, no behavior-based program
had ever been rigorously scientifically validated. The program was established on
widely accepted principles supported by leading experts in the field of behavioral
science and law enforcement.
b. Why did the S&T validation study not validate ‘‘whether behavior detection and
appearances can be effectively used to reliably identify individuals in an airport
terminal environment who pose a risk to the aviation system?’’
A6 (b.) The Science and Technology Directorate (S&T) sponsored study did directly
examine the extent to which ‘‘behavior detection and appearances,’’ as represented
in the existing Screening Passengers by Observational Techniques (SPOT) indica-
tors, can be effectively used to identify high-risk travelers, which is an examination
of classification accuracy (criterion-related validity). Results of the study found sup-
port for criterion-related validity; that is, there is evidence that the SPOT indicators
are accurate in identifying outcomes and is significantly more accurate in doing so
than random screening.
c. How do you respond to Dr. Hartwig’s comment?
A6 c. During the recent testimony, Dr. Rubin responded to a similar question by
stating that the published research literature on the link between behavioral, phys-
iological, and verbal cues to deception and general suspicious behaviors is mixed,
rather than non-supportive as represented by Dr. Hartwig. The Science and Tech-
nology Directorate (S&T) agrees with Dr. Rubin’s assessment.
Q7. Who originated the SPOT program, was it Carl Maccario, as Dr. Ekman states
in his written testimony, or was it Lieutenant DiDomenica, who says his PASS
program was the basis for SPOT? Response: After the terrorist attacks of 9/11,
behavior recognition and analysis concepts were adapted and modified by the
Massachusetts State Police (MSP) Troop F (Lieutenant DiDomenica) assigned to
Boston Logan International Airport (BOS). Their program was modified to meet
the legal, social, political, financial, and resource limitations of the United
States and was merged with drug interdiction techniques used by United States
120
law enforcement. MSP named this program Behavior Assessment Screening Sys-
tem and trained all law enforcement officers assigned to BOS in its use as an
enhanced security measure to the newly instituted security checkpoint screening
system of the Transportation Security Administration (TSA).
The Screening of Passengers by Observation Techniques (SPOT) program was devel-
oped by TSA (Carl Maccario), with assistance from MSP, to meet TSA-specific secu-
rity and public service needs, with particular emphasis on the protection of indi-
vidual civil rights, privacy, and to mitigate possible complaints of racial profiling.
a. What role did the Israeli model play?
A7 (a.) The SPOT subject matter expert was initially trained in Israeli Behavior
Pattern Recognition (BPR). Many of the BPR concepts are contained in SPOT such
as informally interacting with passengers who are in line at the security checkpoint
queue.
b. What aspects of the Israeli model are based on behavioral science?
A7 (b.) TSA defers to the Government of Israel to respond as appropriate, as they
are the subject matter experts on their security model.
Q8. Dr. Ekman distinguishes his experiments from those of his critics by empha-
sizing that his focus is on ‘‘high stake lies, in which the person lying has a lot
to gain or lose by success or failure.’’ He specifically addresses the work con-
ducted by Dr. Hartwig, stating, ‘‘She has dealt with low-not-high-stake lies
which have little relevance to my work or to the situation faced in SPOT.’’ Con-
versely, Dr. Hartwig states, ‘‘Neither the research in general nor specific results
on high-stake lies support the assumption that liars leak cues to stress and emo-
tion, which can be used for the purposes of lie detection.’’
a. Given these opposing views, what is your assessment?
A8. As Dr. Rubin stated during his testimony, the published research literature is
mixed on the topic of behavioral, physiological, and verbal cues to deception and
general suspicious behaviors. Ideally, one might expect greater consensus and sup-
port from the academic research base prior to fielding a screening program; how-
ever, academic research alone is insufficient. Once a screening program is fielded,
regardless of how supportive the academic research base may be, prudent research
requires the conduct of operational experiments to validate the effectiveness of the
screening program and if effective, to then conduct additional research to optimize
its effectiveness. The reality is that behavior-based screening is currently used oper-
ationally by DHS, the U.S. Department of Defense, the U.S. intelligence community,
law enforcement, and by numerous other countries. Increased focus should be ap-
plied to conducting field research on these programs.
Q9. Please indicate each and every research effort that the DHS Science & Tech-
nology Directorate (S&T) is conducting on behalf of the Transportation Security
Administration (TSA). This should include all efforts the S&T Directorate is
taking on behalf of TSA and not simply be limited to work that S&T is per-
forming regarding the TSA SPOT program.
Please include in this list the following information:
• The name of the TSA effort DHS S&T is supporting.
• The purpose of the S&T research or task.
• The amount of financial reimbursement S&T is receiving from TSA for each ef-
fort.
A9. The Science and Technology Directorate (S&T) partners with the Transpor-
tation Security Administration (TSA) on several research and development tasks.
Below are the projects and associated funding from FY 2010 reimbursed by TSA:
(NOTE: * indicates projects are funded by TSA and do not appear in S&T budget
documents)
Project Name: Secure Carton
Financial Reimbursement from TSA: N/A
Description: Develop (at the request of TSA and DHS Policy) a shipping carton
embedded with security sensors that detects tampering or opening of the carton
once closed. It is scalable and applicable across various shipping modalities, in-
cluding maritime and air cargo, and can communicate a tamper event of the in-
ternal cargo to a radio frequency identification reader, when interrogated. The
interaction with TSA has been to keep them informed of the project. S&T in-
tends to test the product for inclusion on the TSA qualified products list. Secure
Carton is a Phase-III Small Business Innovation Research (SBIR) - Phases I &
121
II were funded by S&T SBIR Program and Phase III was funded with S&T Bor-
ders and Maritime Security Division FY09/10 project funds.
Project Name: Secure Wrap
Financial Reimbursement from TSA: N/A
Description: Secure Wrap is being developed for TSA and DHS Policy. It is a
flexible wrapping material that provides a visible indication of tamper evidence
and can be deployed with little to no change to current supply chain logistics and
processes. The interaction with TSA has been to keep them informed of the
project. S&T intends to test the product for inclusion on the TSA qualified prod-
ucts list. Secure Wrap is a Phase-II SBIR with all funding provided by DHS S&T
SBIR Program.
Project Name: Autonomous Rapid Facility Chemical Agent Monitor Project
Financial Reimbursement from TSA: N/A
Description: Develop a low-cost, fully autonomous, chemical vapor monitor that
is intended to ‘‘detect-to-warn’’ of the presence of up to 17 chemical warfare
agents and high-priority toxic industrial chemicals within a single device at both
immediately dangerous to life and health and permissible exposure limit con-
centrations. The monitor will be able to operate continuously in closed or par-
tially enclosed facility 24hrs/day, 365 days/yr.
Project Name: Chemical Security Analysis Center (CSAC) Project
Financial Reimbursement from TSA: N/A
Description: Develop and sustains expert reach-back capabilities to provide
rapid support in domestic emergencies. The CSAC serves as the Nation’s first
centralized repository of chemical threat information (hazard and characteriza-
tion data) for analysis of the Nation’s vulnerabilities to chemical agent attacks.
To ensure a cohesive effort to evaluate threats and countermeasures, CSAC con-
ducts key analytical assessments, such as material threat assessments (MTAs),
hazard assessments, and the Chemical Terrorism Risk Assessment (CTRA). The
DHS Office of Infrastructure Protection, Office of Health Affairs, TSA, and Intel-
ligence & Analysis are the primary DHS customers for CSAC products. CSAC
provides completed MTAs to Health and Human Services to fulfill BioShield re-
quirements.
Project Name: Model Large-Scale Toxic Chem Transport Release Project
Financial Reimbursement from TSA: $800,000
Description: Focus on developing an improved understanding of large-scale re-
leases of toxic inhalation hazards. Aspects of the project include improved mod-
eling, first responder procedures, and industrial safety in addition to the develop-
ment of enhanced mitigation strategies.
Project Name: Canine Detection R&D Project (FY10)
Financial Reimbursement from TSA: N/A
Description: Assess the performance of TSA certified explosive detection canine
teams when screening air cargo. This effort is in support of the TSA National
Explosives Detection Canine Team Program (NEDCTP) effort to independently
test performance measures in operational environments in order to make deci-
sions on concepts of operations. Independent experts collect and present the data
from canine operational assessments and make recommendations on canine
training or deployment to optimize canine explosives detection.
Project Name: Homemade Explosives (HMEs) Stand Alone Detection Project
(FY10)
Financial Reimbursement from TSA: N/A
Description: Identify, evaluate, and improve HME detection technologies and
screening methods through the collection and analysis of detection data and im-
ages from a wide variety of commercial off-the-shelf (COTS) explosive detection
systems (EDS), computed tomography, and x-ray diffraction equipment. This
helps TSA determine how to improve screening system performance through
hardware and software (image processing) upgrades. In addition, this project
evaluates COTS explosives detection equipment in laboratory settings to deter-
mine detection limits, false-alarm rates, and documents unique homemade explo-
sive (HME) properties for detection exploitation.
Project Name: Air Cargo Project (FY10/FY11)
Financial Reimbursement from TSA: FY 10 $1.1 million
Description: Identify and develop next generation screening systems to mitigate
the threat of explosives placed in air cargo containers. Activities include devel-
oping technologies to enable more effective and efficient air cargo screening (in-
122
cluding break-bulk, palletized, and containerized configurations screening) with
reduced operational costs and false-alarm rates.
Project Name: Algorithm and Analysis of Raw Images (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Develop a non-proprietary database of explosive-detection images
which will be provided to all detection-program participants. Collect and consoli-
date images, including those of novel explosives, from commercial vendors and
coordinates the purchase of additional images and data from computed tomog-
raphy, explosive detection systems, trace, emerging devices and other tech-
nologies. The evaluation of these images will help determine the causes of false
alarms over many types of scanning systems.
Project Name: Automated Carry-On Detection (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Develop advanced capabilities to detect explosives and concealed
weapons in carry-on luggage. This project also will introduce new standalone or
adjunct imaging technologies, such as computed tomography, to continue the im-
provement of checkpoint detection performance and the detection of novel explo-
sives.
Project Name: Automated Threat Recognition (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Develop and evaluate automated target recognition algorithms for
advanced imaging technology in a test bed with the goal to automatically and
reliably detect threats on passengers, eliminating the need for human interpreta-
tion in order to improve detection and false alarm performance and reduce pri-
vacy concerns. The December 25, 2009 incident clearly shows the importance of
detecting threats hidden on passengers’ bodies. This research will guide further
enhancements necessary to reach full-scale development and deployment.
Project Name: Detection Technology and Material Science (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Evaluate advanced detection algorithms, improves explosives de-
tection and develops and tests advanced materials for trace sample collection.
Project Name: Explosives Trace Detection (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Develop advanced capabilities to detect explosives (including
homemade explosives) through improved trace sampling and detection tech-
nologies. Develops trace detection standard materials that can be used as field
performance standards for deployed trace detection systems. Characterizes trace
explosives chemical and physical signature properties to inform advanced trace
detector system design.
Project Name: Checked Baggage (FY10/FY11)
Financial Reimbursement from TSA: FY 10 $5.5 million
Description: Drive commercial development of next-generation systems that
will substantially improve performance and affordability of checked baggage
screening. Commercial development is driven when the test results referred to
below are incorporated into TSA’s increased performance requirements for
screening systems. Vendors must then meet these requirements for consideration
during TSA acquisition. Test and evaluation of these systems will focus on prob-
ability of detection, number of false alarms, and throughput. The project also
measures affordability of these systems by evaluating initial purchasing cost, op-
erating costs, maintainability, and other elements of the full life-cycle costs.
Project Name: Mass Transit (formerly Suicide Bomber) (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Identify the infrastructure characteristics and security concept of
operations for surface transportation systems in order to drive a security tech-
nology development strategy designed to combat the explosive threat within the
operational requirements of the transportation systems. Assessments will be con-
ducted at transit authorities to frame the technology development solution space.
Currently fielded technologies will be evaluated for potential enhancement.
Project Name: Next Generation Passenger Checkpoint (FY10/FY11)
Financial Reimbursement from TSA: FY 10 $2.1 million
Description: Develop the next-generation detection system architecture to
screen passengers for explosives at aviation checkpoints. This project also inves-
tigates new emerging liquid- and gel-based explosive threats and includes them
in a comprehensive detection system.
123
Project Name: Predictive Screening Project
Financial Reimbursement from TSA: N/A
Description: Derive the observable behavioral indicators and develops tech-
nologies to automatically identify, alert authorities to, and track suspicious be-
haviors that precede suicide bombing attacks. The Science and Technology Direc-
torate will test technologies at ports-of-entry, transit portals, and special events.
Project Name: Aircraft Vulnerability Tests (FY10/FY11)
Financial Reimbursement from TSA: FY10 $6.6 million
Description: Assess the vulnerability of narrow- and wide-body aircraft pas-
senger cabins and cargo holds to explosives. These vulnerability assessments will
analyze blast/damage effects of explosives and determine the minimum threat
mass required to cause catastrophic damage to various aircraft types. The as-
sessments will also identify the detection limits for bulk screening systems. De-
velop and assess hardened unit load devices (HULDs) for blast mitigation in air
cargo. These HULD development efforts will provide reduced weight air cargo
containers for blast protection while minimizing impact on commerce.
Project Name: Homemade Explosives (HME) Characterization (FY10/FY11)
Financial Reimbursement from TSA: N/A
Description: Determine the impact, friction, and electrostatic-discharge sen-
sitivities of HME threats. This data facilitates the safe handling and storage of
HME materials during research and development activities. Technology efforts
to identify, evaluate, and improve HME detection technologies and screening
methods through the collection of raw data and images from a wide variety of
commercial off-the-shelf (COTS) explosive detection systems (EDS), computed to-
mography, and x-ray diffraction equipment are also conducted. This helps TSA
determine how to improve EDS performance through hardware and software
(image processing) upgrades. In addition, this project evaluates COTS equipment
in laboratories to determine detection limits, false-alarm rates, and documents
unique HME properties for detection exploitation.
Project Name: Facility Restoration Demonstration Project
Financial Reimbursement from TSA: N/A
Description: Develop a systems approach to response and recovery of critical
transportation facilities following a chemical agent release. This project develops
remediation guidance, efficient pre-planning tools, identifies decontamination
methods, identifies sampling methods, and develops decision analysis tools.
Project Name: Operational Tools for Response and Restoration Project
Financial Reimbursement from TSA: N/A
Description: Develop a suite of state-of-the-science indoor-outdoor predictive
tools to characterize the extent and degree of biological contamination, incor-
porating the best-available deposition, degradation, and surface viability data.
This project will provide validated interagency sampling plans and improved sta-
tistical sampling design to support characterization and decontamination plan-
ning.
Project Name: Bridge Vulnerability Project
Financial Reimbursement from TSA: None
Description: Develop an understanding of the vulnerabilities of different types
of bridges to terrorist threats. This project will evaluate vintage bridge compo-
nents to improve understanding of explosives effects and to refine blast modeling
tools. The approach is unique in that it examines actual bridge sections exposed
to wear or aging instead of fabricated specimens. As a result, it will provide more
accurate vulnerability information for aging bridges and allow for refinement of
existing numerical models that predict failure of bridge components. The project
is using the Golden Gate Bridge, Crown Point Bridge (New York State - Lake
Champlain), and Manhattan Bridge (New York City East River), and the Fort
Steuben Bridge (Ohio) for homeland security research on potential effects of an
improvised explosive device (IED) attack and other plausible threats against a
bridge. These efforts are in partnership with the Maine Department of Transpor-
tation (DOT), NY DOT, NYC DOT, Ohio DOT, Golden Gate Bridge Authority,
and the Federal Highway Administration.
Project Name: Blast/Projectile – Protective Measures and Design Tools
Financial Reimbursement from TSA: None
Description: Identify and evaluate protective measures and design guidance for
protecting the Nation’s most critical infrastructure assets. The project considers
novel materials, design procedures, and innovative construction methods to aid
in constructing or retrofitting infrastructure. This will numerically analyze pro-
124
tective designs against blast and projectile threats and conduct physical dem-
onstrations to assess effectiveness.
Project Name: Advanced Incident Management Enterprise System (AIMES)
Financial Reimbursement from TSA: None
Description: Develop the next-generation incident-management enterprise sys-
tem and builds upon the Unified Incident Command and Decision Support archi-
tecture and Training, Exercise & Lessons Learned framework. This will inte-
grate all elements of the incident management enterprise to provide a secure,
scalable, interoperable, and unified situational awareness to the responder com-
munity.
Project Name: Rapid Mitigation and Recovery Project
Financial Reimbursement from TSA: None
Description: Investigate, assess, and develop candidate technologies and meth-
odologies that will reduce or eliminate the release of toxic inhalation hazard
(TIH) from the two threat scenarios of interest (.50 caliber AP and small IED).
Assess potential TIH mitigation technologies, to include development of interface
documentation to ensure that identified technologies can be integrated into any
existing and or future rail car design efforts. Mitigation technologies and ap-
proaches to be assessed include: Self-sealing Technologies and Blast and Frag-
ment Penetration Resistant Materials.
Project Name: Blast Projectile-Advanced Materials Design
Financial Reimbursement from TSA: None
Description: Assess the risk to a tunnel or mass transit station due to a ter-
rorist attack that has the potential of causing catastrophic losses (fatalities, inju-
ries, damage, and business interruption). Information from Integrated Rapid Vis-
ual Screening Tool (IRVS) can be used to support higher level assessments and
mitigation options by experts. In coordination with TSA, IRVS for Mass Transit
Stations and Tunnels were tested in various cities: Boston (Boston Massachu-
setts Bay Transportation Authority (MBTA), Cleveland, St. Louis, and others.
TSA will use the tool to enhance risk assessments of transportation hubs around
the country. In addition to TSA, potential users include Office of Infrastructure
Protection, Federal Emergency Management Agency, Commercial and Govern-
ment Facilities, State and local governments, code officials, associations of engi-
neers and architects, the design and construction industry.
Project Name: Community Based CIP Institute
Financial Reimbursement from TSA: FY11 $1million
Description: The shipment of hazardous materials provides a significant target
for terrorists. The ability to track hazardous materials (HAZMAT) shipments on
a real-time basis is essential for providing an early warning of an impending ter-
rorist threat. The University of Kentucky (UK) will design and organize a func-
tional prototype of a HAZMAT truck tracking center. This project supports a
Transportation Security Administration (TSA) program that tracks motor carrier
shipments of security-sensitive materials. Collaborating with UK on the project
are Morehead State University, Coldstream Digital and General Dynamics Ad-
vanced Information Systems. The prototype software is integrated with ‘‘smart
truck’’ technology and will contain operational components that will integrate re-
porting and shipping information with a real-time tracking and situation display
capability.
Project Name: Suspicious Activity Reporting Project
Financial Reimbursement from TSA: None
Description: S&T is developing an enhanced analytical tool prototype for the
Federal Air Marshal Service (FAMS), Investigations Division. This application,
now named iConnex, is a suite of analytical tools that allows investigators to
search, find, explore, link, visualize and understand relationships within Sus-
picious Activity Reports and other law enforcement data sets. The iConnex appli-
cation is under development using predominantly open-source technologies. The
application’s architecture targets the technical needs of the law enforcement
community by being able to work with an array of structured and unstructured
data. The system is designed to be user friendly, and does not require extensive
training or support to reach operational capabilities. Once completed, iConnex
will be made available to any DHS component or law enforcement agency as a
cost-free Government Open Source solution.
Project Name: Law Enforcement Data Fusion
Financial Reimbursement from TSA: None
125
Description: The Science and Technology is working with Federal Air Marshal
Service (FAMS), Investigations Division to develop a geospatial predictive ana-
lytics product that will detect, forecast, and disrupt future terrorist attacks and
criminal activity - leveraging predictive analytic algorithms and software devel-
oped for the Department of Defense community that successfully ‘forecast’ impro-
vised explosive device locations in Iraq and Afghanistan. This capability will pro-
vide FAMS with actionable guidance on the most effective location and allocation
of agents to place on high risk flights as well as providing them with increased
knowledge of the tactics and procedures of the adversary. This effort utilizes a
cloud-computing environment in which national data (Homeland Security Infra-
structure Protection Gold, among others) are being brought together and ana-
lyzed to support the FAMS mission to discern threats and forecast the location
of attacks. As this technology matures at FAMS, the final product will be made
available to any DHS component or law enforcement agency as a cost-free Gov-
ernment Open Source solution.
Project Name: Cross-Cultural Validation of Screening of Passengers by Obser-
vation Techniques (SPOT)
Financial Reimbursement from TSA: N/A
Description: Provide empirical validation of existing behavioral indicators em-
ployed by DHS’ operational components to screen passengers at air, land, and
maritime ports, including those indicators contained within TSA’s SPOT. This ef-
fort complements the automated prototype work and supports development of an
enhanced capability to detect behavioral indicators of hostile intent at a distance.
The project will integrate these validated behavioral indicators into the screen-
ing concept of operations through each component’s existing training programs.
Project Name: Future Attribute Screening Technologies Mobile Module (FAST
M2)
Financial Reimbursement from TSA: N/A
Description: Develop a prototype screening facility containing a suite of real-
time, non-invasive sensor technologies to detect behavior indicative of malintent
(the intent or desire to cause harm) rapidly, reliably, and remotely. The system
will measure both physiological and behavioral signals to make probabilistic as-
sessments of malintent based on sensor outputs and advanced fusion algorithms.
Federal, state, and local authorities may use the fully developed FAST system
in primary screening environments to increase the accuracy and validity of peo-
ple screening at special events, airports, and other secure areas. FAST will
measure indicators using culturally independent and non-invasive sensors. FAST
will use an ongoing, independent peer review process to ensure objectivity and
thoroughness in addressing all aspects of the program.
Project Name: Hostile Intent Detection - Automated Prototype
Financial Reimbursement from TSA: N/A
Description: Develop real-time, non-invasive, and culturally independent, hos-
tile-intent detection video extraction algorithms to identify unknown or potential
terrorists through an interactive process.
Project Name: Human Systems Research
Financial Reimbursement from TSA: FY10 $1.7 million
Description: Examine ways to maximize human performance across DHS end-
user tasks and activities. Activities under this project include research on excep-
tionally performing (EP) screeners, development of a human factors research
roadmap, a study of airport dynamics and the development of a cognitive assess-
ment tool.
*Project Name: Aviation Security Enhancement Partnership (ASEP) Evalu-
ating TSA’s Comprehensive Airport Security Strategy
Financial Reimbursement from TSA: FY10 $1 million
Description: The project will deliver an evidence-based assessment and a re-
search design for a comprehensive evaluation of the efficacy of the Transpor-
tation Security Administration’s Playbook to ensure that it has the intended pre-
vention and deterrent effects in and around U.S. airports.
*Project Name: Intelligent Closed Circuit Television (iCCTV) Project
Financial Reimbursement from TSA: FY10 $400,000
Description: Design and construct a data video collection, storage, and distribu-
tion capability to support off-line behavioral analysis. The resulting analysis will
support an inter- and intra-reliability assessment of the SPOT indicators.
126
*Project Name: Behavior Detection Officer (BDO) Selection Instrument Valida-
tion Project
Financial Reimbursement from TSA: FY09 $1.25 million (still being com-
pleted)
Description: Design and validate a personnel selection instrument to support
the hiring of TSA BDO.
127
Responses by Dr. Paul Ekman, Professor Emeritus of Psychology,
University of California, San Francisco,
and President and Founder, Paul Ekman Group, LLC

Questions submitted by Chairman Paul Broun


Q1. A Nature article from May, 2010 states that you no longer publish all of the de-
tails of your works in peer-reviewed literature because those papers are closely
followed by scientists in countries such as Syria, Iran and China, which the
United States views as a potential threat. A great deal of security related re-
search is conducted in the country in a manner that follows both the principles
of peer review as well as the security classification systems Is your work unique
in this regard?
A1. I have not done classified research, and I don’t know how those who do such
research handle the matter of publishing their findings, or any part of their find-
ings. I have been told that classified research is not published, but that is hearsay.
Regarding our own research findings, 95% of what we call hot spots -- behaviors
which indicate that full disclosure has not occurred -- has already been published
in scientific journals or book chapters. We have chosen not to publish a few new
findings on hot spots in an attempt not to disclose to potential and actual enemies
of our country everything we have found. If we choose to publish a study and it con-
tains these undisclosed hot spots, then we exclude those undisclosed hot spots from
the statistical analyses that we do report. Since the incidence of these undisclosed
hot spots is quite low, it has not changed the overall findings. Thus we are able to
publish on the incidence of 95% of hot spots, and keep to ourselves and those we
teach in law enforcement and national security, knowledge of the new unpublished
hot spots.
Q2. On pages five and six of your written testimony, you reference a couple of un-
published studies spearheaded by Dr. Mark Frank, one of which you claim
shows ‘‘behavioral markers can be useful even in situations where the person has
yet to commit an illegal act.’’ Did you share any preliminary results from these
studies with either TSA or S&T?
A2. The TSA was fully informed of Dr. Frank’s study that showed it was possible
to detect from hot spots whether or not a person had decided to lie. Past research
had focused on identifying lies about behavior that already had occurred. This study
showed it was also possible to detect lies about the future intent to engage in a
malfeasant action.
Q3. On page seven of Dr. Hartwig’s testimony, she responds to your claim from a
New York Times interview of being able to teach lie detection ‘‘to anyone with
an accuracy rate of more than 95 percent.’’ She goes on to say, ‘‘However, no
such finding has ever been reported in the peer-reviewed literature. More broad-
ly, there is no support for the assertion that training programs focusing on iden-
tifying facial displays of emotions can improve lie detection accuracy. How do
you respond to those observations?
A3. Dr. Hartwig has made a mistake in what she claims I said, one of many mis-
takes in her testimony. What I said was that through time-consuming, careful be-
havioral measurement we have been able to reach accurate determination of who
is lying with up to 95% accuracy, but this included combining some physiological
measures as well. I also said that we teach law enforcement and national security
personnel about our findings, attempting to train them to be able to use our findings
in their evaluations without doing the actual time-consuming research. We have not
claimed that those we train reach a 95% accuracy level of correct judgments in their
work place after our training. We receive reports that they have benefited, and we
have a paper under review by a scientific journal that shows that teaching individ-
uals to recognize micro expressions improves their ability to judge the true emo-
tional state of people who are lying. This in combination with a number of published
studies (once again not cited or not known by Dr. Hartwig) -- Ekman & O’Sullivan,
1991; Frank & Ekman, 1997; Warren, Schertler & Bull, 2008 - which show a cor-
relation between accuracy at detecting micro expressions and accuracy at detecting
lies. But this is found only when the lie is about something the person cares about
and there is a threat of considerable punishment if detected.
A meta analysis by Frank & Feeley (2003) and later updated by O’Sullivan, Frank
& Hurley (2011) on all the published research examining whether training improves
the ability to detect lies, found significant improvements as a result of training. Dr.
128
Hartwig did not know or chose not to mention these studies which directly con-
tradict her testimony.
The only study which evaluated training in actual real world high stakes security
contexts is the new American Institute of Research (AIR) report. The training the
SPOT personnel received whose decisions were found to be highly accurate in the
AIR study included our training materials, and some of the SPOT personnel were
trained by us. Our training is not limited to the face, but includes all of demeanor
- gesture, gaze, voice, and speech as well as facial actions.
Q4. You claim SPOT needs more funding and BDOs need more training.
a. How much funding is enough for SPOT?
b. How much training time would you devote to BDOs?
A4 a. I believe SPOT needs to have its personnel observing line of traffic at all
major airports. I believe our country would be safer if there were also SPOT per-
sonnel at all feeder airports, as the 9/11 hijackers boarded and went through secu-
rity at feeder airports. The information I have received is that there are no SPOT
personnel at feeder airports, and only enough personnel to conduct surveillance at
half the lines of traffic at our major airports. I believe this is a terrible mistake,
especially given the fact that recruiting and training enough SPOT personnel to
have this layer of security in place at all airports would cost less than 1% of last
year’s DHS budget.
Although I am not fully informed of the changes in the program now underway
I believe they include increased training time and more selective recruitment.
A4 a. Regarding training time, since the costs of training are low and the costs of
just one terrorist being missed are very high, I believe it merits overkill. I expect
that 40 hours of training, spread over a few weeks, would be of benefit. But that
is a guess as there is no research available to determine when adding training time
stops producing benefits.
There are many questions that could be answered by doing research to find out
how many BDOs are needed to cover a given area, what breaks are needed and
when to optimize performance, and are people missed who show many of the behav-
iors on the SPOT checklist.
Q5. What steps should TSA have taken prior to implementing the SPOT program
nationwide?
A5. I believe TSA took the appropriate steps: it found out what the Israelis were
doing; and it obtained the help and advice from those scientists who had done re-
search relevant to its objectives, not just my work. By the time TSA consulted with
Israel about their training, we had already provided training to the Israelis. It
should be clear that the training included but was not limited to micro expressions.
In our research we measure and find useful hot spots shown in gesture, voice and
speech itself. And these too are included in TSA’s behavioral profiling.
I believe TSA made the right judgment in adding this layer of security prior to
research about how effective it would turn out to be in catching malfeasants. The
recent AIR study showed it is effective, but it would have been a mistake, in my
judgment, not to have provided the American people with this layer of security be-
fore that study was performed.
I regret that the American people are not now being provided with all the layers
of security which are available in England and Israel, because there simply are not
enough trained Behavior Detection Officers.
*Professor Mark Frank, SUNY Buffalo contributed to some of these responses.

References

Ekman, P. & O’Sullivan, M. (1991) Who can catch a liar? American Psychologist,
46(9), 913-920.
Frank, M.G., & Ekman, P. (1997) The ability to detect deceit generalizes across
different types of high-stake lies. Journal of Personality and Social Psychology
72, 1429-1439.
Frank, M.G, Feeley, T.H., Paolantonio, N. & Servoss, T. J. (2004). Individual and
Small Group Accuracy in Judging Truthful and Deceptive Communication.
Group Decision and Negotiation 13(1), 45–59.
O’Sullivan, M., Frank, M. G., Hurley, C. M., & Tiwana, J. (2009). Police lie detec-
tion accuracy: The effect of lie scenario. Law and Human Behavior 33(6), 542–
543.
129
Warren, G., Schertler, E., Bull, P. (2008) Detecting Deception from Emotional
and Unemotional Cues. Journal of Nonverbal Behavior 33, 59–69.
130
Responses by Dr. Maria Hartwig, Associate Professor, Department of Psychology,
John Jay College of Criminal Justice

Questions submitted by Chairman Paul Broun


Q1. Are there any differences in the behavioral cues associated with a liar being de-
ceitful and the behavioral cues associated with a truth-teller stressed about
being perceived as a liar? In other words, how would one distinguish a liar from
a truthful person who’s afraid of not being believed?
A1. In a situation where liars fear detection, and truth tellers fear not being be-
lieved, the behavioral patterns of the two are likely to be very similar. Research
supports this, by showing that when liars and truth tellers are highly motivated to
be believed, they both display patterns of behavior that are likely to attract decep-
tion judgments. That is, they may both show signs of stress and fear; signs which
an observer may interpret as indicative of deception. Simply put, it is very difficult,
if not impossible, to distinguish between the behavioral signs of stress of a liar who
fears exposure and those displayed by a truth teller who fears misjudgment.
Q2. Your testimony talks about a paradigm shift in the approach to lie detection that
involves, ‘‘moving from passive observation of behavior to the active elicitation
of cues to deception.’’ Unlike the Israeli process, BDOs in the U.S. can’t realisti-
cally stop and interview each passenger several times prior to boarding - how
do you propose TSA incorporate this mentality into SPOT? Should it? Is it prac-
tical?
A2. It is true that it may not be feasible to interview every single passenger due
to the high volume of travelers in the U.S. My suggestion is that the TSA, with the
help of an independent panel of experts, should review theories and empirical find-
ings on the elicitation of cues to deception, and entertain the possibility of incor-
porating some of these methods in their protocol for verbal interactions with trav-
elers. Some form of screening is most likely necessary in order to select passengers
for additional scrutiny in the form of questioning. Whether the SPOT method should
be used for this screening ultimately depends on the findings of the validation
study, which, to my knowledge, has yet to be released.
Q3. What steps should TSA have taken prior to implementing the SPOT program
nationwide?
A3. It would have been beneficial to create and consult with a panel of independent
experts in the relevant areas, in order to ensure that the procedures are in line with
the scientific evidence. Moreover, it is my view that the TSA should have carried
out a validation study prior to implementing the program nationwide. Again, a
panel of experts could have been of assistance in designing and executing such a
validation study.
131
Responses by Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories

Questions submitted by Chairman Paul Broun


Q1. What are the challenges that scientists need to address in order to conduct re-
search in an operational setting? 1b. Can these hurdles be overcome?
A1. There are numerous challenges related to conducting research in operational
settings. I would like to focus on two of these.
1. Evaluation and analysis both in the laboratory and in the field must be based
on specific, testable hypotheses that derive from premises that are established
in some sort of orderly and/or rational manner. For example, using voice stress
analysis (VSA) to illustrate this, it is essential to first understand what is
being measured (that is, what is the specific definition of ‘‘voice stress’’) and
understand how these measures might related to outcome measures. In addi-
tion, in order to isolate critical variables so that then can ultimately be vali-
dated (in the lab or in the field), we also need to consider potential interactions
of variables that might affect results and other factors that could bias or shape
experimental results, including any critical contextual considerations. In the
case of approaches like VSA, field tests should not be conducted prior to dem-
onstrating a valid and reliable approach for characterizing and quantifying, if
possible, the underlying variables. Once these have been established, it is then
possible to move to the field. If the premises are weak or cannot be established,
there is little point in moving to field evaluation.
2. Laboratory studies have the advantage that they often provide for the ability
to precisely control experimental conditions. The disadvantage is that they
often lack what is sometimes called ‘‘ecological validity.’’ That is, what is being
measured in the laboratory may not accurately capture the phenomena that
you are trying to study, often because critical contexts have been removed.
Field evaluation lets you study events in their natural environment. This has
been standard in the ethological approach and in many other instances includ-
ing primate research, research on children, and research in organizational and
institutional settings. Unfortunately, with this greater realism sometimes
comes a consequent loss of experimental control.
Overall, the best approach would be to first clearly nail down a good, concrete un-
derstanding of critical variables and the premises that give rise to them. These
should be experimentally evaluated and understood prior to field evaluation. An as-
sessment of potentially critical contextual variables is also essential. At that point
(but not until then), field evaluation is possible and can provide a rich and realistic
approach for evaluating data and programs. Although there are often limitations in
the field, clever and informed experimental design can go a long way to assisting
with the design of studies that have great utility. If they cannot be used to fully
study a system, they can often be informative and useful as they relate to aspects
of the problem.
Q2. (Regarding the comments of Dr. Ekman and Dr. Hartwig). Given these opposing
observations, what is your analysis?
There appears to be very little in the peer-reviewed, scientific literature to help
differentiate high versus low-risk lying and their relationship. As both Dr. Ekman
and Dr. Hartwig have indicated, research is needed in this area. Peer-reviewed re-
search would be the useful to establish and solidify scientific validity of results.
Such work can be done without jeopardizing security.
Q3. . . . what thoughts do have on the manner in which the SPOT program was im-
plemented?
A3. As you have noted, I agree with Dr. David Mandel’s comments from the sum-
mary of the NRC workshop that I chaired, called ‘‘Field Evaluation in the Intel-
ligence and Counterintelligence Context: Workshop Summary’’.
‘‘Another way in which establishing a connection with the research community
can help the intelligence community is with validation, Mandel said. Once
knowledge and insights from behavioral science are used to develop new tools
for the intelligence community, it is still necessary to validate them. Simply
basing recommendations on scientific research is not the same thing as show-
ing scientifically that those recommendations are effective or testing to see if
they could be substantially improved. Even Heuer was unable to do much to
validate his recommendations, Mandel noted, and, more generally, this is not
132
something that the intelligence community is particularly well equipped to
do.’’
‘‘It is, however, exactly what research scientists are trained to do. Science of-
fers a method for testing which ideas lead to good results and which do not.
Thus partnering with the behavioral science community can help the intel-
ligence community zero in on the techniques that work best and avoid those
that work poorly or not at all.’’
Unfortunately, it appears that the SPOT program was implemented before its un-
derlying premises, measures, indicators, etc., could be adequately scientifically eval-
uated and, if necessary, validated in even a remotely meaningful way. Instead, they
appear to have been rushed into the field due to a combination of fear, zeal, passion,
folklore, intuition, and enthusiasm about controversial scientific results, such as
‘‘micro-expressions.’’ As of the time of the April 6, 2011 hearing, and the end of my
contribution to the TAC report, I had not been provided with information about the
‘‘indicators’’ used in the SPOT program, so I can only speculate about them. How-
ever, if they were things like facial micro-expressions, behavioral indicators such as
gaze direction or head tapping, etc., then they should all be subject to scientific scru-
tiny. Why are such measures being selected? What is the current state of scientific
knowledge regarding their validity? If little is known about them, can then be evalu-
ated scientifically? If not, then they should not be used. On other possible measures
such as excessive sweating, aberrant behavior, etc., it would be useful to understand
the science on how these behaviors related to outcome measures. For example in
voice stress analysis (which does not appear to be a reliable measure) which is sup-
posedly related to changes in voice ‘‘micro-tremors’’, is the appropriate indicator
greater or smaller magnitude of micro-tremor?
Given the enormous stakes related to national security in transportation, and also
to work done by our intelligence and counter-intelligence communities, my strongest
recommendation for the Committee would be that the money currently being de-
voted to (and in my opinion wasted on) this program should immediately be redi-
rected to a large-scale effort to solicit the best possible scientific and technical guid-
ance related to the detection of deception using behavioral indicators. The end prod-
uct should include a clear statement of what works, what does not, what remains
controversial, and how to move ahead. The TAC did not have the independence, ex-
pertise, breadth of knowledge, nor latitude to take on this challenge, not was it
asked to do so. Such a study should be broader than SPOT and should include con-
siderations of approaches like voice stress analysis, facial expression, remote physio-
logical monitoring, and neuroimaging. Members of such a group should have exper-
tise in physiology, behavioral science, psychology, neuroscience, linguistics, statistics
and methodological design, and related areas. It is essential that any group working
on such a project be independent of DHS and TSA. Scientific evaluation of programs
like SPOT and other programs related to the detection of deception can be done in
a manner that does not provide unique knowledge to those who would wish to harm
us.
Q4. How do you respond to DHS’ preliminary assertion that SPOT is significantly
more effective than random screening?
A4. As a member of the Technical Advisory Committee I would have to say that
this assertion on the part of DHS is not a meaningful or useful one. The base rate
for outcomes is too small to be statistically reliable and/or meaningful. If DHS is
making an assertion of this sort, then they need to more clearly define and quantify
what ‘‘significantly more effective than random screening’’ means. In a population
of 100,000 events are 2 observations significantly different than 1? How about 3
versus 1? Or 100 versus 1? What does significance mean as DHS is using the term
and what do they mean by ‘‘effective’’? Small numbers in large populations can be
meaningless and simply part of the randomness and background noise that nor-
mally occur in most systems. Given the controversial and costly nature of this pro-
gram, scientific and statistical rigor should be essential. I find such a statement to
be misleading and potentially dangerous. Politicians, policymakers and the lay pub-
lic, will hear something like ‘‘SPOT is significantly more effective than random
screening’’ and may assume that this program is effective, useful, and has been ade-
quately scientifically evaluated. To this point the effectiveness and usefulness have
not been established. The scientific evaluation has been inadequate and has not
been approached in a manner that would lead to greater knowledge regarding the
program. Establishing scientific credibility has the potential to be helpful to pro-
grams of this sort, but that requires full, well thought out, independent, credible,
and open scientific review.
133
Outcomes, which apparently are based on a combination of indicators, could result
simply from the fact that, according to information described by CNN in a report
on April 15, 2011, individuals are singled out for behaving arrogantly. Arrogant in-
dividuals stand a greater chance of being referred to a law enforcement official
(LEO) than do those who not behave arrogantly. LEO referrals are related to 2 of
the 4 the outcome measures (either by occurring individually or in combination with
another indicator). Thus, almost by definition, the SPOT program has a higher prob-
ability of producing increases in outcome when compared with totally random selec-
tion. Positive SPOT outcomes are mostly due to observations that result in LEO
interaction. These could be strongly related to things like ‘‘arrogant’’ behavior and
be telling us little more than that, which is kind of a ‘‘duh?’’ result for such a serious
investment of time and money. TAC had not been provided with enough information
by the time of the April 6 hearing (when Mr. Willis indicated that the report had
already been finalized) to determine significance and/or potential interaction with
other variables. In summary, it is unclear what ‘‘effective’’ means in this context.
The most significant outcomes in SPOT were related to LEO referrals. It is possible
that the outcome of this program is no more than the observation that individuals
who act like jerks might get arrested. What does that have to do with an effective,
useful program?
134
Responses by Mr. Peter J. DiDomenica, Lieutenant Detective,
Boston University Police

Questions submitted by Chairman Paul Broun


Q1. In your written testimony, you talk about your desire to see some sort of SPOT
training provided for law enforcement personnel so that they can better coordi-
nate and understand a situation when approached by a BDO who has sus-
picions about a traveler. Keeping in mind the limited resources we have in terms
of federal dollars, can you expand on how critical such training would be?
Would we be better off having fewer BDOs with more SPOT-trained LEOs?
A1. I believe that SPOT-trained police officers working in conjunction with the TSA
are critical to the success of the SPOT program not only because of the ability of
law enforcement to coordinate and understand the program but, most importantly,
because of the absolute need for effective resolution of the suspicion. The BDOs are
not empowered to detain, arrest, or deny access and lack law enforcement training
and experience in questioning suspicious persons. Moreover, the BDOs do not have
direct access to the criminal databases that law enforcement officers have access.
The success of the program relies upon law enforcement officers (LEOs) who under-
stand and use behavioral screening who follow through with denial of access, deten-
tion, or arrest when appropriate; otherwise, terrorists or other dangerous people will
likely pass through the system because there will nothing obvious to justify denial
of access or arrest such as a pre-existing arrest warrant or possession of contraband.
The dilemma is that the most dangerous people, such as the 16 suspected terrorists
who passed through SPOT airports, are generally not actively involved in a terrorist
operation when boarding planes so that, short of finding an arrest warrant or con-
traband, there will be no basis for arrest. Even if they are operational and possess
a weapon or explosive, there are still major gaps in weapon and explosive detection
systems that present the significant risk of such weapon or explosive getting
through the physical screening process. In my opinion it is absolutely critical that
behavior assessment trained LEOs are present who are in a position to develop
probable cause to arrest and who, absent such probable cause, are in a position to
deny access when sufficient reasonable suspicion exists allowing the time for a more
thorough investigation. Effective and reasonable security to prevent massive casual-
ties from a terrorist attack on venues such as airports and mass transit significantly
depends, in my opinion, upon behavior assessment trained LEOs who have the
knowledge, ability, and confidence to deny access, in most cases temporarily, to such
venues.
I believe the limited federal dollars available for SPOT screening would be better
spent on training LEOs in behavior assessment and for providing federal support
for overtime costs of deploying local and state LEOs for specific behavior assessment
duties at airports. It seems to me that the American public will get ‘‘more bang for
the buck’’ by enhancing the abilities of already trained and experienced law enforce-
ment officers who can combine both the functions of being the ‘‘spotters’’ of sus-
picious behavior and being the ‘‘resolvers’’ of suspicious behavior. This would reduce
the communication and understanding issues between TSA and LEOs that presently
impede the success of the program. Moreover, the federal government would not be
saddled with the costs of additional federal employees by contracting out the func-
tion to employees of state and local government. Such an approach would also re-
duce the civil liability exposure of the federal government as well. With this ap-
proach I believe there would be more effective prevention of terrorism with less ex-
penditure of federal dollars.
Q2. I get the impression from your testimony that after the events of 9/11, particu-
larly in light of your closeness to the situation, you felt the nation had to do
something to prevent terrorism in the aviation sector. Your experience with Rich-
ard Reid appears to provide further evidence of that mentality.
a. Is that assessment of your mindset as you set about creating the program?
b. In the NRC’s 2008 Report: Protecting Individual Privacy in the Struggle Against
Terrorists - A Framework for Program Assessment, one of the conclusions reached
by the 21-member Committee that published the report is:
In the aftermath of a disaster or terrorist incident, policy makers come under
intense political pressure to respond with measures intended to prevent the
event from occurring again. The policy impulse to do something (by which is
usually meant something new) under these circumstances is understandable,
but it is simply not true that doing something new is always better than doing
nothing.’’
135
b. How do you respond to that conclusion?
A2 (a.) I am not comfortable with the word ‘‘mentality’’ as used in the question as
it implies, in my opinion, a certain rigidity and unwillingness to consider differing
opinion perhaps to the point of being a zealot. I do not believe I had a ‘‘mentality’’
about having to do something to prevent terrorism construing the word ‘‘mentality’’
as I have explained. I did believe that our ability to screen passengers at airports
was deficient and that it could be improved and that the Richard Reid example
showed how reliance on physical screening without use of behavioral screening cre-
ated a gap in security. I knew from my personal experience and from other police
officers I worked with that persons who are engaged in dangerous or high risk activ-
ity tend to behave differently than persons not so engaged, particularly in the pres-
ence of a police officer or other official who could intercept them. I also learned
through scientific literature that people’s behavior changes when engaged in dan-
gerous or high risk activity and that body language, mental state and paralinguistic
attributes can be affected. It seemed reasonable to me then as it does today to use
the ability of trained professionals to detect a person engaged in dangerous or high
risk activity as another of layer of security at our airports provided the training was
proper and the public’s civil rights were protected through adhering to limitations
on detentions and profiling based on the 4th Amendment and the Equal Protection
Clause of the 14th Amendment. I do not believe I was under the impulse to do any-
thing for the sake of doing anything but was motivated by addressing a gap in our
security through reasonable, effective, and lawful means.
A2 (b-c.) I agree 100% with the danger presented by catastrophic events that can
compel governments to respond without due deliberation and in haste sometimes
with troubling and even devastating consequences. I have been an instructor in ra-
cial profiling and biased policing for over a decade and have included discussion of
excesses by the government to respond to a serious incident or crisis. For example,
the internment of more 100,000 Japanese Americans on the West Coast, mostly U.S.
citizens, simply based on ancestry during World War II because of fears of an inva-
sion or sabotage represents such an overreaction to a real threat. In fact, the U.S.
Congress formally apologized to the survivors in 1988. The divisive issue of police
racial profiling was spawned by overreaction to the real danger of drugs being trans-
ported on our highways. Well intentioned efforts to make communities safer re-
sulted in those very communities feeling disenfranchised from law enforcement
through the unlawful use of selective enforcement based on race. I was well aware
of the danger to the American public from overreaction to the real threat of Islamic
Extremist terrorism and made efforts to ensure our response was lawful and effec-
tive and consistent with our nation’s values. I, like many security and law enforce-
ment officials, found a gap in our aviation security and sought and found a means
to address the gap, not because something had to be done but because something
could be done. I would also like to point out that I was not a policy maker but a
policy advisor and was not personally under any political pressure to do something.
I was not an elected official nor did I directly serve elected officials. I could have
simply carried out my duties as a police officer without having attempted to address
the issue or passenger screening but chose to help because I felt I was the type of
person who could balance the need for response to terrorism with the ability to do
it effectively, lawfully, and ethically without undue haste and with proper delibera-
tion.
Q3. Did you consult with any scientists before implementing the BASS program?
What scientific literature did you research prior to the program?
a. Do you consider this review exhaustive or comprehensive?
b. Have you ever submitted the BASS system for outside review by Behavioral Sci-
entists?
c. Did you encounter any criticisms– either through your research or by talking to
people - about the validity of the BASS program?
A3. I consulted with co-panelist Dr. Paul Ekman and Dr. Mark Frank of the State
University of New York at Buffalo. Then Massachusetts State Police Major Thomas
Robbins and I went to Quantico, VA and spoke with the FBI Behavioral Sciences
Unit (Eugene Ragala and Stephen Etter). We also spoke with Dr. Jessica Stern of
the Harvard Kennedy School of Government.
Literature consulted included:
• Atran, Scott, University of Michigan, The Surprises of Suicide Terrorism, Dis-
cover Magazine, Vol. 24 No. 10 (October 2003)
• Lewis, Bernard, What Went Wrong
136
• The 9/11 Commission Report: Final Report of the National Commission on Ter-
rorist Attacks Upon the United States.
• Stern, Jessica, Harvard University John F. Kennedy School of Government, The
Protean Enemy, Foreign Affairs, Volume 82 No. 4, July/August 2003, p. 27.
• Stern, Jessica, Terror in the Name of God
• Richardson, Louise, Harvard University professor, What Terrorists Want
• Pape, Robert, University of Chicago, Dying to Win, Database of every suicide
attack from 1980 to 2003, 315 attacks
• Knapp, Mark, and Hall, Judith, Nonverbal Communication in Human Inter-
action
• Miller, Arthur G., editor, The Social Psychology of Good and Evil
• McDermott, Terry, Perfect Soldiers
• Grossman, Dave, On Killing, On Combat
• Dozier Jr., Rush, Why We Hate
• Barber, Benjamin, Jihad vs. McWorld
• Who Becomes a Terrorist and Why (US Government Report)
• Zimbardo, Phillip, Stanford Prison Experiment (1971)
• Milgram, Stanley, Obedience Experiments (1974)
• Givens, David B, Center for Nonverbal Studies, The Nonverbal Dictionary of
Gestures, Signs & Body Language Cues (2003).
• Sageman, Marc, Former CIA caseworker and forensic psychologist, Study of 400
terrorists
• Meta-analysis on deception cues by Bella DePaulo, et al., 2003. Cues to Decep-
tion, Psychological Bulletin, 129(1):74-118, 2003
• Mehrabian, Albert, and Ferris, Susan R. ‘‘Inference of Attitudes from Nonverbal
Communication in Two Channels,’’ Journal of Consulting Psychology, Vol. 31,
No. 3, June 1967, pp. 248-258
• Mehrabian, A. (1971). Silent messages, Wadsworth, California: Belmont
• Mehrabian, A. (1972). Nonverbal communication. Aldine-Atherton, Illinois: Chi-
cago
• Facial expression of emotion; seven universal expressions of emotion. Ekman,
Friesen, & O’Sullivan, 1988.
• Darwin, Charles, The Expression of Emotion in Man and Animals
• Testimony of Professor Jonathan Turley, Shapiro Professor of Public Interest,
George Washington University Law School, before the U.S. House of Represent-
atives Subcommittee on Aviation, February 27, 2002. Available on the Internet
at http://www.house.gov/transportation/aviation/02-27-02/turley.html
• Ekman Ph.D., Paul, Telling Lies and Human Emotion Revealed
A3 (a.) I do not believe this review to be exhaustive but I do believe it was com-
prehensive.
A3 (b.) I asked Dr. Ekman, Dr. Frank, and the FBI Behavioral Sciences Unit to look
at the program but this was not in the nature of a formal scientific review.
A3 (c.) I participated as a briefer for the JASON (Mitre Corporation) Summer Study
‘‘Badguyology’’ in June 2008 in which I presented information on BASS techniques.
Their findings where that anecdotal evidence exists that police interviewing meth-
odologies work at detecting deception and may be able to be validated and developed
further. However, they also found that no scientific evidence exists to support the
detection or inference of future behavior including intent. My discussions with Dr.
Ekman, Dr. Frank and the FBI Behavioral Sciences Unit generally indicated the
same assessment of BASS: that there was a general scientific foundation for
changes in behavior related to persons engaged in high risk activity who did not
want to be detected but specific studies would be needed to validate the use of spe-
cific behaviors and their significance.
Q4. What does the BASS/PASS training consist of? What behavior/cues/deviations
did you look for?
A5. The following is the training outline of the BASS program showing all the com-
ponents of the training:
INTRODUCTION
• War in the Homeland
• Policing in the Post 9/11 Environment
• Rationale for BASS
• What is BASS
• Is BASS Profiling?
• Benefits of BASS
137
BASS POLICY AND LEGAL CONSIDERATIONS
• Definitions
• Prohibition on Racial Profiling
• Voluntary Encounters
BASS GENERAL GUIDELINES AND PROCEDURES
• Methods of Contact
• Guidelines for Elevated and Reasonable Suspicion
UNDERSTANDING THE TERROR THREAT
• Islamic Fundamentalist Terror
• History of Conflict
• The Current Threat
STEP (1) OBSERVATION OF BEHAVIOR
• Theory of Behavioral Analysis
• Understanding Baselines
• Baseline Field Exercise
• Low Level Behavioral Indicators
• High Level Behavioral Indicators
• Surveillance Indicators
• Unusual Items in Baggage
• Explosive Components
• Suicide Bomber Indicators
• Detecting Bomb Activity in Vehicles and Buildings
• London Bombings
• 9/11 hijackers
• Evolving Suicide Bomber
• High and Low Risk Passengers
STEP (2) EXAMINATION OF TRAVEL DOCUMENTS
• Resident Alien
• Passport
• Visa
• I-94 and I-94W forms
• Elevated Suspicion Factors
• Terrorist Sponsoring and Terrorist Suspicious Countries
STEP (3) INTERVIEW
• Purpose of Interview
• Format of Questions o Travel/Visit Questions
• Vehicle Stop Questions
• Question Form and Technique
• Two-Step Baseline Approach to Resolving Elevated Suspicion
• Signs of Deception
• Analysis of Interview Videos
• Classroom Interview Exercise
STEP (4) RESOLUTION
• Three Dispositions of Person
• Case Studies
• FIELD INTERVIEW EXERCISES COURSE CONCLUSION
• Summary of Course
• Q&A
• Evaluations
The specific behavior/cues/deviations may be protected under TSA regulations as
Sensitive Security Information so I cannot answer this question without further
guidance from legal counsel.
Q5. Page two of Dr. Hartwig testimony states..How do you respond to Dr. Hartwig
and Dr. Rubin’s testimony?
A5. BASS is not a lie detection program: BASS is a program designed to detect be-
havioral changes associated with a person who is engaged in high risk or dangerous
activity and to prevent such persons from entering critical infrastructure until the
status of the person is resolved. Detection of deception constitutes one factor of
many as part of an overall assessment of dangerousness and this factor, while use-
ful, is not required for identification of potentially dangerous people. I have attended
the following courses on interviewing that include detection of deception components
138
and this training indicates that with such interviewing training, police officers can
improve their ability to detect deception:
Paul Ekman Group Training Division
Evaluating Truthfulness Train-the Trainer Workshop, February 16-18, 2006.
Institute of Analytic Interviewing
Interviewing, Credibility, and Emotion, January 10-14, 2005.
Department of the Treasury, Bureau of Alcohol, Tobacco, and Firearms
Analytic Interview School, April 19-23, 1999 at State Police New Braintree.
Wicklander - Zulawski & Associates
The Reid Method of Criminal Interviews and Interrogation, April 16-18, 1996 at
State Police New Braintree.

Moreover, I am certified as a trainer in deception detection by the Paul Ekman


Group Training Group and have conducted this training for the TSA and the De-
partment of State. From my understanding of the research, there are techniques
considered fairly reliable in detection of deception and that if used as part of an in-
tegrated approach that considers both emotional and cognitive aspects of deception
and memory, the seriousness of the potential deception, alternative explanations for
perceived cues, and evaluation of subject baseline, can allow police officers to be
more effective and accurate in the assessment of credibility. I believe the DHS
SPOT validation study provides striking evidence for the effectiveness of the SPOT/
BASS techniques I designed: A high-risk traveler is nine times more likely to be
identified using operational SPOT versus random screening and that this result was
achieved by BDOs engaging 50,000 fewer passengers than the random selection
process. When it came to arrests in this study, the SPOT program was found to be
50 times more effective than random screening. Moreover, the research by Dr.
Frank cited in Dr. Ekman’s testimony indicates that, ‘‘In a situation set up to re-
semble an airport security context, we could predict at 90% accuracy who intended
to lie about an action which s/he had not yet taken. This was accomplished by anal-
ysis solely on their emotional reaction, eye contact, and nervous body behaviors.
These are the types of actions security officers look for in behavioral observation
programs. These results are the first study to show that intentions can be detected
from behavior.’’ Combining my training and experience and this recent research I
am confident that properly trained LEOs have a significantly better than chance
ability to detect potential terrorists and other dangerous people.
I agree with Dr. Rubin’s testimony that shows there is an inclination by those
who are involved in evaluations in the criminal and homeland/national security
arena to be dismissive of scholarly research that may contradict their views. This
is an aspect of basic human nature that we all tend to become defensive when our
basic assumptions are challenged and this includes police officers, scientists, and
congressmen. Nobody likes being told they are wrong. I have always tried to keep
an open mind in my professional work and my work in developing SPOT/BASS was
done in this way to the best of my ability. Most of what I learned and experienced
pointed to the programs going in the right direction and I always welcomed review
and advice. I welcome continued research and testing and know there is a great deal
more to be learned. I agree with the GAO report 10-763 of May 2010 that called
for more scientific validation of SPOT and I am personally disappointed that TSA
did not do more to validate the program after I left in 2004. To be blunt in my opin-
ion, TSA dropped the ball in its efforts to validate SPOT and, as a result have put
many people and entities on the ‘‘spot’’ to defend it and to question it including my-
self, DHS, and this Subcommittee. But as Chairman Broun stated at the April 6,
2011 hearing, ‘‘The goal is not to throw out the proverbial baby with the bath
water.’’ I believe SPOT/BASS programs provide a critical layer in our multifaceted
approach to aviation security and the effort to validate the programs, however be-
lated, is worth our time and expense.
Thank you for this additional opportunity to address the Subcommittee.
Appendix II

ADDITIONAL MATERIALS SUBMITTED FOR THE RECORD

(139)
140
MATERIAL SUBMITTED BY MR. STEPHEN LORD, DIRECTOR, HOMELAND SECURITY AND
JUSTICE ISSUES, GOVERNMENT ACCOUNTABILITY OFFICE
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
Æ

Anda mungkin juga menyukai