HEARING
BEFORE THE
FIRST SESSION
APRIL 6, 2011
Printed for the use of the Committee on Science, Space, and Technology
(II)
CONTENTS
Date of Hearing
Page
Witness List ............................................................................................................. 2
Hearing Charter ...................................................................................................... 3
Opening Statements
Witnesses:
Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office
Oral Statement ................................................................................................. 24
Written Statement ............................................................................................ 26
Mr. Larry Willis, Program Manager, Homeland Security Advanced Research
Projects Agency, Science and Technology Directorate, Department of Home-
land Security
Oral Statement ................................................................................................. 39
Written Statement ............................................................................................ 40
Peter J. DiDomenica, Lieutenant Detective, Boston University Police
Oral Statement ................................................................................................. 42
Written Statement ............................................................................................ 44
Dr. Paul Ekman, Professor Emeritus of Psychology, University of California,
San Francisco, and President and Founder, Paul Ekman Group, LLC
Oral Statement ................................................................................................. 48
Written Statement ............................................................................................ 50
Dr. Maria Hartwig, Associate Professor, Department of Psychology, John
Jay College of Criminal Justice
Oral Statement ................................................................................................. 70
Written Statement ............................................................................................ 71
Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories
Oral Statement ................................................................................................. 79
Written Statement ............................................................................................ 80
Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office .................................................................................. 114
Mr. Larry Willis, Program Manager, Homeland Security Advanced Research
Projects Agency, Science and Technology Directorate, Department of Home-
land Security ........................................................................................................ 118
Dr. Paul Ekman, Professor Emeritus of Psychology, University of California,
San Francisco, and President and Founder, Paul Ekman Group, LLC .......... 127
(III)
IV
Page
Dr. Maria Hartwig, Associate Professor, Department of Psychology, John
Jay College of Criminal Justice .......................................................................... 130
Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories ....................... 131
Peter J. DiDomenica, Lieutenant Detective, Boston University Police ............... 134
Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office .................................................................................. 140
BEHAVIORAL SCIENCE AND SECURITY:
EVALUATING TSA’S SPOT PROGRAM
HOUSE OF REPRESENTATIVES,
SUBCOMMITTEE ON INVESTIGATIONS AND OVERSIGHT,
COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY,
Washington, DC.
The Subcommittee met, pursuant to call, at 10:03 a.m., in Room
2318 of the Rayburn House Office Building, Hon. Paul C. Broun
[Chairman of the Subcommittee] presiding.
(1)
2
3
HEARING CHARTER
Purpose
The Subcommittee on Investigations and Oversight meets on April 6, 2011 to ex-
amine the Transportation Security Administration’s (TSA) efforts to incorporate be-
havioral science into its transportation security architecture. The Department of
Homeland Security (DHS) has been criticized for failing to scientifically validate the
Screening of Passengers by Observational Techniques (SPOT) program before oper-
ationally deploying it. SPOT is a TSA program that employs Behavioral Detection
Officers (BDO) at airport terminals for the purpose of detecting behavioral based in-
dicators of threats to aviation security.
The hearing will examine the state of behavioral science as it relates to the detec-
tion of terrorist threats to the air transportation system, as well as its utility to
identify criminal offenses more broadly. The hearing will examine several inde-
pendent reports-one by the Government Accountability Office (GAO), two by the Na-
tional Research Council, and a number of Defense and Intelligence Community advi-
sory board reports on the state of behavioral science relative to the detection of emo-
tion, deceit, and intent in controlled laboratory settings, as well as in an operational
environment. The Subcommittee will evaluate the initial development of the SPOT
program, the steps taken to validate the science that form the foundation of the pro-
gram, as well as the capabilities and limitations of using behavioral science in a
transportation setting. More broadly, the hearing will also explore the behavioral
science research efforts throughout DHS.
Background
The terrorist attacks on September 11, 2001 exposed a vulnerability in the na-
tion’s air transportation system. In order to augment other screening processes and
procedures, TSA conducted operational testing of behavior detection techniques at
a limited number of airports in October 2003. 1 In 2007, TSA created new BDO posi-
tions as part of the SPOT program with the goal of identifying persons who may
pose a potential security risk by using behavioral indicators such as stress, fear, or
deception. 2
The indicators BDOs use form a checklist with corresponding values and thresh-
olds. These indicators, values, and thresholds are used to assess passengers while
in line awaiting security screening. When an individual displays behaviors or an ap-
pearance that exceeds a predetermined threshold, they are referred for additional
screening. If, during the course of this secondary screening, individuals display be-
haviors that exceed another threshold, they are referred to law enforcement officers
for further investigation.
Initially established to detect terrorist threats to the aviation transportation sys-
tem, 3 the program’s mission has since broadened to include the identification of be-
haviors indicative of criminal activity. 4 Critics of the program have argued that this
expansion reflects the failure of the program to identify any terrorists, and therefore
program success could only be quantified by broadening the goals to include crimi-
1 Aviation Security: Efforts to validate TSA‘s Passenger Screening Behavior Detection Pro-
gram Underway, but Opportunities Exist to Strengthen Validation and Address Operational
Challenges, Government Accountability Office, May 2010. Available at http://www.gao.gov/
new.items/d10763.pdf
2 Ibid.
3 Ibid.
4 Congressional Budget Justification FY2012, Department of Homeland Security.
4
nal activity which has a higher rate of occurrence. 5 This may or may not be a fair
critique based on the extremely small sample size that terrorists would represent.
Regardless of the rationale for the program’s expanded scope, questions remain
about whether indicators for terrorism are the same for criminal behavior.
As of March 2010, TSA employed roughly 3,000 BDOs at approximately 161 air-
ports at a cost of $212 million a year. 6 In the President’s fiscal year 2012 budget
request, the Department seeks to add 175 more BDOs with an increase of $21 mil-
lion - a 9.5 % increase over current funding levels. 7 In total, the five year budget
profile for the SPOT program accounts for roughly $1.2 billion. 8
Relevant Reviews
U.S. Government Accountability Office (GAO)
Aviation Security: Efforts to validate TSA’s Passenger Screening Behavior Detec-
tion Program Underway, but Opportunities Exist to Strengthen Validation and
Address Operational Challenges
In May 2010, GAO issued a report titled ‘‘Efforts to Validate TSA’s Passenger
Screening Behavior Detection Program Underway, but Opportunities Exist to
Strengthen Validation and Address Operational Challenges’’ in response to a Con-
gressional request to review the SPOT program. In preparing the report, GAO ana-
lyzed ‘‘(1) the extent to which TSA validated the SPOT program before deployment,
(2) implementation challenges, and (3) the extent to which TSA measures SPOT’s
effect on aviation security.’’ 9
GAO issued the following findings associated with its review:
Although the Department of Homeland Security (DHS) is in the process of vali-
dating some aspects of the SPOT program, TSA deployed SPOT nationwide
without first validating the scientific basis for identifying suspicious passengers
in an airport environment. A scientific consensus does not exist on whether be-
havior detection principles can be reliably used for counterterrorism purposes,
according to the National Research Council of the National Academy of
Sciences. According to TSA, no other large-scale security screening program
based on behavioral indicators has ever been rigorously scientifically validated.
DHS plans to review aspects of SPOT, such as whether the program is more
effective at identifying threats than random screening. Nonetheless, DHS’s cur-
rent plan to assess SPOT is not designed to fully validate whether behavior de-
tection can be used to reliably identify individuals in an airport environment
who pose a security risk. For example, factors such as the length of time BDOs
can observe passengers without becoming fatigued are not part of the plan and
could provide additional information on the extent to which SPOT can be effec-
tively implemented. Prior GAO work has found that independent expert review
panels can provide comprehensive, objective reviews of complex issues. Use of
such a panel to review DHS’s methodology could help ensure a rigorous, sci-
entific validation of SPOT, helping provide more assurance that SPOT is ful-
filling its mission to strengthen aviation security. 10
Additionally, GAO found issues relating to performance metrics, data integrity,
and reach-back capabilities as well.
TSA is experiencing implementation challenges, including not fully utilizing the
resources it has available to systematically collect and analyze the information
obtained by BDOs on passengers who may pose a threat to the aviation system.
TSA’s Transportation System Operations Center has the resources to investigate
aviation threats but generally does not check all law enforcement and intel-
ligence databases available to it to identify persons referred by BDOs. Utilizing
existing resources would enhance TSA’s ability to quickly verify passenger iden-
tity and could help TSA to more reliably ‘‘connect the dots.’’ Further, most
BDOs lack a mechanism to input data on suspicious passengers into a database
used by TSA analysts and also lack a means to obtain information from the
Transportation System Operations Center on a timely basis. TSA states that it
is in the process of providing input capabilities, but does not have a time frame
5 Weinberger, Sharon, ‘‘Intent to Deceive’’ Can the Science of Deception Detection Help to
Catch Terrorists?’’ Nature, Vol. 465127, May 26, 2010, available at: http://www.nature.com/news/
2010/100526/pdf/465412a.pdf
6 Supra n.1.
7 Supra n.4.
8 Supra n.1.
9 Ibid.
10 Ibid.
5
for when this will occur at all SPOT airports. Providing BDOs, or other TSA
personnel, with these capabilities could help TSA ‘‘connect the dots’’ to identify
potential threats.
Although TSA has some performance measures related to SPOT, it lacks out-
come-oriented measures to evaluate the program’s progress toward reaching its
goals. Establishing a plan to develop these measures could better position TSA
to determine if SPOT is contributing to TSA’s strategic goals for aviation secu-
rity. TSA is planning to enhance its evaluation capabilities in 2010 to more
readily assess the program’s effectiveness by conducting statistical analysis of
data related to SPOT referrals to law enforcement and associated arrests. 11
Opportunities to Reduce Potential Duplication in Government Programs, Save
Tax Dollars, and Enhance Revenue
In March of 2011, GAO issued a report to Congress in response to a new statu-
tory requirement that GAO identify federal programs, agencies, offices, and ini-
tiatives, either within departments or governmentwide, which have duplicative
goals or activities. The report contained a section on SPOT and stated:
Congress may wish to consider limiting program funding pending receipt of an
independent assessment of TSA’s SPOT program. GAO identified potential
budget savings of about $20 million per year if funding were frozen at current
levels until validation efforts are complete. Specifically, in the near term, Con-
gress could consider freezing appropriation levels for the SPOT program at the
2010 level until the validation effort is completed. Assuming that TSA is plan-
ning to expand the program at a similar rate each year, this action could result
in possible savings of about $20 million per year, since TSA is seeking about
a $20 million increase for SPOT in fiscal year 2011. Upon completion of the
validation effort, Congress may also wish to consider the study’s results-includ-
ing the program’s effectiveness in using behavior-based screening techniques to
detect terrorists in the aviation environment-in making future funding decisions
regarding the program. 12
Credibility Assessment at Portals Report
In April 2009, the Portals Committee issued a report for the Defense Academy
for Credibility Assessment titled: ‘‘Credibility Assessment at Portals.’’ 13 The com-
mittee recognized the need for ‘‘advanced and accurate credibility assessment,’’ 14
which is described as ‘‘a decision making process whereby a communication is as-
sessed as to its veracity.’’ The Portals Committee had the following to say about
SPOT:
‘‘The adoption of SPOT occurred despite the fact that no study in the peer-re-
viewed scientific literature suggests that accurate credibility assessments can be
made from unstructured observations. Within SPOT it appears that the observ-
ers are attempting to assess airline passengers by casual observation of facial
micro-expressions (Wilber & Nakashima, 2007). There are several problems
with this. First, scientific research does not support the notion that microexpres-
sions reliably betray concealed emotion (Porter & ten Brinke, 2008). Second,
whereas brief facial activity may reveal the purposeful manipulation of a felt
emotion (Porter & ten Brinke, 2008), the problems of interpretation of such ma-
nipulation renders the approach useless for practical purposes. Third, the
microexpression approach equates deception with manipulated emotion. This
conceptual confusion obscures the fact that most forensically relevant lies are
not lies about feelings but about actions in the past, present or future. In con-
clusion, the use of microexpressions to establish credibility is theoretically
flawed and has not been supported by sound scientific research (Vrij, 2008).’’ 15
JASON
Comprised of world renowned scientists, JASON advises the federal government
on science and technology issues. The vast majority of its work is done at the re-
11 Ibid.
12 Opportunities to Reduce Potential Duplication in Government Programs, Save Tax Dollars,
and Enhance Revenue, Government Accountability Office, March 2011, available at: http://
www.gao.gov/new.items/d11318sp.pdf
13 ‘‘Credibility Assessment at Portals,’’ Portals Committee Report, April 17, 2009, available at:
http://truth.boisestate.edu/eyesonly/Portals/PortalsCommitteeReport.pdf
14 Ibid.
15 Ibid.
6
quest of the Department of Defense and the intelligence community, so its reports
are typically classified.
However, a 2010 Nature article that discusses the SPOT program in a piece on
deception detection provides the following: ‘‘No scientific evidence exists to support
the detection or inference of future behaviour, including intent,’ declares a 2008 re-
port prepared by the JASON defense advisory group.’’ 16
National Research Council (NRC) of the National Academies
Workshop Summary on Field Evaluation in the Intelligence and Counterintel-
ligence Context
On September 22-23, 2009, the NRC’s Board on Behavioral, Cognitive, and Sen-
sory Sciences held a workshop on ‘‘the field evaluation of behavioral and cognitive
sciences-based methods and tools for use in the areas of intelligence and counter in-
telligence.’’ 17 The workshop was sponsored by the Defense Intelligence Agency and
the Office of the Director of National Intelligence. The purpose of the workshop was
to ‘‘discuss the best ways to take methods and tools from behavioral science and
apply them to work in intelligence operations. More specifically, the workshop fo-
cused on the issue of field evaluation - the testing of these methods and tools in
the context in which they will be used in order to determine if they are effective
in real world settings.’’ 18
The NRC published a report in 2010 summarizing the presentations and discus-
sions over the 2-day period. Participants of the workshop included NRC members
and experts in the behavioral sciences and intelligence community. The goal of the
workshop was ‘‘not to provide specific recommendations but to offer some insight -
in large part through specific examples taken from other fields - into the sorts of
issues that surround the area of field evaluations. The discussions covered such
ground as the obstacles to field evaluation of behavioral science tools and methods,
the importance of field evaluation, and various lessons learned from experience with
field evaluation in other areas.’’ 19
While the report identified several obstacles, one of interest to this Subcommittee
hearing is ‘‘the pressure to use new devices and techniques as soon as they become
available, without waiting for rigorous validation. Because lives are at stake, those
in the field often push to adopt new methods and tools as quickly as possible and
before there has been time to evaluate them adequately. Once a method is in wide-
spread use, anecdotal evidence can lead its users to believe in its effectiveness and
to resist rigorous testing, which may show that it’s not as effective as they think.’’ 20
Protecting Individual Privacy in the Struggle Against Terrorists - A Framework for
Program Assessment
From 2005 to 2007, the NRC’s 21-member Committee on Technical and Privacy
Dimensions of Information for Terrorism Prevention and Other National Goals held
several meetings to ‘‘examine the role of data mining and behavioral surveillance
technologies in counterterrorism programs.’’ 21 The ensuing NRC report provides ‘‘a
framework for making decisions about deploying and evaluating those [programs]
and other information based programs on the basis of their effectiveness and associ-
ated risks to personal privacy.’’ 22
The report presented 13 conclusions and 2 broad recommendations. Of interest to
this Subcommittee hearing are the following conclusions:
• ‘‘Conclusion 3: Inferences about intent and/or state of mind implicate privacy
issues to a much greater degree than do assessments or determinations of capa-
bility.
Although it is true that capability and intent are both needed to pose a real
threat, determining intent on the basis of external indicators is inherently a much
more subjective enterprise than determining capability. Determining intent or
16 Supra n.5.
17 ‘‘Field Evaluation in the Intelligence and Counterintelligence Context,’’ National Research
Council of the National Academies , 2010, available at: http://books.nap.edu/
openbook.php?recordlid=12854&page=R1
18 Ibid.
19 Ibid.
20 ?Field Evaluation in the Intelligence and Counterintelligence Context,? National Research
Council of the National Academies, March 2010, available at: http://
www7.nationalacademies.org/bbcss/Highlights-
Field%20Evaluation%20in%20the%20Intelligence%20and%20Counterintelligence%20Context.pdf
21 ‘‘Protecting Individual Privacy in the Struggle against Terrorists - A Framework for Pro-
gram Assessment,’’ National Research Council of the National Academies, 2008, available at:
http://books.nap.edu/openbook.php?recordlid=12452&page=1
22 Ibid.
7
state of mind is inherently an inferential process, usually based on indicators
such as whom one talks to, what organizations one belongs to or supports, or
what one reads or searches for online. Assessing capability is based on such indi-
cators as purchase or other acquisition of suspect items, training, and so on. Rec-
ognizing that the distinction between capability and intent is sometimes unclear,
it is nevertheless true that placing people under suspicion because of their associa-
tions and intellectual explorations is a step toward abhorrent government behav-
ior, such as guilt by association and thought crime. This does not mean that gov-
ernment authorities should be categorically proscribed from examining indicators
of intent under all circumstances-only that special precautions should be taken
when such examination is deemed necessary.’’
• ‘‘Conclusion 4: Program deployment and use must be based on criteria more de-
manding than ‘it’s better than doing nothing.’’
In the aftermath of a disaster or terrorist incident, policy makers come under in-
tense political pressure to respond with measures intended to prevent the event
from occurring again. The policy impulse to do something (by which is usually
meant something new) under these circumstances is understandable, but it is sim-
ply not true that doing something new is always better than doing nothing. In-
deed, policy makers may deploy new information-based programs hastily, without
a full consideration of (a) the actual usefulness of the program in distinguishing
people or characteristic patterns of interest for follow-up from those not of interest,
(b) an assessment of the potential privacy impacts resulting from the use of the
program, (c) the procedures and processes of the organization that will use the
program, and (d) countermeasures that terrorists might use to foil the program.
• ‘‘Conclusion 10: Behavioral and physiological monitoring techniques might be
able to play an important role in counterterrorism efforts when used to detect
(a) anomalous states (individuals whose behavior and physiological states devi-
ate from norms for a particular situation) and (b) patterns of activity with well-
established links to underlying psychological states.
Scientific support for linkages between behavioral and physiological markers and
mental state is strongest for elementary states (simple emotions, attentional proc-
esses, states of arousal, and cognitive processes), weak for more complex states
(deception), and nonexistent for highly complex states (terrorist intent and beliefs).
The status of the scientific evidence, the risk of false positives, and vulnerability
to countermeasures argue for behavioral observation and physiological monitoring
to be used at most as a preliminary screening method for identifying individuals
who merit additional follow-up investigation. Indeed, there is no consensus in the
relevant scientific community nor on the committee regarding whether any behav-
ioral surveillance or physiological monitoring techniques are ready for use at all
in the counterterrorist context given the present state of the science.’’
• ‘‘Conclusion 11: Further research is warranted for the laboratory development
and refinement of methods for automated, remote, and rapid assessment of be-
havioral and physiological states that are anomalous for particular situations
and for those that have well-established links to psychological states relevant to
terrorist intent.
A number of techniques have been proposed for the machine-assisted detection of
certain behavioral and physiological states. For example, advances in magnetic
resonance imaging (MRI), electroencephalography (EEG), and other modern tech-
niques have enabled measures of changes in brain activity associated with
thoughts, feelings, and behaviors. Research in image analysis has yielded im-
provements in machine recognition of faces under a variety of circumstances (e.g.,
when a face is smiling or when it is frowning) and environments (e.g., in some
nonlaboratory settings).
However, most of the work is still in the basic research stage, with much of the
underlying science still to be validated or determined. If real-world utility of these
techniques is to be realized, a number of issues- practical, technical, and funda-
mental-will have to be addressed, such as the limits to understanding, the largely
unknown measurement validity of new technologies, the lack of standardization
in the field, and the vulnerability to countermeasures. Public acceptability regard-
ing the privacy implications of such techniques also remains to be demonstrated,
especially if the resulting data are stored for unknown future uses or undefined
lengths of time.
For example, the current state-of-the-art of functional MRI technology can identify
changes in the hemodynamics in certain regions of the brain, thus signaling activ-
8
ity in those regions. But such results are not necessarily consistent across individ-
uals (i.e., different areas in the brains of different individuals may be active
under the same stimulus) or even in the same individual (i.e., a slightly different
part of the brain may become active even in the same individual under the same
stimulus). Certain regions of the brain may be active under a variety of different
stimuli.
In short, understanding of what these regions do is still primitive. Furthermore,
even if simple associations can be made reliably in laboratory settings, this does
not necessarily translate into usable technology in less controlled situations. Be-
havior of interest to detect, such as terrorist intent, occurs in an environment that
is very different from the highly controlled behavioral science laboratory.’’
• ‘‘Conclusion 12: Technologies and techniques for behavioral observation have
enormous potential for violating the reasonable expectations of privacy of indi-
viduals.
Because the inferential chain from behavioral observation to possible adverse
judgment is both probabilistic and long, behavioral observation has enormous po-
tential for violating the reasonable expectations of privacy of individuals. It would
not be unreasonable to suppose that most individuals would be far less bothered
and concerned by searches aimed at finding tangible objects that might be weap-
ons or by queries aimed at authenticating their identity than by technologies and
techniques whose use will inevitably force targeted individuals to explain and jus-
tify their mental and emotional states. Even if behavioral observation and physio-
logical monitoring are used only as a preliminary screening methods for identi-
fying individuals who merit additional follow-up investigation, Because the infer-
ential chain from behavioral observation to possible adverse judgment is both
probabilistic and long, behavioral observation has enormous potential for vio-
lating the reasonable expectations of privacy of individuals. It would not be un-
reasonable to suppose that most individuals would be far less bothered and con-
cerned by searches aimed at finding tangible objects that might be weapons or
by queries aimed at authenticating their identity than by technologies and tech-
niques whose use will inevitably force targeted individuals to explain and justify
their mental and emotional states. Even if behavioral observation and physio-
logical monitoring are used only as a preliminary screening methods for identi-
fying individuals who merit additional follow-up investigation, these individuals
will be subject to suspicion that would not fall on others not so identified.’’ 23
Issues
Detection of Emotion
The state of science relative to the detection of emotion, deceit, and intent are
vastly different. Decades of research have been devoted to the detection of emotion
using verbal, nonverbal, and microfacial expressions. Each of these observational
techniques have shown to have varying degrees of success at determining an indi-
vidual’s emotion, but generally speaking, a scientific foundation does exist to sup-
port the assertion that emotion can be determined through behavioral cues.
Detection of Deceit
The foundation of research for detecting an expression of deceit is rooted in that
of emotion. For example, it is posited that a deceitful person would express emotions
such as stress, and that stress can be attributed to concealing a lie. The state of
the science in this regard is less solid. Witnesses at the hearing will testify to the
current strengths and weaknesses of this field.
Detection of Intent
Even less certainty exists regarding the ability to determine intent. This ability
is asserted by assuming that a person who intends to do harm will be concealing
this fact, thereby expressing deceitful behaviors - and that deceitful behavioral cues
are founded in stress, which in turn are displayed in emotion. This chain of rea-
soning takes the underlying assumption that behavioral indicators exist for detect-
ing emotion and infers that indicators can therefore be used to detect deceit, and
therefore intent. Very little, if any, evidence exists in the scientific literature to sup-
port this hypothesis, yet this is the goal of the SPOT program - to identify individ-
uals who may pose a threat to aviation security.
23 Ibid.
9
Laboratory vs. Operational Settings
The vast preponderance of behavioral science research conducted relative to the
detection of emotion, deceit, and intent has been done in a laboratory setting. As
the National Research Council noted in its 2008 report, ‘‘Behavior of interest to de-
tect, such as terrorist intent, occurs in an environment that is very different from
the highly controlled behavioral science laboratory.’’ 24
Utility for Counterterrorism
Even if one was to stipulate that a body of evidence existed to support the claim
that one could detect intent using behavioral indicators, it remains to be seen how
useful this would be in a counterterrorism context. In all likelihood, anyone seeking
to cause harm would employ countermeasures designed to conceal their emotions.
It remains to be seen what impact countermeasures will have on the ability to de-
tect emotions, deception, or intent, but if other deception detection tools (such as the
polygraph) are any indicator, they could severely degrade the capability.
Utility in a U.S. Aviation Transportation Setting
The SPOT program is loosely based on the Israeli model successfully employed by
El Al Airlines. This highly successful program employs more agents in more loca-
tions throughout the airport, conducts multiple face to face interviews, actively pro-
files passengers, and operates in smaller and fewer airports. They also have much
fewer passengers and far fewer flights than the U.S. air transportation system.
Israeli screeners also receive more training than the four days of classroom training,
and three days of on the job training that BDOs receive. Scaling up such an enter-
prise to accommodate the U.S. Aviation Transportation Sector would severely re-
strict the flow of commerce and passengers.
DHS S&T Validation
In its report, GAO states that ‘‘TSA deployed SPOT nationwide without first vali-
dating the scientific basis for the program.’’ 25 To its credit, DHS S&T initiated a
review two and a half years ago to ‘‘determine whether SPOT is more effective at
identifying passengers who may be threats to the aviation system than random
screening.’’ 26 GAO goes on to point out in its report, ‘‘However, S&T’s current re-
search plan is not designed to fully validate whether behavior detection and appear-
ances can be effectively used to reliably identify individuals in an airport terminal
environment who pose a risk to the aviation system.’’ 27 The report further states
that, according to the National Research Council, ‘‘an independent panel could pro-
vide an objective assessment of the methodologies and findings of DHS’s study to
better ensure that SPOT is based on valid science.’’ 28
These are two important points. First, the S&T review is not designed to validate
the underlying behavioral cues, but rather to simply demonstrate whether the pro-
gram, as a whole, is more successful than random sampling. As GAO stated in its
recent ‘‘Duplication’’ report, ‘‘DHS’s response to GAO’s report did not describe how
the review currently planned is designed to determine whether the study’s method-
ology is sufficiently comprehensive to validate the SPOT program.’’ 29 Second, based
on the Statement of Work associated with S&T’s review, questions remain as to
whether or not the review is truly independent.
The Statement of Work affirms that S&T had a direct role in selecting peer re-
viewers, as well as planning and structuring workshops that informed the method-
ology to validate the program. The Statement of Work also afforded DHS the ability
to review and provide revision recommendations at numerous points in the process.
Finally, the Statement of Work indicates that deliverables are to be provided to S&T
directly. 30 Whether or not this affected the outcome is uncertain. The validation
work was conducted by the American Institute for Research, a high respected and
reputable firm, but ultimately they are contractually bound by the parameters and
scope defined by Statement of Work negotiated with DHS. It remains to be seen
whether the review was an independent assessment, as recommended by the Na-
tional Research Council, or more of a collaboration.
24 Supra n.21.
25 Supra n.1.
26 Ibid.
27 Ibid.
28 Ibid.
29 Supran.12.
30 Statementof Work for the Naval Research Laboratory, Project Hostile Intent: Behavioral-
Based Screening Indicators Validation, U.S. department of Homeland Security, Science and
Technology Directorate, Human Factors and Behavioral Sciences Division, PR# RSHF-11-00007.
10
Nevertheless, S&T’s two and a half year review (at a cost of $2.5 million) was ini-
tially planned to be delivered in Fiscal year 2011, 31 then February 2011, 32 and then
the end of March 2011. Its current release date is for April 8th, two days after our
hearing. The Subcommittee postponed this hearing, initially scheduled for March
17th, for a number of reasons, including allowing S&T more time to produce the re-
port.
Witnesses
• Mr. Stephen Lord, Director, Homeland Security and Justice Issues, Govern-
ment Accountability Office
• Transportation Security Administration (Invited)
• Mr. Larry Willis, Program Manager, Homeland Security Advanced Research
Projects Agency, Science and Technology Directorate, Department of Homeland
Security
• Dr. Paul Ekman, Professor Emeritus of Psychology, University of California,
San Francisco, and President and Founder, Paul Ekman Group, LLC
• Dr. Maria Hartwig, Associate Professor, Department of Psychology, John Jay
College of Criminal Justice
• Dr. Philip Rubin, Chief Executive Officer, Haskins Laboratories
• Lieutenant Detective Peter J. DiDomenica, Boston University Police
31 Supra n.1.
32 Supra n.12.
11
Appendix 1
These projects advance national security by developing and applying the social,
behavioral, and physical sciences to improve identification and analysis of threats,
to enhance societal resilience, and to integrate human capabilities into the develop-
ment of technology.
Commercial Data Sources Project
Project Manager: Patty Wolfhope
Project Overview: The Science and Technology (S&T) Directorate Human Factors
Behavior Sciences Division (HFD) Commercial Data Sources Project will quan-
titatively assess the utility of commercial data sources to augment governmentally
available information about people, foreign and domestic, being screened, inves-
tigated, or vetted by the Department. The use of commercial data sources may pro-
vide a valuable source of corroborating information to ensure that an individual’s
identity and eligibility for a particular license, privilege, or status is correctly evalu-
ated during screening. This project is part of the Personal Identification Systems
Thrust Area and Credentialing Program within HFD.
Conclusion:
In conclusion, these results indicate that the SPOT program is significantly more
effective than random screening: a high-risk traveler is nine times more likely to
be identified using Operational SPOT versus random screening. Our validation proc-
ess, which included an independent and comprehensive review of SPOT, is a key
example of how S&T works to enhance the effectiveness of the Department’s oper-
ational activities. Expanding on these initial findings, we would like to conduct fur-
ther research to assess the screening accuracy of these observable indicators in simi-
lar operational screening environments, in aviation and beyond. Additionally, we
would like to work to identify other indicators that could further increase accuracy
in operational screening.
Chairman Broun, Ranking Member Edwards, I thank you again for this oppor-
tunity to discuss the Screening of Passengers by Observation Techniques program.
I am happy to answer any questions the Subcommittee may have.
Chairman BROUN. Thank you, Mr. Willis. You kept your remarks
under five minutes, and sometimes that is not done here. In fact,
most times it is not done here.
Our next witness is Mr. Peter DiDomenica of the Boston Univer-
sity Police. Thank you, Lieutenant. Appreciate it. You have five
minutes, sir.
TESTIMONY OF PETER J. DIDOMENICA, LIEUTENANT
DETECTIVE, BOSTON UNIVERSITY POLICE
Mr. DIDOMENICA. Thank you. Good morning. Chairman Broun,
Ranking Member Edwards, and Members of the Committee, I
thank you for this opportunity to address you today regarding the
future of the TSA SPOT program that I originally developed.
By way of additional background, I have trained over 3,000 po-
lice, intelligence, and security officials in over 100 federal, state,
and local agencies in the United States and U.K. in behavior as-
sessment. I have also been a lecturer or advisor on behavior assess-
ment for the FBI, CIA, Secret Service, DHS, U.S. Army Night Vi-
sion Lab, Defense Department Criminal Investigations Task Force,
and the National Science Foundation. I appear today representing
only myself and not any of the organizations I am or have been em-
ployed by.
On December 22, 2001, while assigned to Logan International
Airport as a member of the State Police, I was part of a large team
of public safety officials who responded to the airfield to meet
43
original SPOT program I designed was not primarily for the appre-
hension of suspects but as a means to deny access to critical infra-
structure of high-risk persons who could be involved in terrorism
or other dangerous activity. It was to be the last and, most impor-
tantly, the best chance to prevent a tragedy when other methods
such as intelligence and traditional physical screening have failed.
Catching a terrorist through a random encounter in a public place
without any prior intelligence is extremely difficult.
By way of example, if we use the known number of terrorist sus-
pects who boarded domestic commercial flights at airports with
BDOs and the approximately four billion passenger enplanements
at U.S. commercial airports from 2004 to 2009, the base rate of ter-
rorist passengers is about 1 in 173 million. The expectation that
the SPOT program will result in the arrest of all terrorists at-
tempting to board a domestic flight in the United States is unreal-
istic and threatens its continued support. If, however, it is seen as
part of a multi-layered approach with the primary goal of pre-
venting terrorist access to critical infrastructure in conjunction
with properly trained law enforcement, the program sets reason-
able and attainable goals and should have the support of this Con-
gress.
Thank you for this opportunity to address the program and I am
prepared to answer any questions that you may have.
[The prepared statement of Mr. DiDomenica follows:]
PREPARED STATEMENT OF MR. PETER J. DIDOMENICA,
LIEUTENANT DETECTIVE, BOSTON UNIVERSITY POLICE
Good morning. Chairman Broun, Ranking Member Edwards, and Members of the
Committee, I thank you for this opportunity to address you today regarding the fu-
ture of the TSA Screening of Passengers by Observation Techniques program that
I developed, which is more commonly referred to as the SPOT program.
I am Peter DiDomenica presently employed as a Detective Lieutenant with the
Boston University Police Department. I recently joined the Boston University force
after serving for more than 22 years with the Massachusetts State Police where I
retired as a Lieutenant. While a member of the State Police I served as an investi-
gator in the Major Crime Unit, as the Director of Legal Training for the State Police
Academy, as a staff member to five different superintendents, and as Director of Se-
curity Policy for Boston Logan International Airport in the two years after the dev-
astating 9/11 attacks. I also served the State Police for a decade as a subject matter
expert and lead trainer for Massachusetts police agencies in racial profiling and bi-
ased policing. In this capacity I designed statewide police training programs and the
State Police traffic stop data collection and analysis system created to monitor en-
forcement efforts for indications of biased policing. I am also presently a consultant
for EOIR Technologies of Fredericksburg, VA where I serve as an advisor on human
behavior detection for the U.S. Army Night Vision and Electronic Sensors Direc-
torate. I am a certified instructor in the interview, behavior assessment, and decep-
tion detection programs for The Forensic Alliance, a consulting firm of forensic psy-
chologists based in British Columbia, Canada. I am presently an adjunct instructor
for the graduate criminal justice program at Anna Maria College in Paxton, MA.
I am a licensed attorney in Massachusetts having earned my J.D. in 1995. I have
trained over 3,000 police, intelligence, and security officials in over 100 federal,
state, and local agencies in the U.S. and U.K. in behavior assessment. I have also
been a lecturer or advisor on behavior assessment for the FBI, CIA, Secret Service,
Department of Homeland Security, Defense Department Criminal Investigations
Task Force, and National Science Foundation. I appear today representing only my-
self and not any of the organizations I am or have been employed by.
On December 22, 2001, while assigned to Logan International Airport as a mem-
ber of the State Police and as Director of Security Policy, I was part of a large team
of public safety officials who responded to the airfield to meet American Airlines
flight 63, diverted to Boston on a flight from Paris, France to Miami. On board was
a passenger named Richard Reid who attempted to detonate an improvised explo-
45
sive device artfully concealed in his footwear that, if successful, would have killed
all 197 passengers and crewmembers aboard. As I stood only a few feet away from
Reid, who was now securely in custody in the back of a state police cruiser, it hit
me that this man was the real thing, that the threat of another terrorist attack from
Al Qaeda would not stop, and that we needed to do more, much more, to properly
screen passengers than merely focusing on weapons detection. Over the next several
days I met with the incident commander for Reid’s arrest, Major Tom Robbins, who
was the Aviation Security Director for Logan Airport and Troop Commander for
State Police Troop F at the airport. One evening, while having dinner with Major
Robbins, he wrote the words ‘‘walk and talk’’ on a dinner napkin - a reference to
airport narcotics interdiction - and directed me to look into airport drug interdiction
programs as a model for a terrorist behavioral profiling program to augment the
weapons screening process. Thus began the development of what would become the
Behavior Assessment Screening System or BASS.
Because of my legal background and experience in training on racial profiling and
bias policing, I knew immediately what the BASS program would not be. Whatever
program we would create to identify potential terrorists, it would not include racial
profiles that target people of apparent Islamic belief or Arab, Middle Eastern, or
South and Central Asian ethnicities. As well as being illegal such profiling could
distract security officials from detecting true threats. Moreover, the unconscious bias
against these groups would be so strong because of 9/11 that security officials would
need training to counter these biases. I began to explore the scientific literature in
an effort to quantify the human capacity to detect dangerous people. My research
included many disciplines including, physiology, psychology, neuroscience, as well as
specific research into suicide bombers. What this literature indicated was that a per-
son who is engaged in a serious deception of consequence or otherwise engaged in
an act in which the person has much to lose by being discovered or by failing to
succeed will suffer mental stress, fear, or anxiety. Such stress, fear, or anxiety will
be manifested through involuntary physical and physiological reactions such as an
increase in heart rate, facial displays of emotion, and changes in speed and direction
of movement. In developing the program specific behaviors were selected that were
both supported in the scientific literature and consistent with law enforcement expe-
rience. In addition to avoiding the legal prohibition on selective enforcement based
on race, ethnicity, or religion 1 the program also had to ensure that police encoun-
ters with the public not meeting the standard of reasonable suspicion were vol-
untary under the U.S. Supreme Court case of U.S. v. Medenhall. 2 In addition to
behavior, the program also examines: aspects of appearance unrelated to race, eth-
nicity, or religion; responses to law enforcement presence and questioning; and, the
circumstances surrounding the presence of the person at a specific location. I cre-
ated a simple method called ‘‘A-B-C-D’’ which means Analysis of Baseline, addition
of a Catalyst, and scan for Deviations. Baselines are merely an evaluation of what
was normal for a specific environment and a catalyst is the insertion into the envi-
ronment of something that would be particularly threatening to a terrorist or crimi-
nal to provoke behavioral changes.
In 2002 and 2003 I taught the BASS program to all the troopers, the primary law
enforcement agency for Logan Airport, and developed a staff of additional instruc-
tors. We also began training other police departments In Massachusetts; in fact we
trained the entire Massachusetts Transit Police force and a group of Boston Police
officers in preparation for the 2004 Democratic National Convention. Because of the
success of the program, I created a derivative program called PASS or the Passenger
Assessment Screening System suitable for TSA screeners that eventually became
the SPOT program. Over the course of two years I worked with TSA officials at Bos-
ton, including the Federal Security Director George Niccara, and officials at TSA
headquarters including their Office of Civil Rights, Science and Technology, and
Workforce Performance and Training. In 2004 my team of State Police BASS in-
structors conducted a training program with TSA to create two pilot SPOT pro-
grams at Portland International Jetport in Maine and T.F. Green International Air-
port in Rhode Island.
One of the reasons the BASS program got the interest of TSA headquarters as
a model for a behavior detection program was an incident that occurred in the fall
of 2003 at Logan Airport while I was training members of the Boston Police in
BASS. A middle-age male caught my attention due to an appearance and luggage
deviation as well as baseline deviation in movement. When the Boston police officer
3 GAO-10-763. The report cites 23 suspected terrorists having passed through SPOT airports.
4 King Downing v. Massachusetts Port Authority, et al, Civil Action No. 2004-12513-RBC.
5 392 U.S. 1 (1968).
6 435 F. 3d 1125.
48
Bowl or other major sporting event, even when we don’t have the constitutional au-
thority to arrest we must have the confidence to deny them access based on the
sound principles of BASS and SPOT. This is our last and best chance of preventing
another terrorist attack.
Thank you again for this opportunity to address the SPOT program and I am pre-
pared now to answer any questions you may have.
Chairman BROUN. Thank you, Lieutenant. You did not exceed
your five minutes either. Congratulations and thank you for being
here and——
Mr. DIDOMENICA. Two seconds.
Chairman BROUN. That is right. I recognize our next witness, Dr.
Paul Ekman, Professor Emeritus of Psychology, University of Cali-
fornia, San Francisco, and President and Founder of the Paul
Ekman Group. Doctor, you have five minutes for your testimony.
TESTIMONY OF PAUL EKMAN,
PROFESSOR EMERITUS OF PSYCHOLOGY,
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO,
AND PRESIDENT AND FOUNDER, PAUL EKMAN GROUP, LLC
Dr. EKMAN. Thank you, Chairman Broun, Ranking Member Ed-
wards. I really appreciate this opportunity to testify on this very
important issue.
I have been working with TSA on SPOT for eight years based on
40 years of research on how demeanor—facial expression, gesture,
voice, speech, gaze and posture—can help in identifying lies and
also harmful intent. My research has examined four very different
kinds of lies: lies to conceal a very strong emotion felt at that mo-
ment, lies claiming to hold a social political opinion the exact oppo-
site of your truly strongly held opinion, lies denying that you have
taken money that isn’t yours, and lies in which members of extrem-
ist political groups attempt to block an opposing political group
from receiving money.
Now, our research focuses on real-world lies that matter to soci-
ety in which each person decided for him or herself whether to lie
or tell the truth, just as we do in the real world. No scientist comes
out of the clouds and tells us you are supposed to lie, you are sup-
posed to tell the truth, except in experiments published in journals.
The person who tells the truth knows that if he or she is mistak-
enly judged to be lying, they will receive the same punishment of
the liar who is caught. This makes the truthful person apprehen-
sive and harder to distinguish from the liar, just as it is in the real
world. And the punishment threatened is as severe and highly
credible to those who participate in the research as we could make
it, passed by the University IRB.
I should mention I work in a medical school. I would never get
it passed at Berkley, but at a medical school what I do is consid-
ered trivial.
Now, unlike any other research team, we have performed the
most precise comprehensive measurements of face, gesture, voice,
speech, and gaze, and those measurements have yielded between
80 and 90 percent identification of who is lying and who is telling
the truth. The clues we have found are not specific to what the lie
is about. As long as the stakes are very high, especially the threat
of punishment, the behavioral clues to lying will be the same. It
is this finding that suggested there would be no clues specific to
49
the terrorist hiding harmful intent than the money smuggler, the
drug smuggler, or the wanted felon.
In my written testimony I raised three questions. First, what is
the basis for the SPOT checklist? I have explained why I believe
our findings on four very different kinds of lies provided a solid
basis for reviewing what was on the SPOT checklist.
Question two, what is the evidence for the effectiveness of SPOT?
Mr. Willis has already covered that. I won’t attempt to repeat it.
I am very eager to see that report that you are eager to see.
Question three, can SPOT be improved? That is a dangerous
question to ask a scientist. We could always think that more re-
search is necessary. But is it a wise investment compared to other
things that the government can invest in regarding airport secu-
rity? That is your decision, not mine. In my testimony I have out-
lined a couple of types of research that I think could be useful if
you decide you would want to do more research. But we do not
need to do more research now to feel confidence in this layer of se-
curity provided to the American people.
In my written testimony I attempted to answer questions that
have been raised by critics of SPOT. Would it have not been better
to base SPOT on how terrorists actually behave? Wasn’t SPOT
based on—Why wasn’t SPOT based on people role-playing terror-
ists? Why is SPOT catching felons and smugglers, not just terror-
ists? And aren’t people with Middle Eastern names or Middle East-
ern appearance more likely to be identified by SPOT?
I would be glad in responding to questions to provide brief an-
swers to each of these that are in my written testimony. Again, my
thanks to the Committee and the staff of the Committee for the op-
portunity to talk to you and to the men and women in TSA who
make flying a safer path than it would be without their dedicated
efforts. Thank you.
[The prepared statement of Dr. Ekman follows:]
50
PREPARED STATEMENT OF DR. PAUL EKMAN, PROFESSOR EMERITUS OF PSYCHOLOGY,
UNIVERSITY OF CALIFORNIA, SAN FRANCISCO, AND PRESIDENT AND FOUNDER, PAUL
EKMAN GROUP, LLC
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
The TSA has implemented the SPOT program, a security screening protocol that
relies on observation of nonverbal and facial cues to assess the credibility of trav-
elers. In particular, the program relies on behavioral indicators of ‘‘stress, fear, or
deception’’ (GAO, p. 2). A key question is whether there is a scientifically validated
basis for using behavior detection for counterterrorism purposes. This testimony will
review the relevant empirical evidence on this question. In brief, the accumulated
body of scientific work on behavioral cues to deception does not provide support for
the premise of the SPOT program. The empirical support for the underpinnings of
the program is weak at best, and the program suffers from theoretical flaws. Below,
I will elaborate on the scientific findings of relevance for this issue.
Accuracy in deception judgments
For several decades, behavioral scientists have conducted empirical research on
deception and its detection. There is now a considerable body of work in this field
(Granhag & Strömwall, 2004; Vrij, 2008). This research focuses on three primary
questions: First, how good are people at judging credibility? Second, are there be-
72
havioral differences between deceptive and truthful presentations? Third, how can
people’s ability to judge credibility be improved?
Most research on credibility judgments is experimental. An advantage of the ex-
perimental approach is that researchers may randomly assign participants to condi-
tions, which provides internal validity (the ability to establish causal relationships
between the variables, in this context between deception and a given behavioral in-
dicator) and control of extraneous variables. Importantly, the experimental approach
also allows for the unambiguous establishment of ground truth, that is, knowledge
about whether the statements given by research participants are in fact truthful or
deceptive. In this research, participants provide truthful or deliberately false state-
ments, for example by purposefully distorting their attitudes, opinions, or events
they have witnessed or participated in. The statements are subjected to various
analyses including codings of verbal and nonverbal behavior. This allows for the
mapping of objective cues to deception–behavioral characteristics that differ as a
function of veracity. Also, the videotaped statements are typically shown to other
participants serving as lie-catchers who are asked to make judgments about the ve-
racity of the statements they have seen. Across hundreds of such studies, people av-
erage 54% correct judgments, when guessing would yield 50% correct. Meta-anal-
yses (statistical summaries of the available research on a given topic) show that ac-
curacy rates do not vary greatly from one setting to another (Bond & DePaulo, 2006)
and that individuals barely differ from one another in the ability to detect deceit
(Bond & DePaulo, 2008). Contrary to common expectations (Garrido, Masip, &
Herrero, 2004), presumed lie experts such as police detectives and customs officers
who routinely assess credibility in their professional life do not perform better than
lay judges (Bond & DePaulo, 2006). In sum, that judging credibility is a near-chance
enterprise is a robust finding emerging from decades of systematic research.
Cues to deception
Why are credibility judgments so prone to error? Research on behavioral dif-
ferences between liars and truth tellers may provide an answer to this question. A
meta-analysis covering 1,338 estimates of 158 behaviors showed that few behaviors
are related to deception (DePaulo et al., 2003). The behaviors that do show a sys-
tematic covariation with deception are typically only weakly related to deceit. In
other words, people may fail to detect deception because the behavioral signs of de-
ception are faint.
Lie detection may fail for another reason: People report relying on invalid cues
when attempting to detect deception. Both lay people and presumed lie experts,
such as law enforcement personnel, report that gaze aversion, fidgeting, speech er-
rors (e.g., stuttering), pauses and posture shifts indicate deception (Global Deception
Research Team, 2005; Strömwall, Granhag, & Hartwig, 2004). These are cues to
stress, nervousness and discomfort. However, meta-analyses of the deception lit-
erature show that these behaviors are not systematically related to deception. For
example, in DePaulo et al. (2003), the effect size d (a statistical measure of the
strength of association between two variables) of gaze aversion as a cue to deception
across all studies is a non-significant 0.03. DePaulo et al. state: ‘‘It is notable that
none of the measures of looking behavior supported the widespread belief that liars
do not look their targets in the eye. The 32 independent estimates of eye contact
produced a combined effect that was almost exactly zero (d = 0.01)’’ (p. 93). More-
over, fidgeting with object does not occur more frequently when lying, d = -0.12 (the
negative value suggests that object fidgeting occurs less, not more frequently when
lying, but this difference is not statistically significant), nor does self-fidgeting (d =
-0.01) and facial fidgeting (d = 0.08). Speech disturbances are not related to decep-
tion (d = 0.00), nor are pauses (silent pauses d = 0.01; filled pauses d = 0.00; mixed
pauses d = 0.03). Posture shifts are not systematically related to deception either,
d = 0.05.
In sum, the literature shows that people perform poorly when attempting to de-
tect deception. There are two primary reasons: First, there are few, if any, strong
cues to deception. Second, people report relying on cues to stress, anxiety and nerv-
ousness, which are not indicative of deceit.
High-stake lies. Some aspects of the deception literature have been criticized on
methodological grounds, in particular with regard to external validity (i.e., the gen-
eralizability of the findings to relevant non-laboratory settings, see Miller & Stiff,
1993) The most persistent criticism has concerned the issue of generalizing from
low-stake situations to those in which the stakes are considerably higher. Critics
have argued that when the deceit concerns serious matters, liars will experience
stronger fear of detection, leading to cues to deception. There are several bodies of
work of relevance for this concern. In a meta-analytic overview of the literature on
credibility judgments (Bond & DePaulo, 2006), the evidence on the effects of stakes
73
was mixed: Within studies that manipulated motivation to succeed, lies were easier
to tell from truths when there is relevant motivation. However, the effect size was
fairly small (d = 0.17). However, when the comparison was made between studies
that differed in stakes, no difference in lie detection accuracy was observed. Also,
the meta-analysis revealed that as the stakes rise, both liars and truth tellers seem
more deceptive to observers. That is, lie-catchers are more prone to make false posi-
tive errors - mistaking an innocent person for a liar - when judging highly motivated
senders.
Furthermore, research on real-life high-stake lies, such as lies told by suspects of
serious crimes during police interrogations, shows that people obtain at best mod-
erate hit rates when judging such material (for a review of these studies, see Vrij,
2008). Behavioral analyses of the suspects in these studies do not support the asser-
tion that cues to deception in the form of stress, arousal and emotions appear when
senders are highly motivated. Vrij noted that the pattern from high-stake lies stud-
ies are ‘‘in direct contrast with the view of professional lie-catchers who overwhelm-
ingly believe that liars in high-stake situations will display cues to nervousness,
particularly gaze aversion and self-adaptors’’ (2008, p. 77). Moreover, he notes that
the results ‘‘show no evidence for the occurrence of such cues’’ (2008, p. 77).
In sum, neither the research in general nor specific results on high-stake lies sup-
port the assumption that liars leak cues to stress and emotion, which can be used
for the purposes of lie detection.
Verbal vs. nonverbal cues to deception
The SPOT program seems to rely heavily on evaluation of nonverbal cues. This
emphasis on nonverbal behavior as opposed to verbal content cues runs counter to
the recommendations from research. A number of findings suggest that reliance on
nonverbal cues impairs lie detection accuracy. First, the meta-analysis on accuracy
in deception judgments investigated accuracy under four conditions: a) watching vid-
eotapes without sound b) watching tapes with sound c) listening to audiotapes and
d) reading transcripts (Bond & DePaulo, 2006). The accuracy rates in the first condi-
tion, where people based their judgments solely on nonverbal behavior, was signifi-
cantly lower than in the other three, which did not differ significantly from each
other. Thus, the combined results of hundreds of studies on lie detection suggest
that having access to only nonverbal cues impairs lie detection accuracy.
Second, a number of studies have correlated lie-catchers’ self-reported use of cues
with lie detection accuracy. The purpose of such analyses is to investigate whether
failure to detect deception coincides with the self-reported use of a particular set of
cues. The results of these studies are consistent: They show that the more fre-
quently a participant reports relying on nonverbal behavior, the less likely they are
to be accurate in detecting deception. First, Mann et al. (2004) investigated police
officers’ ability to assess the veracity of suspects accused of murder, rape and arson.
They found that successful lie detectors mentioned story cues (e.g., contradictions
in the statement, vague responses) more frequently than poor lie detectors. More-
over, the more nonverbal cues the detectives mentioned (e.g., gaze aversion, move-
ments, posture shifts), the lower their lie detection accuracy was. Second, Anderson
et al. (1999) and Feeley and Young (2000) found that the more vocal cues lie-catch-
ers mentioned, the more accurate they were in detecting deception. Third, Vrij and
Mann’s (2001) analysis of accuracy in judging the statement of a convicted murderer
showed that the participants who mentioned cues to stress and discomfort obtained
the lowest hit rates. Fourth, Porter et al. (2007) found that the more visual cues
participants reported, the poorer they were at detecting deception.
It should be noted that reliance on nonverbal cues is associated not only with
poorer lie detection accuracy, but also a more pronounced lie bias (a tendency to
judge statements as lies rather than truths). That is, paying attention to visual cues
increases the tendency for false positive errors - mistaking an innocent person for
a deceptive one. This finding was obtained in one of the meta-analyses on deception
judgments (Bond & DePaulo, 2006), as well as in a study of police officers’ judg-
ments of suspects of serious crimes (Mann et al., 2004).
The finding that reliance on nonverbal cues hampers lie detection is not sur-
prising, given the research findings on cues to deception. These findings suggest
that speech-related cues may be more diagnostic of deception than nonverbal cues
(DePaulo et al., 2003; Sporer & Schwandt, 2006, 2007; Vrij, 2008). For example,
DePaulo et al. (2003) showed that liars talk for a shorter time (d = -0.35), and in-
clude fewer details (d = -0.30). Liars’ stories are also less logically structured (d =
-0.25) and less plausible (d = -0.20). Liars and truth tellers differ in verbal and vocal
immediacy (d = -0.55), and with respect to the inclusion of particular verbal ele-
ments, such as admissions of lack of memory (d = -0.42), spontaneous corrections
74
(d = -0.29) and related external associations (d = 0.35). These findings are in line
with predictions from content analysis frameworks (e.g., Köhnken, 2004).
Detecting deceptions from facial displays of emotion
Theoretical concerns. Parts of the SPOT program seem to be predicated on the
assumption that analyses of facial displays of emotion can improve deception detec-
tion accuracy. The claims of effectiveness for such approaches are not modest. In
an interview with the New York Times, Ekman claimed that ‘‘his system of lie de-
tection can be taught to anyone, with an accuracy rate of more than 95 percent’’
(Henig, 2006). However, no such finding has ever been reported in the peer-reviewed
literature (Vrij et al., 2010). More broadly, there is no support for the assertion that
training programs focusing on identifying facial displays of emotions can improve
lie detection accuracy (Vrij, 2008).
Apart from lack of empirical support for the effectiveness of training programs fo-
cusing on the analysis of facial displays of emotion, there are theoretical problems
with the approach. The assumption behind the training program is that concealed
emotions may be revealed automatically, through brief displays sometimes referred
to as microexpressions. Implicit in this assumption is the notion that liars will expe-
rience emotions, and that leakage of emotions can betray their deceit. This seems
to equate cues to emotion with cues to deceit. But what is the evidence that lying
will entail emotions, while truth telling will not? Several scholars have noted that
the assumption that liars will experience emotion is a prescriptive view - it suggests
how liars should feel. Common moral reasoning suggests that lying is ‘‘bad’’
(Backbier et al., 1997). In line with this reasoning, Bond and DePaulo (2006) pro-
posed a double-standard hypothesis to explain the discrepancy between people’s be-
liefs about deceptive behavior (that liars will display signs of discomfort and stress)
and the actual findings on deceptive behavior (that liars typically do not display
such signs). The double-standard hypothesis suggests that people have two views
about lying: one about the lies they themselves tell, and one about the lies told by
others (a form of fundamental attribution error; Ross, 1977). In the words of the au-
thors: ‘‘As deceivers, people are pragmatic. They accommodate perceived needs by
lying. [.] [Lies] are easy to rationalize. Yes, deception may demand construction of
a convincing line and enactment of appropriate demeanor. Most strategic commu-
nications do. To the liar, there is nothing exceptional about lying’’ (p. 216). However,
people’s view of the lies told by others is markedly different: ‘‘Indignant at the pros-
pect of being duped, people project onto the deceptive a host of morally fuelled emo-
tions - anxiety, shame, and guilt. Drawing on this stereotype to assess others’ verac-
ity, people find that the stereotype seldom fits. In underestimating the liar’s capac-
ity for self-rationalization, judges’ moralistic stereotype has the unintended effect of
enabling successful deceit. Because deceptive torment resides primarily in the
judge’s imagination, many lies are mistaken for truths. When torment is perceived,
it is often not a consequence of deception but of a speaker’s motivation to be be-
lieved. High-stakes rarely make people feel guilty about lying; more often, they
allow deceit to be easily rationalized. When motivation has an impact, it is on the
speaker’s fear of being disbelieved, and it matters little whether or not the highly
motivated are lying (pp. 231-232).’’
These are important points, in that they highlight the discrepancy between the
perspective of the liar and the lie-catcher: People fall prey to an error of reasoning
when assuming that the liars are plagued by emotions. They fail to take into ac-
count the pragmatic nature of lies, as well as the liar’s ability to rationalize their
lie. Moreover, they may misinterpret the fear of a motivated innocent person as a
sign of deceit.
Beyond naı̈ve moral reasoning about lies, is it psychologically sound to assume
that people experience stress and negative emotion about lying? Can we expect that
a criminal will experience guilt or shame about the actions he has committed, or
that a prospective terrorist is plagued by negative feelings about the actions he is
about to commit? They may, but given the double-standard hypothesis, we cannot
be certain that this is the case. Apart from guilt and shame, it could be argued that
liars may experience fear of not being able to convince. However, we must acknowl-
edge the important fact that truth tellers might also experience such fear. For ex-
ample, Ekman coined the term ‘‘Othello error’’ to describe how lie-catchers may mis-
interpret an innocent person’s fear of not being believed as a sign of deception
(Ekman, 2001). Moreover, people may react not only with fear but also anger in re-
sponse to suspicion. Indeed, one study found that truth tellers reacted with more
anger to suspicion than did liars (Hatz & Bourgeois, 2010). For an innocent person,
suspicion is obviously undeserved. An emotional reaction to such treatment fits with
a large body of social justice research suggesting that people have affective re-
75
sponses to violations of fairness (De Cremer & van den Bos, 2007; Mikula et al.,
1998).
Empirical support. In sum, the concern raised above is that equating arousal, fear
and stress with deception may rest on shaky theoretical grounds. If one rejects this
concern and insists that such processes accompany lying, there is yet another hurdle
to overcome. If people do experience affective processes, can they conceal them?
Given the attention to microexpressions in the media, one might assume that there
is an abundance of research published in peer-reviewed journals addressing this
question. However, this is not the case. Porter and ten Brinke (2008) noted that ‘‘to
[their] knowledge, no published empirical research has established the validity of
microexpressions, let alone their frequency during falsification of emotion’’ (p. 509).
They proceeded to conduct an analysis of people’s ability to a) fabricate expressions
of emotions they did not experience and b) conceal emotions that they did in fact
experience. Their results showed that people are not perfectly capable of fabricating
displays of emotions they do not experience: When people were asked to present a
facial expression different from the emotion they were experiencing, there were
some inconsistencies in these displays. However, the effect depended on the type of
emotion people were trying to portray. People performed better at creating con-
vincing displays of happiness compared to negative expressions. This is plausibly
due to people’s experience of creating false expressions of positive emotion in every-
day life. With regard to concealing an emotion people did in fact experience, they
performed better: There was no evidence of leakage of the felt emotion in these ex-
pressions. As for microexpression, no complete microexpression (lasting 1/5th-1/25th
of a second) involving both the upper and lower half of the face was found in any
of the 697 facial expressions analyzed in the study. However, 14 partial micro-
expressions were found, 7 in the upper and 7 in the lower half of the face. Interest-
ingly, these partial microexpression occurred both during false and genuine facial
expressions. That is, not only those who were falsifying or concealing emotions dis-
played these expressions; true displays of emotion involved microexpressions to the
same extent. Porter and ten Brinke concluded that the ‘‘occurrence [of microexpres-
sions] in genuine expressions makes their usefulness in airline-security settings
questionable, given the implications of false-positive errors (i.e., potential human
rights violations). Certainly, current training that relies heavily on the identification
of full-face microexpressions may be misleading.’’ (p. 513).
Passive vs. active lie detection
If it is difficult, or even impossible to detect deception through analyses of leakage
of cues to affect, how can lie detection be accomplished? The research reviewed here
suggests that it is more fruitful to focus on the content of a person’s speech than
to observe their nonverbal behavior, since the latter provides little valid information
about deceit. The implication of this is that in order for lie judgments to be reason-
ably accurate, lie-catchers cannot simply observe targets. Instead, they should elicit
verbal responses from these targets, as verbal messages may be the carriers of cues
to deceit.
The proposition that lie-catchers ought to elicit verbal responses from targets fits
with an important paradigm shift in the literature on deception detection. In brief,
this paradigm shift involves moving from passive observation of behavior to the ac-
tive elicitation of cues to deception (Vrij, Granhag, & Porter, 2010). This shift in the
approach to lie detection is based on the now well-established finding that liars do
not automatically leak behavioral cues. However, that the behavioral traces of de-
ception are faint is not necessarily a universal fact: it may be possible to increase
the behavioral differences between liars and truth tellers by exploiting some of the
cognitive differences between the two. The approaches to elicit cues to deception are
thus anchored in a cognitive rather than emotional model of deception. This model
assumes that lying is a calculated, strategic enterprise that may demand cognitive
and self-regulatory resources: Liars have to suppress the truth and formulate an al-
ternative account that is sufficiently detailed to appear credible, while being mindful
of the risk of contradicting particular details or one’s own statement if one has to
repeat it later on. Liars may experience greater self-regulatory busyness than truth-
ful communicators, as a function of the efforts involved in deliberately creating a
truthful impression (DePaulo et al., 2003).
Departing from this theoretical framework, it is possible to identify several dif-
ferent approaches to elicit behavioral differences between liars and truth tellers.
First, if it is true that liars are operating under a heavier burden of cognitive load
than truth tellers, imposing further cognitive load should hamper liars more than
truth tellers. This hypothesis has been tested in several studies, in which cognitive
load was manipulated (for example, by asking targets to tell the story in reverse
order) and cues to deception were measured (e.g., Vrij et al., 2008; Vrij, Mann, Leal,
76
& Fisher, 2010). In support of the cognitive load framework, cues to deception were
more pronounced, and veracity judgments were more correct in the increased cog-
nitive load conditions.
A related line of research has investigated whether it is possible to elicit cues to
deception by exploiting the strategies liars employ in order to convince. For exam-
ple, this research has attempted to elicit cues to deception by asking unanticipated
questions, based on the assumption that liars plan some, but not all of their re-
sponses (Vrij et al., 2009). In line with the predictions, liars and truth tellers did
not differ with regard to anticipated questions, but when unanticipated questions
were asked, cues to deception emerged. Moreover, liars’ verbal strategies of avoid-
ance can be exploited through strategic use of background information, which elicits
inconsistencies or contradictions between the target’s statement and the background
information (Hartwig et al., 2005; 2006). For an extensive discussion on approaches
to elicit cues to deception, see Vrij et al. (2010).
Summary and directions for future research
In summary, the research reviewed above suggests that lie detection based on ob-
servations of behavior is a difficult enterprise. Hundreds of studies show that people
obtain hit rates just slightly above the level of chance. This can be explained by the
scarcity of cues to deception, as well as the finding that people report relying on
behavioral cues that have little diagnostic value. A wave of research conducted dur-
ing the last decade suggests that lie judgments can be improved by the elicitation
of cues to deception through various methods of strategic interviewing. This wave
of research has been accompanied by a theoretical shift in the literature, moving
from an emotional model of deception towards a cognitive view of deception.
The SPOT program’s focus on passive observations of behavior and its emphasis
on emotional cues is thus largely out of sync with the developments in the scientific
field. The evidence that accurate judgments of credibility can be made on the basis
of such observations is simply weak. Of course, it must be acknowledged that engag-
ing travelers in verbal interaction (ranging from casual conversations to more or
less structured interviews) is more time-consuming and effortful than simply observ-
ing behaviors from some distance. Still, the literature on elicitation of cues to decep-
tion suggests that this approach is likely to be substantially more effective than pas-
sive observations of behavior.
Evaluation of the SPOT program. At the time this testimony is written, the DHS’s
report on the validation of the SPOT program has yet to be released. Therefore, I
cannot comment on the methodological merits of this validation study. However, as
requested, I will briefly outline some methodological processes that I would expect
a validation study to follow. First, it would be necessary to establish clear oper-
ational definitions of the target(s) of the program. What is the program supposed
to accomplish? In order to evaluate the outcomes of the program, such definitions
are crucial. Moverover, I would expect analyses of the outcomes of the SPOT pro-
gram using the framework of decision theory. That is, a validation study should
minimally provide information about the frequency of hits, false alarms, misses and
correct rejections (to do this, one must have an operational definition of what a hit
is). Those values should be compared to chance expectations based upon the
baserate of the defined target condition. Then the obtained outcomes should be com-
pared to a screening protocol that does not include the key elements of the SPOT
program. For example, the outcome of a comparable sample of airports employing
a random screening method may serve as an appropriate control group.
In addition to analyzing the results using a decision theory framework, it would
be desirable to empirically examine the behavioral cues displayed by targets who
pose threats to security, and compare them to targets who do not. That is,
videotaped recordings of these targets (to the extent that they are available) should
be subjected to detailed coding to determine the behavioral indicators that indicate
deception and/or hostile intentions as these travelers move through an airport. The
behaviors displayed by such targets should be compared to an appropriate control
group, for example, a random sample of innocent travelers. The purpose of such
analyses would be twofold: First, the results would empirically establish the behav-
ioral indicators of deception and malicious intent in the airport setting. Second, the
results could be compared to the SPOT criteria to establish whether there is an
overlap between the two sets of indicators.
Moreover, it would be useful to evaluate the criteria on which Behavior Detection
Officers rely to make judgments that a target is worthy of further scrutiny. That
is, analyses of the behaviors of targets selected for scrutiny could be subjected to
coding, to establish a) whether the officers rely on valid indicators of deception and
hostile intentions and b) whether they rely on the criteria set forth in the SPOT
training program. This would validate the SPOT program in a slightly different
77
manner, as it would assess to what extent the Behavior Detection Officers follow
the protocol of their training.
A problem of using field data is that important data will likely be missing. That
is, while databases may include information about hits and false alarms from trav-
elers who are subjected to further scrutiny, the data on misses and correct rejections
are will be incomplete. For example, misses may not be detected for years, if ever.
For this reason it may be appropriate to subject the SPOT program to an experi-
mental test, in which the ground truth about the travelers’ status is known. The
field and experimental approaches are obviously not mutually exclusive: It is pos-
sible (and perhaps even preferable) to conduct both types of validation studies, as
the strength and weaknesses of each approach in terms of internal and external va-
lidity complement each other. A multi-methodological approach to validating the
SPOT program may also provide convergent validity. If a concern with the labora-
tory approach is that participants in an experimental study would not be sufficiently
motivated, it may be worth mentioning that it is possible to experimentally examine
the effect of motivation on targets’ behaviors within the context of a laboratory para-
digm. Some targets could be randomly assigned to receive a weaker incentive for
successfully passing through the screening, while others receive a stronger incen-
tive. Of course, it would not be possible to create a fully realistic incentive system
due to ethical considerations. Still, such a manipulation could provide some insight
into the role of motivation in targets’ behaviors, and to what extent motivation mod-
erates the display of relevant behavioral cues.
In closing, I will briefly note a few areas of relevance for the airport security
screening settings that I believe future research ought to focus on. First, most re-
search has examined truths and lies about past actions. In the airport setting,
truths and lies about future actions (intentions) may be of particular relevance. A
few recent studies have examined true and false statements about future actions
(Granhag & Knieps, in press; Vrij, Granhag, Mann, & Leal, in press; Vrij et al., in
press). The studies reveal some findings in line with the research on true and false
statements about past actions, for example in that false statements about intentions
are less plausible (Vrij et al., in press). However, there are also some differences
in these results. While research on statements about past actions shows that lies
are less detailed than truths, this finding has not been replicated for statements
about future actions. However, this body of work is still small, and further empirical
attention is needed. Second, and relatedly, it would be valuable to attempt to extend
the research findings on elicitation of cues to deception to airport settings. That is,
it would be useful to establish to what extent it is possible to increase cues to decep-
tion using cognitive models when the statements concern future actions. Such
knowledge could be translated into brief, standardized questioning protocols that
could be used to establish the veracity of travelers’ reports about both their past
actions and their intentions.
References
Anderson, D. E., DePaulo, B. M., Ansfield, M. E., Tickle, J. J., & Green, E.
(1999). Beliefs about cues to deception: Mindless stereotypes or untapped wis-
dom? Journal of Nonverbal Behavior, 23, 67-89.
Backbier, E., Hoogstraten, J., & Meerum Terwogt-Kouweenhove, K. (1997). Situ-
ational determinants of the acceptability of telling lies. Journal of Applied Social
Psychology, 27, 1048-1062.
Bond, C. F., Jr., & DePaulo, B. M. (2006). Accuracy of deception judgments. Per-
sonality and Social Psychology Review, 10, 214-234.
Bond, C. F., Jr., & DePaulo, B. M. (2008). Individual differences in judging de-
ception: Accuracy and bias. Psychological Bulletin, 134, 477-492.
De Cremer, D., & van den Bos, K. (2007). Justice and feelings: Toward a new
era in justice research. Social Justice Research, 20, 1-9.
DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., &
Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129, 74-118.
Ekman, P. (2001). Telling lies: Clues to deceit in the marketplace, politics and
marriage. New York: Norton.
Feeley, T. H., & Young, M. J. (2000). The effects of cognitive capacity on beliefs
about deceptive communication. Communication Quarterly, 48, 101-119.
Garrido, E., Masip, J., & Herrero, C. (2004). Police officers’ credibility judgments:
Accuracy and estimated ability. International Journal of Psychology, 39, 254-275.
78
The Global Deception Research Team (2006). A world of lies. Journal of Cross-
Cultural Psychology, 37, 60-74.
Government Accountability Office (2010). Aviation security. GAO-1-763.
Granhag, P. A., & Knieps, M. (in press). Episodic future thought: Illuminating
the trademarks of true and false intent. Applied Cognitive Psychology.
Granhag, P. A., & Strömwall, L. A. (2004). The detection of deception in forensic
contexts. New York, NY: Cambridge University Press.
Hartwig, M., Granhag, P. A., Strömwall, L. A., & Kronkvist, O. (2006). Strategic
use of evidence during police interviews: When training to detect deception
works. Law and Human Behavior, 30, 603-619.
Hartwig, M., Granhag, P. A., Strömwall, L. A., & Vrij, A. (2005). Deception de-
tection via strategic disclosure of evidence. Law and Human Behavior, 29, 469-
484.
Hatz, J. L., & Bourgeois, M. J. (2010). Anger as a cue to truthfulness. Journal
of Experimental Social Psychology, 46, 680-683.
Henig, R. M. (2006). Looking for the lie. New York Times, Feb 5.
Köhnken, G. (2004). Statement validity analysis and the ‘detection of the truth’.
In P.A. Granhag, & L.A. Strömwall (Eds.), The detection of deception in forensic
contexts (pp. 41-63). Cambridge: Cambridge University Press.
Mann, S., Vrij, A., & Bull, R. (2004). Detecting true lies: Police officers’ ability
to detect suspects’ lies. Journal of Applied Psychology, 89, 137-149.
Mikula, G., Scherer, K. R., & Athenstaedt, U. (1998). The role of injustice in the
elicitation of differential emotional reactions. Personality and Social Psychology
Bulletin, 24, 769-783.
Miller, G. R., & Stiff, J. B. (1993). Deceptive communication. Newbury Park:
Sage Publications.
Porter, S., & ten Brinke, L. (2008). Reading between the lies: Identifying con-
cealed and falsified emotions in universal facial expressions. Psychological
Science, 19, 508-514.
Porter, S., Woodworth, M., McCabe, S., & Peace, K. A. (2007). ‘‘Genius is 1% in-
spiration and 99% perspiration’’.or is it? An investigation of the impact of moti-
vation and feedback on deception detection. Legal and Criminological Psychology,
12, 297-310.
Ross, L. D. (1977). The intuitive psychologist and his shortcomings: Distortions
in the attribution process. In L. Berkowitz (Ed.), Advances in experimental social
psychology (Vol. 10), pp. 174-221. New York: Academic Press.
Sporer, S. L., & Schwandt, B. (2006). Paraverbal indicators of deception: A meta-
analytic synthesis. Applied Cognitive Psychology, 20, 421-446.
Sporer, S. L., & Schwandt, B. (2007). Moderators of nonverbal indicators of de-
ception: A meta-analytic synthesis. Psychology, Public Policy, and Law, 13, 1-34.
Strömwall, L. A., Granhag, P. A., & Hartwig, M. (2004). Practitioners’ beliefs
about deception. In P. A. Granhag & L. A. Strömwall (Eds.), The detection of de-
ception in forensic contexts (pp. 229-250). New York, NY: Cambridge University
Press.
Vrij, A. (2008). Detecting lies and deceit: Pitfalls and opportunities (2nd ed.).
New York, NY: John Wiley & Sons.
Vrij, A., Granhag, P. A., Mann, S., & Leal, S. (in press). Lying about flying: The
first experiment to detect false intent. Psychology, Crime & Law.
Vrij, A., Granhag, P. A., & Porter, S. (2010). Pitfalls and opportunities in non-
verbal and verbal lie detection. Psychological Science in the Public Interest, 11,
89-121.
Vrij, A., Leal, S., Granhag, P. A., Fisher, R. P., Sperry, K., Hillman, J., & Mann,
S. (2009). Outsmarting the liars: The benefit of asking unanticipated questions.
Law and Human Behavior, 33, 159-166.
Vrij, A., Leal, S., Mann, S., & Granhag, P. A. (in press). A comparison between
lying about intentions and past activities: Verbal cues and detection accuracy.
Applied Cognitive Psychology.
79
Vrij, A., & Mann, S. (2001). Telling and detecting lies in a high-stake situation:
The case of a convicted murderer. Applied Cognitive Psychology, 15, 187-203.
Vrij, A., Mann, S., Leal, S., & Fisher, R. P. (2007). ‘‘Look into my eyes’’: Can an
instruction to maintain eye contact facilitate lie detection? Psychology, Crime &
Law, 16, 327-348.
Vrij, A., Mann, S., Fisher, R. P., Leal, S., Milne, R., & Bull, R. (2008). Increasing
cognitive load to facilitate lie detection: The benefit of recalling an event in re-
verse order. Law and Human Behavior, 32, 253-265.
Chairman BROUN. Thank you, Dr. Hartwig. If you want to add
some suggestions, we would be glad to enter those in the record
and entertain those suggestions that you may have. And hopefully,
we can get those from you.
Now, I would like to recognize our final witness and that is Dr.
Philip Rubin, Chief Executive Officer of Haskins Laboratories. Dr.
Rubin, you have five minutes for your oral testimony.
Deception Detection
85
People in the military, in law enforcement, and in the intelligence community
regularly deal with people who deceive them. These people may be working for
or sympathize with an adversary, they may have done something they are try-
ing to hide, or they may simply have their own personal reasons for not telling
the truth. But no matter the reasons, an important task for anyone gathering
information in these arenas is to be able to detect deception. In Iraq or Af-
ghanistan, for example, soldiers on the front line often must decide whether
a particular local person is telling the truth about a cache of explosives or an
impending attack. And since research has shown that most individuals detect
deception at a rate that is little better than random chance, it would be useful
to have a way to improve the odds. Because of this need, a number of devices
and methods have been developed that purport to detect deception. Two in par-
ticular were described at the workshop: voice stress technologies and the Pre-
liminary Credibility AssessmentScreening System (PCASS).
Threatening Communications
In March 2011, the NRC released a small collection of papers on the subject of
threatening communications and behavior. In my introduction (along with Barbara
A. Wanchisen) to the volume, we say:
‘‘Today’s world of rapid social, technological, and behavioral change provides
new opportunities for communications with few limitations of time or space.
The ease by which communications can be made with-out personal proximity
has dramatically affected the volume, types, and topics of communications be-
tween individuals and groups. Through these communications, people leave be-
hind an ever-growing collection of traces of their daily activities, including dig-
ital footprints provided by text, voice, and other modes of communication.
Many personal communications now take place in public forums, and social
89
groups form between individuals who previously might have acted in isolation.
Ideas are shared and behaviors encouraged, including threatening or violent
ideas and behaviors. Meanwhile, new techniques for aggregating and evalu-
ating diverse and multimodal information sources are available to security
services that must reliably identify communications indicating a high likeli-
hood of future violence.’’
The papers reviewed the behavioral and social sciences research on the likelihood
that someone who engages in abnormal and/or threatening communications would
actually then try to do harm. They focused on ‘‘how scientific knowledge can inform
and advance future research on threat assessments, in part by considering the ap-
proaches and techniques used to analyze communications and behavior in the dy-
namic context of today’s world. Authors were asked to present and assess scientific
research on the correlation between communication-relevant factors and the likeli-
hood that an individual who poses a threat will act on it. The authors were encour-
aged to consider not only communications containing direct threats, but also odd
and inappropriate communications that could display evidence of fixation, obsession,
grandiosity, entitled reciprocity, and mental illness.’’
‘‘The papers in this collection were written within the context of protecting high-
profile public figures from potential attack or harm. The research, however, is
broadly applicable to U.S. national security including potential applications for anal-
ysis of communications from leaders of hostile nations and public threats from ter-
rorist groups. This work high-lights the complex psychology of threatening commu-
nications and behavior, and it offers knowledge and perspectives from multiple do-
mains that can contribute to a deeper understanding of the value of communications
in predicting and preventing violent behaviors.’’
This volume focused on communication, forensic psychology, and the analysis of
language-based datasets (corpora) to help identify and understand threatening com-
munications and responses to them through text analysis. It serves as an example
of the kind of synthesis of current knowledge that is useful for generating ideas for
potential new research directions. (Chung & Pennebaker, 2011; Meloy, 2011; O’Hair,
et al, 2011).
REFERENCES
Aviezer, Hillel, Hassin, Ran R., Ryan, Jennifer, Grady, Cheryl, Susskind, Josh,
Anderson, Adam, Moscovitch, Morris, and Bentin, Shlomo. (2008). Angry, dis-
gusted or afraid? Studies on the malleability of emotion perception. Psycho-
logical Science, Vol. 19, No. 7, 724-732.
Barrett, Lisa Feldman. (2006). Are emotions natural kinds? Perspectives on
Psychological Science, Vol. 1, #1, 28-58.
Barrett, Lisa Feldman, Lindquist, Kristen A., and Gendron, Maria. (2007).
Language as context for the perception of emotion. TRENDS in Cognitive
Sciences, Vol. 11, No. 8, 327-332.
Bhatt, S., and Brandon, S. E (2009). Review of voice stress-based technologies
for the detection of deception. Unpublished manuscript, Washington, DC.
Chung, Cindy K. and Pennebaker, James W. (2011). Using computerized
textanalysis to assess threatening communications and behavior. In National
Research Council, Threatening Communications and Behavior: Perspectives on
92
the Pursuit of Public Figures. National Academies Press, Washington, DC, 3-
32.
Damphouse, Kelly R. (2011). Voice Stress Analysis: Only 15 percent of lies
about drug use detected in field test. National Institutes of Justice (NIJ) Jour-
nal, 259, 8≥12.
Ekman, Paul. (1972). Universals and Cultural Differences in Facial Expres-
sions of Emotions. In J. Cole (ed.), Nebraska Symposium on Motivation, 1971,
University of Nebraska Press, Lincoln, Nebraska, 1972, 207-283.
Ekman, P. and Friesen, W. (1978). Facial Action Coding System: A Technique
for the Measurement of Facial Movement. Consulting Psychologists Press, Palo
Alto.
Ekman, Paul. (2009). Lie catching and micro expressions. In Clancy Martin
(ed.), The Philosophy of Deception. Oxford University Press.
Ekman, Paul and O’Sullivan, Maureen. (1991). Who can catch a liar? American
Psychologist, 46(9), Sep. 1991, 913-920.
Ekman, Paul, O’Sullivan, Maureen, and Frank, Mark G. (1999). A few can
catch a liar. Psychological Science, 10(3), May 1999, 263-266.
Farah, Martha J. (ed.). (2010). Neuroethics: An introduction with readings. The
MIT Press, Cambridge, MA.
Gazzaniga, Michael S. (2011). Neuroscience in the courtroom. Scientific Amer-
ican, April 2011, 54-59.
Gordon, J. B., Levine, R. J., Mazure, C. M., Rubin, P. E., Schaller, B. R., and
Young,J. L. (in press). Social contexts influence ethical considerations of re-
search. American Journal of Bioethics, 2011.
Hartwig, Maria, Granhag, Pär Anders, Strömwall, Leif A., and Kronkvist, Ola.
(2006). Strategic use of evidence during police interviews: When training to de-
tect deception works. Law and Human Behavior, 30(5), 603-619.
Heuer, Richards J., Jr. (1999). Psychology of intelligence analysis. Center for
the Study of Intelligence, Central Intelligence Agency, Washington, DC.
Intelligence Science Board. (2006). Educing Information: Interrogation: Science
and Art. The National Defense Intelligence College.
Kahneman, D. and Tversky, A. (1972). Subjective probability: A judgment of
representativeness. Cognitive Psychology, 3, 430-454.
Loftus, Elizabeth F. (1996). Eyewitness Testimony. Harvard University Press,
Cambridge, MA.
Mayew, William J. and Venkatachalam, Mohan. (in press). The power of voice:
Managerial affective states and future firm performance. Journal of Finance,
forthcoming.
Meloy, J. Reid. (2011). Approaching and attacking public figures: A contem-
porary analysis of communications and behavior. In National Research Coun-
cil, Threatening Communications and Behavior: Perspectives on the Pursuit of
Public Figures. National Academies Press, Washington, DC, 75-101.
Moreno, Jonathan D. (2006). Mind Wars: Brain Research and National De-
fense. The Dana Foundation, New York and Washington, DC.
O’Hair, H. Dan, Bernard, Daniel Rex, and Roper, Randy R. (2011). Commu-
nications-based research related to threats and ensuing behavior. In National
Research Council, Threatening Communications and Behavior: Perspectives on
the Pursuit of Public Figures. National Academies Press, Washington, DC, 33-
74.
National Research Council. (2003). The Polygraph and Lie Detection. Com-
mittee to Review the Scientific Evidence on the Polygraph. Board on Behav-
ioral, Cognitive, and Sensory Sciences and Committee on National Statistics,
Division of Behavioral and Social Sciences and Education. National Academies
Press, Washington, DC.
National Research Council. (2008). Behavioral Modeling and Simulation: From
Individuals to Societies. Committee on Organizational Modeling: From Individ-
uals to Societies. Board on Behavioral, Cognitive, and Sensory Sciences, Divi-
93
sion ofBehavioral and Social Sciences and Education. National Academies
Press, Washington, DC.
National Research Council. (2008). Emerging Cognitive Neuroscience and Re-
lated Technologies. Committee on Military and Intelligence Methodology for
EmergentNeurophysiological and Cognitive/Neural Science Research in the
Next Two Decades. Standing Committee for Technology Insight - Gauge,
Evaluate, and Review Division on Engineering and Physical Sciences. Board
on Behavioral,Cognitive, and Sensory Sciences, Division of Behavioral and So-
cial Sciences andEducation. National Academies Press, Washington, DC.
National Research Council. (2008). Human Behavior in Military Contexts.
Committee on Opportunities in Basic Research in the Behavioral and
SocialSciences for the U.S. Military. Board on Behavioral, Cognitive, and
SensorySciences, Division of Behavioral and Social Sciences and Education.
Washington, National Academies Press, Washington, DC.
National Research Council. (2008). Protecting Individual Privacy in the Strug-
gle Against Terrorists. Committee on Technical and Privacy Dimensions
ofInformation for Terrorism Prevention and Other National Goals; Committee
on Law and Justice (DBASSE); Committee on National Statistics (DBASSE);
Computer Science and Telecommunications Board (DEPS). National Academies
Press, Washington, DC.
National Research Council. (2010). Field Evaluation in the Intelligence and
Counterintelligence Context. Workshop Summary. Planning Committee on
Field Evaluation of Behavioral and Cognitive Sciences-Based Methods and
Tools for Intelligence and Counterintelligence. Board on Behavioral, Cognitive,
and Sensory Sciences, Division of Behavioral and Social Sciences and Edu-
cation. National Academies Press, Washington, DC.
National Research Council. (2011). Intelligence Analysis: Behavioral and Social
Scientific Foundations. Committee on Behavioral and Social Science Research
to Improve Intelligence Analysis for National Security. Board on Behavioral,
Cognitive, and Sensory Sciences, Division of Behavioral and Social Sciences
andEducation. National Academies Press, Washington, DC.
National Research Council. (2011). Intelligence Analysis for Tomorrow: Ad-
vances from the Behavioral and Social Sciences. Committee on Behavioral and
Social Science Research to Improve Intelligence Analysis for National Security.
Board on Behavioral, Cognitive, and Sensory Sciences, Division of Behavioral
and Social Sciences and Education. National Academies Press, Washington,
DC.
National Research Council. (2011). Threatening Communications and Behav-
ior: Perspectives on the Pursuit of Public Figures. Board on Behavioral, Cog-
nitive, and Sensory Sciences, Division of Behavioral and Social Sciences and
Education.National Academies Press, Washington, DC.
National Science and Technology Council, Subcommittee on Social, Behavioral
and Economic Sciences. Executive Office of the President of the United States.
(2009). Social, Behavioral and Economic Research in the Federal Context. Jan-
uary 2009.
Nisbett, Richard E. (2003). The Geography of Thought: How Asians and West-
erners Think Differently... And Why. Free Press.
Pohl, Rüdiger F. (2004). Cognitive Illusions: A Handbook on Fallacies and Bi-
ases in Thinking, Judgement and Memory, Psychology Press, Hove, UK, 215-
234.
Rubin, P. (2003). ‘‘Introduction.’’ In S. L. Cutter, D. B. Richardson, & T. J.
Wilbanks (Eds.), The Geographical Dimensions of Terrorism. Routledge, New
York.
Rubin, P. and Wanchisen, B. (2011). ‘‘Introduction.’’ In National Research
Council, Threatening Communications and Behavior: Perspectives on the Pur-
suit of PublicFigures. National Academies Press, Washington, DC.
Russell, James A., Bachorowski, Jo-Anne, and Ferńandez-Dols, Jośe-Miguel.
(2003). Facial and vocal expressions of emotion. Annual Review of Psychology,
54, 329349.
Thompson, Suzanne C. (1999). Illusions of control: How we overestimate our
personal influence. Current Directions in Psychological Science, 8(6), 187-190.
94
United States Department of Health and Human Services (HHS). (2009). Code
of Federal Regulations. Human Subjects Research (45 CFR 46). (See: http://
www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html )
United States Government Accountability Office (GAO). (2010). Aviation Secu-
rity: Efforts to Validate TSA’s Passenger Screening Behavior Detection Pro-
gram Underway, but Opportunities Exist to Strengthen Validation and Ad-
dress Operational Challenges. GAO-10-763, May 2010, Washington, DC.
Wallach, Wendell and Allen, Colin. (2010). Moral Machines: Teaching robots
right from wrong. Oxford University Press, New York.
Weinberger, Sharon. (2010). Airport security: Intent to deceive? Nature, 465,
412≥415.
Wells, Gary L. & Quinlivan, Deah S. (2009). Suggestive Eyewitness Identifica-
tion Procedures and the Supreme Court’s Reliability Test in Light of Eye-
witness Science: 30 Years Later. Law & Human Behavior, 33, 1-24.
Widen, S. C., Christy, A. M., Hewett, K., and Russell, J. A. (in press). Do pro-
posed facial expressions of contempt, shame, embarrassment, and compassion
communicate the predicted emotion? Cognition & Emotion, in press, 1-9.
Chairman BROUN. Thank you, Dr. Rubin. And I want to express
my appreciation for your being here. I know you have had some re-
cent challenges and I greatly appreciate you being here in spite of
those. So thank you very much.
Dr. RUBIN. Thank you.
Chairman BROUN. And I want to thank all the panel for your tes-
timony. Reminding Members that the Committee rules limit ques-
tioning to five minutes. The Chair at this point will open the round
of questions and the Chair recognizes himself for five minutes.
Mr. Willis, when can we expect the SPOT validation report?
Mr. WILLIS. The report was delivered to me by AIR last night.
It is being submitted through DHS’s review and release distribu-
tion process. I am not exactly sure what that time is or when it
is ultimately disseminated. I can certainly get that information for
you, sir.
Chairman BROUN. I would appreciate getting that report to us as
quickly as possible.
Mr. WILLIS. Yes, sir.
Chairman BROUN. What additional steps have to be taken before
we get the report?
Mr. WILLIS. I don’t know what DHS’s distribution process en-
tails. I know that I will submitting it this morning following my
participation here.
Chairman BROUN. Do you have any problems in releasing the
preliminary results?
Mr. WILLIS. I don’t know what DHS’s policy is on that, but I am
happy to provide whatever is consistent with DHS’s S&T’s policy
on release.
Chairman BROUN. I understand that the results, I assume, are
still preliminary. There appears to be a discrepancy in the SPOT’s
success rate. In your testimony you state ‘‘the study did indicate
that a high-risk traveler is nine times more likely to be identified
using Operational SPOT versus random screening.’’ Yet when you
met with the staff from the I&O Subcommittee on March 3 you
said that the SPOT program was 50 times more effective than ran-
dom screening. One of our other witnesses, Dr. Ekman, also makes
a similar claim in his testimony saying ‘‘malfeasance, felons, smug-
95
Mr. WILLIS. The reason we use those metrics that we had just
listed, sir, was because they were available to us through the data
in sufficient numbers to analyze, even though they themselves are
low base rate or extremely rare. And data directly dealing with ter-
rorism is unavailable and, thus, can’t be used as a metric.
Chairman BROUN. Okay. My time is up. Ms. Edwards.
Ms. EDWARDS. Thank you, Mr. Chairman. And as I mentioned
earlier, I am disappointed that TSA isn’t here because I think that
there are a number of questions that actually go to things like
training protocols and other aspects of the SPOT program that they
would have, you know, really useful information to share and so I
look forward to working with the Chairman and the Committee.
This question about who needs to appear or not is not a decision,
really, for the Administration. Congress determines, under its Con-
stitutional authority, who appears before the Committees and what
the jurisdiction is. So I do share that concern.
I want to go to this question, though, of profiling——
Chairman BROUN. Does the gentlelady yield?
Ms. EDWARDS. Yes.
Chairman BROUN. I appreciate your comment. You took up about
almost a minute with that and I would like to give you an extra
minute on top of that, so I don’t want to charge you that time.
Ms. EDWARDS. I appreciate that, Mr. Chairman.
Chairman BROUN. So I will give you the extra minute. So if you
all would start her clock again, please.
Ms. EDWARDS. Thank you. Thank you again, Mr. Chairman. I
have a question, really, that goes to this issue of profiling. I mean,
as an African American woman who sometimes, because I have
short hair and I get cold, I wear a scarf on my head and that is
true in the airports especially. I have had the experience of actu-
ally being pulled over, questioned, and it hasn’t just happened once
or twice. It has actually happened multiple times. And, you know,
I don’t want to make any speculation about that, but it does raise
the question of who is identifying me and how and what I am send-
ing off.
I am also reminded in Dr. Hartwig’s testimony that, you know,
I remember when I broke a lamp and I tried to glue it together and
my mother walked in and she said what did you do? And I suspect
that part of the reason that she could say that and she knew—and
then I proceeded to tell her a lie, but I suspect that part of the rea-
son that she knew I was lying is because she knew me and because
she had had experience with me and because she had read my both
verbal and nonverbal cues many times over, which gave her a
much better indication of when I was doing truth-telling and when
I wasn’t.
We don’t have that experience in our airports, and so I have a
question for Lieutenant DiDomenica, and that is whether it is pos-
sible to train officers of all kinds not to engage in profiling? And
I have done police training, law enforcement training as well, and
I think it is tough to train out culture, culture in the sense of a
police culture and a law enforcement culture where you have to
train against type when it comes to these issues. And so I am curi-
ous, Lieutenant DiDomenica, if you can share with us whether it
is possible to train officers not to engage in profiling?
97
who are repeatedly showing racial profiling. And you either reedu-
cate or you reassign them to a different job.
Ms. EDWARDS. Thank you, Dr. Ekman, and thanks for your in-
dulgence, Mr. Chairman.
Chairman BROUN. You know, we will always be friends and I will
always give you some variances on the time so I am not going to
be worried about that at all.
Dr. Benishek, you are up next for your questions. Go ahead, sir.
Mr. BENISHEK. Thank you, Mr. Chairman. Thanks to the panel,
as well, for being here.
It is our job here to try to spend the money of the taxpayer the
most efficacious way and listening to the testimony here, it is real-
ly difficult for me to determine whether this SPOT process is accu-
rate or not. But I would like to address Mr. DiDomenica about the
process a little bit more. From your comments today it seems as
if there is some doubt, I mean, even after the BDO sees some kind
of behavior, then what is the process after that? If there is someone
there, it sounds as if you have some doubt as to the next step as
to what is happening, the next screening step. Are those people not
trained in the same thing? I mean I would hate to see somebody
get missed. So I would like to know more about the exact process
from the moment that the person gets taken out of the queue. Is
that effective? Is it—are we doing any good? Are we missing peo-
ple? I mean, this is the kind of thing I think you brought up in
your testimony.
Mr. DIDOMENICA. I think it is effective and I also think we are
missing people, but I think that could be improved. The process ac-
tually starts with an observation that may indicate a person that
is high-risk, that maybe should not get on that airplane or get onto
that train or into that government building, whatever the critical
infrastructure is. And based on the evaluation, this SPOT scoring,
which I really can’t go into because that is, you know, that is sen-
sitive information.
But there are two levels, and one is more screening, and one is
a law enforcement response. So for the people deemed to be the
most high-risk, the protocol is to invite or call a law enforcement
officer to do a follow-up interview. Now, this follow-up interview is
the opportunity to address the false positives, because a lot of peo-
ple that exhibit the behaviors that may indicate possible terrorist
intent or criminal intent are just people that are upset or dis-
tracted or late for work or going to a funeral, whatever it is, that
maybe a lot of people just get on the radar. And this interview,
which really only takes a couple of minutes to do, is the oppor-
tunity to resolve that so you are not creating false positives. And
it is also an opportunity to determine if you have got the real
thing, that this person is high-risk. And so that is another skill. I
mean that is the interview skill, which is another part of this proc-
ess. So there are——
Mr. BENISHEK. Are those people skilled enough in your opinion?
Mr. DIDOMENICA. When you say ‘‘those people’’——
Mr. BENISHEK. The people—the secondary person. Are there
enough of those people?
Mr. DIDOMENICA. I think the responsibility ultimately falls on
police officers when there is a high-risk person. I think they are ca-
99
pable. Every day they are making decisions around this country
whether to arrest somebody, not to arrest somebody, use lethal
force in some cases, deny people their freedoms, and so I don’t
think it is too much to ask them to make a decision, is this person
a high-risk person and do we need to slow down the process to fig-
ure out what is going on? I think they are capable of doing it. We
are doing it—whether this program gets funded or not, cops are
making these decisions every day. But I would like to see them get
more training and more support to make them better at what they
do. And this program has that potential.
Mr. BENISHEK. All right. Thank you. I don’t know where we are
at with the time, but I will yield back the remainder of my time,
if any.
Chairman BROUN. Thank you, Doctor. I just want to say your
questioning just shows further why TSA should be here so that we
could answer those questions, because if they were, then you could
direct it to the TSA individuals and it would be very instructive to
the whole Committee, Democrats and Republicans alike, and help
us to go forward.
The next person on the agenda is my friend, Mr. McNerney. You
are recognized for five minutes.
Mr. MCNERNEY. Thank you. And I appreciate you calling this
hearing. It is interesting. I have watched ‘‘Lie to Me’’ on occasion
and I find it is compelling but not too scientific in my opinion. But
it is good for us to examine this issue and see how much utility
there can be from it and how much money should be expended to
find that utility.
Dr. Hartwig, I think I heard you say—and you can correct me
if I am wrong—that you fail to see how knowledge of the indicators
could be useful.
Dr. HARTWIG. I think that is, again, an empirical question. There
isn’t enough research on—well, there is a lot of research on de-
meanor cues, but as far as I know, there is no study that tests
whether knowledge about, for example, micro-expressions help peo-
ple not display them. But that would be a second step. It would be
a good first step to establish that these expressions occur reliably.
Mr. MCNERNEY. Okay, and I was——
Dr. HARTWIG. So countermeasures come second.
Mr. MCNERNEY. Okay. Thank you, Dr. Hartwig. And I was going
to follow up with you, Dr. Ekman, to basically say would you agree
that knowledge of those indicators would also be useful to potential
wrongdoers?
Dr. EKMAN. We don’t know. I mean you are basically asking the
question in polygraph terms is could you develop countermeasures?
Mr. MCNERNEY. Right. Right.
Dr. EKMAN. A proposal I put in to the government to find out—
I mean I have reason to believe that the Chinese know the answer
because they were sending me questions that you would want to
prepare on if you were going to do a training study to see whether
you could inhibit people from showing not just micro-expressions
but there are dozens of items on that checklist. The—our govern-
ment has not decided that it is worth finding out whether you can
beat the system. Other governments are finding out and may be se-
lecting people who can and training them so they can. We just
100
that, the behavior leading to arrest. How many of those were ar-
rested?
Mr. WILLIS. Of the 71,000?
Mrs. ADAMS. Yes.
Mr. WILLIS. That is the random selection method.
Mrs. ADAMS. Correct.
Mr. WILLIS. 71,000 were referred in the random selection. Nine
arrests were made.
Mrs. ADAMS. Nine?
Mr. WILLIS. Yes.
Mrs. ADAMS. And in the other method?
Mr. WILLIS. Using SPOT 23,000 and a little bit were referred and
151 were arrested.
Mrs. ADAMS. And the types of arrests?
Mr. WILLIS. I don’t have the nature of the arrests in the data
that we looked at, ma’am.
Mrs. ADAMS. So it could have been belligerency or any other
thing for that matter?
Mr. WILLIS. Some of them were for prohibited items that were
on them at the time. Others could have been through outstanding
warrants or something of that nature, ma’am.
Mrs. ADAMS. Do you think that I have an appearance or would
I be a target for SPOT? I mean every time I go through the airport
I get pulled aside and searched. And the reason I ask that is be-
cause, you know, being a past law enforcement officer and trained,
I have some concerns about the way you are identifying pulling
people aside. Dr. Hartwig, you said you wanted—you thought the
program would work if more tools were available. Would it be bet-
ter to use a validated system as opposed to one that is untested
and invalidated?
Dr. HARTWIG. Well, first of all, I didn’t say that about that the
program would work. I was talking about where I think more em-
phasis should be spent or put.
Mrs. ADAMS. So even with the more emphasis do you believe that
it would work?
Dr. HARTWIG. I don’t know. I think we would need a properly
conducted study to find that out. And I think it would be important
to go beyond examining the arrest rates and to look at what are
the actual behaviors that are displayed by these people who are ar-
rested and to compare those behaviors with those that are in the
list of queues. I don’t know what those queues are because it is not
available. And to look at are the SPOT criteria actual indicators.
So I think that—it is definitely—we need to know whether it works
or now.
Mrs. ADAMS. Mr. DiDomenica, you are a law enforcement officer.
I am a past law enforcement officer. Do you believe that the TSA
employees have enough training and the skills sets based on the
training they are receiving to—you know, to provide this type of
screening at this level?
Mr. DIDOMENICA. I think with a proper follow up by trained law
enforcement that they do. But if we don’t have the proper follow
up by the police officers to figure out what is going on because this
is just like an alarm. It is like going through the magnetometer
and beeps. Well, what does that mean? So someone comes over and
108
pats you down. Well, the cops are like the pat-downs. All right.
Why did this beep? And so if you have that level of follow up by
trained law enforcement, I am comfortable with the training they
receive. But without that level of follow up, I am not comfortable.
Mrs. ADAMS. So would it be your opinion that there needs to be
more training?
Mr. DIDOMENICA. Yes.
Mrs. ADAMS. I yield back.
Chairman BROUN. Thank you, Ms. Adams. Mr. Willis, I have got
another question for you. Does TSA plan to use R and D to improve
the SPOT program or does it believe the program cannot be im-
proved upon?
Mr. WILLIS. We do have some ongoing research with them and
if I may say this is one of the beginning research elements that we
have with TSA, sir, and in fact it was started in 2007 prior to
GAO’s interests. Its focus is specific, not to evaluate absolutely ev-
erything going on with SPOT. That is a huge tasking of which we
are not tasked or resourced to do. This is looking at the indicators,
the checklist itself, the existing checklist.
The first question that needs to be asked from a scientific per-
spective is does the checklist as it is currently put together and as
it is currently deployed accomplish its mission. You would like to
be able to compare that against random and against something else
that has been shown to be out there and valid, but the fact is that
there isn’t another behavioral-based screening out there employed
by any other group that we are aware of, either in the United
States or abroad, that has been statistically validated. And so we
have not been able to address that. So we compared this against
random, which is the first scientific basis.
Chairman BROUN. So TSA is doing research?
Mr. WILLIS. We are doing research that supports TSA.
Chairman BROUN. Ms. Edwards, do you have another question?
Ms. EDWARDS. I do, thank you, Mr. Chairman. I just want to fol-
low up with you, Mr. Willis, because I am confused. My under-
standing is that you shared with our staff that there is a pool video
available of suicide bombers and the like that could be used to
study. And I mean I would expect that if TSA were operating the
right kind of way that would also be used for training. And so I
am a little confused by your answer and I just want to be clear.
Do we have video both from ourselves and perhaps from our inter-
national partners that we could use to assess the techniques that
have been developed and the questions that—the assessment ques-
tions that have been developed so that we can make sure that we
have a program that is working as effectively as we know it can
work?
Mr. WILLIS. We don’t presently have a sufficient number of vid-
eos to conduct scientific analysis on. S&T is attempting to work
with our partners in the United States and internationally to gath-
er these, but being a resource organization, we do not have the
ability to compel operational organizations, much less international
ones to provide us with that video. What we are doing is attempt-
ing to continue to collect that at—the best we can, as well as to
conduct other kinds of supporting things such as interviews of di-
rect eyewitnesses to suicide bombings, international subject matter
109
Mrs. ADAMS. Thank you, Mr. Chair. The program, Mr. Willis, has
been ongoing since 2007? Is that what I heard?
Mr. WILLIS. The validation research study has been ongoing
since 2007.
Mrs. ADAMS. A validation research study since 2007. And I heard
you say there was no system out there that you could use that was
validated or available, is that correct?
Mr. WILLIS. We are unaware of any behavioral-based screening
program that is used that has been rigorously validated, yes.
Mrs. ADAMS. What about Israel’s program?
Mr. WILLIS. We have not located any study that rigorously tests
that.
Mrs. ADAMS. Did they study it?
Mr. WILLIS. We are not provided any information——
Mrs. ADAMS. Did you ask?
Mr. WILLIS. Yes.
Mrs. ADAMS. And they have said they would not provide it?
Mr. WILLIS. We have not been—they didn’t say they wouldn’t
provide it.
Mrs. ADAMS. Okay. So it is maybe the way you were—you asked
for it maybe? I am trying to determine, since ’07 you have been
doing a study. We don’t have anything validated. You can’t give us
a cost/benefit analysis. We are four years out and when you say
there is no other programs out there, there are some out there, I
believe. Mr. DiDomenica, are there programs out there?
Mr. DIDOMENICA. There are similar programs—excuse me. There
are similar programs for behavior assessment, principally for law
enforcement. I mean I have been teaching BASS. There is a DHS
program called—it is proved by DHS called Patriot. I have another
training course called HIDE, Hostile Intent Detection Evaluation.
But these programs are given, it may be a few days of training,
and then people go off and do their thing. There is no follow up,
in other words, how successful it is. I mean people, I think, are get-
ting good ideas, they are getting good techniques, but it is not done
in a way where it can be measured and followed up on, and I think
that needs to be done.
Mrs. ADAMS. And these programs are all from DHS also?
Mr. DIDOMENICA. There is one that is approved. In other words,
it is approved for funding. And—but they are not DHS programs.
Mrs. ADAMS. Okay. So they are funded but they are trying to
then—they are kind of sent out and there is no true follow up. Is
that what you are saying?
Mr. DIDOMENICA. Yeah, there is no collection of data about suc-
cess or failures or effectiveness. It is like a lot of law enforcement
training, and you are probably aware of this, that you go in for a
class, you sit there for a week, you get a certificate, and you walk
out the door and that is the end of it. So I think, unfortunately,
that just falls in line with a lot of the training that is done. And
I think for this program, it is—you know, what is at—for what is
at stake, we need to be better at how we follow up on this.
Mrs. ADAMS. I know in my certificate we had to go back for train-
ing every so often or else we lost our certificate. So I can relate to
having to keep your training and your skills honed. I appreciate
that. No more questions, Mr. Chair.
111
(113)
114
ANSWERS TO POST-HEARING QUESTIONS
Responses by Mr. Stephen Lord, Director, Homeland Security and Justice Issues,
Government Accountability Office
115
116
117
118
Responses by Mr. Larry Willis, Program Manager, Homeland Security Advanced
Research Projects Agency, Science and Technology Directorate,
Department of Homeland Security
References
Ekman, P. & O’Sullivan, M. (1991) Who can catch a liar? American Psychologist,
46(9), 913-920.
Frank, M.G., & Ekman, P. (1997) The ability to detect deceit generalizes across
different types of high-stake lies. Journal of Personality and Social Psychology
72, 1429-1439.
Frank, M.G, Feeley, T.H., Paolantonio, N. & Servoss, T. J. (2004). Individual and
Small Group Accuracy in Judging Truthful and Deceptive Communication.
Group Decision and Negotiation 13(1), 45–59.
O’Sullivan, M., Frank, M. G., Hurley, C. M., & Tiwana, J. (2009). Police lie detec-
tion accuracy: The effect of lie scenario. Law and Human Behavior 33(6), 542–
543.
129
Warren, G., Schertler, E., Bull, P. (2008) Detecting Deception from Emotional
and Unemotional Cues. Journal of Nonverbal Behavior 33, 59–69.
130
Responses by Dr. Maria Hartwig, Associate Professor, Department of Psychology,
John Jay College of Criminal Justice
(139)
140
MATERIAL SUBMITTED BY MR. STEPHEN LORD, DIRECTOR, HOMELAND SECURITY AND
JUSTICE ISSUES, GOVERNMENT ACCOUNTABILITY OFFICE
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
Æ