Anda di halaman 1dari 14

A Study on Evolution of Data Mining Techniques Post 9/11

A Study on Evolution of Data Mining Techniques in FBI Post 9/11


Avinandita Sarkar, MBA
Neha Harit, MBA

Abstract After giving an introduction on data


This paper introduces the ways data mining techniques for counter-
mining techniques are used to counter terrorism, this paper throws some light
terrorism. It concentrates on effects of on transition in FBI after 9/11 attack in
9/11 attack on data mining techniques U.S. Post 9/11 there were certain
used by FBI, throwing light on actions taken and new programs were
measures and initiatives taken up by introduced. This paper throws some
Federal Bureau of Investigation to light on the Investigative Data
counter terrorism.The paper concludes Warehouse architecture deployed by
with detailed description on how FBI FBI, discussing the details of the
has incorporated and transitioned to datasets acquired along with the
IDW System. structure of the program followed form
then to now.

Introduction Survey of Literature


Data and information gathered from Data mining is the process of
that data is an extremely valuable extracting patterns from data. Data
organizational resource. As defined by mining has become an increasingly
Thuraisingham, data mining is the important tool to transform data into
process of posing queries and information. It is commonly used in a
extracting useful patterns or trends wide range of profiling practices, such
often previously unknown from large as marketing, surveillance, scientific
amounts of data using various discovery, fraud detection and
techniques such as those from pattern combating terrorism. Jesús Mena [1]
recognition and machine learning. wrote the first book to outline how data
There have been several mining technologies can be used to
developments in the use of data combat crime in the 21st century. It
mining techniques with time in the field introduces security managers, law
of counter-terrorism applications. This enforcement investigators, counter-
paper will provide an overview on the intelligence agents, fraud specialists,
use of data mining for counter- and information security analysts to
terrorism. This also discusses data the latest data mining techniques and
mining solutions that attempt to detect shows how they can be used as
or prevent terrorism and at the same investigative tools. It uses clear,
time maintain some level of privacy, understandable language for novice
throwing light on both non real time readers and provides instructions as to
threats and real threats and see how how to search public and private
data mining in general and web data databases and networks to flag
mining in particular could handle such potential security threats and root out
threats. Along with this an introduction criminal activities even before they
to link analysis discusses how it is occur. Hsinchun Chen [2] in his book
useful for detecting abnormal patterns. gives a detailed analysis about
advanced techniques and
A Study on Evolution of Data Mining Techniques Post 9/11

methodologies which are used to in November 2002, when news broke


acquire, integrate process, analyze, about a massive government data
and manage the diversity of terrorism- mining program called ‘Total
related information for international
Information Awareness’ which sucked
and homeland security-related
applications. It also discusses on up as much data as possible about
topics like social/technical areas to everyone, sift through it with massive
terrorism research including social, computers, and investigate patterns
privacy, data confidentiality, and legal that might indicate terrorist plots. This
challenges. Bhavani M. idea of trading privacy for security has
Thuraisingham’s[3] book describes met with more resistance than
both opportunities and dangers on the
expected.
Web can be identified and managed.
FBI has built a database with more
Cyberspace terrorism became much than 659 million records including
more apparent after the tragic events terrorist watch lists, intelligence cables
of September 11, 2001 and hence the and financial transactions. The system
focus on the internet as a is one of the most powerful data
communication and propaganda tool analysis tools available. Ellen
for terrorism strengthened [4]. Cecilia Nakashima [9] reports on how
S. Gal’s [5] book highlights several manifolds FBI have grown. [10] A
aspects of patrolling the Web and is report on MATRIX data mining system,
targeted towards counter-terrorism which has been shut down because of
experts and professionals since it much political attention and loss of
gives practical point of views by federal funding. This pilot project
experts from related industry leveraged proven technology to assist
describing lessons learned from criminal investigations by
practical efforts to tackle these implementing factual data analysis
problems. from existing data sources and
integrating disparate data from many
Assistant Attorney Generals Viet Dinh types of Web-enabled storage
and Michael Chertoff both said systems. [11] According to this article,
information is a key weapon in since 9/11, FBI has done a lot of
combating terrorism [6]. Chertoff, head expensive upgrades from employees
of the criminal division and a key having access to internet to deploying
drafter of last year's major anti- a one stop shopping called
terrorism law, defended data mining as Investigative Data Warehouse
an anti-terror tool as against the computer systems. In August 2006,
accusations of invasion of privacy by the Electronic Frontier Foundation
several parties. Out of different (EFF) sought government records
sources of information used for data concerning the Federal Bureau of
mining, Bryan D. Kreykes[7], focuses Investigation (FBI)'s Investigative Data
on usage of telephone records as an Warehouse (IDW) pursuant to the
investigatory to combat terrorism. Freedom of Information Act (FOIA).
[12], this report is based upon the
But there have been speculations that records provided by the FBI, along
data mining might not be successfully with public information about the
used as tool against terrorism.[8]Most datasets included in the data
people first learned about data mining warehouse. [13], this study throws light
on how data mining could have an
A Study on Evolution of Data Mining Techniques Post 9/11

impact on privacy of personal data. threats and observe how data mining
This paper presents a research in and web data mining in our
progress study that investigates the terminology encompasses data mining
need for an expanded role of ethics in as it deals with data mining on the web
data mining. [14], Countering as well as mining structured and
Terrorism: Integration of Practice and unstructured data.
Theory. [15], Article throws light on
how fast growing FBI data mining Data Mining for Handling
systems billed as a tool for hunting Threats
terrorists is being used in hacker and Data used for mining purposes for
domestic criminal investigations, and
now contains tens of thousands of handling of threats are grouped in
records from private corporate different ways. An example of that is
databases, including car-rental information related and non-
companies, large hotel chains and at information related groups of data.
least one national department store, Another way of grouping is real- time
declassified documents obtained by and non real-time threats. These
Wired.com show.[16], this is a report to
groupings are somewhat arbitrary in
the National Commission on Terrorist
Attacks upon the United States nature, e.g. a non real-time threat
explaining the FBI’s counterterrorism becomes a real-time threat when a
program.[17], Bhavani Thuraisingham suspected terrorist decides to attack at
in her book introduces how data a certain date.
mining has become a useful tool for
detecting and preventing terrorism, Non Real-time Threats:
explaining technical challenges for Non real-time threats are threats
data mining, various types of terrorist that do not have any time
threats how these techniques can constraints. Data might be collected
provide solution to counter terrorism. over months, analyzed and then
come at a conclusion may not
occur. For data mining to work
Data mining applications effectively, many examples and
in Counter terrorism patterns are needed. Patterns and
In this we will discuss a high level historical data are used to make
overview of how web data mining as predictions. The prime requisite is
well as data mining could help towards good data to carry out data mining
counter terrorism. Web data mining and obtain useful results. Examples
goes beyond just mining structured of barriers here are incomplete data
data. We will throw some light on and unwillingness of organizations
mining unstructured data, mining for to share data. Hence mining tools
business intelligence, web usage have to make a lot of assumptions
mining and web structured mining as a regarding incomplete and
web data mining. This states that data unavailable data. An alternative is
mining could contribute towards to carry out federated data mining
counter-terrorism, by extracting hidden under some federated
patterns and trends from large administrator.
quantities of data is very important for The next step is to decide what
detecting and preventing terrorist data needs to be collected. Mostly
attacks. We will be examining both the data regarding various people like
non real time threats and real time where they come from, what they
A Study on Evolution of Data Mining Techniques Post 9/11

are doing, who are their relatives, threats. Both hypothetical data as
etc. are gathered and then groups well as simulated data are needed
are formed of individuals having to be used. As many possible
similar patterns. Individuals with similar examples should be
criminal records are kept under high gathered from counter-terrorism
vigilance. specialists. Once the examples are
Once the data is collected, the data gathered and training of the neural
is formatted and organized. Data networks and other data mining
may be structured or unstructured tools are initiated, the next task is
data. Also, there might be data that deciding what sort of models are to
may not be of much use. Therefore, be built. To handle real-time
the data is segmented in terms of threats, dynamically changing
critical data and non-critical data. models are needed. This is the
Once the outcomes are determined, biggest challenge faced.
the mining tools are used to start Real time data mining is a
the mining process. controversial topic as many people
After that comes the most complex opine that it is an impossible task.
part. The usefulness of the mining Hence the challenge is to redefine
results are to be decided. Chances data mining and figure out ways to
of getting a false positive or a false handle real-time threats.
negative is pretty high and either of Analyzing data emanating from
the results could be disastrous. At sensors is a common source of
present human specialists are gathering data e.g. surveillance
needed to work with the mining
cameras placed in various places
tools. If the tool states that a certain
person is a terrorist, the specialist such as shopping centres and in
will have to do some more checking front of embassies and other public
before arresting or detaining. places. The data emanating from
A non real-time threat could these sensors have to be analyzed
become a real-time threat. The in real-time to detect/prevent
challenge will then be to find exactly attacks. Hence arises the issues
what the attack will be? Then, data
that raise the questions of privacy
mining tools that can continue with
the reasoning as new information and civil liberties. But the real
comes in, are needed, i.e., as new dilemma is what really the
information comes in, the alternatives are? Should privacy be
warehouse needs to get updated sacrificed to protect the lives of
and the mining tools should be millions of people? Policy makers
dynamic and take the new data and and lawyers need to work together
information into consideration in the
to come up with viable solutions.
mining process.
Real-time Threats: Analyzing the Techniques:
In the case of real-time threats The goal of data mining is to
there are time constraints. That is, analyze data and make predictions
such threats may occur within a and trends. It includes examining
certain time and therefore various data mining outcomes and
immediate response is required. discussing how they could be
There are several types of data applied for counter-terrorism. The
mining techniques for real-time outcomes of these analyses arrived
at by making associations, link
A Study on Evolution of Data Mining Techniques Post 9/11

analysis, forming clusters, points out that gathering information


classification and anomaly about people, mining information about
detection. The techniques that people, conduction surveillance
result in these outcomes are activities and examining say e-mail
techniques based on neural messages and phone conversations,
networks, decisions trees, market etc. are all threats to privacy and civil
basket analysis techniques, liberties. So the objective of data
inductive logic programming, rough mining is taking a turn towards
sets, link analysis based on the enhancing national security but at the
graph theory, and nearest same time ensuring privacy of
neighbour techniques. The methods individuals. A proposed approach is to
used for data mining are top down process privacy constraints in a
reasoning which starts with a database management system. It
hypothesis and then determine should consist of levels of privacy like
whether the hypothesis is true, or fully-private, semi-private, etc. But
bottom up reasoning which starts several sources have said that privacy
with examples and then goes up to enhanced data mining may be time
become a hypothesis. consuming and may not be scalable.
Several data mining techniques are More investigation is required on this
Association techniques: An area to come up with viable solutions.
example of association technique is
market basket analysis. The goal of
FBI Counter Terrorism
market basket analysis is to find
which items go together. Clustering Program post 9/11
techniques: Clustering is a Since the attack of September 11,
technique where data is analyzed 2001, Federal Bureau of Investigation
from various clusters. (FBI) has implemented a
Anomaly detection: Anomaly comprehensive plan that
detection is the technique of fundamentally transforms the
observing and analyzing deviations organization to enhance their ability to
from general pattern. predict and prevent terrorism. With this
they developed a three step plan that
Link Analysis: provided immediate support to
Link Analysis is a particular data counterterrorism investigators and
mining technique that is especially analysts. This plan transitions away
useful for detecting abnormal from separate systems containing
patterns. Link analysis uses various separate data(ACS, TelApps) towards
graph theory techniques to reduce Investigative Database
the graphs into manageable Warehouse(IDW) containing all the
chunks. The objective is to find data that can legally be stored
interesting associations and then together.
determine how to reduce the
graphs to manageable and not SCOPE:
combinatorially explosive results. The initial step towards the IDW was
the implementation of Secure
A Note on Privacy Counterterroist Operational Prototype
Environment (SCOPE) program. This
The issue of privacy has been a topic
program quickly consolidates
of recent debates among the counter-
counterterrorism information from
terrorism experts and civil liberties
various data sources, providing
unions and human rights lawyers which
A Study on Evolution of Data Mining Techniques Post 9/11

analysts at headquarters with access system of record for, all the FBI
to more information in far less time electronic files.
than with other FBI investigative
systems. SCOPE data base even if Analytical Tools
gave opportunity to test new To make the most out of the IDW data
capabilities in a controlled stored, advanced analytical tools were
environment; this has now been planned to be used. These tools allow
replaced by IDW. FBI agents and analysts to look across
multiple cases and data sources
Investigative Data Warehouse indentifying relationships and other
The IDW, delivered in its first phase in pieces of information that initially
January 2004, now provides analysts weren’t readily available using older
with full access to investigative FBI systems. These tools will make
information within FBI files, including databases searches simple and
ACS and VGTOF data, open source effective, give analysts new
news feeds, and the files of other visualization, geomapping, link-
federal agencies such as DHS. charting and reporting capabilities and
Without needing to know the physical allow analysts to request automatic
location or format of the data IDW updates to their query results
allows users to access and provides whenever new, relevant data is
physical storage for that data. The downloaded into the database. Please
data in the IDW is at the secret level, refer illustrations from 1 to 3, which
and the addition of TS/SCI level data is give fictional examples that illustrate
in the planning stages. how some of these tools can assist
They have planned to enhance the drawing connections between discrete
IDW by adding additional data sources pieces of information.
like Suspicious Activity Reports, and
by making it easier to search. With this FBI IDW Systems
the agents and analysts using new In August 2006, the Electronic Frontier
analytical tools will be able to search Foundation (EFF) sought government
rapidly for pictures of known terrorists records concerning the FBI IDW
and match or compare the pictures pursuant to the Freedom of
with other individuals in minutes rather Information Act (FOIA), EFF filed a
than days. This will help in identifying lawsuit o October 17, 2006. The
relationships across cases. The major following data is based upon the
advantage of this deployment is that it records provided by 2009, along with
will take seconds to search up to 100 public information about the IDW and
million pages of international terrorism- the datasets included in the data
related documents. warehouse.

Master Data Warehouse Overview of IDW


The plan was to turn the IDW into a IDW is a centralized, web-enabled,
Master Data Warehouse (MDW) that closed system repository for
will include the administrative data intelligence and investigative data.
required by the FBI to manage its According to the documents, the FBI
internal business processes in addition began speding on IDW in 2002 and
to the investigative data. MDW will system implementation was completed
grow to eventually provide physical in 2005. IDW 1.1 was released in july
data storage for, and become the 2004 with enhanced functionality,
including batch processing capabilities.
A Study on Evolution of Data Mining Techniques Post 9/11

FBI worked with Science Applications platform (IDW-D) and a subsystem for
International Corporation (SAIC), maintenance and testing (IDW-I).
Convera and Chilliad for developing
the project. By March 2006, the IDW IDW Secret
had 53 data sources and over half a This system is the main subsystem
billion. By September 2008, the IDW of the IDW authorized to process
had grown to nearly one billion. classified national security data up
to, and including, information
IDW System Architecture designated Secret. However,
According to FBI project description, neither Top Secret data nor any
IDW system environment consists of a Sensitive Compartmented
collection of UNIX and NT servers Information (SCI) is authorized to
providing secure access to cohort of be processed by this system. The
very large-scale storage devices. The IDW Top Secret/ Sensitive
servers provide application, web Compartmented Information level
servers, relational database servers, datamart, appears to be in the
and security filtering servers. IDW web planning stage. This system is the
application can be accessed through successor of the Secure Counter-
FBINet by the user desktop units, Terrorism/collaboration Operation
providing browser based access to the Prototype Environment.
central database and their access
control units. The entire configuration IDW-Special Project Team
is designed to be scalable to enable A special project was started to
expansion as more data sources and augment the existing IDW system
capabilities are added. with new capabilities for use by FBI
and non-FBI agents on the JTTFs
A DOJ Inspector General report (Joint Terrorism Task Force) in
explained: "Data processing is November 2003 by
conducted by a combination of Counterterrorism Division, along
Commercial-Off-the-Shelf (COTS) with the Terrorist Financing
applications, interpreted scripts, and Operations Section (TFOS). The
open-source software applications. FBI office of Intelligence is the
Data storage is provided by several executive sponsor of the IDW. The
Oracle Relational Database IDW Special Projects Team was
Management Systems (DBMS) and in originally initiated for the 2004
proprietary data formats. Physical Threat Task Force. By May 2006,
storage is contained in Network the” Special Project Team provided
Attached Storage (NAS) devices and services to 5 task forces or
component hard disks. Ethernet operations.”
switches provide connectivity between
components and to FBI LAN/WAN. An As Described by the FBI, “The
integrated firewall appliance in the Special Projects Team (SPT)
switch provides network filtering." Subsystem allows for the rapid
import of new specialized data
IDW Subsystems sources. These data sources are
According to the IDW Concept of not made available to the general
Operations, the IDW has two main IDW users but instead are provided
subsystems, the IDW Secret (IDW-S) to a small group of users who have
and IDW-Special Project Tean(IDW- a demonstrated "need-to-know".
SPT). It also consist of a development The SPT System is similar in
A Study on Evolution of Data Mining Techniques Post 9/11

function to the IDW-S system. With words, synonyms and meaning


the main difference is a different variants for words, as well as common
set of data sources. The SPT misspellings of words. If a user
System allows its users to access misspells a common word, IDW will
not only the standard IDW Data run the search as specified, but will
Store but the specialized SPT Data prompt the user to ask if they intended
Store.” to run the search with the correct
spelling."
IDW Features
Deputy Assistant Director Hulon also By 2006, the IDW was processing
asserted that "when the IDW is between 40,000 and 60,000
complete, Agents, JTTF [Joint "interactive transactions" in any given
Terrorism Task Force] members and week, along with between 50 and 150
analysts, using new analytical tools, batch jobs. An example of a batch
will be able to search rapidly for process is where "the complete set of
pictures of known terrorists and match Suspicious Activity Reports is
or compare the pictures with other compared to the complete set of FBI
individuals in minutes rather than days. terrorism files to identify individuals in
They will be able to extract subjects' common between them."
addresses, phone numbers, and other
data in seconds, rather than searching Dataset in IDW
for it manually. They will have the According to various FBI documents,
ability to identify relationships across 38 data sources were included in the
cases. They will be able to search up IDW on or before August 2004.
to 100 million pages of international Automated Case System
terrorism-related documents in (ACS), Electronic Case File
seconds." Since then the number of (ECF)
records already grew ten folds. The dataset consists of ASCII
flat files (metadata and
At FBI, Office of the Chief Technology document text) and
Officer (OCTO) developed an alert WordPerfect documents
capability that allowed users of IDW to consisting of the ECs, FD-302s,
create up to 10 queries of the system Facsimiles, FD-542s, Inserts,
and be automatically notified when a Transcriptions, Teletypes, Letter
new document is uploaded to the Head Memorandums (LHM),
database that meets their search Memorandums and other FBI
criteria. Users can search for terms documents contained within
within a defined parameters. For ACS. The ACS system, FBI’s
example, “the search: 'flight school' centralized electronic case
NEAR/10 'lessons' would return all management system consists of
documents where the phrase 'flight Investigative Case Management
school' occurred within 10 words of the component, Electronic Case
word "lessons." Users can also specify File component and Universal
whether they want exact searches, or Index Component.
if they want the search tool to include
other synonyms and spelling variants Secure Automated Messaging
for words and names.” Network (SAMNet)
“ASCII files in standard cable
"IDW includes the ability to search traffic message format (all
across spelling variants for common capitals with specific header),
A Study on Evolution of Data Mining Techniques Post 9/11

consisting of all messaging Lists of individuals and


traffic sent either from the FBI to organizations who the FBI
other government agencies, or believes to be associated with
sent from other government violent gangs and terrorism,
agencies to the FBI through the provided by the FBI National
Automated Digital Information Crime Information Center
Network (AutoDIN), including (NCIC). “It includes biographical
Intelligence Information Reports data and photos pertaining to
(IIRs) and Technical members of the identified
Disseminations (TD) from the groups in the form of ASCII flat
FBI, Central Intelligence Agency files (data/metadata) and JPEG
(CIA), Defence Intelligence image binaries (none, one or
Agency (DIA), and others from multiple per subject). The
November of 2002 to present.” biographical data includes the
individual's name, sex, race,
Joint Intelligence Committee and group affiliation, and, if
Inquiry (JICI) Documents possible, such optional
“Scanned copies (TIFF images information as height and
and ASCII OCR text) of all FBI weight; eye and hair colors;
documents related to extremist date and place of birth; and
Islamic terrorism between 1993 marks, scars, and tattoos.”
and 2002.” These are
counterterrorism files that were CIA Intelligence Information
scanned into a database to Reports (IIR) and Technical
accommodate the JICI's Disseminations (TD)
investigation into the attacks of “A copy of all IIRs and TDs at
September 11th. the Secret security classification
or below that were sent to the
Open Source News FBI from 1978 to at least May
The open source data collected 2004.” Intelligence Information
for the FBI comes from the Reports are designed to provide
MiTAP system run by San the FBI with the specific results
Diego State University. “MiTAP of classified intelligence
is a system that collects raw collected on internationally-
data from the internet, based terrorist suspects and
standardizes the format, activities, chiefly abroad.
extracts named entities, and
routes documents into IntelPlus scanned document
appropriate newsgroups. This libraries
dataset is part of the Defense “Copies of millions of scanned
Advanced Research Projects TIFF format documents and
Agency (DARPA) Translingual their corresponding OCR ASCII
Information Detection, text related to FBI's major
Extraction and Summarization terrorism-related cases.”
(TIDES) Open Source Data IntelPlus is an application that
project.” allows the users to view "Table
of Contents" lists from large
Violent Gang and Terrorist collections of records. The user
Organization File (VGTOF) is able to display the document
whether it is in text form or one
A Study on Evolution of Data Mining Techniques Post 9/11

of several graphic formats and financial documents obtained as


then print, copy or store the a result of numerous financial
information. The application subpoenas pertaining to
allows tracking associated individuals and accounts. These
documents on related topics documents have been verified
and provides a search as being of investigatory
capability. interest and have been entered
into the terrorist financial
Financial Crimes database for linkage analysis.
Enforcement Network The TFOS has obtained
(FinCEN) Databases financial information from FBI
Data related to terrorist Field Divisions and Legal
financing. "FinCEN requires Attached Offices, and has
financial institutions to preserve reviewed and documented
financial paper trails behind financial transactions. These
transactions and to report records include foreign bank
suspicious transactions to accounts and foreign wire
FinCEN for its database. transfers."
FinCEN matches its database
with commercial databases Foreign Financial List
such as Lexis/Nexis and the Copies of information
government's law enforcement concerning terrorism-related
databases, allowing it to search persons, addresses, and other
for links among individuals, biographical data submitted to
banks, and bank accounts." At U.S. financial institutions from
least one of these databases foreign financial institutions.
includes all currency transaction
report (CTR) forms on bank Selectee List
customers' cash transactions of Copies of a Transportation
more than $10,000: "In 2004, Security Administration (TSA)
FinCEN first provided the FBI list of individuals that the TSA
with bulk transfer of [CTRs]" believes warrant additional
Over 37 million CTRs were filed security attention prior to
between 2004-2006. boarding a commercial airliner.
According to Michael Chertoff,
Terrorist Financing "fewer than" 16,000 people
Operations Section were designated "selectees" as
Databases of October 2008.
According to Dennis Lormel,
Section Chief of the Terrorist Terrorist Watch List (TWL)
Financing Operations Section, The FBI Terrorist Watch and
TFOS has a "centralized Warning Unit (TWWU) list of
terrorist financial database names, aliases, and
which the TFOS developed in biographical information
connection with its coordination regarding individuals submitted
of financial investigation of to the Terrorist Screening
individuals and groups who are Center (TSC) for inclusion into
suspects of FBI terrorism VGTOF and TIPOFF watch
investigations. The TFOS has lists. Also called the Terrorist
cataloged and reviewed Screening Database (TSDB),
A Study on Evolution of Data Mining Techniques Post 9/11

the database "contained a total database, and the Consular


of 724,442 records as of April Lookout and Support System
30, 2007." (CLASS), which includes
information provided by the
No Fly List Department of Health and
A copy of a TSA list of Human Services (HHS) and law
individuals barred from boarding enforcement agencies such as
a commercial airplane. the Federal Bureau of
According to Michael Chertoff, Investigations (FBI) and U.S.
2,500 people were on the "no Marshals Service." "The overall
fly" list as of October 2008. CLASS database of names has
risen to over 20 million records
Universal Name Index (UNI) in recent years, including
Mains millions of names of criminals
A copy of index records for all from FBI records provided to
main subjects on FBI the State Department under the
investigations, except certain terms of the USA PATRIOT
records that might reveal people Act." "The Online Passport Lost
in witness protection or & Stolen System permits
informants. "A main file name is citizens to report a lost or stolen
that of an individual who is, passport." It includes "Name,
himself/herself, the subject of date of birth (DOB), social
an FBI investigation." security number (SSN),
address, telephone number,
Universal Name Index (UNI) and e-mail address," as
Refs reported by the citizen.
A copy of index records for all
individuals referenced in FBI Department of State
investigations, except certain Diplomatic Security Service
records that might reveal people A copy of past and current
in witness protection or passport fraud investigations
informants. A "reference is from the "DOS DDS RAMS
someone whose name appears database." The Records
in an FBI investigation. Analysis Management System
References may be associates, (RAMS) Database "allows all
conspirators, or witnesses." Field Offices, Resident Agent
Offices (RAO) and the Bureau
Department of State Lost and of Diplomatic Security to track,
Stolen Passports maintain, and efficiently share
A copy of records pertaining to law enforcement investigative
lost and stolen passports. "The case information. RAMS
Consular Lost and Stolen contains CLASSIFIED
Passports (CLASP) database information." By September
includes over 1.3 million records 2005, the Department of States
concerning U.S. passports. All was "developing a 'Knowledge
passport applications are Base' on-line library that will be
checked against CLASP, a 'gateway' to passport
PIERS [Passport Information information, anti-fraud
Electronic Records System], the information, and relevant
Social Security Administration's databases. All passport field
A Study on Evolution of Data Mining Techniques Post 9/11

agencies and centers can use Security By Hsinchun Chen,


this system to submit anti-fraud Edna Reid, Joshua Sinai, Andrew
information such as exemplars Silke, Boaz Ganoz
of genuine and malafide
documents, fraud trends in their 3. Web data mining and
respective regions, and other applications in business
information that will be instantly intelligence and counter-
available throughout the terrorism By Bhavani M.
department." Thuraisingham

Conclusion 4. Security Informatics and


This paper gives an overview of Terrorism: Patrolling the Web
usages of Data mining applications in By Cecilia S. Gal, Paul B. Kantor,
the act of counter terrorism. It gives a Bracha Shapira
brief description of different types of
perceived threats (Real-time Threats 5. Fighting terror in cyberspace
and Non Real-time Threats) and By Mark Last, Abraham Kandel
analysis of the techniques used to
handle these threats. It also lightly 6. “Justice officials defend data
covers the ethical dilemma of issues of mining as anti-terror tool” By
privacy of individuals.
Drew Clark National Journal's
Technology Daily November 15,
Next comes the in-detailed study of the
2002
FBI Counter Terrorism Program post
9/11. Post 9/11, the FBI went through 7. Data Mining And Counter-
a transition from the separate systems Terrorism: The Use Of
containing separate data(ACS, Telephone Records As An
TelApps) towards Investigative Investigatory Tool In The “War
Database Warehouse(IDW) which On Terror” by Bryan D. Kreykes
contained all the data that could legally
be stored together. Extensive 8. Commentary by Bruce Schneier “
discussion about the details of IDW is Why Data Mining Won't Stop
covered in this paper. The final topics Terror” 03.09.06
of discussion involve the analytical
9. Ellen Nakashima, FBI Shows Off
tools used to analyze the data stored
Counterterrorism Database, 2006
in the IDW and the various sources http://www.washingtonpost.com/wp
from which these data were gathered. -
dyn/content/article/2006/08/29/AR2
References 006082901520.html

1. Investigative data mining for 10. MATRIX data mining system is


security and criminal detection unplugged, 2005
By Jesús Mena http://www.privacyinternational.org/
article.shtml?cmd[347]=x-347-
2. Terrorism Informatics: 205261
Knowledge Management and
Data Mining for Homeland
A Study on Evolution of Data Mining Techniques Post 9/11

11. Robb S Todd, FBI's New Data 14. February 28, 2002, Countering
Warehouse A Powerhouse, 2006 Terrorism: Integration of
http://www.cbsnews.com/stories/20 Practice and Theory, An
06/08/30/terror/main1949643.shtml Invitational Conference FBI
Academy, Quantico, Virginia
12. Report on the Investigative Data 15. Ryan Singel, Newly Declassified
Warehouse, 2009 Files Detail Massive FBI Data-
http://www.eff.org/issues/foia/inves Mining Project, 2009
tigative-data-warehouse-report http://www.wired.com/threatlevel/2
009/09/fbi-nsac/
13. James Lawler, A Study of Data
Mining and Information Ethics in
Information Systems Curricula. 16. A Report to the National
Commission on Terrorist
Attacks upon the United States,
The FBI’s Counterterrorism
Program, 2001

17. Bhavani Thuraisingham, Data


Mining for Counter-Terrorism

Illustrations

Illustration 1
A Study on Evolution of Data Mining Techniques Post 9/11

Illustration 2

Illustration 3

Anda mungkin juga menyukai