Anda di halaman 1dari 9

INTRUSION DETECTION THROUGH

“DNA CHARACTERIZATION”
ABSTRACT With the increasing use of personal
computers in business and home, the
With growing Internet connectivity topic of computer security must be
comes growing opportunities for continually addressed. Only with
attackers to illicitly attack the computer adequate computer security can users be
over the network. Continuous research is certain that their computers are not
being carried upon to develop a infected by malicious programs or being
completely dependable “Intrusion used for malicious purposes. In order for
Detection” system with most of the a computer to have a dangerous program
research channeled towards developing installed on it or for it to be used
new systems rather than adapting pre- inappropriately, it must first be intruded
existing ones. The research is mostly upon. Computer intrusion may occur via
focussed on strengthening the security as an e-mail attachment, a download from a
opposed to recovery. website, physically via a disc, or by
In this paper we present a new method to unauthorized access. Therefore, against
hit the headlines in the field of Intrusion computer intrusion two defense
Detection, namely “DNA techniques are employed. They are:
Characterization” which draws its
inspiration from the human genome, to  Intrusion Prevention: This scheme
detect the intrusion as early as possible. may be utilized to prevent viruses,
We do this, based upon the “Teiresias” malicious software and unauthorized
algorithm, which helps us to generate a users are prevented from entering our
DNA sequence with which the newly computer. Protection of this nature can
generated sequences are compared and if be accomplished with the use of
there are any conflicts then the user is > Password controlled access.
informed that a possible intrusion is > Software signature recognition.
happening. This is well illustrated by > Firewalls.
several self-explanatory examples.The But, inevitably the best intrusion
long term goal of our paper is not only to prevention systems fail due to the
develop a system with more continuous evolution of malicious
“dependability” but also to enable software and the persistence of
“survivability” so as to hold its place in unauthorized users.
the ever challenging world of network
intrusion and security.  Intrusion Detection: Due to the
shortcomings present in the above
method, this defence scheme gains
1. INTRODUCTION
importance. In this paper, we define a
unique method called COMPUTER
DNA CHARACTERIZATION for
determining whether our computer is
being intruded or attacked through
networks.
1.1 Why we use COMPUTER The first base structure contains
DNA CHARACTERIZATION? only three pieces of information namely,
amount of TCP, UDP, and ICMP traffic
The benefits of decoding the human over a certain time period. Sequences
genome are to have a better generated from this base only present the
understanding of diseases and efficient most basic information about a computer
development of drugs and techniques system’s network activity. A sequence of
that will treat the disease. Thus, early this type is illustrated in Figure 1.
detection of diseases will be
possible.Computer characteristics such
as network traffic, modification of
system files, and modification of data
files are determined by the computer’s
DNA sequence. As with humans, the
initial DNA sequences are
predetermined on inception and ideally To retrieve information pertaining to
do not cause the system any harm. But packet type, the IP header must be
any user trying to intrude the system examined. Although various protocols
brings an abnormal change in the exist we use only TCP, UDP and ICMP
generated DNA sequence, leading to an protocols. Although very basic in nature,
early intrusion detection, which limits sequences based on this fundamental
the damage and quickens the process of base structure can provide important
recovery. information about computer network
usage. For eg., a sequence may be
generated for a home computer that is
commonly used to retrieve information
from the Internet but is not used to
download streaming audio or video files.
Therefore, this sequence would predict
average TCP network traffic but
minimal UDP network traffic. However,
if a computer attacker attempts to access
this particular computer via a UDP flood
or a similar attack, the increase in UDP
network traffic would be detected and
the user can be alerted to a possible
intrusion.
A COMPUTER ILLUSTRATION
2.1.2. Detailed Base Structure:
2. COMPUTERNETWORK As with the fundamental base
TRAFFICDNASEQUENCES structure, this sequence will include
2.1. Base Structure Definition information about the no. of UDP
Before a computer DNA packets and ICMP packets over a given
sequence can be generated, the structure time period. However, TCP packets are
of the base pairs must first be further separated into three segments
determined. There are two base as ,
structures:
2.1.1. Fundamental Base Structure:
The logic behind using a structure of this
form is to detect a common computer Mostly network communications occur
attack known as a SYN flood. Normally, via commonly used ports such as port
communications between two computers 23, 80 and 25. Therefore, a base
is initiated by a three-way handshake. structure could enumerate the no. of
When the destination host receives a packets processed by commonly used
TCP packet with the SYN flag set, the ports as well as enumerate packets
destination host replies with a SYN received and sent by other ports.
ACK packet and then waits for an ACK
packet from a source host. While waiting
for the ACK packet, a connection queue
of finite size on the destination host
keeps track of connections waiting to be
completed. This queue typically empties
quickly since ACK is expected to arrive
a few millisec., after the SYN ACK. A
computer attacker exploits this design by
generating numerous SYN packets with
random source IP addresses. These
packets are then directed towards a
victim and the victim replies with a SYN
ACK packet back to the random source 3. DNA SEQUENCE
IP addresses. Further, an entry is added
GENERATION USING
to the victim’s connection queue.
Because the SYN ACK packet is send to TEIRESIAS ALGORITHM
random IP addresses, the victim does not
receive the final ACK packet and the Once the base structures have
last part of the three-way handshake been defined, sequences of these bases
is never completed. However, the entry must be generated to form Computer
remains in the connection queue until a DNA using Teiresias algorithm.
timer expires, typically for about one
minute. When the connection queue is Algorithm:
full, legitimate users will not be able to
access TCP services such as e-mail and Step 1: Collection of the raw data
web browsing. pertaining to network activity.
Suggestions for improvement: Base Step 2: Raw data must be processed into
Structure using Port Numbers the two base structures as shown below.
A base structure containing information For eg.,[10, 3, 1] indicate 10 TCP ,3
about the computer ports being used for UDP and 1 ICMP packets
communications may be beneficial. Step 3:After all base structures have
been generated for a given time interval,
repeated structures are determined
via Teiresias method.
Step 4: A collection of the commonly to changes in users and user activities.
repeated base structures then forms the
computer DNA sequence to predict 4. Parameters in Computer DNA
computer network activity. In the fig 4, Sequence Generation
the base structure [10, 3, 1] occurs 17
times in the given time period

Time period allocated to a particular
instance of a base structure: Each base
structures in fig 4, indicates computer
network activity for a time period of five
seconds. Generally, if a longer time
period is used to generate base
structures, the no. of packets that a
computer system must process increases.
Time period for collecting base
structures to create computer’s DNA:
Base structures indicating computer
network activity will be collected for a
certain time interval and then processed
by Teiresias to determine commonly
Once the computer DNA sequence is repeated structures. Generally, if a
created, new base structures generated longer time interval is allocated for base
by real time network activity can be structure
compared to structures contained within collection, the no. of repeated base
the DNA sequence. For, eg., if a new structures will increase, thereby
base structure of the form [1000,500,40] generating a larger DNA sequence.
is generated for the Tolerance:The tolerance factor is an
predetermined time period, but absent in integer used as a parameter of a function
the DNA sequence of the computer in to round a number to the closest multiple
question, a flag would be raised of the tolerance factor. Tolerance factor
indicating a computer attack may be is included to allow similar numbers to
occurring. be treated as equal.
Suggestions for improvement:
DNA EvolutioN
Once the initial DNA sequence has been
established, a mechanism should be
developed so that the sequence is
continuously updated by
 Adding
 sequences that are not currently Suggestions for improvement:
present. Time
 allocated per base structure :
Eliminating
 initial sequences that (1). A shorter time period may be more
were absent in future monitoring. effective for a computer, subjected to
By continuously evolving, the DNA continuously high levels of network
sequence for a computer will more activity.
accurately predict the levels of computer Tolerance:

network activity currently being
experienced by adapting
(1). If security requirements are higher, a period is large we get a larger DNA
lower tolerance factor should be used to sequence
generate a more specific DNA sequence
capable of detecting smaller variations in
network activity.
(2). If network activity volume is high, a
large tolerance factor should be utilized
to limit the no. of false positives

5.NETWORK MONITORING

5.1. Network Monitoring Tools


Monitoring network traffic is crucial in
many intrusion detection and prevention
schemes, to track a computer’s network 6.2. Testing Procedure:
activity. Several such tools are To determine the usefulness of the
TDIMON, IPMON, and SNIFFIT. Here generated sequences in predicting future
we use a utility, named WINDUMP levels of computer network activity,
compatible with Windows OS. continued network monitoring of the
WINDUMP has the ability to capture home network switch is required
entire packets that are sent to and Subsequent base structures generated by
received by a particular network adapter. the continued network monitoring
By analyzing the raw packets captured were compared against the initial
by a particular computer system, generated DNA sequence to determine if
information required to generate the base the generated DNA sequences are able to
structures, and inturn, the system’s DNA predict future levels of network activity.
sequence can be gathered. In order to quantify the effectiveness of
computer intrusion detection method,
two main criteria are examined.
6. RESULTS AND ANALYSIS
False positives: These occur when a
non-existent intrusion is detected. This
6.1. Sample Initial DNA sequence: condition generally occurs when the
The initial DNA sequence was obtained threshold for an intrusion is set too low.
by monitoring network traffic for a False positives are dangerous in that they
particular period of time. Here any raise unnecessary worry and concern for
sequence that was repeated twice was the users of the systems.
added to the DNA sequence. If the time
False
 negatives: These occur when structures represent a similar amount of
an intrusion is not detected by the network activity, and the error generated
intrusion detection scheme. Needless to by the base structure [70, 0,
say, false negatives are dangerous 0] was obviously a false positive. From
because potentially devastating this fact, a conclusion was derived
intrusions are not detected. specifying that the tolerance factor was
Therefore, the common goal in far too strict. So when the tolerance
designing an intrusion detection system factor was increased to 100 there was a
is to limit the no. of both false positives drastic reduction in the no. of false
and false negatives. positives that were detected, which is
shown in the following figure
6.3. Results
In generating the DNA sequence for
the home network location, a very low
tolerance factor was used.
Subsequently , many base structures Fig 10: Errors detected when tolerance
occurred during normal computer factor=100.
usage not predicted by the generated Note: The Sequences containing 3
DNA sequence. Some eg., of the errors element base structures produce fewer
that were detected are shown in false positives than sequences containing
following figures.Based on the results 6 element base structures.
and errors detected, several key
conclusions were derived. First, several 7. DETECTION OF
of the detected errors closely resemble COMPUTER ATTACKS
base structures found in the DNA
sequences.
A computer DNA sequence must be able
to detect potential computer attacks.
Unfortunately, conducting an actual
computer attack is rather difficult to
regulate and may result in damage to the
victim’s computer.

7.1. Detection of UDP Flood:

For eg., the base structure [70, 0, 0]


generated an error although the base The resulting computer network activity
structure [60, 0, 0] was present in the when an UDP flood attack happens is
DNA sequence. Clearly, both base shown in fig 7. Clearly, the data used to
illustrate a computer attack in fig 7
would not be predicted by the DNA to say, the prevention of computer
sequence generated for the home attacks and intrusions is paramount. We
network switch, and therefore, would be propose that a computer DNA sequence
flagged as abnormal behavior. can be generated for various computer
characteristics, and a method for
generating a DNA sequence using the
Teiresias algorithm was explained. The
computer characteristic of particular
interest to this paper was computer
network activity. By several eg., it was
shown that generating a
DNA sequence which accurately
predicts future levels of computer
network activity is possible.
Furthermore, it has been shown by eg.,
that a computer DNA sequence is able to
7.2. Detection of Internet Worms: detect common computer attacks.
Importantly, several areas of future
improvements have been suggested.
Finally, in order to confidently conclude
that computer DNA characterization is
an effective means of network intrusion
detection, a system must be given a
DNA
sequence and then exposed to various
network attacks. Such testing would
require strict supervision as well as the
required resources to conduct such
These are another form of common attacks. This paper illustrates
computer attacks that would produce preliminary results of Computer DNA
abnormal activity levels that would be Characterization and its ability to detect
detected via a computer DNA sequence. computer attacks. Although not entirely
Attacks of this nature generally try to conclusive, the preliminary results
propagate by sending malicious code to provide a solid foundation on which to
all the e-mail addresses present in the conduct further research.
victim’s address book. Normal computer
users rarely (if ever) try to send email to 9. REFERENCES
everyone they know simultaneously.
Therefore, a flag would be raised if our [1]www.oreilly.com
computer were infected with such a virus [2]www.demarc.com
[3]www.howstuffworks.com
8. CONCLUSION

The costs of computer attacks can be


attributed to the loss of information,
sales, data, and infrastructure. Needless

Anda mungkin juga menyukai