Anda di halaman 1dari 34

Network defense using redirection of malicious traffic to

honeypots

John Doe

Heaven University
Faculty of Informatics

May 12, 2018

John Doe PA000 May 12, 2018 1 / 34


Overview

1 Motivation

2 Theory background
Active network defense
Honeypots and dynamic honeypots

3 Active network defenses


Past research on dynamic honeypots
Implementation

4 Data analysis
Data mining

John Doe PA000 May 12, 2018 2 / 34


Motivation I

What is active network defense? How can it be classified?


Why is it needed? What is the honeypots role in it?
What is dynamic honeypot and why is it a subject of focus?
What is the current research on dynamic honeypots?
How such a dynamic honeypot can be employed? What costs does it
bring?
How to interpret data collected from honeypots?

John Doe PA000 May 12, 2018 3 / 34


Motivation II - Passive network defense

Encryption, firewall, automatized detection (IDS).


General strategy is to apply Defense in Depth.
Slows down the attacker, does not answer back - reactive.
Perfect as a base for active network defense.
Not enough to employ in many cases.

John Doe PA000 May 12, 2018 4 / 34


Motivation III - Dynamic honeypots

Static vs. dynamic honeypot classification might be less known.


Dynamic honeypot does not restrict the nature of honeypot in general.
Only few research papers on dynamic honeypots with no available
implementation (as of completion of this work).
Active network defense built upon dynamic honeypots is flexible and
simple at its base.

John Doe PA000 May 12, 2018 5 / 34


Active network defense (AND)

No formal definition. The definition is mostly adjusted to the needs of


given AND.
An interaction of various AND varies.
Two extremes are active-passive network defense and agressive AND.
Two known measures of interactions of AND are Active Response
Continuum and Rosenzewig typology.
An agressive AND can be preemptive or passive/active
counter-attack. A subject related solely to cyberattacks.

John Doe PA000 May 12, 2018 6 / 34


Definition (Active network defense)
Any measure performed by network node or a group of network nodes at
time, when the attack was detected, in order to analyze, followed by a
suppression with a method that has a non-agressive form. The whole
process is automatized without a need of explicit interaction.

John Doe PA000 May 12, 2018 7 / 34


Deception-based AND

Collection of network defenses that purpose is to suppress an attack


by deceiving an attacker.
Notable examples are honeypot, tar pit, beacon file, address hopping
and DNS sinkhole.
Tar pit is a weaker form of honeypot. The main purpose is to slow
down the attacker to gain a time (LaBrea Tarpit or iptables).
Beacon file is used to expose an attacker’s identity by letting the
attacker open a crafted resource (Georgian-Nato Agreement.zip).
Address hopping (space randomization) is used to mask IP addresses.
A synchronization between communicating parties is needed (can be
useful against nmap or DoS/DDoS).
DNS sinkhole (sinkholing) is a defense method by deflecting a traffic
to sinkhole.

John Doe PA000 May 12, 2018 8 / 34


Classifications of AND

Based on Rosenzewig typology.


The basic idea is to recognize the range of network defense and a
level of interaction.
Range can be within a network or out of the network while the level
of interaction can be one of the following: observation, access,
disruption and destruction.
Observation Access Disruption Destruction
In network honeypot
Out network sinkholing beacon file beacon file

AND can be classified, however an interaction may vary in policies.

John Doe PA000 May 12, 2018 9 / 34


Definition (Active response)
Any sequence of actions performed by an individual or an organization at
time between attack detection and judgment that this attack has ended.
These actions can be automatized or nonautomatized in order to suppress
an attack using given resources.

John Doe PA000 May 12, 2018 10 / 34


Active Response Continuum

Active response may vary depending on new information gained by


observing an attack.
Decision model is used to choose the best possible response.
Active defense algorithm and model (ADAM) is a framework which
decides active response based costs with relative probability of success
suppressing an attack.

John Doe PA000 May 12, 2018 11 / 34


Honeypots

No formal definition.
Known definition is based on the implementation reason, where
honeypots are purposely implemented to be vulnerable.
Flexible tool used to detect or suppress an attack on network.
Low overhead, no production value - local communication from/to
honeypot is suspicious.

Definition
Security resource whose value is in revealing itself in order to give an
attacker a possibility to inspect and attack. (L. Spitzner)

John Doe PA000 May 12, 2018 12 / 34


Honeypot deployment
Do not inspect all the data - separate using detection device.
Add a device (tool) to balance the traffic between honeypot and
production network.
Load balancer can filter, change, redirect the traffic depending on the
implementation.
Keep transparency and fairness.

John Doe PA000 May 12, 2018 13 / 34


Honeypot classification

Main classification is based on level of interaction and purpose.


Can be virtual or physical, however this changed with virtual
computers and OS simulation.

There are more categories made depending on usage within a research


and deployment.

John Doe PA000 May 12, 2018 14 / 34


Dynamic honeypot

Honeypots are configured once and deployed. The configuration does


not change - static honeypots.
In order to keep the honeypot configuration fresh a dynamic honeypot
is deployed instead.
Can dynamically reflect a production network, less likely to be
detected by an attacker (blacklisting prevention).
Also known as catering honeypot in some sources.

Feature Static Dynamic


Behaviour Fixed Changes automatically
Adaptability None Adapt and blend with current env.
Human effort Need to manually reconfigure None
Knowledge Prior skill and experience None

John Doe PA000 May 12, 2018 15 / 34


Deployment of dynamic honeypot

Dynamic honeypot is represented by one or more static honeypots.


Three processes must be deployed: scanning, configuration
management and static honeypot maintance.
Honeypot maintance and configuration management depend on policy
of a AND, while scanning can be passive or active.
No overhead using passive scanning, however a delay and inaccuracy
are introduced. Active scanning removes thes drawbacks, however a
firewall may be an obstacle to make the dynamic honeypot effective.
Combination of both scanning approaches is possible.

John Doe PA000 May 12, 2018 16 / 34


AND with dynamic honeypot (Iyad Kuwatly et al.)

John Doe PA000 May 12, 2018 17 / 34


AND with dynamic honeypot (Christopher Hecker et al.)

John Doe PA000 May 12, 2018 18 / 34


Setup for simulation of AND

A scenario reflecting a real network is prepared.


The major nodes of the scenario are target network, attacker, router
and honeypot.
To simplify the case one station was used as a target.
An attacker performs one type of attack.
Router has an extended functionality to make sure that the attack is
redirected to the honeypot.
Honeypot is a dynamic honeypot which periodically scans the target
network at the beginning and in a time interval.

John Doe PA000 May 12, 2018 19 / 34


Sequence diagram for simulation

John Doe PA000 May 12, 2018 20 / 34


The KYPO - Cyber exercise & Research platform
Virtual enviroment is more convenient for simulation than production
network.
The KYPO is a simulation enviroment developed by CSIRT-MU team.
Scenario is provided to KYPO to allocate virtual space for a specified
topology.

John Doe PA000 May 12, 2018 21 / 34


The KYPO (cont.)

John Doe PA000 May 12, 2018 22 / 34


The KYPO (cont.)

Images with OS and needed tools were prepared for each node of the
simulation.
Router node is represented by a router within sandbox in order to
save space.
Target Honeypot Attacker Router
fail2ban honeyd nmap rfw
curl farpd hydra iptables
openSSH cron
scan

Fail2ban can run a script based on detecting an anomaly inside logs.


For this purpose a jail is created, where action is curl.
Hydra aims to run bruteforce attack on ssh running on target.
Modified rfw is used to signal router about routing attacker’s packets.
HoneyD and farpd need to cooperate in order to create a low
interaction honeypot.
John Doe PA000 May 12, 2018 23 / 34
Realization of AND with redirection to dyn. honeypot

Four phases: configuration and installation, initial attack, redirection


and logging with analysis.
Two modules of AND: Dynamic honeypot and Forward process.
Input for dynamic honeypot is a configuration file generated by scan
script written in Perl.
Output is a configuration for static honeypot.
This process reapeats to keep the honeypot fresh.
Forward process takes place after the inital attack is performed.

John Doe PA000 May 12, 2018 24 / 34


Dynamic honeypot module

John Doe PA000 May 12, 2018 25 / 34


Sample output from DHM

John Doe PA000 May 12, 2018 26 / 34


Forward process of AND

John Doe PA000 May 12, 2018 27 / 34


Data analysis

Different approaches to collect data from any type of honeypot.


High interaction honeypots data is mostly harder to collect, harder to
analyze and overally requires much time from the analysist.
Low interaction honeypot data is of a simpler form, larger amount
and collected automatically for a long period of time → data mining.
Root Cause Analysis is performed in order to recognize as many
different attacks as possible.

John Doe PA000 May 12, 2018 28 / 34


Data mining - Association rules

AxB ⇒ C , market basket analysis, web mining and many more.


Support is an indication how frequently the itemset appears in
dataset: supp(X ) = |t∈T|T;X| ⊆t| .
Confidence is an indication of how often the rule has been found to
∪Y )
be true: conf (X ⇒ Y ) = supp(X
supp(X )
Apriori algorithm is used to find frequent itemsets on input database
with provided minimum tresholds on support and confidence.
Database contains tables where each table represents a port sequence
and each line is an information about attack source.
The data is mined based on parameters T, ni and N, where T is
number of attacker targets, ni is number of packets sent to i-th target
and N is total number of packets.

John Doe PA000 May 12, 2018 29 / 34


Data mining - Association rules (cont.)

Ports sequences Frequent itemsets Support (%)


{445} Cluster 1: T=3, N=9, n1 =3, n2 =3, n3 =3 65.5
Cluster 2: T=1, N=1, n1 =1, n2 =0, n3 =0 18.4
{80} Cluster 1: T=1, N=3, n1 =0, n2 =3, n3 =0 55.8
Cluster 2: T=3, N=11, n1 =3, n2 =5, n3 =3 18.7

To reduce ouput, minimum treshold on support is 1%.


Two different attacks can still have the same parameters → data
iteself must be analyzed.
Cluster analysis can be used on each itemsets based on data similarity.

John Doe PA000 May 12, 2018 30 / 34


Data mining - Affinity propagation

Algorithm for clustering data (e.g. face recognition).


The number of clusters is not bounded (it stops either after specified
number of iteration or if the output remains unchanged).
Payloads of packets within one attack source are concatenated.
Levenshtein distance is used as a similarity measure.
Clusters obtained with this method may still include different attacks
→ simple statistics can be applied.

John Doe PA000 May 12, 2018 31 / 34


Data mining - More analysis

Let C be the set of all attack words within a cluster and n = |C |.


Next, let D(a, b) be the distance of a and b and i any attack word.
The average distance from i to other word is di = j∈C D(i,j)
P
n−1 .
P di
The average distance of cluster C is DC = i∈C n .
Standard deviation can be calculated as well to see the rough distance
difference.
Clusters with zero average distance represent one root cause (e.g.
port scanning).
Clusters with lower average distance possibly represent one root cause
(randomization within packet headers).
Clusters with high average distance should be further inspected.
Either algorithms are run with different parameters or a manual
inspection is done.

John Doe PA000 May 12, 2018 32 / 34


Conclusion

The provided solution is simple and easy to deploy.


To improve defense a tunnel should be provided between router and
dynamic honeypot.
The moment of redirection is critical. Attacker may be able to scan or
ping the real network however an attack should be redirected in the
right time. In this case, the attack had to be performed first to do the
redirection.
Overhead of the scanning can be influenced in several ways. A
tradeoff is to scan only critical part of the production network with a
deep scanning.

John Doe PA000 May 12, 2018 33 / 34


Thank you

John Doe PA000 May 12, 2018 34 / 34

Anda mungkin juga menyukai