4 Full Chapter Margin PDF

CHAPTER I: INTRODUCTION
1.1
Overview
This project is about a network forensic that allow finding the details of
networking events after they happened and how to analyze VoIP attacked data
pattern by using WEKA, a data mining tool. WEKA is used to view network
traffic, in order to investigate network and security attacks or application
performance issues. From the data pattern, an investigation will be conducted to
reveal information about network and application interactions, user sessions, and
response time and latency metrics. It is also to get the information about the
source of the attacks, when the attacks happen, where the source of the attacks
comes from and what type of attacks that are found and track down a hacker is to
keep vast records of activity on a network with the help of an intrusion detection
system.
From the gathered data, it will help to find a solution for each attack to
prevent them from happening again in the future. From the data analysis, it also
reveals who communicated with whom, when, and how often. This information
gained could be used as evidences to the victims for them to take further action
on the parties that committed network crimes on them.
1.2
Project Objective
The main objective of this project is to analyze the pattern of attack data
from the captured data. In which case, the data will indicate the condition of the
network events. Hence the source of attacks or other problem incidents will be
discovered. It helps in identifying unauthorized access to a computer system, and
searches for evidence of other types of threats of attack occurrence.
1
The second objective is to convert the pcap data to arff data file that will
recognize by the WEKA data mining tool. The first objective cannot be
conducted if the second objective is failing to apply.
1.3
Project Scope
This project will focus on VoIP and attacked data pattern by using
WEKA, a data mining tool. The Denial-of-service attack (DoS), Spam over
Internet Telephony (SPIT), and Man-in-the-middle (Mitm) attacks are the three
main focuses of this project.
1.4
Problem Statement
The growth in networking connectivity, complexity and activity has

increased the number of crimes committed within networks.
An emerging
application like VoIP has worsened the situation. Knowing the attacked patterns
allows network administrators to fence their network.
VoIP is one of the newest technologies that are being rapidly embraced
by the market as an alternative to the traditional Public Switched Telephony
Network (PSTN). The common VoIP threats are network-based DoS,
eavesdropping, signaling protocols, spam and etc. These attacks can make
conversations unintelligible due to malicious people that can listen in others
conversations, network overloaded, and packet loss or network congests that
caused a network down. In addition the bandwidth for each application on the
network will be less since they will be shared amongst the applications.
1.5
Problem Solving
The solution to the problem can be solved by any network tools. WEKA
which is a data mining tool will be used in this project to view network traffic
history to investigate the attacked and identify the source of attacks.
1.6
Chapter Organization
This chapter contains the detailed description of the project proposed

which is VoIP data Forensic using WEKA a data mining tool. In this chapter, we
have described the surface of how the VoIP data Forensic work and how the
attacks had given an impact on VoIP application. More details about VoIP data
Forensic using WEKA a data mining tool will be described in Chapter 4.
Chapter 2 discusses the literature review that is used in the project. The
literature review describes all the research and findings that related to this
project.
Chapter 3 will discuss on the research methodology that will give specific
research methods used to design the project. In this chapter, there had
explanations on the methods and specifications that used in this project and also
prepare budgets and costing.
Chapter 4 will discuss on the testing and implementation of the

project. This chapter will give the explanation on how the project will be
implemented.
Chapter 5 will discuss on the project verification. This chapter will give a
result from the project implementations or experiments. From this chapter, user
will understand on how the system running and the final output of the system.
Chapter 6 is the conclusion of the project. Any other suggestions or

enhancements will be listed in this chapter for future reference.
CHAPTER II: LITERATURE REVIEW
This chapter consists of discussion on several subjects that related to this project.
The reviews start with a definition and concept of VoIP, Data Mining and Network
Forensic. In addition, the existing VoIP protocol and VoIP issues will be one of the
researches. Then some work by other researchers that related to the area of study will be
study so that it can be included in a literature review.
2.1
Background
2.1.1
Voice over Internet Protocol (VoIP)
VoIP known as IP Telephony, which is using an Internet Protocol over an

IP network. With the growth in popularity and bandwidth, VoIP allows phone
calls to be routed over the Internet rather than Public Switched Telephone
Network (PSTN). VoIP converts the voice signal into digital signals that travel
over the Internet. The voice signal is packetized and sent over the network oneby-one. The processes of packetization involved with a callers voice signal
being compressed, then transfer it over the IP network, and it is then
decompressed at the end [1]. So VoIP can achieve on any data network that uses
IP like Local Area Networks (LAN), Internets and Intranets [1].
There are several reasons why VoIP telephony is becoming very

attractive to telecommunication providers and users rather than PSTN. The
decreased call cost is one of the main reasons. It is relatively cheap to make a
long distance call through a VoIP service rather than PSTN. This is because
network resources such as bandwidth, router CPU and memory are shared
between applications in the Internet [2]. When using a PSTN line, users had to
pay for each minute that spend on the phone. The Internet is a backbone of VoIP,
5
the cost that the user has to pay is a monthly bill to an Internet service provider
(ISP). The other reason is VoIP services can be used for conference calls as
appose to the phone line whereby only two persons can speak at a time. With
VoIP, a conference can be setup with a whole team, communicating in a real
time.
Figure 2.1: VoIP Processing [3]
Figure 2.1 shows a simple VoIP process. To send data over the internet,
the voices or the data are compressed into small packets to reduce amount of
transmission space. These packets are sent in different order and the packets are
then streamed line at the other end. Generally packet loss can happen during the
transmission. To recover from the loss, there is a mechanism in order to cover up
the loss and building up the data by collecting the pieces of information [3].
There are also other potential problems with VoIP such as increased
security risks and lower Quality of Service (QoS) and Denial of Service (DOS)
[4]. In the PSTN, a circuit or dedicated channel was set up between two points
for the call duration. These telephony systems are based on copper wires carrying
analog voice data over the dedicated circuits [5]. A set amount of bandwidth is
6
reserved when a call is established between the callers for the time the
connection is active. One of the main problems with PSTN technology is that
the 64 kbps of bandwidth is reserved even when there is no data being sent and
the entire bandwidth is not needed. The actual requirement for bandwidth is
usually only a small amount of what is reserved [4].
VoIP telephony relies upon methods and various protocols to establish

calls and transmit data. Most VoIP implementations however use Session
Initiation Protocol (SIP) and Real-time Transport Protocol (RTP). The SIP
protocol is a text based application is used for a call teardown, call initiation, and
other call related data sent during the conversation [4]. Besides using SIP for
teardown and initiating calls, SIP is also used for integrating more users into a
conference call. VoIP that uses SIP relies on a SIP proxy server which to
authenticate the users login credentials. This proxy also used for signaling the
data and to route the call and acts as a registrar which is used to locate other
users [4].
RTP is used for generic transport capabilities for real-time multimedia

applications that support both steaming applications and conversational such as
video conferencing, video-on-demand, internet telephony, internet radio and
music-on-demand. RTP is transported with a Datagram Protocol (UDP) packet to
reduce overhead to get a greater transmission speed or a better call quality.
2.1.2
VoIP Attacks
VoIP is for sure gaining advantage over PSTN but there is a major concern for
the VoIP community which is its security. An increasing security mechanism
would have a poor VoIP performance service. On the other hand, without
security mechanisms, VoIP services would be open to threats and attacks [2].
Man-in-The-Middle (MiTM), Denial of Service (DoS) and Spam over Internet
Telephony (SPIT) are among the VoIP attacks.
7
MiTM attack is the attacker inserts himself between two communicating

parties that the he can delete or modify the communications. MiTM attack is a
real threat to the security. For example, the MiTM which in the VoIP signaling
or a media path can easily divert, wiretap, and even hijack selected VoIP calls
[6]. Such MITM attacks on VoIP could cause a serious effect to the targeted
VoIP users. For example, attackers are able to collect sensitive information such
as bank account number, credit card number, PIN number and etc. of the victims.
MITM is such a problem in Internet communication because there s no way to
recognize someone's face and voice. Electronic communications are tools that the
attackers are easy to discover because they would not be able to answer quickly
when victims are suspicious about the caller, they might question the attackers
about a shared history moment for example as a test [7]. That is why MiTM
attacks work against web-based systems because the web is not synchronous [7].
The attacker could simply pass and get through to the end of the
communications.
DoS attack is an attack that denies a service or connectivity on a network

or devices, or bringing down the servers offering such services because it can
overload the devices internal resources of the network. DoS attacks can be
carried out by flooding a target with unnecessary SIP call-signaling messages.
This can cause calls to drop prematurely and halts call processing [8]. The DoS
attacks goal is to cause the service inoperable for as long as possible. By
targeting victims computer and network of the site victims are trying to use, the
attacker may be able to prevent victims from accessing websites, or other
services. Floods are a common type of DoS attack. Floods happened when the
attacker overloads the server with a request so it cannot process the victims
request and they cannot access that site. The attacker can use spam email
messages on victims email account. The email account services have assigned
one account with a specific quota which one account has a limited amount of
data at any given time [9]. The attacker can collect victim quota, preventing them
from receiving legitimate messages by sending large, or many email messages to

the account [9].
If SPAM is for email, SPIT is for VoIP which is an unwanted bulk calls
or voicemails that sent over VoIP networks [10]. SPIT may be a bigger problem
to deal with to compare with SPAM. SPIT might cause a bandwidth problem that
will increase the bandwidth bills for several times. This is because voice
messages carry up more bytes than emails which only a few kilobytes apiece.
SPIT attacks are different with SPAM. SPAM can be detected before it interfere
the recipient meanwhile in SPIT, there is too late for prevention of SPIT if the
phone rings and the phone rings immediately after session initiation [10]. This
will disturb the users current activity.
2.1.3
Network Forensic
VoIP is an application resides within the Internet environment. As the increasing

number of people using the Internet, the number of illegal activities such as
identity theft, data theft and etc. also increases drastically. Network forensics
deals with the recording, capture or analysis of network events. With network
forensics, it is able to analyze historical network traffic in order to conduct
investigations for security attacks [11]. From the gathered information, it will
help in identifying an unauthorized access to the system, and searches a solution
to prevent them happening in future. This information can be used as for
evidence in case of such an occurrence.
The main goal of network forensics is to provide evidence that is

sufficient to allow the criminal perpetrator to be successfully prosecuted [12].
Network forensics require two steps, first gathering a complete network activity
data and then interpreting the data. Network activity data build a necessary
foundation for a network forensics investigation which interpreting forensic
network data could range from extracting files and reconstructing web sessions
to tracing data leakage and detecting advanced persistent threats [13].
2.1.4
WEKA Data Mining Tool
Data mining is the process of analyzing data from different corners and
summarizing it into useful information [14], and it is one of the analysis tools
software for analyzing data. Data mining could be separate into two parts,
directed and undirected. In directed data mining, it is trying to predict a particular
data point, but in undirected data mining, it is trying to find patterns in existing
data, or creates groups of data [15]. Data mining has dozens of techniques and
procedures that used to examine and transform data. The data mining is to
create a model that can improve the way to read and interpret the existing data
and the future data [15].
Waikato Environment for Knowledge Analysis (WEKA) is one of the

data mining tools software and is open source software. WEKA is a collection
of machine learning algorithms for data mining tasks and it is the product of the
University of Waikato, New Zealand [15]. The software is written in the Java
language. It contains tools for data preprocessing, regression, clustering,
classification, association rules and visualization [16]. WEKA uses a flat text file
describing the data and it can work with a variety of data files including its own
file formats, Attribute Relation File Format (ARFF) and C4.5 file formats. ARFF
is the WEKA default file type that use for data analysis, but the data also can be
imported from a various formats [17]. The data can also be read from a
Structured Query Language (SQL) database or from Uniform Resource Locator
(URL).
10
2.2
Previous Work
2.2.1
Skype Forensics in Android Devices
In this research paper, Mohammed I. Al-Saleh and Yahya A. Forihat did some
investigation on the evidences of Skype calls and chats in the Android devices.
Smartphones, have a bit of capabilities similar to that of PCs which can store a
large of data and different categories of information. Smartphone which is
having an Android-based device is getting more popular because there are a lot
of varieties of mobile Applications (Apps) that were developed to extend the
functionality of the phones. VoIP Apps are extensively used that provided the
usage for their wide availability and cheap prices and Skype is one of the popular
VoIP Apps.
Figure 2.2: Investigation Model [18]
11
This research paper might assume that Skype is one of the ways that
helps in committing cybercrimes. Digital Forensics may be conducted on mobile
devices, computers, and networks, in order to detect the cyber-criminal activities
and prove them guilty under the law. Fig. 2.2 is an investigation models
researchers designed. The figure summarized that the criminal starts a call
conversation session with the victim. The conversation sessions from the
criminals device need to be extracted by the investigator to extract evidences by
inspecting both RAM and NAND flash memories [18].
After doing several experiments, the pattern for each experiment had
shown there were no differences between the call conversation patterns. The
result of chat messages is found in both memories and have decreased the
average number of occurrences for the different time durations. This means, chat
messages were stuck for a long time in the flash memory without redundancy.
The remaining number of messages still can be used as evidence. The researchers
concluded that Skype conversation patterns and chat messages can be found in
both of the RAM and NAND flash memories for a long time and regardless of
deleting calls and chat histories and signing out of the Skype [18].
2.2.2
Network Forensics Models for Converged Architectures
A pattern is a solution to a problem that can be used to guide evaluation

of systems or the design. The concept of forensic pattern is introduced by
illustrating them using Unified Modeling Language (UML) object oriented
models. Attack patterns are a description of the objectives and steps of an attack.
From these attack patterns, it can obtain useful information to analyze a ways to
stopping the attacks. Forensic pattern is a systematic approach to network
forensic collection and data analysis. By using these forensic patterns,
investigators or forensic teams will have a structured method to search, collect
and analyze network forensic data.
12
Firewalls and Intrusion Detection System (IDS) is a general security

mechanism that unable to detect and stop the attacks at a higher level. To stop it
in the future, some details about the attackers activities need to collect and send
them to be analyzed. Sensors with examination capabilities for collection of
evidence are a way of collecting data which help reduce human intervention
were used. These sensors are to capture all entering or leaving the system of
voice packets. The evidence collector starts collecting forensic data if there is
notifications alert of alarm that detect the against VoIP components. After
collecting the forensic data, the evidence collectors will the data to the network
forensics server. These data are used to discover and rebuild the attacking
behaviors. The forensics server will perform the corresponding forensics
analysis.
Figure 2.3 Class diagram for a VoIP network forensics system [19]
Log correlation and normalization are one of the techniques to analyze

forensic database and files. The evidence analyzer will presents results to the
forensic investigator. This result will include such information as the IP address,
the topology of the network, the MAC address, and possibly the geographic
location of the IP. In Fig. 2.3, Juan C. Pelaez and Eduardo B. Fernandez
described how a forensic system and IP telephony integrate. The model
represented the three primary components, the forensic server, the evidence
13
collector and the network investigator. The advantages using the forensic pattern
are; automated evidence analyses will reduce response times of the forensic
investigators, the analyzer can provide information about logs and for tracing
back the attackers, and can determine the call history, when a user is using the
VoIP device, and with whom the user communicates [19].
2.2.3
Security Patterns for Voice over IP Networks
The authors[REFERENCE REQUIRED], discuss the security attacks

and related them to the ways the system is used and provided some defense
mechanisms. Four security patterns are presented which provide good practices
for VoIP in identifying and understanding the mechanisms needed. The patterns
include VoIP Tunneling, Network Segmentation, Secure VoIP Call, and Signed
Authenticated Call. Unified Modeling Language (UML) was used to make easier
for the implementation of the patterns. There are three different types of
connections when using the IP protocol. PC-to-PC, PC-to-Telephone, and
Telephone-to-Telephone. VoIP uses the Real-Time Protocol (RTP) for transport,
Real-Time Transport Protocol (RTCP) for reporting Quality of Service (QoS),
and SIP, H.323 Media Gateway Control Protocol (MGCP) for signaling.
In this journal[REFERENCE REQUIRED], there are several attacks that

the authors presented. Theft of service, IP Spoofing, and Denial-of-service
(DoS), masquerading, call interception, repudiation, call hijacking, and brute
force is one of the presented attacks that against the VoIP network. The authors
have made some detail analyzed of these attacks using the concept of attack
pattern by considering the forensic aspects.
14
Figure 2.4: Relationships between VoIP security patterns [20]
Fig.2.4 shows the relation between VoIP security patterns and related
cryptographic patterns. The double box represented the patterns. In the Network
Segmentation pattern, it will minimize disruption in the attack event and critical
voice traffic wont impact. The VoIP Tunneling pattern uses encryption to ensure
data integrity and confidentiality in VoIP networks. Tunnels will secure the VoIP
traffic transport over the external network and eliminates the risk of exposing a
network. The Signed Authenticated Call provides a suitable way for
authentication of messages in VoIP and the best countermeasure for theft of
service attacks. In Secure VoIP call, encryption and decryption of VoIP calls
were used to provide good confidentiality.
It concludes that, use VPNs and encrypt all voice traffic are the best
security approach in VoIP. This would ensure that the critical voice traffic
would be unaffected if an attack did occur on the data network [20]. To enhance
the security in VoIP, filtering and firewalls can be implemented to control the
traffic between the data VPN and the voice [20].
15
2.2.4
Enhancing Forensic Investigation In Large Capacity Storage Devices Using

WEKA: A Data Mining Tool
This research project focuses on large sets of data that can be handled by
a data mining system. WEKA data mining tools are studied to demonstrate the
data mining methodology and thus obtain the data. The WEKA tool kit is easily
extendable and flexible. WEKA is written in Java and makes it easy to use and
easily portable. It allows modeling techniques and data preprocessing.
WEKA is a user friendly which provides a large set of functions and tools
included attribute selection, pre-processing filters, data clustering, classification
and selection of data, data visualization of data and association discovery.
WEKA is open source free software that is available to all users and it can be
used to run individual experiments. There are various data formats WEKA
supported. These files are ARFF, Comma Separated value (CSV), Decision
induction algorithm acceptable format etc.
Figure 2.5: Flow of Data Mining Methodology in WEKA [21]
Fig. 2.5 present the flow of data mining that used in WEKA. Data is
classified based on the attribute selection, and data are then divided into clusters
based on the types of grouping that the user selects. The output obtained after
clustering gives the accuracy of data when the data is clustered which can be
16
used for future predictions. Finally regression analysis describes how regression
can be applied and results can be visualized.
Figure 2.6: Preprocessing window [21]
This research project used a bank data to import into WEKA and
implement it in 4 modules that represents data mining process stages. The source
file can be in one of the formats which are either .arff or .csv. Fig. 2.6 is a
WEKA preprocessing window with the bank data. The data are saved to bankdata-final.arff after the parameters are set up. The project was implemented in
four modules which represents various stages and each task of data mining.
Association, classification, clustering and regression are the four stages of data
mining process [21].
17
2.3
Critical Analysis
The following table is a review of the differences in the literature review.
Table 2.1: Critical Analysis
JOURNAL
JOURNAL 1
[REFERENCE
REQUIRED],
JOURNAL 2
[REFERENCE
REQUIRED],
JOURNAL 3
[REFERENCE
REQUIRED],
JOURNAL 4
[REFERENCE
REQUIRED],
Skype
Converged
Network
Converged
Network
Bank
Employee
SOFTWARE
HARDWARE
DoS
SPIT
MiTM
SIP
RTP
RESEARCH
DATA
TOOLS
VOIP ATTACKS
PROTOCOL
18
CHAPTER III: RESEARCH METHODOLOGY
This chapter will cover the detail explanation of methodology that is being used
to make this project complete and working well. The method is used to achieve the
objective of the project that will accomplish a perfect result. Subsequently, section 3.1
introduces the methodology that be used in this project. In section 3.2 the resources of
the hardware and software are listed. The budget and costing of the tools are listed in
Section 3.3. Section 3.4 and Section 3.5 the Work Breakdown Structure (WBS) and the
project timeline, Gantt chart was developed which consists of activity duration
estimation and the development of the project schedule.
3.1
Rapid Application Development (RAD) Methodology
Rapid Application Development (RAD) methodology is selected to be

used as a methodology model because it is a suitable process for software
development and it used to replicate the flow of each work related to this project.
This methodology is based on an iterations approach and prototype. Since this
project involves with the existing data, comprises analysis and reporting of the
data, RAD process works best in cases where the data is known, the
requirements can be defined and kept unchanged during the development and the
functional requirements can be met within a short time frame [22]. In this
project the RAD methodology based on 6 phases which consist of Initiation
phase, Planning phase, Design phase, Testing and
Implementation phase,
Verification phase and the last phase is Documentation phase.
19
Figure 3.1: RAD model methodology
RAD methodology is designed with advantages. Quality and speed are

the primary advantages of this methodology. RAD increased the speed of
development and decreased delivery time, which focuses on converting
requirements to code as quickly as possible [23]. Increased quality is a RAD
primary focus, which is defined as both the degree to which a delivered
application meets the needs of users as well as the degree to which delivered
systems has low maintenance costs and provide a considerable reduction in the
errors due to the use of automation tools and prototyping. Errors and omissions
are detected in the early stages of development, thereby preventing any extra
effort or cost. [24].
3.1.1
Initiation
An initiation or feasibility study is conducted after getting an approval

from the FYP supervisor. During the first of these phases, the initiation phase,
the project objective, project scope and current problem statement are identified.
A feasibility study is conducted to gather all the findings and data that related to
the project. The findings include all the sources of the information from internet,
books, journal, articles and previous study which is similar to this project or
systems. From the research literature, it can spot various gaps in the literatures
20
which can formulate a research question based on the research gaps and discuss
how these projects are likely.
3.1.2
Planning
The next phase, the planning phase, all of the work to be done is identify
where is the hardware and software resource requirements, and research model is
identified, along with the strategy process to implement the project. A project
plan is created outlining the activities, tasks, dependencies and timeframes and
identified a project budget by providing cost estimates for the equipment and
materials costs. The budget is used to monitor and control cost expenditures
during project implementation. The project plan can be referred at Fig.3.3 and
Fig.3.4 on pages 7 and 8.
3.1.3
Design
During the third phase, the design phase, the hardware and software are
defined, and .pcap data files collections are collected in this phase. The system
architecture, topology is well designed in this phase, which show the process of
project work and the process of converting the .Pcap data files into a format that
will be recognized by WEKA. Fig.3.2 shows the architecture of the project.
21
Figure 3.2: Architecture Topology
The .Pcap data files are the most available file format for logging network
traffic and can be used by almost any network analysis tool which displays huge
amounts of data that need to go through to find problems with the network. To be
recognized by WEKA,. Pcap data files are converted into a temporary .csv data
file format using a tshark Wireshark command line. Then the .csv data files will
convert into .arff data files format that supported by WEKA using a simple txt
notepad file and saved it as .arff file.
3.1.4
Testing & Implementation
In this phase, the project architecture is being tested in order to identify

the effectiveness the test techniques that apply by converting the .pcap data files
into a format that will be recognize by WEKA. The implementation will be
started when the .pcap data file is successfully converted, and the hardware and
software requirements are all gathered. All the installing and software setup is
completed in this phase. Refer Section 4.2 in chapter four on page 32 . The
collected data will be imported into WEKA that needs to be analyzed to get a
result. The data collected from the company are subject to our worked with time.
22
3.1.5
Verification
The fifth phase is the verification. This is where the result in fourth phase
will be verified in order to identify whether the data and the design implemented
meets the requirements of the project or not. If there is failure in testing phase,
there will be some modification to this system until it will run successfully. The
conclusions can be made based on the correctness and completeness of
development and operation in Testing phase process.
3.1.6
Documentation
The last phase is documentation where is the preparation of documented

all the information and result that related to the project as a final report including
the corrections and amendments the report before submission.
3.2 Project Resources
The project requires the following hardware and software. Table 3.1
shows the hardware and Table 3.2 shows the software specifications. These are
the minimum requirement needed to ensure the success of the simulator.
3.2.1
Hardware Specifications
Table 3.1: Hardware Requirement
No
.
1
Device
Laptop
Quantit
y
1
Specifications
ASUS brand
Processor : Intel inside CORE i3
RAM : 6.00 GB
OS : Microsoft Windows 7
23
3.2.2
Software Specifications
Table 3.2: Software Requirement
No
.
Software
Descriptions
WEKA
Version: 3.7.10(Latest version)

License/Price: Free
OS: Windows 7,8,XP,Vista,2000
Programming Language: Java
Size: 25.9 Mb
Wireshark
Version: 1.10.1 (64-bit)

License/Price: Free
OS: Windows 7,Vista,XP
Networking Software Tools
3.3 Budget/Costing
The following is review of the budget and costing of the hardware and software
requirements. Table 3.3 shows the hardware and Table 3.4 shows the software
estimated budget and costing.
3.3.1
Hardware Estimated Budget
Table 3.3: Project costing for hardware
No.
1
Equipment
Laptop
Quantity
Price(RM)
Remark
1800
Students properties
24
3.3.2
Software Estimated Budget
Table 3.4: Project costing for software
No
.
Equipment
Quantity
Price(RM)
Remark
WEKA
Open source
Wireshark
Open source
3.4 Work Breakdown Structure (WBS)
The following figure is WBS which is contains level of the work breakdown
structure that provides further definition and detail.
25
Figure 3.3: Work Breakdown Structure (WBS)
26
3.5 Project Timeline

Project timeline in Fig.3.4 shows the time duration that is taken to accomplish
this project. It shows every phase of the project development and schedule of the
project to make sure the project will meet.
Figure 3.4: Gantt chart
27
CHAPTER IV: TESTING AND IMPLEMENTATION
This chapter explains the project testing and implementation stages. Section 4.1,
testing stage will discuss on a conversion of the pcap files into arff format files. The
testing stage is divided into two subsections. Section 4.1.1 introduces the conversion of
the pcap files into csv files format, meanwhile in section 4.1.2 introduces the conversion
of the csv files into arff files format. In section 4.2 will discusses on an ethical matters
and in section 4.3 will discuss on a ways to analyze the data.
4.1
Testing Stage
This section defines the testing method on the project architecture

topology. The project architecture is set as shown in Figure 3.2 on page 22. This
stage is important to ensure that the test techniques that apply by converting the
pcap data files into a format that will be recognized by WEKA is in a systematic
manner.
4.1.1
Pcap To Csv Conversion
There is no direct conversion of pcap to arff formats. The csv file is an

intermediate file between pcap and arff files. Wireshark will be used for
converting pcap files into csv files.
28
Figure 4.1: Wireshark Export Packet Dissections
Figure 4.2: Wireshark Export File
Run the pcap files using wireshark and on File menu choose an Export
Packet Dissection. This menu item allows exporting some of the packets in the
capture file to file. In this case, choose CSV (Comma Separated Values packet
29
summary) as shown in figure 4.1 on pages 25. Then save the files as csv files
format as shown in figure 4.2.
4.1.2
CSV to Arff conversion

This is a step to convert CSV to Arff using WEKA. First of all, itll need to
install WEKA. It can be downloaded from http://www.cs.waikato.ac.nz/ml/weka.
It is a free source. WEKA windows will look like Figure 4.3. An ArffViewer
option under the Tools menu is to load or open the csv files into WEKA as
shown as in figure 4.3 on pages 22.
Figure 4.3: Weka GUI Chooser
30
Figure 4.4: ARFF-Viewer windows
Open the csv file by change files of types become CSV data files (*.csv) as
shown in figure 4.4.
Figure 4.5: Weka Save Windows
31
Then save as the file in the file name delete ".csv" and change it to ".arff" like in
figure 4.5, then the data files already finished converting csv file to arff file.
4.2
Ethical Matters
The ethical matter is pertaining to the data gathering that we collected

from a third party company. We mentioned it here as to protect the companies
and ourselves from legal action taken in the future if the data leaks. The first
company that we approached is a security company through its employee that
was one of our speakers during the UniKL Security Talk day. However, the
company was unable to release the data due to the sensitivity of the data. The
official letter sent to the company as in Appendix X
The second attempt was through the Malaysian Computer Emergency Response
Team (MyCERT), CyberSecurity Malaysia. After a few trials on phone calls and
weeks, we got a response from one of the officer who is in charged on our
request. We then sent a formal letter, refer to Appendix X, in order to conduct an
interview with the officer. We also asked if the company could supply the data
that are related to our project. Unfortunately the company did not keep data type
that relates to network attacks. On the other hand, they provide advisories on
what to do when an attack happens
The third attempt was to set an interview with Vigilnet Company, which
provided VoIP analysis. The person in charge was outstation for a few weeks,
though the company agreed to supply the data. At the end the company supplied
us with the VoIP data, however the data were clean data and with no trace of
network or individual attacks on the data. Nevertheless, we still use this data as
one of the analyses.
32
4.3
Analysis Stage
Towards understanding and improving forensics analysis processes, in this stage

an analyzing experiment were conducte on collected VoIP attack data for
analysis. This stage enables to mark or discovers the source of security attacks or
other problem incidents.
4.3.1
Analyze Using WEKA
This stage was focused on some common attack types of DoS attack which is
ICMP Echo flood, UDP flood, TCP SYN flood, and a data from reliable sources
by using WEKA Explorer preprocessing, classification, clustering, and attribute
selection.
4.3.1.1
ICMP Echo Flood
4.3.1.1.1
Preprocessing
The file was loaded into WEKA in the Preprocess window as shown in Fig.4.8
by click on Open file button and choose the .arff file from the local file
system.
Figure 4.8: Weka Open File

33
Once the data is loaded, WEKA recognizes attributes that are shown in the
Attribute window.
Left panel of Preprocess window shows the list of recognized attributes:
No.: number that identifies the order of the attribute as they are in the
data file.
Selection tick boxes: allow to select the attributes for working

relation.
Name: name of an attribute as it was declared in the data file.
During the scan of the data, WEKA computes some basic statistics on each
attribute. The following statistics are shown in Selected attribute box on the
right panel of Preprocess window:
Name: is the name of an attribute.
Type: is most commonly Nominal or Numeric.
Missing: is the number percentage of instances in the data for which

this attribute is unspecified.
Distinct: is the number of different values that the data contains for
this attribute.
Unique: is the number percentage of instances in the data having a

value for this attribute that no other instances have.
Figure 4.9: Weka Selected Attribute Box

34
No. is numeric. Therefore, the following frequency statistics for this attribute in
the Selected attributes window:
Missing: 0 means that the attribute is specified for all instances (no
missing values).
Distinct: 6 means that number. has six connections communication
Unique: 6 means that other instances do have the same value as number.
has.
Figure 4.10: Matched Attribute
Time is a Numeric value. The statistics describing the distribution of values in

the data - Minimum, Maximum, Mean and Standard Deviation. Minimum = 1 is
the lowest time, Maximum = 2.075 is the longest time, mean and standard
deviation. By comparing the result with the attribute table destunreachble.csv,
the numbers in WEKA match the numbers in the table. Figure 4.11 showed the
visualization of all attributes.
35
Figure 4.11: Attributes Visualization
4.3.1.1.2
Classification
Classifiers in WEKA are the models for predicting nominal or numeric

quantities.
Figure 4.12: Classify Tab Windows
36
Figure 4.13: Weka J48 Algorithm Tree
In the Fig.4.13, C4.5 algorithm and J48, decision tree learner is used to analyze
the data sample. The C4.5 algorithm was chosen because of it can handle
numeric attributes.
Figure 4.14: Classifier Test Option
37
In this data sample, the classifier will be evaluated based on how well it predicts
66% of the tested data. The Percentage split radio-button was checked and
keeps it as default 66%. Percentage splits evaluate the classifier on how well it
predicts a certain percentage of the data, which is held out for testing. The
amount of data held out depends on the value entered in the % field. When the
options have been specified, the learning process will be started by click on the
Start button.
4.3.1.1.3
Clustering
Clustering in WEKA is for finding groups of similar instances in a dataset.
Figure 4.15: Cluster Tab Windows
Figure 4.16: Weka Gui Generic Object Editor Window

38
Once
the
cluster
scheme
SimpleKMeans
is
selected,
weka.gui.GenericObjectEditor screens came up by right-click on the algorithm

as shown in Fig.4.16. The value in numClusters box was set to 7 because it has
seven clusters in the .arff file.
Figure 4.17: Cluster Output
When training set is completed, the Cluster output area on the right panel of
Cluster window is filled with text describing the results of training and testing.
A new entry appears in the Result list box on the left of the result.
4.3.1.1.4
Attribute selection
Attribute selection searches through all possible combinations of attributes in the

data and finds which subset of attributes works best for prediction.
39
Figure 4.18: Select Attribute Tab Windows
In Fig.4.18, the CfsSubsetEval and BestFirst search method was set up to

search through all possible combinations of attributes in the data and find which
subset of attributes works best for prediction. The results of selection are shown
on the right part of the window when the attribute selection process is finished as
shown in Fig.4.19.
Figure 4.19: Attribute Selection Output
40
The implementation of the other data which are UDP Flood, TCP SYN
Flood and the data from reliable source were not shown because it have same
steps as shown by ICMP Flood data, so the results on each data will analyze on
next chapter, Chapter V: Result and Analysis.
41
CHAPTER V: RESULT AND ANALYSIS
This chapter discusses the results of the experiments conducted as described in

Chapter 4. There are four discussed results regarding to attacks. The results were
separated into each section according to the attacks. Section 5.1 discusses on data that
got from reliable sources and Section 5.2 discusses on ICMP Flood attack data. In
section 5.3, TCP SYN Flood attack data will be discussed.
5.1
Reliable Data
The protocols involved in the pcap can be viewed in the protocol

classifier tree. SIP, RTCP, RTP, and HTTP were the protocol which involved as
shown in run information below:
= = = Run information = = =
Scheme:
weka.classifiers.trees.J48 -C 0.25 -M 2
Relation:
reliabledata
Instances:
4447
Attributes: 7
No.
Time
Source
Destination
Protocol
Length
Info
Test mode:
split 66.0% train, remainder test
42
= = = Classifier model (full training set) = = =
J48 pruned tree

-----------------Time <= 201.325642
| Length <= 78: TCP (911.0)
| Length > 78
| | Length <= 1349
| | | Time <= 155.870121
| | | | Time <= 0.000353: SIP (2.0)
| | | | Time > 0.000353: HTTP (185.0/5.0)
| | | Time > 155.870121
| | | | Length <= 783: SIP (11.0)
| | | | Length > 783: SIP/SDP (3.0)
| | Length > 1349: TCP (195.0/1.0)
Time > 201.325642
Length <= 202
| | Length <= 82: RTP EVENT (126.0/1.0)
| | Length > 82
| | | No. <= 4444: RTCP (9.0)
| | | No. > 4444: ICMP (2.0)
| Length > 202: RTP (3003.0/15.0)
Number of Leaves:
10
Size of the tree:
19
Time taken to build model: 0.06 seconds
43
SIP
is
signaling
protocol
used
for
controlling
multimedia
communication sessions, like voice or video calls over IP. The protocol can be
used for modifying, creating and terminating two-party or multiparty sessions
consisting of one or several media streams. In this capture file, SIP is used to
create and tear down VoIP sessions.
RTP defines a standardized packet format for delivering audio and video
over the Internet. RTP is usually used in conjunction with the RTCP. When in
conjunction, RTP is usually originated and received on even port numbers,
whereas RTCP uses the next higher odd port number. In this capture file, RTP is
used as the media protocol to transport voice.
RTCP partners with RTP in the delivery and packaging of multimedia

data, but does not transport any media streams itself. RTCP itself does not
provide any flow encryption or authentication methods.
HTTP is a request-response protocol standard for client-server

computing. In this capture file, HTTP is used to communicate with the GUI
frontend of the SIP PBX.
= = = Run information ===

Scheme:
Relation:
reliabledata
Instances:
4447
Attributes: 7
No.
Time
Source
Destination
Protocol
44
Length
Info
Test mode:
= = = Classifier model (full training set) = = =
J48 pruned tree

-----------------Protocol = SIP
| Source = 172.25.105.43: Request: OPTIONS sip:100@172.25.105.40 | (1.0)
| Source = 172.25.105.40
| | Length <= 574: Status: 200 OK | (4.0/2.0)
| | Length > 574
| | | No. <= 1298: Status: 401 Unauthorized
(0 bindings) | (2.0/1.0)
| | | No. > 1298: Status: 401 Unauthorized | (2.0)

| Source = 172.25.105.3
| | No. <= 1302
| | | No. <= 1297: Request: REGISTER sip:172.25.105.40
(fetch bindings) |
(2.0)
| | | No. > 1297: Request: SUBSCRIBE sip:555@172.25.105.40 | (2.0)
| | No. > 1302: Request: ACK sip:1000@172.25.105.40 | (3.0/1.0)
| | No. > 1: User-Agent: Asterisk PBX 1.6.0.10 | FONCORE
At the beginning of the attack, the attacker 172.25.105.43 sent a SIP

OPTIONS request for extension 100 at 172.25.105.40. Luckily, 172.25.105.40
responded to the request with a 200 OK response. The information that is useful
for the attacker is the User-Agent message header field of the response. Given
this information, the attacker now knows that he/she is facing an Asterisk PBX
and FONCORE Tribox family distribution. With these clues in hands, the
attacker tried to connect to the box with HTTP.
45
5.2
ICMP Echo Flood
Internet Control Message Protocol (ICMP), which enables users to send

an echo packet to a remote host to check whether its alive. These packets
request reply from the victim and this results in saturation of the bandwidth of
the victims network connection.
=== Run information ===
Scheme:
Relation:
icmp
Instances:
Attributes: 7
No.
Time
Source
Destination
Protocol
Length
Info
Test mode:
=== Classifier model (full training set) ===
J48 pruned tree

------------------
Source = 10.2.10.2: Echo (ping) request
id=0x0200, seq=9472/37, ttl=32
[ETHERNET FRAME CHECK SEQUENCE INCORRECT] (3.0/2.0)

Source = 10.2.99.99: Destination unreachable (Host unreachable) [ETHERNET
FRAME CHECK SEQUENCE INCORRECT] (3.0)
46
Number of Leaves : 2
Size of the tree :
Time taken to build model: 0.01 seconds
=== Evaluation on test split ===
Time taken to test model on training split: 0 seconds
=== Summary ===
Correctly Classified Instances
Incorrectly Classified Instances
100
Kappa statistic
%
%
Mean absolute error
0.5
Root mean squared error
0.6124
Relative absolute error
133.3333 %
Root relative squared error
141.4214 %
Coverage of cases (0.95 level)
Mean rel. region size (0.95 level)
50
Total Number of Instances
%
%
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC
ROC Area
PRC Area Class

0.000
0.000
0.000
0.000
0.000
0.000
Echo
(ping) request id=0x0200, seq=9472/37, ttl=32 [ETHERNET FRAME CHECK

SEQUENCE INCORRECT]
47
0.000
0.000
0.000
0.000
0.000
0.000
1.000
Destination unreachable (Host unreachable) [ETHERNET FRAME CHECK

SEQUENCE INCORRECT]
0.000
1.000
0.000
0.000
0.000
0.000
Echo

SEQUENCE INCORRECT]
0.000
0.000
0.000
0.000
0.000
0.000
Echo

SEQUENCE INCORRECT]
Weighted Avg.
0.000
0.000
0.000
0.000
0.000
0.000
0.000
1.000
=== Confusion Matrix ===
a b c d <-- classified as
0 0 0 0 | a = Echo (ping) request id=0x0200, seq=9472/37, ttl=32 [ETHERNET
FRAME CHECK SEQUENCE INCORRECT]
0 0 2 0 | b = Destination unreachable (Host unreachable) [ETHERNET FRAME
CHECK SEQUENCE INCORRECT]
0 0 0 0 | c = Echo (ping) request id=0x0200, seq=9728/38, ttl=32 [ETHERNET
0 0 0 0 | d = Echo (ping) request id=0x0200, seq=9984/39, ttl=32 [ETHERNET
In information above, packet in the capture file is a standard Echo (ping)

request packet from 10.2.10.2 to 10.4.88.88. The listing shows packets Ping
queries sent by the source and the packets did not make it to the destination. In
this situation, based on ICMP on Wireshark, ICMP only identified the packet
type, it would not give much useful information.
48
5.3
TCP SYN Flood

The SYN flooding attacks exploit the TCPs three-way handshake
mechanism and its limitation in maintaining half-open connections. When a
server receives a SYN request, it returns a SYN/ACK packet to the client. Until
the SYN/ACK packet is acknowledged by the client, the connection remains in
half-open state for a period of up to the TCP connection timeout.
=== Run information ===
Scheme:
Relation:
tcp
Instances:
Attributes: 7
No.
Time
Source
Destination
Protocol
Length
Info
Test mode:
=== Classifier model (full training set) ===
J48 pruned tree

------------------
Source = 192.168.0.1: boinc-client > neod2 [ACK] Seq=1 Ack=1 Win=8760

Len=0 [ETHERNET FRAME CHECK SEQUENCE INCORRECT] (3.0/2.0)
49
Source = 192.168.0.2: [TCP Retransmission] neod2 > boinc-client [PSH, ACK]

Seq=5841 Ack=1 Win=8760 Len=648 [ETHERNET FRAME CHECK
SEQUENCE INCORRECT] (6.0/1.0)
Number of Leaves:
Size of the tree:
Time taken to build model: 0 seconds
=== Evaluation on test split ===
Time taken to test model on training split: 0 seconds
=== Summary ===
Correctly Classified Instances
33.3333 %
Incorrectly Classified Instances
66.6667 %
Kappa statistic
0.1429
Mean absolute error
0.2667
Root mean squared error
0.483
Relative absolute error
84.6154 %
Root relative squared error
116.1347 %
Coverage of cases (0.95 level)
33.3333 %
Mean rel. region size (0.95 level)
26.6667 %
Total Number of Instances
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall

PRC Area Class
50
F-Measure MCC
ROC Area
0.000
0.000
0.000
0.000
0.000
0.000
0.500
0.333
boinc-client > neod2 [ACK] Seq=1 Ack=1 Win=8760 Len=0 [ETHERNET

0.000
0.000
0.000
0.000
0.000
0.000
0.500
0.333
neod2 > boinc-client [PSH, ACK] Seq=5841 Ack=1 Win=8760 Len=648

[ETHERNET FRAME CHECK SEQUENCE INCORRECT]
0.000
0.333
0.000
0.000
0.000
0.000

0.000
0.000
0.000
0.000
0.000
0.000

1.000
0.500
0.500
1.000
0.667
0.500
0.750
0.500
[TCP Retransmission] neod2 > boinc-client [PSH, ACK] Seq=5841 Ack=1

Win=8760
Len=648
[ETHERNET
FRAME
CHECK
SEQUENCE
INCORRECT]
Weighted Avg.
0.333
0.167
0.167
0.333
0.222
0.167
0.583
0.389
=== Confusion Matrix ===
a b c d e <-- classified as
0 0 1 0 0 | a = boinc-client > neod2 [ACK] Seq=1 Ack=1 Win=8760 Len=0
0 0 0 0 1 | b = neod2 > boinc-client [PSH, ACK] Seq=5841 Ack=1 Win=8760
Len=648 [ETHERNET FRAME CHECK SEQUENCE INCORRECT]
0 0 0 0 0 | c = boinc-client > neod2 [ACK] Seq=1 Ack=2921 Win=8760 Len=0
0 0 0 0 0 | d = boinc-client > neod2 [ACK] Seq=1 Ack=5841 Win=8760 Len=0
51
0 0 0 0 1 | e = [TCP Retransmission] neod2 > boinc-client [PSH, ACK]

Seq=5841 Ack=1 Win=8760 Len=648 [ETHERNET FRAME CHECK
SEQUENCE INCORRECT]
From the information above, the file begins with standard TCP ACK
packets sent between 192.168.0.1 and 192.168.0.2. When TCP sends a packet to
a destination and does not get a reply, it waits a specified amount of time then
retransmits the original packet. If a response is still not received, the source
(transmitting) computer doubles the amount of time it waits for a response before
sending another retransmission. Once the retransmission attempts have failed, the
connection has completely failed and the data in the transmission is lost.
52
CHAPTER VI: CONCLUSION
This chapter contains a conclusion and some recommendation and suggestion

that are made for future improvement and enhancing the project that conclude after
testing and result. The essence of the study is to analyze VoIP traffic trace using WEKA
a data mining tool. We believe that the objective is achieved.
6.1
Project Accomplishment
In the early days of VoIP, there was no big concern about security issues
related to its use. People were mostly concerned with its cost, functionality and
reliability. Now that VoIP is gaining wide acceptance and becoming one of the
mainstream communication technologies, security has become a major issue. To
overcome a major problem, the network forensic is prepared to the monitoring
and analysis of computer network traffic for the purposes of information
gathering, legal evidence, or intrusion detection.
This project started with converting the pcap (Packet Capture) into
Attribute-Relation File Format (arff) which format that WEKA recognize and
learned how to analyze the data by using WEKA Explorer preprocessing,
classification, clustering, and attribute selection before getting the data from
company who provide VoIP analysis.
We believed that the objectives set for this project are met. The first
objective is to analyze the pattern of attack data from the captured data. In which
case, the data indicates the condition of the network events.
53
The second objective is also achieved. It is to convert the pcap data to arff
data file so that the input will be recognized by the WEKA data mining tool. It is
important to state that and the first objective depends on this second objective.
We have some hiccup in getting the right data for our analysis since many
companies are tied with the legality that refrain them from sharing their data with
us. However, we still get data from a simulated data from other related project
conducted by another student in UniKL. Otherwise, our research will produce
more interesting findings.
6.2
Future Recommendation
For the future recommendation, there are few aspects that can be
further enhanced by expanding a few features and criteria to make the
analysis more firm and strong.
Suggestion for Improvement
Current Project Situation
Improve Data Set or create a traffic
As thedata in this project in not
simulation program to collect the
related to VoIP attack due to a
required data
certain problems, the pure collected

data that related to VoIP attack can
be
analyzed
for
the
future
enhancements.
Include more type of attacks that are
Only looking for SPIT and MiTM
related to VoIP. Different type of VoIP
attacks
attacks such as Vishing (VoIP Phishing),

Eavesdropping, and Identity and service
theft can also be used in order to find the
different result.
Expand the analytical knowledge by
Analysis using the GUI interface is
using WEKAs Simple CLI interface.
user friendly, but would not speed

54
Scripts can be written to allow the data
up the process.
processing to be executed automatically.
As a conclusion we would like to highlight that the issues with VoIP security are one
of the concerned raised by the VoIP community. Although the problem is still under
control the system admin currently is not equipped with the right tools to detect the
VoIP attacks as earliest as possible. In most cases Wireshark or other network sniffer
is used to determine the condition of the network. We are trying to provide
alternative tools to the system admin by providing report pattern produced by a data
mining tool like WEKA.
55
REFERENCES
[1]
A Brief History of VoIP Document One - The Past. Hallock, Joe. 2004.
[2]
AmnaSaad. Secure VoIP Performance Measurement. 2013.
[3]
How Does VoIP Work? discusstech.org. [Online] [Cited: November 17,

2013.] http://discusstech.org/2011/05/how-does-voip-work/.
[4]
Voice over IP: Forensic Computing Implications. MatthewSimon. 2006.
[5]
The Difference Between VoIP and PSTN Systems. webopedia.com. [Online]

[Cited:November 17, 2013.]
http://www.webopedia.com/DidYouKnow/Internet/2008/VoIP_POTS_Differ
ence_Between.asp.
[6]
On the Feasibility of Launching the Man-In-The-Middle Attacks on VoIP

from Remote Attackers. Ruishan Zhangy, Xinyuan Wangy, Ryan Farleyy,
Xiaohui Yangy, Xuxian Jiang. 2009.
[7]
Man-in-the-Middle
Attacks.
schneier.com.
[Online]
July 15, 2008.
[Cited:November 18, 2013.]

http://www.schneier.com/blog/archives/2008/07/maninthemiddle 1.html.
[8]
Security Threats In VoIP. voip.about.com. [Online] [Cited: November 18,

2013.] http://voip.about.com/od/security/a/SecuThreats.htm.
[9]
Understanding Denial-of-Service Attacks. us-cert.gov. [Online] [Cited:

November 18, 2013.] http://www.us-cert.gov/ncas/tips/ST04-015.
[10]
SPIT: Spam Over Internet Telephony. asteriskblog.com. [Online] [Cited:

November 19, 2013.] http://www.asteriskblog.com/spit-spam-over-internettelephony.
56
[11]
Network Forensics 101: Finding the Needle in the Haystack. WildPackets

white paper.
[12]
Network Forensic. cyberforensics.in. [Online] [Cited: November 20, 2013.]

http://www.cyberforensics.in/(A(cos8NMWQywEkAAAAODMwODM4Y
WMtNWFmZC00ZWNhLThkNDEtNTlhMWM3MGE5MzA5hkCziwldj9ts
_CCtkjYQI68akds1))/Research/NetworkForensics.aspx?AspxAutoDetectCoo
kieSupport=1.
[13]
Network Forensics & Packet Capture Analysis. ipcopper.com. [Online]

[Cited: November 20, 2013.] http://www.ipcopper.com/data_analysis.htm.
[14]
Data Mining: What is Data Mining? anderson.ucla.edu. [Online] [Cited:

November 20, 2013.]
http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palac
e/datamining.htm.
[15]
Data mining with WEKA,Part 1: Introduction and regression. [Online]

[Cited: November 20, 2013.]
http://www.ibm.com/developerworks/library/os-weka1/.
[16]
Weka - Modified for Data Mining Course at WPI. [Online] [Cited:

November 21, 2013.] http://davis.wpi.edu/~xmdv/weka/.
[17]
Introduction to Weka - A Toolkit for Machine Learning.
[18]
Skype Forensic in Android Devices. Forihat, Mohammed I. Al-Saleh &

Yahya A. 2013.
[19]
Network Forensics Models for Converged Architectures. Fernandez, Juan

C. Pelaez & Eduardo B. 2010.
[20]
Security patterns for Voice over IP Networks. Eduardo B. Fernandez, Juan

C. Pelaez and Maria M. Larrondo-Petrie. 2007.
57
[21]
Enhancing Forensic Investigation in Large Capacity Storage Devices using

WEKA: A Data Mining Tool. Lanka, Shravya. 2011.
[22]
The Rapid Application. Issam J Zeinoun Cambridge Technology

Enterprises, Inc. 2005.
[23]
Rapid
Application
Development.
Core
Partners
Inc.
s.l.:
www.corepartners.com.
[24]
Advantages of Rapid Application Development. buzzle.com. [Online] 2002013. [Cited: December 12, 2013.]
http://www.buzzle.com/articles/advantages-of-rapid-applicationdevelopment.html
58
25

4 Full Chapter Margin PDF

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

4 Full Chapter Margin PDF

Diunggah oleh

Hak Cipta:

Format Tersedia

CHAPTER I: INTRODUCTION

The growth in networking connectivity, complexity and activity has

This chapter contains the detailed description of the project proposed

Chapter 4 will discuss on the testing and implementation of the

Chapter 6 is the conclusion of the project. Any other suggestions or

CHAPTER II: LITERATURE REVIEW

Voice over Internet Protocol (VoIP)

VoIP known as IP Telephony, which is using an Internet Protocol over an

There are several reasons why VoIP telephony is becoming very

Figure 2.1: VoIP Processing [3]

VoIP telephony relies upon methods and various protocols to establish

RTP is used for generic transport capabilities for real-time multimedia

MiTM attack is the attacker inserts himself between two communicating

DoS attack is an attack that denies a service or connectivity on a network

from receiving legitimate messages by sending large, or many email messages to

VoIP is an application resides within the Internet environment. As the increasing

The main goal of network forensics is to provide evidence that is

WEKA Data Mining Tool

Waikato Environment for Knowledge Analysis (WEKA) is one of the

Skype Forensics in Android Devices

Figure 2.2: Investigation Model [18]

Network Forensics Models for Converged Architectures

A pattern is a solution to a problem that can be used to guide evaluation

Firewalls and Intrusion Detection System (IDS) is a general security

Log correlation and normalization are one of the techniques to analyze

Security Patterns for Voice over IP Networks

The authors[REFERENCE REQUIRED], discuss the security attacks

In this journal[REFERENCE REQUIRED], there are several attacks that

Figure 2.4: Relationships between VoIP security patterns [20]

Enhancing Forensic Investigation In Large Capacity Storage Devices Using

Figure 2.5: Flow of Data Mining Methodology in WEKA [21]

Figure 2.6: Preprocessing window [21]

The following table is a review of the differences in the literature review.

Table 2.1: Critical Analysis

CHAPTER III: RESEARCH METHODOLOGY

Rapid Application Development (RAD) Methodology

Rapid Application Development (RAD) methodology is selected to be

Verification phase and the last phase is Documentation phase.

Figure 3.1: RAD model methodology

RAD methodology is designed with advantages. Quality and speed are

An initiation or feasibility study is conducted after getting an approval

Figure 3.2: Architecture Topology

Testing & Implementation

In this phase, the project architecture is being tested in order to identify

The last phase is documentation where is the preparation of documented

3.2 Project Resources

Table 3.1: Hardware Requirement

Table 3.2: Software Requirement

Version: 3.7.10(Latest version)

Version: 1.10.1 (64-bit)

Hardware Estimated Budget

Table 3.3: Project costing for hardware

Software Estimated Budget

Table 3.4: Project costing for software

3.4 Work Breakdown Structure (WBS)

Figure 3.3: Work Breakdown Structure (WBS)

3.5 Project Timeline

Figure 3.4: Gantt chart

CHAPTER IV: TESTING AND IMPLEMENTATION

This section defines the testing method on the project architecture

Pcap To Csv Conversion

There is no direct conversion of pcap to arff formats. The csv file is an