Proceedings of The Third International Conference on Data Mining, Internet Computing, and Big Data, Konya, Turkey 2016

Big Data Analysis with Query Optimization Results in Secure Cloud

Ahmad Shukri Mohd Noor1, Zahariman Farhan Mohd Yatim2, Amir Ngah3
School of Informatics and Applied Mathematics,
Universiti Malaysia Terengganu,
Terengganu Malaysia,farhanzahari@gmail.com2,
Cloud data storage is a problem solving and
explore potential solutions for the provision
of long-term storage and access to data
sharing, focusing on the data that can be used
by various agencies and widely use.
Availability in cloud computing provides a
broad selection of data preservation.
Contrary to provide infrastructure within an
organization, which usually require a long
time and upfront capital investment, cloud
computing infrastructure is available upon
request and is highly scalable. However, the
use of commercial cloud computing services
in particular raises governance issues, cost
effectiveness, safety, data storage and highcapacity access. Through the study, the
system requires data that can be accessed and
the time to respond immediately when a data
inquiry is carried out. State Welfare System is
a system which gives added value to the
government and the people. This system
facilitates requests for assistance by citizens
and in the meantime the government can
assess the extent of the needs and current
living standards of the people who need help.
To provide the best service to the people of
this system is proposed to be upgraded to
ensure the safety and accuracy of the data
could be improved. Big Data can be
generated from this system will be used as a
measure of the success of the program and
help the government to change the people's
living standards over time. This paper
discusses some of the techniques of cloud

environment that can be used to optimize the

use of data in an efficient and streamlined.

Keywords - Cloud computing; Cloud

provider; data sharing; big data; data storage



The use of cloud computing

increases the need for a policy that covers
the preparation, storage, data security,
data transmission security, application
security and security of supply in relation
to third parties [1]. The use of cloud
computing technology in the public or
private sector relate to the services on
offer to consumers. Cloud computing is
essentially divided into three classes as
SaaS (Software as a Service), PaaS
(Platform as a Service), IaaS
(Infrastructure as a Service). Data
security is a major factor to be considered
for any data stored in the cloud, a place of
safety that meets the specifications IaaS
be provided. This cloud storage providers
use a technology base to provide cloud
storage distributed file system [3].
The conventional method of
implementation of the present time as the
availability of data carried on by
individuals. However, in the cloud


transaction security monitoring and

evaluation data service improvements is
fully managed by the service provider
and cause the owner or customer data
cannot control the loss of data storage
Exposure to these hazards,
vulnerability in cloud computing
environments come in a variety of ways
to reach or manipulate. Therefore, service
providers must ensure that there are
controls in place at every level of the
infrastructure. Provision of infrastructure
to counter data leakage can avoid
exposure to danger or threat of loss of
sensitive data, finance, control and
provision of services [1][2]. Sensitive
data that can fall into the hands of
unscrupulous individuals. Cloud data
scalability data storage, its provides
customers with the benefit of large
storage without any issues. In addition to
these benefits, cloud computing lets you
can easily access all your applications,
documents and data from any
geographical region where the Cloud
Service Providers network or Internet
can be accessed. Cloud technology
making it easier for group members in
different locations to collaborate [2][4].

Fig. 1: Cloud Computing Architecture Sample

Sample cloud computing is as in Figure

1. It can be said cloud services (SP)
separate fully utilized by the government
and private sectors. Thus, data integrity
and privacy are major issues that need to
be considered apart from the issue of data
improvements applied by cloud service
providers to ensure data privacy,
customers should always be alert and
carry out periodic tests to ensure that
stored data can be used in the event of
data loss or disaster.
Section 2 and 3 revisits the proposed
method for secure cloud storage. Section
4 describes the data optimization result
and lastly section 5 present the proposed
method to improve State Welfare System
centralized data results.




Literature review peel back some of

the studies that have been implemented in
cloud computing. According to Seong
Han Shin and Kazukuni Kobara, 2010 [4]

Proceedings of The Third International Conference on Data Mining, Internet Computing, and Big Data, Konya, Turkey 2016

in the study of cloud storage security has

proposed a new solution, known as LRAKE
Authenticated Key Exchange) which is a
validation and data management systems
as a major subject. This protocol provides
a high level of data against hacking
attacks and including data encryption that
can be stored in a distributed manner.
The proposed protocol uses a
matching keyword is matched to two
different servers by users. Users only
need to enter keywords for accessing data
and LR-AKE protocol will work to make
a match to the client's request to allow
data stored in the cloud computing

Fig. 2: Overall Procedure for Secure Cloud Storage

2.1 Overall Procedure

Generally, cloud storage is safe, easy
and efficient when following proper
procedures for the implementation plan
(see Figure 2.). A customer or user of
the cloud can store, modify and access
data (which aims encrypt / decryption
management system authentication,
information or data that is stored in the

computing cloudy in particular will be

more resilient to attacks and threats
from outside
2.1.1 Rules
A rule or input password a user of cloud
usually choose simple combinations or
also called low - entropy. LR - AKE
procedure will combine consumer
preferences passwords along with high
levels of passwords. The principle of
secrecy resulted in values for
implement LR- AKE protocol between
the client C and the authentication
server (i.e. server A and server key
intermediate B).

2.1.2 Authentication
After authentication procedure done
successfully, client C confirm recovers
the valuable data that have been
distributed between a pair of the two
parties (e.g., client C and secondary
server B). Upon verification of data,
client server C will receive a new
encrypted data (i.e.: the secret data
validation and sharing encrypted). Data
recovery key can be used to encrypt /
decrypt the bulk data storage where
data that has been encrypted placed in
storage cloud

2.2 Test Procedure

In the test procedure, it will show
how validation and data management
system used resilient leak in a secure
recommendation is to use the interface
LR - AKE (called, LR- Password).


The main function is LR- Password

encryption for authentication personal
password. When the password is
correct, the key data be corrected
automatically cached into memory in a
predetermined time period.


categorized according to the

current policy).


Mahima Joshi & Yudhveer Singh

Moudgil, 2011 [5] in security studies
suggest cloud storage architecture for
Cryptography Provision Storage that
takes into account three main
components as storage security focus,
refer Fig.3. Data processors, certified
data and generating markers data
processor (DP), data verifiers (DV) and
token generator (TG) is the benchmark
for data requests by customers.

Fig. 3: Cloud storage architecture for

Cryptography Provision

3.1 Accessibility

Data processor (DP), will

process the data before it
transferred to cloud computing,

Access secure data using customer

identification include some keyword
matching pairs, making it difficult for a
third party to break the codes of
communications vulnerable to threats
between client and server.

Data verifiers (DV), check

whether the data in the cloud
interference or security and
generating a token (TG)


Token generated (TG) will

permitted user the access to the
cloud storage to retrieve
customer data segment and
implement access control policy
by issuing a confirmation to the
various parties in the system
(this validation will enable
customers to dispose of

The data obtained the required

optimal (N.Samatha, K.Vijay Chandu &
P.Raja Sekhar Reddy, 2012) explained
that the reduction in the size of the
investment in open applications such as
stack operation by allowing data to be
optimized better. This is because the
main factor is more relevant to their
communication costs. Increasing the
power to accomplish data processing can


be performed even better by reducing the

size of data to enable data access from
multiple servers can be implemented
more quickly and accurately.
To obtain optimal data in this paper,
[2] some techniques featured proposals to
provide users access data quickly and
accurately [2].

4.1 Query Processing

A process flow with 3-step (Fig.4) that
transforms a high-level query (of
relational calculus/SQL) into an
equivalent and more efficient lower-level
query (of relational algebra).

Fig.4: process flow with 3-step that transforms to

a high-level query

Parsing and translation Syntax

checking and verification. The
query will translate into an

Data storage can be implemented

properly through cloud storage is due to
the factor of readiness. The concept is
easy to change the system queries
complex queries without knowing about
the details of physical data organization
and processing technology. Question
elevation change high-level query user /
application to low-level strategy to
transformation must achieve both
accuracy and efficiency. The success
story is achieve the efficiency but was
very difficult. This is also one of the most
important point to get the correct result
from any distributed system

Optimization - Implement a plan

for optimal ( cost- efficient ) to
questions by consumers

engine takes an optimal evaluation
plan, execute and return the
answers query by system

a. Data Storage

B. Cloud Storage



It will transform a high level query (of

relational calculus/SQL) on a stored
database (i.e., a set of global relations)

ISBN: 978-1-941968-35-2 2016 SDIWC


into an equivalent and efficient lowerlevel query (of relational algebra) on

relation fragments. Cloud storage system
query processing is more complex with
additional requirements such as:

C. Sensitive Data Storage

Fragmentation of relations
Parallel execution


ii. Lower Level Query

Example: Transformation of an SQLquery into an RA-query. Relations:



Query: Find the names of employees who

are managing a project?

iii. High Level Query

DUR > 37

Two possible transformations of the

query are:

Avoids the expensive and large

intermediate Cartesian product, and
therefore typically is better.

Expression 1_ENAME
Expression 2: _ENAME (EMP
ENO (_DUR>37(ASG)))

For storage of sensitive data such as

access permissions, IP address, port
number, all elements in the network,
statistics and graphs of performance data,
packet loss, elements of lighting systems,
traffic, alarms and traps Sanjay Tiwari,
Chandresh Bakliwal & Chitra Garg [7]
suggests two different storage resources
of public cloud and private cloud.
Configuration in the public cloud private
cloud acts as a disaster recovery center
(DRC: Disaster Recovery Center)
Disaster recovery center provides
various services for business continuity
through data recovery. By using
infrastructure is better because it will
naturally give more shield to the data and
configuration. Ahmad Shukri Mohd
Noor & Mustafa Mat Deris [9] proposes
an autonomous, self-configured fail-stop
failure recovery model based distributed
replication system for data recovery.



5.1 State Welfare System

State Welfare System implemented
by the Terengganu State of Government
is a system that are used by the people and
the government began in 2005 as initially
it was 5 different application systems. Its
also using separated application server


and database, which is required more

space and electrical usage. Application
systems include Eid Mubarak Fund,
School Clothes, Youth Services, Trishaw
& Traditional Boat Operators and Flood
Victims. Refer Fig.5.

recorded through this system amounted

204, 222 applicant for Eid Mubarak Fund
and 104,861 for Youth Services
applicants. Some criteria have been

5.1.1 Application Terms

Among the conditions for the application
of the Public Fund are:

Fig.5: Original Architecture for

State Welfare System

People's State @ resident over 10

Applicants must be voters in the
Total income of head of
household (the applicant) not
exceeding RM 2,000 per month.

Whereas the conditions for

Services are as follows:

Fig. 6: Current Architecture for

State Welfare System

Fig.6 Brief current State Welfare System

architecture after upgrading to a
centralized application system be
implemented to support the efforts of
saving electricity and also makes it more
space to green technology. On the same
time big data integrity can give accurate
Applications for the 2015 Election Fund

Aged between 21 and 40 who are

not married (single).
Applicants must be voters in the
state (excluded applicants aged 21
Monthly income not exceeding
RM 1,500 per month for that

5.2 Private Cloud

If we look furthermore, the big data it can
be a good asset to the State Government
to draw up an economic plan for the
people who are eligible. This is because
the applicant's data since 2005 to 2015
can be processed to give a boost to the


In order to optimize the use of existing

data and hardware, the use of private
cloud computing [6] it is proposed to
make data access faster and safer during
peak periods.



Fig.7: Private Cloud Usage

State Religious Affair and Malay

Customs Terengganu
Ministry of Education
Election Commission
The domestic institutions
Land Registry District
Companies Commission of
Inland Revenue Board of
Accountant Department of

Engagement with the use of data from

agencies involved will allow data
produced better quality and reliability.
But it's costly if you use common
technology data center versus private
cloud [8]. Refer Fig. 8.

Private cloud is essentially

intended to enable access to the system
has more channels than local hosting
existing now. The term "private" simply
means that you are the only one who has
access to it. Private cloud can also be
organized on a shared data center, while
you are the only one who has access to a
set of physical resources.

5.3 Centralized Database Query

In order to ensure the assistance received
by the target group, a new mechanism
has also been proposed for grant aid. It
involves some data from different
government agencies. Among them:

National Registration
Accountant General Department
Social Welfare Department

Fig. 8: Proposed Private Cloud &

Data Centralization

To ensure data security is assured and

there are no elements of threat, the rules
in section 2, 3 & 4 as lighting will be
applied to the improvement of this
As Fig.8, Private cloud will accelerate the
processing of data requested by the user.
This is because the private cloud is


proposed to be placed in the Disaster

Recovery Centre in Cyberjaya, which is
the data access speed of up to 6gbps.

5.4 The Purpose of Upgrading

The aim is to upgrade the system because
the existing system while still being
manipulated by certain quarters as data
please check only involves several
agencies only. This situation would cause
the government to provide assistance to
The proposed upgrade will also allow the
system to run fully automatically without
the need for manual verification of the
village committee. Application system
will also be faster during peak hours when
applied additional security elements such
as cryptography provision storage and
access to data is faster with the use of data
The proposal also is also upgrading the
system to monitor the application of the
Terengganu people who live in other
states. This is to ensure that any
information given is accurate. This is
because this has happened before,
applicants use false information such as
workplace fraud, fraud rates of pay,
identity fraud and also overlapping
applications for assistance for a child to
The proposal also involves a cost savings,
hardware, space and electricity. Which is
no compromise on security issues.

It is an expectation that this study can
serve as a basis and guide (baseline) in
addition to the public sector in the
implementation of programs related to
ICT technology to guarantee security
foundations of secrecy (confidentiality),
integrity (integrity) and accessibility
(accessibility) are complied with at all
The overall view of the original system
has improved in availability readily as the
modularity concept was applied where
the number of single point of failure is
reduced through redundant components.
Further adding the resource monitor adds
robustness in high availability delivery as
well as ensuring service continuity with
fault tolerance and fast recovery.
The proposed framework uses an
adaptive fault tolerance technique to
achieve high availability with neighbor
replication and master-slave database
replication. This technique is very
benecial in monitoring resource inside a
dynamic distributed environment with a
moderate number of nodes to increase the
overall availability and performance of
the system
Besides, through this study is expected to
be a strong consideration to be given the
utmost attention to security aspects before
a decision is made on technology
adaptation. This is because government
data and information is an important asset
of the State that is priceless.
In addition, the use of existing data
analyzed from time to time allow the
government and the administration to

provide the best for the development of

the people.

