Anda di halaman 1dari 11

Big data is primarily de ned by the volume of a data set.

data sets are generally huge – measuring tens of terabytes –
and sometimes crossing the threshold of petabytes. The term
big data was preceded by very large databases (VLDBs) which
were managed using database management systems (DBMS).
Today, big data falls under three categories of data sets –
structured, unstructured and semi-structured.

Structured data sets comprise of data which can be used in

its original form to derive results. Examples include relational
data such as employee salary records. Most modern
computers and applications are programmed to generate
structured data in preset formats to make it easier to process.

Unstructured data sets, on the other hand, are without proper

formatting and alignment. Examples include human texts,
Google search result outputs, etc. These random collections
of data sets require more processing power and time for
conversion into structured data sets so that they can help in
deriving tangible results.Semi-Structured data sets are a
combination of both structured and unstructured data. These
data sets might have a proper structure and yet lack de ning
elements for sorting and processing. Examples include RFID
and XML data.

Semi-Structured data sets are a combination of both

structured and unstructured data. These data sets might have
a proper structure and yet lack de ning elements for sorting
and processing. Examples include RFID and XML data.

Big data processing requires a particular setup of physical

and virtual machines to derive results. The processing is done
simultaneously to achieve results as quickly as possible.
These days big data processing techniques also include
to convert visitors on your
Computing and Arti cial Intelligence. These technologies
website to leads?
help in reducing manual inputs and oversight by automating
many processes and tasks.

The evolving nature of big data has made it dif cult to give it
a commonly accepted de nition. Data sets are consigned the
big data status based on technologies and tools required for
their processing.

Big data analytics – Technologies

and Tools
Big data analytics is the process of extracting useful
information by analysing different types of big data sets. Big
data analytics is used to discover hidden patterns, market
trends and consumer preferences, for the bene t of
organizational decision making. There are several steps and
technologies involved in big data analytics.

Data Acquisition

Data acquisition has two components: identi cation and

collection of big data. Identi cation of big data is done by
analyzing the two natural formats of data – born digital and
born analogue.

Born Digital Data

It is the information which has been captured through a

digital medium, e.g. a computer or smartphone app, etc. This
type of data has an ever expanding range since systems keep
on collecting different kinds of information from users. Born
digital data is traceable and can provide both personal and
demographic business insights. Examples include Cookies,
Web Analytics and GPS tracking.

Born Analogue Data

When information is in the form of pictures, videos and other

such formats which relate to physical elements of our world,
it is termed as analogue data. This data requires conversion
into digital format by using sensors, such as cameras, voice
recording, digital assistants, etc. The increasing reach of
technology has also raised the rate at which traditionally
analogue data is being converted or captured through digital
The second step in the data acquisition process is collection
and storage of data sets identi ed as big data. Since the
archaic DBMS techniques were inadequate for managing big
data, a new method is used for collecting and storing big
data. The process is called MAD – magnetic, agile and deep.
Since, managing big data requires a signi cant amount of
processing and storage capacity, creating such systems is out-
of-reach for most entities which rely on big data analytics.
Thus, the most common solutions for big data processing
today are based on two principles – distributed storage and
Massive Parallel Processing a.k.a. MPP. Most of the high-end
Hadoop platforms and specialty appliances use MPP
con gurations in their system.

Non-relational Databases

The databases that store these massive data sets have also
evolved in how and where the data is stored. JavaScript
Object Notation or JSON is the preferred protocol for saving
big data nowadays. Using JSON, the tasks can be written in
the application layer and allow better cross-platform
functionalities. Thus enabling, agile development of scalable
and exible data solutions for the devs. Many companies are
using it as a replacement of XML as a way of transmitting
structured data between the server and web application.

In-memory Database Systems

These database storage systems are designed to overcome

one of the major hurdles in the way of big data processing –
the time taken by traditional databases to access and process
information. IMDB systems store the data in the RAM of big
data servers, therefore, drastically reducing the storage I/O
gap. Apache Spark is an example of IMDB systems. VoltDB,
NuoDB and IBM solidDB are some more examples of the

Hybrid Data Storage and Processing Systems – Apache Hadoop

Apache Hadoop is a hybrid data storage and processing

system which provides scalability and speed at reasonable
costs for mid and small-scale businesses. It uses a Hadoop
Distributed File System (HDFS) for storing large les across
multiple systems known as cluster nodes. Hadoop has a
replication mechanism to ensure smooth operation even
during instances of individual node failures. Hadoop uses
Google’s MapReduce parallel programming as its core. The
name originates from ‘Mapping’ and ‘Reduction’ of functional
programming languages in its algorithm for big data
processing. MapReduce works on the premise of increasing
the number of functional nodes over increasing processing
power of individual nodes. Moreover, Hadoop can be run
using readily available hardware which has sped up its
development and popularity, signi cantly.

Data Mining
It is a recent concept which is based on contextual analysing
of big data sets to discover the relationship between separate
data items. The objective is to use a single data set for
different purposes by different users. Data mining can be
used for reducing costs and increasing revenues.

Top 10 sectors using big data

Big data is nding usage in almost all industries today. Here
is a list of the top segments using big data to give you an
idea of its application and scope.
1. Banking and Securities : For monitoring nancial markets
through network activity monitors and natural language
processors to reduce fraudulent transactions. Exchange
Commissions or Trading Commissions are using big data
analytics to ensure that no illegal trading happens by
monitoring the stock market.

2. Communications and Media: For real-time reportage of events

around the globe on several platforms (mobile, web and TV),
simultaneously. Music industry, a segment of media, is using big
data to keep an eye on the latest trends which are ultimately
used by autotuning softwares to generate catchy tunes.
3. Sports: To understand the patterns of viewership of different
events in speci c regions and also monitor the performance of
individual players and teams by analysis. Sporting events like
Cricket world cup, FIFA world cup and Wimbledon make special
use of big data analytics.
4. Healthcare: To collect public health data for faster responses to
individual health problems and identify the global spread of
new virus strains such as Ebola. Health Ministries of different
countries incorporate big data analytic tools to make proper use
of data collected after Census and surveys.
5. Education: To update and upgrade prescribed literature for a
variety of elds which are witnessing rapid development.
Universities across the world are using it to monitor and track
the performance of their students and faculties and map the
interest of students in different subjects via attendance.
6. Manufacturing: To increase productivity by using big data to
enhance supply chain management. Manufacturing companies
use these analytical tools to ensure that are allocating the
resources of production in an optimum manner which yields the
maximum bene t.
7. Insurance: For everything from developing new products to
handling claims through predictive analytics. Insurance
companies use business big data to keep a track of the scheme
of policy which is the most in demand and is generating the
most revenue.
8. Consumer Trade: To predict and manage staf ng and inventory
requirements. Consumer trading companies are using it to grow
their trade by providing loyalty cards and keeping a track of
9. Transportation: For better route planning, traf c monitoring and
management, and logistics. This is mainly incorporated by
governments to avoid congestion of traf c in a single place.
10. Energy: By introducing smart meters to reduce electrical
leakages and help users to manage their energy usage. Load
dispatch centers are using big data analysis to monitor the load
patterns and discern the differences between the trends of
energy consumption based on different parameters and as a way
to incorporate daylight savings.

8 Ways you can grow your

business with data science
Today, the advent of Internet of Things and the development
of AI technology has simpli ed implementation of big data
solutions to the degree that even medium to small scale
businesses are bene ting from it. And since the top 10 list
comprises of sectors which are directly or indirectly
associated with various businesses, the imperative of this
technology increases even further. Using big data analytics,
businesses can take informed decisions and better their
operational ef ciency in a number of ways.
E.g. Using big data analytics, businesses can take informed
decisions and better their operational ef ciency in a number
of ways. E.g.

Utilizing company data to identify the need for improvement in

existing policies and processes.
Utilizing customer data available with the company such as
social media streams, credit information, and internal or external
consumer research, to improve or develop new products and

Deploying data science for your business –

1. Empowers management to make better decisions

Big data analytics acts as a trusted advisor for an organization’s
strategic planning. It helps your management and staff in
enhancing their analytical abilities and thereby improve their
overall decision-making skills. Measuring, recording and
tracking performance metrics then allow the upper management
to set new goals.
2. Helps identify trends to stay competitive
As mentioned earlier in this post, one of data analytics’ primary
objectives is to determine patterns within large data sets. This is
particularly useful for identifying new and emerging market
trends. Once identi ed these trends could become the key to
gaining a competitive advantage by introducing new products
and services.
3. Increases the ef ciency and commitment of staff in handling
core tasks and issue
By making employees aware of bene ts of using the
organization’s analytics product, data science can make them
more ef cient at their jobs. Working with a greater insight into
company goals, these employees will be able to drive more
action towards core tasks and issues at every stage. Hence,
improving the overall operational ef ciency of your business.
4. Identi es and acts upon opportunities
Data science is all about constantly looking for areas of
improvement in the organizational workings. By discovering
inconsistencies in the organizational processes and existing
analytical systems, data scientists can introduce new ways of
doing things. This, in turn, can drive innovation and allow new
product development, opening pro table avenues for your
5. Promotes low risk data-driven action plans
Big data analytics has made it possible for small and big
businesses to take actions based on quanti able, data-driven
evidence. Such a strategy can save a business from unnecessary
tasks and sometimes foreshadow risks.
6. Validates decisions
Apart from allowing your business to base decisions on data,
analytics also helps you test these decisions by introducing
variable factors, to check for exibility and scalability. Using
data science and big data solutions you can introduce
favourable changes in your organizational structure and
7. Helps in selecting target audience
One of the key value props of big data analytics is how you can
shape customer data to provide more insight into consumer
preference and expectations. A deeper analysis of customer data
can help companies in identifying and targeting audience with
utmost precision using tailor-made products and services.
8. Facilitates sensible recruitment of talent
Human resource departments are constantly at work in
companies to nd talent that ts the prescribed criteria. Big data
has made their task simpler by providing comprehensive data
pro les on individuals by merging social media, corporate
pro les and job search databases. Now your HR Department can
process CVs much faster and recruit the right talent quickly and
without compromises.

The world is moving towards a more connected future, and
big data solutions are going to play a big part in automation
and development of AI technologies. Companies like Google
are already using Machine Learning processes for greater
precision in delivering their services. As technologies around
the globe become more synchronous and interoperable, big
data will become the core that connects them together.
Therefore, companies using big data solutions need to keep
up with its evolving nature while those still reluctant to
invest should rethink their organizational policies. There are
a few pointers which can be helpful in getting the most out
of your investment in big data.

Demand a value proposition from big data by investing in

adequate technologies to capture and store data. If you do not
have the data, then you do not have the bene ts. Data discovery
tools can help you in digging up big data which is relevant to
your business.
Make use of big data to improve and innovate your applications
and services.
Arrange organization-wide training to accustom your staff to big
data solutions and their usage.
Interact and collaborate with big data users from associated
elds of businesses to derive more bene ts and bring down
usage costs.
Avoid siloed big data management and stay open to integration
with shared enterprise infrastructures.
If shifting to a new data platform, choose those which have a
special support system for big data, such as in-memory
processing, MapReduce, etc.
Develop a tech strategy for your organization’s data and lay out a
plan for capturing and processing them in the long-run.

Plan the nancials for storage and processing of your big data as

Moreover, big data is also resonating with government and

public-sector agencies, which is a good sign for businesses all
around the world as this will help deepen the public-private
collaboration in a range of elds.

Have Questions?
Request 30 minutes Free

Copyright © 2019 Maruti Techlabs Pvt. Ltd.