Anda di halaman 1dari 103

Big Data

The Next Bi g Thi ng


M
AKING M
A
R
K
E
T
S
F
U
N
C
T
I
O
N
B
E
T
T
E
R
YEARS
GLOBAL RESEARCH & ANALYTICS
International Youth Centre, Teen Murti Marg, Chanakyapuri, New Delhi - 110 021, India
Phone: 91-11-23010199, Fax: 91-11-23015452, Email: research@nasscom.in
Website: www.nasscom.in
Big Data: The Next Big Thing
2
Copyright 2012
International Youth Centre, Teen Murti Marg, Chanakyapuri
New Delhi - 110 021, India
Phone: 91-11-23010199, Fax: 91-11-23015452
Email: research@nasscom.in
Published by
NASSCOM, New Delhi
Designed & Produced by
CREATIVE INC.
Phone: 91-11-41634301
Printed at
P.S. Press Services
Disclaimer
The information contained herein has been obtained from sources believed to be reliable. NASSCOM and
CRISIL GR&A disclaim all warranties as to the accuracy, completeness or adequacy of such information.
NASSCOM and CRISIL GR&A shall have no liability for errors, omissions or inadequacies in the information
contained herein, or for interpretations thereof.
Service provider proles are representative of the Indian players. We have tried to cover players across
the Big Data spectrum hardware, software, analytics, system integration and IT services. Identication
of players is based on reliable industry sources, interviews, and organisation websites. This report is not
a recommendation to invest/disinvest in any organisation covered in the report.
The material in this publication is copyrighted. No part of this report may be reproduced either on paper or
electronic media in part or in full without permission in writing from NASSCOM. Request for permission
to reproduce any part of the report may be sent to NASSCOM.
Usage of Information
Forwarding/copy/using in publications without approval from NASSCOM will be considered as
infringement of intellectual property rights.
Big Data: The Next Big Thing
3
Every few years, we come across the next big technological idea which radically transforms the way
businesses function by opening up new opportunities and ef ciencies. Big Data has now emerged as the
next big thing the big idea whose time has come. And like most big ideas in the recent past, Big Data
ofers a big opportunity for India.
In this study, jointly conducted by NASSCOM & CRISIL Global Research and Analytics (GR&A), we look at
the opportunity, which lies in ofering services around Big Data implementation and analytics for global
multinationals. By 2015, Big Data is expected to become a USD 25 billion industry, driven by uses across
industries such as manufacturing, retail, nancial services, telecom and healthcare. We expect the Indian
Big Data industry to grow from USD 200 million in 2012 to USD 1 billion in 2015 at a CAGR in excess of
83 per cent Indian service providers are already leveraging partnerships, M&As and venture funding to
capture Big Data outsourcing opportunity. We are condent that India will be at the forefront in ofering Big
Data analytics and related IT services. The challenge, however, is in meeting the demand of data scientists
and IT engineers which is estimated to reach approximately 15,000-20,000, at a CAGR of 80 per cent by
2015. The signs, though, are encouraging.
India follows close on the heels of the US and is well ahead of other outsourcing destinations in terms of
Big Data talent availability and service providers initiatives to build such talent for the Big Data opportunity.
To further augment this capacity, organisations are leveraging their academic alliance programmes, with
universities in India to introduce courses on various areas of Big Data. Their eforts are being complemented
by private IT training institutes in the country, which are developing talent through courses specic to
Big Data skills.
Today, data is omniscient and omnipresent. This data is getting generated at a rapid pace: around 2.5 billion
GB of data is generated every day, and more than 90 per cent of the data available today has been created
in the past 3-4 years. This has primarily been because of the explosion in our use of click stream, mobile
applications and social media. Its estimated that Twitter alone generates 12 Terabytes of data daily. Its a
gold mine for businesses which can separate the wheat from the chaf to identify the trends. Organisations
across segments are now looking at this pool of data to determine how best it can be mined and gauge
their customers likes and dislikes.
Storing, analysing and making sense of data of such unwieldy dimension will be a challenge of epic
proportions. However, we believe India is on the right path to steal a march over others. In this study, we
ofer a big perspective on Big Data and how it can be turned into actionable insights.
Foreword
Roopa Kudva
Managing Director and CEO, CRISIL
Som Mittal
President, NASSCOM
Big Data: The Next Big Thing
4
Acknowledgements 5
Key Takeaways 6
Introduction to Big Data 8
Global Perspective on Big Data 26
Indias Advantage in the Big Data Opportunity 40
The Future of Big Data 71
Annexure 78
Contents
Big Data: The Next Big Thing
5
This publication was prepared through a collaborative efort by several institutions and individuals. We
would like to acknowledge the support of our Executive Council for providing the essential and gracious
counsel and guidance. NASSCOM has published, and continues to work on, various reports on the
IT-BPO sector; information from these reports have been used in this study.
We gratefully acknowledge the contribution of our members and partners including Genpact, EMC,
Sears Holding, HP Analytics, Mu Sigma, AbsolutData, Computer Sciences Corporation, Deloitte,
Frost & Sullivan, Marlabs, LatentView, EXL Services, Fidelity Investments, Impetus and JP Morgan Chase
in terms of their valuable time and informative case studies.
We deeply appreciate the eforts of CRISIL Global Research & Analytics (GR&A) and its team comprising
Gaurav Dua, Kumar Rajendran, Priya Khemka, Gunja Rastogi, Mehak Mayor, Praveen Kalani, Hemant Bisht,
Ridhima Sudan, Santosh Kandwal and Sonam Gupta who were instrumental in producing this report.
We also convey our special acknowledgement to NASSCOMs research team for their efort and
contribution towards the production of this report.
Acknowledgements
Key Takeaways
Big Data: The Next Big Thing
7
India showcases competitive advantage in Big Data oferings
An Introduction
to Big Data
Big Data: The Next Big Thing
9
Big Data is dened by volume, variety and velocity
Organisations worldwide are turning their attention to Big Data as they scramble to derive insights from
the deluge of information generated from various sources. In the past few years, the global marketplace
has seen exponential growth in data volumes, created and consumed by a diverse cross-section of
stakeholders. The term Big Data signies large datasets in multiple formats, growing at an enormous
rate and posing problems for traditional storage and analytical platforms. Big Data is distinct from large
existing data stored in various relational databases, as it warrants a more advanced mechanism for both
storage and analysis. Technologies such as NoSQL databases and MapReduce/Hadoop frameworks are
at the core of the solutions heralding a paradigm shift. So Big Data is characterised by three attributes
of data: volume, variety and the velocity at which it is generated.
Traditional analytics on transactional or structured data have helped data-driven organisations gain
insights from various enterprise data. As data from weblogs, social media posts, sensors, images, emails,
audio and video les emerge as sources of insights, it presents a huge competitive opportunity for
businesses. The need to derive predictive and actionable insights from this data for improved business
operations and better decision making is what drives Big Data analytics.
Big Data: The Next Big Thing
10
The data being generated globally is undergoing
exponential growth
Data volume is the primary characteristic of Big Data. With data becoming an indispensable part of
every economy, industry, organisation, business function and individual, it is being actively captured by
organisations to better understand their customers, suppliers, partners and operations. Large datasets
yield more information and hence, improved analysis compared to limited records of data, leading to
better competitive advantage and business operations. This data is being generated at a rapid pace:
around 2.5 billion GB of data is generated every day, and more than 90 per cent of the data available
today has been created in the past 3-4 years. According to IDC, data generated globally is expected to
witness a 41.0 per cent CAGR between 2009 and 2020 to reach 35.0 Zettabytes.
Moreover, the technological landscape has changed with innovation in both managing and storing large
data. As organisations move away from the traditional data storage systems such as le systems and
databases to newer technologies such as cloud-based storage and open source software, data storage
and management costs are seeing a downward trend. According to IDC, storage costs have plummeted
from USD 18.9/gigabyte in 2005 to USD 1.6/gigabyte in 2011, and are expected to further decline to
0.7/gigabyte by 2015. Apart from storage costs, the evolution of several open source analytical tools
and platforms has made data analytics exible, reliable and relatively afordable for Big Data.
Volume
Variety
Velocity
Big Data: The Next Big Thing
11
Today 80 per cent of data existing in any enterprise is
unstructured data
Organisations worldwide are increasingly realising that unstructured data, if analysed, can provide a
competitive edge. While structured data is transactional and can be stored in rows and columns with
an identiable structure, unstructured data such as audio, video and social media messages is raw or
semi-structured. This data is generated in several forms such as web clicks, emails, phone conversations,
weather data, audio and video les, location coordinates and pictures. Moreover, unstructured data
is highly dynamic and does not have a particular format, i.e., it may be in diferent languages, have
several terminologies, and may exist in the form of X-ray sheets, voice mails, digital photographs, or
phone conversations.
Organisations are overwhelmed by the volume of unstructured data and are looking at ways to manage
and analyze them in a systematic manner. As a result, one of the key focus areas for organisations
wanting to leverage Big Data is to handle unstructured data and adopt new technologies to deal
with them.
It is imperative to develop technologies that can enable storage of such huge data as well as maintain
transactional consistency between structured and unstructured data. Newer technologies such as NoSQL
databases to store unstructured data and processing methods such as Hadoop and massively parallel
processing are gaining prominence in the area of Big Data and Big Data analytics.
Volume
Variety
Velocity
Big Data: The Next Big Thing
12
Increased data velocity enables real-time use of Big Data
The proliferation of the internet and the mobile era has increased the rate at which data is created and
stored; hence, there is a need for tools and technologies to analyse data at an equal speed. The shelf-life
of data has dropped from months to hours and seconds.
The ubiquitous nature of the internet, coupled with massive computing power and accessibility, has
transformed data processing from an auxiliary function into an essential mechanism that enables
organisations to transform their businesses. Big Data service providers are increasingly leveraging
technologies such as streaming processing and in-memory computing that mitigates the shortcomings
of batch processing and enable faster storage and data processing.
Earlier, these technologies were popular in verticals considered more critical, such as the nancial and
government sectors. However, as the criticality of analysing data in real-time emerges, several other
industries are also adopting solutions based on these technologies.
Volume
Variety
Velocity
Big Data: The Next Big Thing
13
Social media analytics, sentiment analysis and behavioural
analysis are the upcoming Big Data analytics services
Big Data analytics is the process of applying advanced analytical techniques to large datasets to
uncover hidden patterns, unknown correlations and other useful information. Big Data analytics
helps businesses:
Take better business decisions: The most important objective of Big Data analytics is to help
organisations make better business decisions, taking into account all the available information.
This is achieved by analysing large volumes of structured and unstructured data from sources that
are left unutilised by conventional business intelligence solutions
Predict and identify change: Big Data analytics helps organisations closely monitor their ecosystem,
discover what has changed, and decide how they should react. It also enables them to predict
change, which is crucial given the current competitive business environment
Identify new opportunities: Advanced Big Data analytics is an efective way to discover new
opportunities such as new business segments, best suppliers, associate products of af nity and
sales seasonality
The evolution of advanced analytical techniques such as machine learning, predictive analytics, data
mining, statistical analysis, articial intelligence and natural language processing have enabled
Big Data: The Next Big Thing
14
organisations to generate insights across all aspects of their businesses. Organisations are now able
to analyse complete datasets, including unstructured data, instead of smaller samples, resulting in
better outcomes. New visualisation tools and techniques are helping data scientists, and business
users are able to understand Big Data and make decisions based on it. Visual tools for generating
insights have also evolved from simple graphs, PowerPoint presentations and dashboards to heat maps,
cluster analysis and real-time advanced dashboards. Some of the widely used Big Data visualisation
tools are:
Tag cloud: A weighted visual list where words that appear most frequently are larger and words
that appear less frequently are smaller
Clustergram: Used to visualise how clusters are formed and how cluster members are assigned to
clusters as the number of clusters increases
Heat map: A graphical representation of data where the individual values contained in a matrix are
represented as colours
Dashboard: A real-time graphical presentation of data analysis
History ow: Charts the evolution of a document as it is edited by multiple contributing authors
Big Data: The Next Big Thing
15
Big Data analytics is the application of advanced techniques
on Big Datasets; answer questions previously considered
beyond reach
Big Data analytics is an evolving and multifaceted area for analytics players. The key diferentiating
factors between traditional analytics, advanced analytics and Big Data analytics are:
Big Data analytics difers from advanced analytics in terms of diferent data formats and structures,
and new application requirements for Big Data
While traditional analytics performs rear-view analysis on structured data, advanced analytics and
Big Data analytics provide a progressive view, enabling organisations to anticipate and deal with
future opportunities i.e. Big Data analytics has a denitive predictive end-result in its use
Big Data analytics has enabled cross-channel analytics and real-time insights at greater speed, access
and collaboration. For example, detection of consumer emotions on a call on mentioning a competitor
or conversion of a service call into an opportunity by leveraging Big Data analytics are more relevant
in real-time rather than after the interaction ends.
Big Data: The Next Big Thing
16
Big Data management, analytics, IT services and applications are
the key constituents of Big Data ecosystem
The Big Data ecosystem includes multiple elements from the data that is analysed using the IT
infrastructure that supports it and the applications that enable its analysis and usage. Elements of
Big Data include:
Data management refers to systems where the data resides. It comprises the legacy systems as well
as Hadoop-based systems and NoSQL databases. Legacy systems include databases that store and
manage structured data, i.e., RDBMS to store and analyse structured data, and MPP systems to scale
up for large structured datasets. Hadoop is an open source software framework to support applications
that enable analysis of petabyte and xetabyte-sized data. Given Hadoops popularity and wide adoption,
several other open-source projects have become associated with it, adding new functionality and
enterprise-ready features to make it a compelling enterprise solution. These sub-projects include
Hadoop Distributed File System (HDFS), Hbase, Hive, Mahout, Pig, ZooKeeper, Avro, Cassandra, and
Chukwa. Once Big Data is collected and processed, it becomes operational data, i.e., it represents Big
Data outcomes or serves as an input data for analytics.
Big Data analytics includes the technologies and tools to analyse the operational data and generate
insight from it. After the data is analysed, it becomes available for business users through various
visualisation techniques.
Big Data: The Next Big Thing
17
Data consumption involves enabling the Big Data insights to work in Business Intelligence (BI) and
end-user applications
IT services enable integration of Big Data framework with the traditional business intelligence
infrastructure
Big Data: The Next Big Thing
18
Traditional storage architectures limit the potential of Big Data,
thereby, compelling businesses to move to new data foundation
The traditional analytics technology stack has evolved into the Big Data analytics technology stack.
The inability of traditional BI applications to process unstructured datasets makes them less relevant
in the Big Data analytics space.
Big Data management, infrastructure and storage systems: Growth in Big Data has led to signicant
infrastructure requirements to support the distributed processing of unstructured data analytics. Unlike
traditional relational databases, which are structured, normalised, and densely populated, Big Data
technology stack mainly comprises Hadoop architecture that has a distributed le system, analytics
and data storage platforms, and an application layer that manages distributed processing, parallel
computation, workow and conguration management for unstructured data. Other than Hadoop,
there are non-relational databases such as NoSQL databases and MPP systems that are scalable,
network-oriented, semi-structured, and sparsely populated. This layer also comprises servers, networks,
and storage used for scale-out deployment of Big Data technology. With the emergence of Big Data,
traditional RDBMS, MPP and DW are transitioning into a new role of supporting Big Data management
by processing structured datasets as outputs of Hadoop or MapReduce technologies and then input
for BI software and analytical applications.
Big Data: The Next Big Thing
19
Big Data analytics: While traditional analytics primarily catered to structured or row/column-based data,
Big Data analytics enables analytical processing of multi-structured data for text analytics, predictive
modelling, and social media analytics, using techniques such as MapReduce and in database analytical
functions. Moreover, traditional analytics leveraged basic visualisation techniques such as charts and
graphs to communicate analysis to business users, while Big Data analytics uses new visualisation
tools such as real-time dashboards, heat maps and tag clouds.
Big Data: The Next Big Thing
20
Key players across the traditional and Big Data technology stack
As Big Data technologies become mainstream, the vendor landscape is evolving rapidly. Data
management includes vendors of Hadoop-based solutions, other MapReduce technology suppliers
as well as cloud and datacentre providers. The increased demand for Big Data analytics has changed
the competitive landscape for the Big Data analytics service providers. In addition to the incumbent
IT/BPO/Knowledge service players, there are now more pure-play analytics players, some of whom
provide sector-specic analytics solutions. Some of the larger organisations have set up captives, which
provide data analytics solutions to the other divisions and subsidiaries of those organisations. Even
the breadth of the services provided by analytics organisations has substantially increased from data
storage and management to delivering real-time insights and end-to-end data analytics services.
Big Data management and storage: Many new organisations have emerged as providers of Apache open
source Hadoop distributions, with various levels of proprietary customisation for data management.
Cloudera and Hortonworks are the major players for Hadoop distributions. While Cloudera contributes
signicantly to Apache HBase, the Hadoop-based non-relational database that enables low-latency,
Hortonworks mainly ofers next-generation MapReduce architecture. Other pure players include
MapR, Hadapt and Zettaset. Moreover, mega IT vendors have also entered the Big Data market
through acquisitions. The Big Data warehouse market is mainly led by four players IBM Netezza,
Big Data: The Next Big Thing
21
EMC Greenplum, HP Vertica and Teradata Aster Data. Non-Hadoop vendors are also signicantly
contributing to the Big Data market opportunity Splunk, HPCC Systems and Datastax are some of
the key players.
Big Data analytics: With the deluge of data, it has become pertinent to have applications and platforms
that leverage the underlying Hadoop infrastructure for data analytics. Some of the key players in this
segment are: Karmasphere, which ofers an analytical development platform to perform ad-hoc queries
on Hadoop-based data via an SQL interface; Datameer, which provides a Hadoop-based business
intelligence platform that leverages a spreadsheet-like interface to analyse data; and service providers
such as QlikView, Revolution Analytics, Informatica, 1010data, and ClickFox which ofer cloud-based
Big Data applications and services.
Big Data use: Big Data analytics engage with large datasets which may be dif cult to understand for
business users. A number of organisations such as Amazon Web Services, Google, and Intellicus are
launching new user applications which facilitate the usage of Big Data analytics.
Additionally, the landscape for Big Data IT services is growing exponentially, with established service
providers such as Oracle, IBM and CSC building their Big Data service portfolio. Moreover, Indian IT/
BPO players such as TCS, Infosys and Wipro are also bolstering their capabilities in Big Data-specic
software development and implementation.
Big Data: The Next Big Thing
22
Big Data enables better customer segmentation, improved
productivity and fraud detection across all industry sectors
As organisations adjust to the rapidly changing digital lifestyle of consumers worldwide, they are
beginning to discover the importance of understanding and envisaging the impact of information
generated from non-traditional sources such as blogs, Facebook posts, tweets, emails, smartphone
applications, electronic sensors, images and YouTube videos.
Big Data not only helps organisations gain a multi-dimensional view of their ecosystem, but also
generates powerful insights that can help them better execute their operations and take well-informed
decisions. Big Data is increasingly being leveraged through advanced data analytics tools and techniques
to provide organisations with a better understanding of their customers, competitors, operations,
suppliers and partners. High performance analytics, which previously took days or weeks to perform,
can now be undertaken in seconds, minutes or hours through Big Data technologies.
The public and private sectors are adopting Big Data analytics on a large scale to generate strategic
insights and improve their product/service strategy, operational efficiency and gain a deeper
understanding of their customers, competitors and suppliers. Big Data analytics is enabling them to
predict the trends in near real-time, make more accurate forecasts and adjust their operations quickly
to changing demand or new business opportunities.
Big Data: The Next Big Thing
23
Public sector: Big Data can be of immense use in the public/development sectors. It enables government
departments and developmental organisations to analyse large amount of data across populations and
to provide better governance and service. Big Data analytics can help them to improve transparency,
enhance decision making, and adopt innovative practices in healthcare, public administration, defence,
disaster management, transportation and energy. For example, Big Data has emerged as a new
focal point for the US Government, which has announced a USD 200 million Big Data Research and
Development Initiative in March 2012.
Financial services: Big Data analytics can enable nancial institutions make better trading and
risk decisions, protect themselves from frauds and security threats, and improve their products by
better customer identication and marketing campaigns. Further, Big Data analytics is transitioning
investment banks from relying on overnight batch data to make trading decisions. It has improved
the risk decisions by leveraging real-time analysis of current data rather than the risk management
models based on historical data. For example, CITIC Bank Credit Card Center used Big Data technology
to identify customers unlikely to activate their credit card services, and direct marketing incentives
to those most likely to activate, thereby improving the efectiveness of the marketing campaign by
65 per cent, while Westpac New Zealand used Big Data technology to analyse social media data to
gain real-time insights into the banks brand health and its product performance across diferent
geographies by correlating specic branch performance to customers social data.
Healthcare: The surge in volumes of clinical data on medication, allergies, and procedures owing to the
implementation of electronic health records have led healthcare organisations to seek opportunities to
predict and react more rapidly to critical clinical events, resulting in better care for patients and more
efective cost management. For example, several of the United States largest integrated delivery
networks such as Cleveland Clinic, MedStar, University Hospitals, St. Joseph Health System, Catholic
Health Partners and Summa Health System use the Big Data platform for real-time exploration,
performance and predictive analytics of clinical data.
Manufacturing: Organisations are increasingly leveraging Big Data and nding new opportunities
to predict maintenance problems, enhance manufacturing quality and reduce costs using Big Data.
For example, Volvo leverages Big Data to analyse information received from its vehicles, customer
relationship management systems, product development and design systems, to identify, in advance,
potential issues such as manufacturing and mechanical problems and proactively resolve the problems
by adjusting its manufacturing process.
Telecommunications: Organisations in the telecom industry are increasingly relying on real-time
analysis of data generated by mobile devices including phone calls, text messages, applications, and
web browsing for better customer service and to build on retention and loyalty. For instance, while
Nokia collects a huge amount of unstructured data from phones in use, services, log les and other
sources and uses it to gain insights and understand the collective behaviour of consumers to improve
the quality of its phones and their features, Cablecom deploys Big Data analytics to identify when a
Big Data: The Next Big Thing
24
customer was most likely to make a decision to leave its network and ofers special deals and incentives
to retain the customer at the right time.
Retail: With large amounts of data being generated from the point-of-sale at stores, online transactions,
and social media posts, Big Data ofers numerous opportunities to retailers to improve marketing,
merchandising, operations, supply chain and develop new business models. Retailers are deploying
Big Data analytics to improve the accuracy of forecasts, anticipate changes in demand and react
accordingly. For example, the use of Big Data analytics led to signicant growth in the number of active
members of Sears loyalty programme (membership crossed 80 million customers).
Other industries: Big Data can also be used in other industries. Data-intensive verticals such as utilities,
oil & gas, and transportation, where data is generated through smart meters, GPS systems, and satellites
are gradually using Big Data analytics to make real-time predictions of their operations.
Big Data: The Next Big Thing
25
Social gaming, mobile applications, internet search portals are
key end-user applications, leveraging Big Data analytics
As adoption of Big Data analytics by enterprises is gaining traction, players are also gearing up towards
mainstream adoption, i.e., B2C applications. Many Big Data players are solving dif cult problems for
consumers by providing Big Data applications on PCs, smartphones, tablets and other web-enabled
devices. Consumers are using Big Data analytics for everyday chores such as locating vacant parking
spaces more efectively, and for real-time comparison of prices. With new applications coming into play
everyday, the B2C market for Big Data is likely to replicate the success of current mobile applications
in the coming years. While innovation is taking place in Big Data technologies, success would be
determined by mass adoption and a large number of businesses getting valuable insights through the
new and compelling end-user applications that allow regular business users or customers to quickly
derive practical and actionable insights.
Global Perspective on
Big Data
Big Data: The Next Big Thing
27
North America drives the Big Data opportunity with over
55 per cent of the worlds data
North America and Europe, the two major data hubs of the world, account for a substantial portion of
the global demand potential for Big Data analytics. Big Data service providers and leading IT players
have signicantly ramped up their capabilities in these developed regions that embraced the concept
of Big Data, particularly in data-intensive industries such as digital media, manufacturing, healthcare,
retail and nancial services.
While North America and Europe are poised to drive the growth of Big Data for the next 2-3 years,
developing economies such as India and China are expected to catch up soon riding high on the rapid
expansion of multimedia content, increasing popularity of social media and proliferation of mobile
devices. Further, while developed economies are likely to continue to be the major Big Data contributors
in terms of revenue opportunity, emerging economies, particularly India, are all set to emerge as the
preferred Big Data analytics and associated IT service providers.
Big Data: The Next Big Thing
28
Global Big Data market is estimated at ~USD 8.0 billion in 2012
Though still in an embryonic stage, with large rms piloting Big Data implementation, the industry is
witnessing exponential growth and market penetration. Statistics suggest that the industry is poised to
grow by more than 50 per cent in 2012 to approximately USD 8.0 billion from USD 5.0 billion in 2011.
Tremendous opportunities have mushroomed for players across the technology spectrum hardware
and software applications providers; systems integrators; technology consultants and analytics
service providers with a large number of organisations implementing Big Data technologies. The
IT-BPO industry is expected to account for about 36-38 per cent of the market opportunity, followed
by applications software at approximately 26-28 per cent.
The market is further expected to experience high penetration rate with investments expanding
beyond the leaders of the Silicon Valley such as eBay, Amazon, Yahoo and Google organisations
that initiated the Big Data revolution, to industry verticals such as manufacturing, nancial services,
healthcare and retail.
Big Data: The Next Big Thing
29
Emergence of niche start-ups and technological developments
fostering growth in the Big Data industry
Big Data: The Next Big Thing
30
New database architectures and innovative analytics tools and
techniques to facilitate Big Data implementations
The key stimulus for Big Data implementation is the innovation in database architectures and analytical
tools. Technologies are emerging in the areas of:
Data storage and management (architectures): A number of database architectures and systems
such as Hadoop, NoSQL database systems, and MPP systems have emerged, enabling easy storage
and analysis of high volume unstructured data, thus improving scalability and fault tolerance. These
systems perform data management functions much faster through distributed processing and rapid
parallel computations on large clusters of computer nodes.
Data storage, advanced analytics, and data processing: The need for faster data access, storage and
analysis has led to the development of in-memory databases such as SAP HANA and Terracottas
BigMemory, which store data in a computers memory, as opposed to disk-based database systems,
thereby enabling faster data processing, low-latency and real-time analytical queries. In-memory
databases particularly help in algorithmic trading, e-Commerce and social media analytics, where
datasets are large and real-time analysis is required. Moreover, analytics tools such as Kognitio, SAP
HANA, and SAS analytics server enable rapid computing and real-time analysis by reducing the response
time, exible and agile analytical environment through massively parallel processing of queries.
Big Data: The Next Big Thing
31
Advanced visualisation: Tools and techniques such as tag clouds, real-time dashboards, and heat maps
enable representation of multi-dimensional data in enhancing the quality of analysis and insight by
facilitating rapid and accurate observations. Unlike traditional visualisation tools, these new techniques
facilitate integrated display of performance metrics updated in real-time, enabling users to quickly
visualise complex data and get faster insights.
Big Data: The Next Big Thing
32
Emergence of niche Big Data start-ups to boost technological
innovation
Tools and technologies required to manage and analyse Big Data present a growth opportunity for start-
ups to innovate and come up with new products. New organisations across the Big Data technology
stack have been thriving on the back of some robust investments anticipated in the Big Data space. The
centrepiece of Big Data technology innovation, the Hadoop distribution, has been put to commercial
use by many start-ups such as Cloudera, HortonWorks, Zettaset, and MaPR, with some customisation
of the open source software. Furthermore, the business environment is witnessing a slew of start-ups
in the non-Hadoop systems such as NoSQL, Next Generation (MPP) Data Warehousing like CouchBase,
Splunk and VoltDB. The industry also has many start-ups emerging in the analytics platforms and
cloud-based applications as well as in the advanced data visualisation space. While the past 2-3 years
have mainly seen new organisations coming up in the data management space, analytics applications is
the impetus for growth in the next few years. Some of the start-ups in this eld include Karmasphere,
Kognitio, 1010Data, Revolution Analytics and QlikView.
The Big Data technology space is witnessing a lot of venture capital activity, with funding in Big Data
start-ups reaching ~USD 2.5 billion in 2011, compared with ~USD 1.5 billion in 2010. These start-ups are
innovation hubs that are gaining importance across industry verticals. Most of theseorganisations are
witnessing high double-digit revenue growth driven by the huge demand for their solutions. Moreover,
Big Data: The Next Big Thing
33
many start-ups are being acquired by larger IT players given the growth opportunities and the need to
build Big Data capabilities. For instance, IBM has acquired Tealeaf Technologies, Vivisimo and Varicent;
Teradata acquired eCircle, and EMC acquired Greenplum.
Big Data: The Next Big Thing
34
Large IT players leveraging M&As to add Big Data capabilities to
their service portfolios
The Big Data space is witnessing a string of M&A driven by the need to quickly ramp up capabilities
and also to have a complete set of capabilities to service clients who are keen to have Big Data
implementation. Leading technology players such as Oracle, IBM, SAP, and EMC are aggressively
acquiring smaller Independent Software Vendors (ISVs) and data analytics rms to strengthen their
Big Data portfolio.
IBM is in the forefront of this phenomenon through multiple acquisitions over 2010-12 in the Big Data
space. It acquired Vivisimo and TeaLeaf Technology in 2012, i2 Limited in 2011 and Coremetrics and
Netezza Corporation in 2010, for bolstering its Big Data capabilities. Further, HP acquired Autonomy
for more than USD 10 billion, making it the largest deal in the Big Data industry. HP aims to cater to
the Big Data market by leveraging Autonomys pattern matching technology that recognises and
processes Big Data.
Big Data: The Next Big Thing
35
Emergence of cloud-based development and deployment for
Big Data solutions
As data is increasingly becoming unstructured, complex and varied, it has become imperative to
process and analyse it in real-time. New data-centric solutions such as Database Platform-as-a-
Service (PaaS), on-demand database service, analytics Software-as-a-Service (SaaS), as well as
on-demand data preparation, storage or enrichment through Data-as-a-Service (DaaS) are now
commercially available.
These Big Data cloud solutions enable traditional enterprises to scale up their data management
and storage at lower costs and provide them real-time insights about the data that could not be
stored before.
While the existing SaaS application service providers are working towards product/service diferentiation
to ensure that customers derive more value from their applications, new pure-play service providers
are launching Big Data-specic cloud applications and services. For example, Google, Amazon Web
Services and Microsoft have enhanced their cloud oferings to ofer PaaS and analytics SaaS for
Big Data. Leading technology players are launching Big Data cloud solutions in June 2012, CSC launched
its DaaS ClimateEdge, a suite of reports that leverages data from NASA, the National Oceanic and
Atmospheric Administration (NOAA) and other government sources and uses on-demand advanced
analytics to manage climate-related risk and exposure. New players such as 1010Data, and Kognitio
Big Data: The Next Big Thing
36
are also ofering their cloud-based Big Data solutions to their customers, enabling them to analyse
Big Data on-demand.
However, the adoption of Big Data through cloud applications may witness a few roadblocks in terms
of data privacy and security concerns. For example, regulations such as Health Insurance Portability
and Accountability Act (HIPAA) Privacy Rules that ensure patient privacy of shared data may inhibit
the adoption of Big Data analytics on-demand.
Big Data: The Next Big Thing
37
Potential shortfall of 1.5 million data-savvy managers and
~150,000 data scientists in the US in 2018
The Big Data phenomenon has led to an increasing demand for data scientists professionals
conversant with both the business context and data analytics who play a crucial role in extracting
insights from large datasets, analysing these and then presenting the value-added information to
business users or non-data experts. Big Data needs a new breed of professionals with a deep expertise
in statistics and machine learning, as well as managers and analysts who can leverage insights for
Big Data. The shortage of such talent is a signicant challenge that organisations need to address
for successful Big Data implementation. According to McKinsey, the US alone faces a shortage of
140,000-190,000 analysts and 1.5 million managers who can analyse Big Data.
To address the shortage, organisations have embarked on initiatives to train their existing employees
and develop new talent. Organisations such as EMC, Oracle and IBM are partnering with universities
to ofer courses on various elements of Big Data. Internally, enterprises are creating organisational
cultures that are favourable for data-driven decisions by hiring employees from academic elds such
as statistics, and mathematics, as well as through on-the-job training on emerging technologies in
the Big Data space.
Big Data: The Next Big Thing
38
Slow enterprise adoption due to lack of awareness about
benets of the Big Data
While there is a lot of attention on Big Data and organisations worldwide have started investing in
it, adoption by traditional enterprises has been slower than expected. This is partly due to dif culties
in understanding the Big Data paradigm and how to integrate it with legacy systems and extract
business value.
Industry studies show that majority of respondents, mainly senior executives from diverse industry
verticals world over, acknowledge that Big Data holds signicant business opportunities; however, there
is a lack of understanding about how data can be used to drive businesses forward. Further, ensuring
that investing in Big Data implementation would achieve a high RoI is also a major concern. Given the
gap in understanding the benets and opportunities of Big Data, many enterprises are less inclined
to give it high priority for immediate investments. However, the market appears receptive as most of
the leading organisations across industry verticals are willing to integrate Big Data into their existing
systems, and are engaging in pilot projects to examine their success.
The value ofered by Big Data is not currently out of doubt as there are skeptics who are still questioning
if it is worth all the investments being poured into it. This is in part due to the lack of abundant and
well-publicised business cases on successful implementation and the benets accrued. Therefore, as
executives lack an understanding, and in some cases the sponsorship of Big Data, IT organisations
may witness additional complexities in terms of budget and bandwidth constraints in the process of
implementing Big Data.
Big Data: The Next Big Thing
39
Data related regulations like Dodd-Frank and Basel III to impact
Big Data implementations
An increasing number of regulations are driving organisations to source, analyse and report large
amount of data. Regulations such as Dodd-Frank, Basel III and HITECH mandate more transparency
and real-time reporting for data collected from multiple systems/sources, their aggregation, analysis
and storage. Consequently, organisations in various industry verticals are leveraging Big Data analytics
to comply and provide more transparency. This has prompted data management, storage and analysis
to be more comprehensive and real-time.
While regulations in industry verticals are driving Big Data adoption, regulations such as the EU Data
Protection Directive may impact adoption of Big Data analytics, particularly in cloud-based delivery
models. Further, with businesses collecting and storing large amount of customer data, privacy-related
concerns have also increased. Some countries have already enacted legislations to protect the privacy
of individuals and many are in diferent stages of formulating them. Therefore, businesses will also
have to consider certain regulatory aspects as they move towards leveraging Big Data analytics using
stored customer data.
Indias Advantage in the
Big Data Opportunity
Big Data: The Next Big Thing
41
Indias Big Data market opportunity estimated
at ~USD 200 million in 2012
India is rising to play an important role as a key outsourcing destination in the overall global Big Data
landscape for services relating to Big Data technology implementation and analytics, capitalising on its
already well-established IT/BPO and knowledge service outsourcing industry, which ofers signicant
cost and intellectual arbitrage to global multinationals.
Indias domestic demand for Big Data analytics is at a nascent stage since most Indian organisations
still consider Big Data as a mere hype. The opportunity for Indian service providers arises from
ofering Big Data technology implementation and analytics outsourcing services, which is growing
robustly. In 2011, Indias Big Data outsourcing opportunity was estimated by CRISIL GR&A to be around
USD 90 million and is projected to grow by ~110-115 per cent in 2012 to USD 200-205 million. The IT
services segment, which primarily comprises the Big Data technology implementation, including data
collection, integration, and designing of Big Data architecture and data analytical tools, is expected
to account for 82-84 per cent of this growth projection, while the Big Data analytics services is likely
to account for 16-18 per cent.
Although immense amount of data is being generated across all industry verticals including nancial
services, manufacturing, retail, healthcare, telecom, logistics, and others, nancial services and telecom
are early adopters of the Big Data technologies.
Big Data: The Next Big Thing
42
Key factors that are pushing organisations to adopt Big Data analytics include large volumes of data
being generated across global organisations as a result of the increasing use of Internet, mobile, social
media marketing, as well as Machine-to-Machine (M2M) conversations that need to utilise this data
to derive meaningful insights to help organisations make well-informed decisions.
Big Data: The Next Big Thing
43
Global In-house centres, pure-play analytics rms and IT/BPO
players expected to benet from the Big Data opportunity
The Big Data outsourcing market, though still at an embryonic stage, is being tapped aggressively by
the global in-house centres (captive centres of multinationals) as well as the Indian service providers
comprising IT/BPO players, pure-play analytics rms and knowledge service providers.
Global In-house Centres: Global multinationals have set up these centres across India to ofer
support on various back-end processes such as accounting, HR, and payroll as well as to ofer
an ofshore base for knowledge services such as business research, nancial research, data
management and analytics and legal services. With growing interest in Big Data, organisations
are leveraging their already established in-house centres for Big Data technology implementation
as well as to handle large volumes of unstructured data to provide business intelligence and data
analytics solutions.
Global in-house centres have been successfully leveraged to unleash the power of Big Data as
they enable seamless sharing of data given that they are a business unit/division of the parent
organisation. This is because there are no data security/privacy issues and there is a high level of
data integration with the parent. Further, the management enjoys tighter control over the data and
applies analytics closely related to business needs given that these centres have built-in domain
knowledge. Some of the key players who have set up in-house centres to deliver Big Data analytics
Big Data: The Next Big Thing
44
to their parent organisation are:
- Retailers such as Sears Holdings and Walmart
- IT/technology service providers such as Google, Yahoo, HP, SAP, Oracle, IBM and Dell
- Financial service organisations such as JPMorgan Chase, Merrill Lynch, HSBC, American Express,
Goldman Sachs, Barclays, Bank of America, Citigroup and Wells Fargo
Pure-play Analytics Players: These primarily comprise Indian as well as global pure-play analytics
rms as well as major knowledge service outsourcing providers who ofer analytics and are now
establishing their presence in the Big Data analytics eld. Key pure-play analytics rms operating in
the industry are: Bridge i2i, Nuevora, MuSigma, Cognilytics, Fractal and AbsolutData. Key knowledge
services outsourcing players such as CRISIL GR&A, Ugam Solutions, and SmartCube are increasingly
taking interest in expanding their analytics capabilities to harness the potential of Big Data. These
service providers enjoy strong subject matter expertise, leverage the best practices in the industry
to ofer analytics services, and ofer optimum priced services, given the economies of scale coming
from serving various clients with Big Data analytics. These players face key challenges such as low
levels of data integration with the clients, intellectual property and data security.
Integrated IT/BPO Providers: Several integrated IT/BPO players engaged in application development
& management, and infrastructure management as well as BPO players providing outsourcing
services for back-end functions have also entered the Big Data market and are moving from
simpler business process services to providing Big Data implementation, tools, and technologies.
To strengthen their presence in Big Data, these players leverage their global presence and existing
multinational client base looking at Big Data implementation as well as utilise their strong
technology orientation to provide Big Data tools and technologies. This business model mainly
comprises two categories of players:
- IT-BPO providers such as Infosys, TCS, Wipro, and HCL. TCS and Infosys are helping their global
multinational clients in designing and implementing Big Data technology
- Key BPO vendors such as Genpact, EXL, and WNS
Big Data: The Next Big Thing
45
Pure-play providers and integrated IT service providers are active
in providing services in the Big Data environment
Big Data: The Next Big Thing
46
Global in-house centres to be the front-runners in Big Data
servicing; but IT/Analytics players follow closely
Big Data analytics came into play globally in late-2011. In 2011, many multinationals were skeptical
about Big Data implementation and trying to quantify the Return on Investment (RoI) to build a
case for Big Data implementation. The early adopters of Big Data analytics have tried to leverage
their in-house global centres in India, given the talent shortage in the developed world, to generate
meaningful insights from Big Data. The ease of seamlessly sharing data and information also prompted
multinationals to leverage their analytics and knowledge centres in India to conduct Big Data analytics.
Global multinationals across verticals such as nancial services, retail, technology, and healthcare have
started leveraging their Indian centres for Big Data implementation and analytics.
In 2012-13, the success of global in-house centres in the Big Data market is expected to catapult the
emergence of a hybrid service model in which the in-house centres of global organisations would ofer
analytical services to external clients in addition to their internal business units. Further, pure-play
analytics rms present in India are increasingly deploying advanced analytical tools and techniques on
Big Data sets to gain signicant business traction as more and more Big Data business opportunities
move to India. Integrated IT/BPO service providers are building Big Data architecture and ofering
analytics services to their clients.
Big Data: The Next Big Thing
47
Some of the key initiatives taken by Indian service providers and global multinationals are:
In 2012, Sears Holdings, the fourth largest retailer in the US, created a wholly-owned
subsidiary, MetaScale, to target and sell its managed Hadoop services (or Big Data services) to
customers with revenue of between USD 1.0 million and USD 10.0 million across healthcare and
entertainment verticals
Walmart expanded its e-Commerce operations to India by opening a @Walmartlabs facility in
Bengaluru, India, in April 2012, to develop social media analytics and Big Data infrastructure
In July 2012, Yahoo also set up a Grid Computing Lab at the IIT-Madras campus in partnership with
the institute to enable researchers to access web-scale data and conduct research on Big Data
issues such as search, personalisation and digital advertising
Infosys aggressively focuses on ofering major enablers for Big Data analytics adoption including
solutions, services, and expertise across key industry verticals such as financial services,
manufacturing, healthcare, and telecom
In 2012, TCS won Big Data contracts to deliver next-gen insights using Big Data frameworks for a
global airline, a US-based bank and a global market research rm as well as to set up a leading-
edge distributed data warehouse for a hi-tech rm using Big Data
BPO service providers such as Genpact and IBM Daksh are also being seen as strong contenders in
the analytics domain and are well poised to capitalise on the Big Data trend
The Big Leap in Big Data is expected to come by 2014 when the stage of testing waters would have
been successfully crossed and Big Data pilot projects would have delivered protable results or expected
ROI for clients. Once the multinational organisations realise the potential opportunity ofered by
Big Data analytics, more and more organisations are expected to undertake Big Data implementation in
a big way to strengthen their business and enhance protability. All the players are expected to expand
their operations to tap the growth in the market. Hence, the industry is expected to witness:
The emergence of several new Big Data analytics rms to cash in on the growing Big Data opportunity.
Further, these analytics rms and knowledge service players are expected to play a dominant role
in the Big Data analytics space
Integrated IT services providers who are likely to ofer services across the Big Data value chain from
implementation, consulting to analytical services
Global in-house centres are likely to continue to grow, and more and more multinational organisations
are expected to leverage this business model and set up/expand their in-house centres for
Big Data implementation
Big Data: The Next Big Thing
48
Service providers are leveraging partnerships, M&As and venture
funding to capture the Big Data outsourcing opportunity in India
Major services providers across the country are undertaking several strategic initiatives to capitalise
on the Big Data outsourcing opportunity. The industry is witnessing an increasing thrust on leveraging
venture capital funding; collaboration for developing Big Data technologies and joint go-to-market;
mergers and acquisitions to enhance capability across Big Data software and services as well as
expanding overseas presence to capture the market.
Venture Capital (VC) funding: In the recent months, venture and growth capital rms have
invested huge amounts in Big Data organisations, primarily to enable these rms to strengthen
their operations
Partnerships with foreign players: Big Data service providers are entering into technology
partnerships and collaborations to expand their capabilities to serve new markets and
industry verticals
- In August 2012, Intel built partnerships with India-based Independent Software Vendors (ISVs)
across various business segments such as nancial services, manufacturing, education, retail,
telecom, and healthcare to foster its presence in the Big Data ecosystem in India
Big Data: The Next Big Thing
49
- In July 2012, BPO players such as Infosys BPO announced plans to look for partners in the
Big Data analytics eld to strengthen its capabilities
Strategic M&A to gain Big Data capabilities: The hype in the industry has led to the mushrooming
of various smaller players ofering Big Data services such as application development, system
integration, consulting, storage and architecture design. Established integrated IT/BPO service
providers and pure-play analytics rms are aggressively acquiring niche players to broaden
their capabilities
- In June 2012, Wipro acquired Australia-based Promax Applications Group, a specialised trade
promotion management rm, for USD 36.6 million to reinforce its presence in the Australian
market and strengthen its capabilities in Big Data analytics solutions
Geographic expansion: Indian organisations are also looking to expand their overseas presence to
market their Big Data capabilities and capture the market opportunity
Strengthening workforce: Various organisations are planning to collaborate with the academia to
train and certify data scientists to counter the impending shortage of data scientists, analysts,
and managers that is likely to challenge the Big Data market growth
- In August 2012, Intel announced plans to collaborate with educational institutions to bring
innovation in data analytics and research, and has tied up with ~300 colleges and universities
in India including the IITs and other educational institutes such as Pune University to foster
research and innovation in Big Data analytics
Big Data: The Next Big Thing
50
India has an early mover advantage vis--vis other geographies
in creating a strong base of Big Data workforce
India is expected to be a forerunner in Big Data talent supply, not as a cheaper alternative but as a
go-to-destination for the quality of talent in the country. India churns out more than 2.5 million university
graduates and about 750,000 post graduates every year, of which ~700,000 students are graduates
in Mathematics and Science and ~300,000 are post graduates in these elds. With its repower of
intellectual pool in Mathematics and Science, India is all geared up for the Big Data revolution. Further,
with the ever-increasing number of students having domain expertise in decision sciences, India is
well-positioned to address the global demand for Big Data solutions.
With India already catering to the business analytics needs of global multinationals at the best possible
performance-to-cost ratio, the country has a huge potential to supply data scientists for the Big Data
industry. Tier I cities such as NCR (Delhi, Gurgaon, and NOIDA), Bengaluru, and Mumbai have emerged
as good breeding grounds in India for global organisations to set up their analytics centres of excellence
and they account for more than two-thirds of the analytics professionals in India. Further, more than
60 per cent of the analytical workforce in India has a work experience of 3-10 years, which is a boon to
Big Data analytics. These professionals have the ability to apply advanced analytics and can be trained
internally by organisations to work on Big Data.
Big Data: The Next Big Thing
51
Indian academia is also aggressively developing capabilities to match the ever-growing demand for and
dearth in supply of data scientists with analytical training through solemn intervention at the education
level and imparting training on analytical and statistical tools. Premier colleges/universities in India
already have courses in place to impart training in analytics. Key analytics courses in India include:
Business Analytics and Intelligence (BAI) IIM Bengaluru: An executive course, BAI requires at
least ve years of work experience and is suitable for professionals who are already working in
analytics to enhance their knowledge as well as for those with an analytical aptitude
Executive Programme in Business Analytics IIM Calcutta: This is a one-year distance programme
ofered in association with Hughes Education, and covers topics such as data mining, soft computing,
design of experiments, survey sampling, statistical inference, investment management, nancial
modelling, and advanced marketing research
Advanced Certicate Programme in Business Analytics IIT Bombay: Designed in partnership
with HughesNet Global Education, it is a part-time course for analysts to develop the skills and
competencies of key analytics techniques such as behaviour and data modelling
Business Analytic & Data Mining Indian Statistical Institute ISI Pune: Designed to guide
business analytics professionals in analysing large quantities of data to study unknown interesting
patterns through cluster analysis, dependencies (association rule mining), classication of data,
and predictive analytics
Post Graduate Certificate in Research and Analytics MICA Ahmedabad: This is a
one-year programme based on practical and non-technical approach through various data
analysis software
Indian universities continue to introduce courses in statistics and data analytics to produce graduates to
meet the manpower shortage in the global Big Data market. Recent academia initiatives for developing
the talent pool for Big Data analytics include:
In August 2012, Academy of Decision Science and Analytics started ofering an e-learning Post
Graduate Programme (PGP) course in data analytics in association with Ivory Education
In July 2012, The Institute of Management Technology (IMT), Ghaziabad, signed an MoU with
Genpact to develop and implement analytics elective for the two-year post graduate diploma in
management programme to provide both theoretical and practical work experience in analytics as
applied in diferent industries
- Pankaj Kulshreshtha, Senior Vice President and business leader Smart Decision Services
Analytics and Research, Genpact, stated, The emergence of big data, regulatory changes
and social media are causing a big shift in the way businesses operate and students of IMT
will learn how to combine process, analytics and technology to make organisations smarter in
this dynamic new world. It is also a great example of two organisations, both leaders in their
respective elds working together to build talent in an area which is expected to more than
double in the next 2-3 years in India.
Big Data: The Next Big Thing
52
In May 2012, IIMLucknow partnered with the US-based Kelley School of Business to provide two
certicate programmes in business analytics and global strategy
- Dan Smith, Dean of the Kelley School, said, Our collaborative goal is to fundamentally advance
the quality of decision making by business leaders by improving their ability to draw meaningful
insights from the massive amounts of data available to them today.
In November 2011, Indian School of Business (ISB) Hyderabad launched Asia Analytics Lab for
its students, which is a focal point for data analytics initiatives, education, research and business
applications in the Asian context
In 2011, the Indian Institute of Science (IISC) Bengaluru launched Master of Management, a
two-year course to focus on training students in Technology Management and Business
Analytics
Indian service providers are also making large investments and innovation in creating and grooming
a new breed of talent. For example, IBM has partnered with 500 universities in India to help more
than 30,000 students develop skills in predictive analytics. India is at an advantage vis--vis other
geographies, as apart from the ample number of graduates it produces each year, organisations in
India are also making huge investments in breeding and grooming such talent. Further, India retains
advantages due to demographic factors, and the fact that the education system is producing a huge
pool of analytical talent.
Big Data: The Next Big Thing
53
Indian service providers ofering Big Data solutions
across verticals
Big Data: The Next Big Thing
54
1. Manufacturing: Indian service providers enable manufacturing
organisations to analyse large datasets for efective
decision making
The manufacturing sector generates large volumes of text, image and numerical data in its production
processes, R&D and engineering functions. The sector generates data from a multitude of sources,
including instrumented production machinery (process control), supply chain management systems,
and performance monitoring systems.
Large volumes of datasets thus aggregated are then subjected to diferent Big Data analytical tools
and techniques to generate useful insights across the value chain. Hence, Big Data nds application
across R&D, product design, supply chain management, production, marketing and sales, and
after-sales service.
R&D and product design: The use of Big Data in the R&D processes ofers opportunities to accelerate
product development, help designers focus on product features based on concrete customer inputs
as well as use designs that minimise production costs
- Aggregate customer data and make them available to improve service and enable
design-to-value
- Source and share data through virtual collaboration sites (idea marketplaces to enable
crowd sourcing)
Big Data: The Next Big Thing
55
- Build consistent interoperable, cross-functional R&D and product design databases to enable
rapid experimentation, simulation, and co-creation
Procurement: Manufacturing rms use Big Data analytics during procurement process to drive
ef ciency in their supply chain and improve demand forecasting processes. Manufacturers deploy
Big Data analytics to
- Gather sales, customer feedback, and demand patterns from distributors/retailers to rectify
any deviation in real-time, thereby improving the supply chain responsiveness
- Conduct a path analysis to design ways to move a product more efectively from the factory
to the customer
- Automate stock optimisation and replenishment decisions based on the analysis of
inventory-related data trends
Production: The deployment of the Internet of Things or actuators and sensors also allows
manufacturers to leverage real-time data from sensors to track parts, monitor machinery, and
guide actual operations. At the production stage, Big Data analytics is used in
- Digital factory simulations: Manufacturers take inputs from product development and historical
production data and apply advanced computational methods to create a digital model of the
production process and thus design optimal production layouts and digital shop oor control
and improved fault detection
- Sensor-based operations: Firms leverage Big Data analytics on the volumes of real-time, highly
granular data gathered from the sensors deployed across production lines to forecast operational
costs, schedule predictive systems maintenance, monitor labour and equipment performance,
and improved fault detection by identifying patterns that lead to potential equipment failure
Sales & Distribution: Manufacturing organisations track customer-related transaction data to
generate actionable insights on the customer buying patterns and behaviour, strengthen their
marketing and sales strategies and make informed product decisions. Analytics can be applied on
this data to
- Ensure improved customer segmentation and better customer relationship management
- Improve product inventory tracking
- Enhance the efectiveness of the sales force and marketing campaigns
After-Sales Service: Warranty analytics as well as real-time analysis of sales and feedback data are
the key applications being leveraged by manufacturing rms, which are based on Big Data analytics.
These applications primarily involve analysing large volumes of warranty claims to improve product
development with the aim of improving product quality and reducing warranty costs. Further,
after-sales and feedback data can help enhance after-sales service as well as detect and rectify
manufacturing and design errors to enhance customer satisfaction

Big Data: The Next Big Thing
56
Some of the key benets delivered by Big Data analytics for the manufacturing sector include:
Product demand forecasting and supply planning: Using real-time data from sales and demand
patterns or from customer feedback and purchasing behaviours, manufacturers can rectify any
deviation in real-time, engage in efective demand forecasting, adjust production levels and increase
the frequency of planning supply cycles to match with the production cycles
Improved collaborative engineering through crowdsourcing: Leverages crowdsourcing to collect
product-/market-related data to enable collaborative engineering that results in innovative design
from customers. For example, auto manufacturing organisations encourage ideas from consumers
to make improvements to new car models. Big Data analytics enables these organisations to gather
and analyse data from tweets, blogs and other social media platforms efectively to ofer innovative
features in newer versions of the vehicles
Mass customisation: By enabling design-to-value, Big Data analytics allows manufacturers to
leverage quantitative customer insights mined from sources such as PoS, customer feedback from
retail surveys, and social media platform, and improve their output quantities as well as facilitate
mass customisation
Ef cient planning and operations: Big Data aids in designing, simulating and testing product or
factory plans in a virtual manner, before the actual production or construction. Further, it is used
to predict equipment failures and system replacements to better anticipate any roadblocks in the
manufacturing processes.
To capitalise on this huge opportunity, various Indian Big Data service providers such as Infosys, Intel,
Fractal, and Wipro have built capabilities to win new clients as well as to better serve the existing ones
in the manufacturing sector.
In 2012, Infosys was selected as the sole sourced partner for cloud strategy and Big Data infrastructure
for a North American manufacturer, to devise a Big Data strategy and roadmap
In August 2012, Intel announced the signing of partnerships with India-based ISVs across various
business segments including manufacturing, and others to build Big Data analytics capabilities
across India
Big Data: The Next Big Thing
57
Case examples: Indian service providers serving global
manufacturers on custom designed Big Data implementations
and analytics
Big Data: The Next Big Thing
58
2. Retail: Indian service providers help retailers understand
customer buying patterns and maintain optimal stock levels
Retailers generate Big Data through various sources such as social media, Point of sale (PoS) and web/
online sales platform (credit cards and rewards cards, purchases), consumer surveys, loyalty programme
proles, in-store tools and footfalls. This customer-focused data can be used to gain signicant and
meaningful insights into consumer behaviour, their buying patterns, and changing preferences.
Big Data analytics helps both online as well as brick and mortar retailers to improve their decision making,
manage the supply chain, inventory levels, merchandising and pricing, enhance focus on customer
segmentation and hence introduce targeted products/services as well as marketing/promotional
campaigns. Further, Big Data allows retailers to enhance their margins and productivity by enabling
them to perform real-time analysis of customer response to pricing/product changes/productivity and
rene their strategies based on such analysis.
Some of the important areas within the retail industry where Big Data analytics is being used are:
Supply chain and procurement: Retailers use Big Data analytics to help them better manage their
and their suppliers inventory levels, relationships with suppliers, and make informed decisions on
stock levels. For example, Barnes & Noble deployed Big Data analytics solution from IBM to enable
suppliers to monitor its inventory and take appropriate replenishment decisions. Big Data enables
retailers to
Big Data: The Next Big Thing
59
- Improve inventory management, stocking decisions and stock forecasting by combining multiple
datasets such as sales history, weather predictions and seasonal sales cycles
- Optimise transportation and vehicle routing by using GPS-enabled Big Data telematics to
improve eet and distribution management, enhance productivity by rationalizing fuel ef ciency,
preventive maintenance, driver behaviour, and vehicle routing
- Base their supplier negotiations for price discounts, and change in raw material preferences by
analysing customer preferences and buying behaviour data
Merchandising: Big Data implementation and analytics on the POS and RFID data can help retailers
to easily strengthen their merchandising-oriented decisions such as
- Assortment optimisation: Retailers make product assortment decisions in stores based on the
demographic and purchasing pattern data
- Price optimisation: Retail rms can leverage advanced demand-elasticity models on the pricing
and sales data available for deciding the optimum pricing of products and services
- Placement and design optimisation: Brick and mortar retailers optimise the placement of
goods and visual designs of their store layout by mining sales data at the SKU level and even
foot-traf c data and online retailers adjust website placements based on data on page interaction
such as website traf c, scrolling, clicks, and mouse-overs
Operations: To create operational value and efficiency, retail firms are deploying Big Data
implementation to
- Ensure performance transparency by analysing store sales, SKU sales, and sales per
employee data
- Reduce costs while maintaining service levels by leveraging the labour input, time and attendance
data, and tracking labour scheduling information
Sales and marketing: It is the most common business function for which retail rms use Big Data
analytics. Key sales and marketing functions where Big Data implementation nds use are:
- Use customers demographics, purchase history, preferences, and real-time location data for
cross-selling and up-selling of goods
- Undertake location-based marketing for ofering promotional discounts, and special ofers,
primarily leveraging the personal data generated by smartphones
- Enable customer micro-segmentation to deliver personalisation of products/services
to customers based on traditional market research data as well as data available from
behavioural tracking
Big Data: The Next Big Thing
60
- Use sentiment analysis that leverages consumer data generated by social media platforms
to make informed business decisions such as assessing the real-time response to
marketing campaigns
- Study in-store consumer behaviour to improve store layout, product mix, and shelf positioning
by tracking shopping patterns, real-time location data from smartphone applications, and
shopping cart transponders
Customer services: By applying Big Data analytics on customer behaviour, which can be tracked
through service centres (IVR and call centres), social media platforms; retailers can improve their
interaction with customers for better service delivery
Big Data analytics has found signicant acceptance in the retail sector, especially among the leading
players. Walmart acquired social media rm Kosmix to create WalmartLabs and is using this specialist
R&D unit to redesign its business by merging social, mobile and retail data, to understand consumers
buying habits. Further, in April 2012, Walmart expanded its e-Commerce operations to India and
opened the @Walmartlabs facility in Bengaluru, India, to develop social media analytics and Big
Data infrastructure. Other retailers such as Sears utilise their in-house IT/technology centres in India
to provide Big Data analytics to set product prices in real-time and move inventories. It also has a
subsidiary, Metascale, which helps other organisations in industries such as energy and healthcare,
implementing Hadoop.
Big Data-driven analytics hold much potential for retailers in the realm of customer intelligence.
These include:
The ability to prole and segment customers based on socioeconomic characteristics can allow
rms to market to diferent segments based on their discrete preferences and hence generate
better customer retention rates
Online social network analysis enables businesses to monitor consumer sentiments towards their
brands, react to trends as they develop, and identify inuential individuals within networks for
direct marketing
Using Big Data to construct predictive models for customer behaviour and purchase patterns
facilitates the accurate appraisal of each Customers Lifetime Value (CLV) to a rm, allowing
resource allocation towards acquiring and retaining profitable clients, thereby raising the
overall protability
Big Data: The Next Big Thing
61
Sears is leveraging Big Data analytics to turn itself around, and is
also keen on ofering analytics services to external clients
Big Data: The Next Big Thing
62
3. Financial services: Witnessing increased adoption of Big Data
analytics, to reduce risk and uncover new market opportunities
Financial services is considered to be a very data-intensive sector, with more data per million of revenue/
operating expenditure or per employee, than almost all other sectors. Within the sector, structured
and unstructured data is available from a variety of sources such as customer and transaction data
from various channels such as branch, kiosks, mobile and web; social media; emails; credit cards
data; insurance claims data; stock market data; statistical data, PDF & excel les, news, videos, and
government lings.
With the industry facing a multitude of challenges such as higher customer expectations, uncertain
operating environment, strict regulations, stif competition, and slowing economic growth, Big Data
analytics can help banks, capital markets and insurance organisations by providing tools to reduce
costs and improve productivity. Increasing regulatory compliances and the need for collecting every
piece of data and standardising them is driving the growth of Big Data analytics. Several areas within
the nancial services sector are expected to gain from Big Data technologies. They include:
Banking
Credit reward programme analysis: Banks are increasingly using unstructured data to understand
customer prole and introduce successful credit cards with innovative rewards programme
- For example A national bank used a Big Data solution to analyse data from sources such as call
centres, customer service emails, and social media conversations to create a credit card ofering
Big Data: The Next Big Thing
63
with a rewards programme to attract a young, professional demographic. This helped in providing
information to the marketing department to create a targeted promotion campaign, including
strategically placed social messaging and monitoring
Capital Markets
Trading surveillance: The nancial sector leverages Big Data to monitor trading activities and identify
abnormal trading patterns. In surveillance, Big Data analytics allow online access to trade-by-trade
history for investigation, trending, and discovery to be combined with real-time data to provide a
real-time and historical context to behaviour
- For example Organisations combine data about the parties that participate in a trade with the
complex data that describes relationships among those parties and how they interact with one
another. The combination allows the bank to recognise unusual trading activity and to ag it
for review
Insurance
Insurance organisations are increasingly using unstructured data to predict client longevity, along
with examining the prospective clients medical status by analysing their general comments, visits to
particular websites, and enquiry about some specic products.
Using weather and calamity information for managing claims exposures and losses based on
unstructured data from weather measurements, and soil observations.
- E.g. An insurance organisation sells Total Weather Insurance, which pays local farmers
when they are impacted by weather events that afect their prots. The organisation uses a
cloud-driven Big Data analytics service to predict the possibility of extreme weather, along
with the potential impact. It prices its insurance policies accordingly, based on 2.5 million daily
weather measurements, 150 billion soil observations, and 10 trillion scenario data points to
build and price their products
Big Data is being extensively used across all domains of the nancial services for risk management,
fraud detection, compliance and customer relationship management:
Risk management: Predictive modeling of customer behaviour and scoring techniques enable
nancial sector organisations to access and minimise default risks at an individual level and make
customised oferings, in line with the customers risk prole
- E.g. A large bank wanted to use 12 years of monthly account-level credit card data, credit
bureau information and bank account information to better assess the risk before granting
loans or raising credit limits. Ideally, it wanted this information in real time. To speed the
computing, it used an in-database Big Data approach, which helped the bank to calculate risk
70 times faster
Big Data: The Next Big Thing
64
Fraud detection: Big Data technologies give nancial services organisations the ability to run
exploratory modelling and discovery on data, thereby increasing the accuracy of fraud detection
models. The faster processing capability enables organisations to quickly build or refresh fraud
detection models, and also helps in detecting fraud in real-time by analysing and streaming
transaction data
Compliance and regulatory reporting: Increased oversight and scrutiny of the organisations
operations, funding and investment portfolio has led nancial services organisations to adopt
sophisticated Big Data technologies to store and process vast amount of data to simplify and
streamline their regulatory and compliance reporting
- For example Reserve Bank of India (RBI) has directed all Indian banks to standardise their
regulatory reporting by following an Automated Data Flow (ADF) approach to ensure
100 per cent accuracy and zero human intervention in every stage of reporting: right from data
extraction from source systems to the actual submission of returns. Firms that could not utilise
complete information and rms that believed reporting did not really require management
attention are increasingly focusing on Big Data analytics
Customer relation management: Big Data analytics also helps nancial service organisations
in acquiring new customers and cross-selling their oferings to existing customers by using
Big Data to identify the most protable customers and run efective marketing campaigns. The
large volume of unstructured data from social media is combined with the CRM systems to
study customer behaviour and optimise customer experience. Apart from customer acquisition,
organisations can improve customer retention by using predictive analytics to detect early signs
of disengagement
Financial services organisations are gaining business advantage by mining and analysing Big Data to
stay ahead of the competition, improve customer service, detect fraud, accurately calculate risks and
maximise operational ef ciencies, along with adhering to stringent regulations and compliances.
Indian service providers are enabling Big Data analytics in the area of fraud detection, client behaviour
analysis, trading pattern analysis, risk calculation on large portfolio of loans, and improved and targeted
marketing campaigns. Further, Indian nancial sector organisations are increasingly favouring Big Data
analytics to tackle terabytes of unstructured data:
- YES Bank is nding out solutions to handle the increasing pile of unstructured data from mobile
devices and social media networks, customer transaction starting from withdrawal of money
from bank, and ATM. The bank feels the regulatory requirement of storing internally generated
data is driving banks to adopt Big Data
Big Data: The Next Big Thing
65
Case examples: Financial services rms are using Big Data to
prevent fraud and better understand customer prole
Big Data: The Next Big Thing
66
4. Telecom services: Telcos are using Big Data to
boost marketing, reduce attrition rate and enhance
network productivity
The telecommunication service industry is characterised by extremely high levels of competition. This
has resulted in the telecom organisations shift in focus from simply reducing costs and increasing
protability to delivering value and managing customers experience over their networks. Further,
commoditisation of traditional voice-based services has led to reducing Average Revenue per User
(ARPU) and margins. So it has become important for the telecom service providers to diferentiate
themselves by providing innovative and high quality services, while avoiding network overload and
cost overruns.
The telecom industry generates large volumes of real-time data, including customer call logs, billing
and usage data as well as data from networks and routers, access points, mobile devices, and social
media platforms. This presents a huge opportunity for telecom players to leverage Big Data analytics
to derive meaningful insights, help gain better control of services and make efective operating and
investment decisions. Following are some areas where Big Data analytics can play a signicant part:
Network planning and optimisation: Big Data implementation can help operators to ef ciently
plan and predict network growth based on past capacity utilisation, marketing demand forecasts
and service consumption trends and implement network changes just before the demand curve. It
can also help them to analyse the data available on various web metrics to assess the bandwidth
utilisation and better plan on how to use the unused resources
Big Data: The Next Big Thing
67
Service quality management: Big Data analytics can give telecom service providers the ability to
analyse real-time streaming data from network elements and consumer devices to predict network
failures and take preemptive steps
Price & product customisation: Using insights generated by combining customer usage and
subscription data with network, cost and revenue data, telecom organisations can provide a wide
range of services to their customers
Strategy assessment and decision making: Leveraging the data generated from customer
records across various platforms, telecom organisations can design their marketing campaigns
and promotional schemes/discounts/ofers to better target customer groups and ofer more
personalised/targeted services
Customer attrition management: By conducting predictive churn management analytics on its
customer data, telecom service providers can identify high risk customers who are likely to leave
the network and ofer them timely and attractive deals to retain them
As the telecom industry has been one of the early adopters of Big Data tools and technologies, it is
reaping the several benets from the usage of Big Data analytics including planning ef cient utilisation
of network bandwidth, improving service levels by proactively detecting network and router failure,
and better customer retention through targeted marketing and promotion campaigns.
Various Indian telecom service providers have adopted Big Data technologies and realised
its benets
Reliance Communications plans to use Big Data implementation on the data generated by its
telecom business for analytical planning and strategic decision making. Reliance has adopted Multi
Parallel Processing DB to store CDR and unstructured data and perform analytics on it
Bharti Airtel creates more than 5,000 targeted campaigns a day using Big Data generated from its
customer usage, billings and sales details
Big Data: The Next Big Thing
68
Case example: Impetus provided a pragmatic approach using
NoSQL Apache Cassandra
Big Data: The Next Big Thing
69
5. Healthcare: Is getting transformed with the adoption of
Big Data analytics, substantially improving patient care
In the healthcare industry, data is being generated at a faster pace owing to rapid digitisation of patient
healthcare records, monitoring of in-patient and out-patient through sensors, generation of epidemic
data, genomics research, medical imaging (MRI, CT-Scan) and implementation of Hospital Information
System (HIS), Picture Archiving and Communication System (PACS) as well as gathering of patient
behaviour and sentiment data from social media platform.
Currently, only a few physician of ces and hospitals, majorly in the US and UK, have Electronic Health
Records (HER) systems in place, but that number is likely to increase as the Health and Human Services
departments and private hospitals are likely to support EHR adoption rapidly in the coming years. This
has prompted healthcare providers, payers, pharmaceutical and medical products organisations to adopt
Big Data and explore measures to manage costs, develop products and provide better healthcare to
patients. There are several areas where Big Data technologies play a critical role:
R&D, Life Sciences/Biomedicine: In this area, Big Data technologies are useful in drug discovery
analysis, data annotations and validity analysis of genomic, proteomic, and metabolic data and
studying gene expression for next-generation sequencing and read mapping
Big Data: The Next Big Thing
70
- E.g. Clinical Genomics uses algorithms and analytics to nd treatments for conditions based on
a patients genetic prole. Doctors can use Clinical Analytics to analyse patients similarities,
predict outcomes, evaluate risk benets and view treatment options
Patient Care: Big Data analytics is increasingly being used in the areas of patient care such as patient
monitoring and assessment, patient care personalisation, providing efective and value-added
services, preventative care, identifying potential causes for infections, readmission, and diseases.
Some of the instances of how Big Data improves patient care are:
- Application of Big Data analytics to patient proles (e.g., segmentation and predictive modelling)
which help to identify individuals who would benet from proactive care or lifestyle changes.
For instance, these approaches can help identify patients who are at high risk of developing a
specic disease (e.g., diabetes) and would benet from a preventive care programme
- Using visual analytics, doctors can look more deeply into care processes to identify the most
efective ones and how they can be ne-tuned
- Improve patient care by analysing data coming from myriads of remote patient monitoring
devices such as wearable devices, home sensing devices, and video monitoring
Healthcare operations: Healthcare operations include activities such as understanding and
inuencing consumer behaviour, optimising physician interactions, clinical decision support system,
monitoring and educating patients, and Comparative Efectiveness Research (CER).
- E.g. automated diagnoses of early-stage breast cancer by using Big Data analytics technology.
By automatic analysis of large sets of mammographic images, using unique Image Classication
approach, healthcare organisations can classify large collection images based on a small set of
training images, which helps the radiologists to speed up their time-to-diagnose
Epidemiology: Big Data technologies are helpful for pattern analysis and trends in health issues
across a geography, tracking of the spread of disease based on streaming data, and visualisation
of global outbreaks, enabling the determination of source of infection
Healthcare security: The healthcare sector loses huge sums of money due to medical fraud. Big
Data technologies enable government and insurance organisations to detect fraud in real-time and
prevent nancial losses arising from fraudulent claims
Therefore, Big Data helps organisations within the healthcare, life sciences and pharmaceutical space
to improve the quality of patient care or proactive care, lower the cost of healthcare services and
patient care, enhance fraud detection and make hospital operations more ef cient, and accelerate
research and development. Organisations such as Quintiles and Accenture are leaders in providing
Big Data analytics for the healthcare and pharmaceutical space.
The Future of Big Data
Big Data: The Next Big Thing
72
Global Big Data market to reach ~USD 25 billion by 2015
As enterprises undertake pilots for Big Data implementation and large IT organisations and start-ups
compete for market share, the global Big Data market is expected to grow by about 46 per cent to
more than USD 25 billion by 2015. The IT & IT-enabled services, including analytics, are expected to
grow the fastest, at a rate of more than 60 per cent), with their share in the total Big Data market
expected to increase to ~45 per cent in 2015 from ~31 per cent in 2011. Big Data analytics is likely to be
driven by the near-ubiquitous nature of the data and proliferation of technologies and applications
such as mobile sensors, smartphones and social networking, along with the growing realisation of the
benets of Big Data by enterprises.
While Big Data could add momentous value in the coming years, it might have to overcome certain
roadblocks. Though early movers are formulating Big Data strategies, mass adoption may be hindered
by the lack of best practices and the signicant cultural change organisations require for sharing data.
However, as organisations leverage large datasets from within and outside, Big Data is likely to continue
to grow as an area which can deliver substantial benets. Finally, the aggressive eforts of service
providers both large IT organisations and niche start-ups to demonstrate their domain expertise
and ability to derive valuable insights from Big Data would be an enabler to this opportunity.
Big Data: The Next Big Thing
73
India Big Data outsourcing opportunity to increase over
2012-15 to lie between USD 1.1-1.2 billion
Indias Big Data outsourcing opportunity is likely to grow by about 83 per cent annually to ~USD 1.0
billion during 2011-15. India is expected to be the preferred destination for analytics and IT services for
Big Data due to its pre-eminence in IT/BPO services, knowledge services outsourcing and analytics as
well as for its intellectual pool of talent. The share of analytics in the overall Big Data opportunity is
expected to rise from ~16 per cent in 2011 to 25per cent in 2015.
The key drivers for India include the eforts of service providers to develop talent and increase their
domain expertise and breadth of services. Moreover, a number of Indian service providers are leveraging
partnerships with Big Data technology players to facilitate delivery of Big Data solutions. Finally, while
the current demand for Big Data analytics is generated from global clients, domestic demand in India
is also gaining traction. For example, Asian Paints and Star India have leveraged Big Data analytics to
track and analyse large datasets.
Big Data: The Next Big Thing
74
Global Big Data market to evolve, India to emerge as a preferred
destination for analytics and IT services
The Big Data industry is likely to continue to strengthen its foundation over the coming years by investing
more on the Big Data technologies and tools. While the emphasis was on technology innovation
in database storage and management in 2011-12, the focus is expected to shift to the delivery of
Big Data analytics with newer applications coming in for analytics and visualisation. Further, with
the growing use cases of Big Data implementation, best practices could help in the wider adoption of
Big Data analytics.
The US is likely to remain the major market, while demand for Big Data solutions from APAC and
Europe is expected to gain traction in the next 2-3 years. Service providers are expected to continue
expanding their oferings and educating clients about the benets of Big Data. Large integrated
IT-BPO players are likely to leverage technological partnerships with large Big Data players such as
Cloudera, EMC (Greenplum), and HortonWorks, as well as evaluate M&As as a tool to build a robust
portfolio and provide a global delivery network.
As things stand, India is ideally poised to capitalise on Big Data, but there is still work to be done if it
were to fully realise its potential in terms of a rened talent pool, having a mature service-provider
landscape and innovative service delivery. Continued excellence, along with Indias key value proposition,
will ensure the countrys position as a hub for Big Data analytics.
Big Data: The Next Big Thing
75
Indian service providers expected to hold a lions share in
analytics and IT services for Big Data
With more businesses embracing a data-driven decision making culture, Indian IT/BPO service providers
and analytics players are providing clients the necessary tools and solutions required to harness
Big Data. They ofer analytics-led solutions for better customer insights, unique market diferentiation
and managing risks and nancial metrics more efectively.
Indian players are likely to build capabilities across the entire spectrum of the Big Data ecosystem.
While IT/BPO players would build capabilities in the development of infrastructure, implementation,
and delivery, analytics and knowledge service providers are expected to scale up their capabilities
in providing Big Data analytics. As the Indian IT/BPO players already have a leading position in
industry-specic software development and implementation, they have a huge growth opportunity
to build Big Data end-user applications, and develop Big Data management and storage service
portfolio. On similar lines, analytics players such as Genpact, MuSigma, CRISIL GR&A, AbsolutData,,
and LatentView are gearing up to build the robust advanced analytics required to manage the insights
engine for Big Data.
Big Data: The Next Big Thing
76
Concerted eforts by the service providers and academia to
improve talent employability
The rising demand for Big Data analytics is expected to witness a global shortfall of IT and analytics
professionals with the necessary skills to implement technologies to leverage Big Data and manage
project mandates to derive business value based on these datasets, and data scientists who can
run complex techniques to unravel the insights from these datasets. If this issue is not addressed,
it can result in a situation where businesses might not be able to gain from the potentially valuable
insights from Big Data. However, Big Data analytics as an integrated discipline has just emerged in
the academic curriculum and it would take some time before academic institutions start producing
Big Data professionals. As a fresh pedigree of data scientists would be limited, the industry is
aggressively implementing efective recruitment practices and training modules to develop the existing
pool of BI analysts and IT professionals for Big Data analytics.
While major eforts are being undertaken globally to develop the Big Data talent, India is at the
forefront and has an early-mover advantage than other outsourcing destinations in terms of initiatives
by academia and corporates in building fresh talent for Big Data. Indian enterprises and academia
have also started addressing the Big Data skills shortage. Technology rms such as EMC, Oracle, and
IBM are planning to work with universities in India and overseas to introduce full-length electives or
crash courses on various elements of Big Data. Training organisations such as NIIT and Aptech are
Big Data: The Next Big Thing
77
also exploring designing curricula for developing specialised skilled talent for the Big Data industry.
Further, enterprises are creating organisational cultures that are conducive to data-driven decision
making by:
Efective recruitment: While recruiting new talent, the focus is shifting from business-oriented
degrees to other academic elds such as hard sciences, statistics, and mathematics. Further,
candidates are being tested for intellectual curiosity and technical depth to address
Big Data challenges
On-the-job training: Organisations, both global and Indian, are investing in on-the-job training in
emerging technologies of Big Data to eliminate skill gaps in their existing workforce
Annexure
Big Data: The Next Big Thing
79
AbsolutData Research & Analytics
Big Data: The Next Big Thing
80
Accenture
Big Data: The Next Big Thing
81
CRISIL Global Research & Analytics
Big Data: The Next Big Thing
82
CSC
Big Data: The Next Big Thing
83
EMC Corporation
Big Data: The Next Big Thing
84
Fractal Analytics
Big Data: The Next Big Thing
85
Genpact
Big Data: The Next Big Thing
86
IBM
Big Data: The Next Big Thing
87
Impetus Technologies
Big Data: The Next Big Thing
88
Infosys Limited
Big Data: The Next Big Thing
89
LatentView Analytics
Big Data: The Next Big Thing
90
Marlabs
Big Data: The Next Big Thing
91
MetaScale LLC
Big Data: The Next Big Thing
92
Mu Sigma
Big Data: The Next Big Thing
93
Nuevora
Big Data: The Next Big Thing
94
Glossary
Apache Cassandra
Is an open source distributed database management system. It is a NoSQL solution, designed to handle very large amounts of data
spread out across many commodity servers while providing a highly available service with no single point of failure.
Apache Thrift
Is an interface denition language that is used to dene and create services for numerous languages.
Apache Avro
Is a remote procedure call and serialisation framework developed within Apaches Hadoop project. It denes data types and protocols,
and serialises data in a compact binary format.
Apache Pig
Refers to the data ow language and execution framework for parallel computation, built on HDFS.
BASEL III
BASEL III is a global regulatory standard on bank capital adequacy, stress testing and market liquidity risk agreed upon by the
members of the BASEL Committee on Banking Supervision in 2010-11.
BFSI Services
Refer to banking, nancial services and insurance services, and includes players like banks, asset managers, mutual funds, insurers,
brokers, traders, etc.
Big Data
Big Data relates to rapidly growing, structured and unstructured datasets with sizes beyond the ability of conventional database
tools to store, manage, and analyse them. In addition to its size and complexity, it refers to its ability to help in evidence-based
decision making, having a high impact on business operations.
Business Intelligence
Use of analysis tools to query data repositories and generate analyst reports, enabling managers in business decision making by
identifying trends and patterns in the industry.
Convectional File System
Is the log-structured le system, designed for high write throughput. All updates to data and metadata are written sequentially
to a continuous stream known as log.
Chukwa
Is a Hadoop subproject for large-scale log collection and analysis.
Clustergram
Is used to visualise how clusters are formed and how cluster members are assigned to clusters as the number of
clusters increases.
Big Data: The Next Big Thing
95
Data Acquisition
Refers to primary data collection through phone, eld surveys and interviews, as well as secondary
data collection through web searches, printed sources and databases.
Data Analytics
Refers to the extensive use of data, statistical and quantitative analysis, explanatory and predictive
models, and fact-based management to drive business decisions and actions.
Data Collection
Refers to collection of data, survey programming and hosting, data search integration
and programming.
Data Integration
Involves combining data residing in diferent sources and providing users with a unied view of
these data.
Data Management
Refers to the development and execution of architectures, policies, practices and procedures that
properly manage the full data lifecycle needs of an enterprise.
Data Scientist
Refers to professionals conversant with both the business context and data analytics. Their role
encompasses extracting insights from large datasets, analysing these and then presenting the
value-added information to business users or non-data experts.
Data Warehousing
Refers to the systems where the data resides or stored.
Delivery Centre
Refers to a regional office at an onshore or offshore location, established to deliver services
to clients.
Distributed Hardware
Consists of multiple autonomous computers that communicate through a computer network.
Dodd-Frank
The DoddFrank Wall Street Reform and Consumer Protection Actplaces regulation of the nancial
industry in the hands of the government. Aims to prevent another signicant nancial crisis by creating
new nancial regulatory processes that enforce transparency and accountability while implementing
rules for consumer protection.
Big Data: The Next Big Thing
96
Equity Research
Study and analysis of fundamental parameters of Companies and industries, to ofer insights on
investment opportunities.
ETL & Data Integration
ETL is a process in database usage and in data warehousing that involves extracting data from multiple
sources, transforming it to t operational needs, and loading it into a database, operational data store,
data mart or data warehouse.
EU Data Protection Directive
Enables protection of individuals with regard to the processing of personal data and on the free
movement of such data in the European Union.
Financial Research
Includes equity research, xed income research, credit research, knowledge support for investment and
wealth management, commodities and foreign exchange, derivatives, as well as emerging services
like risk management analytics and insurance actuarial services.
FTE
Refers to Full-Time Equivalent (FTE) employees, who work an equivalent number of hours as a full-
time employee.
Gigabytes
Refers to research in instruments with xed interest payment obligation, such as bonds and swaps.
Hadoop Distributed File System (HDFS)
Is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data
blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely
rapid computations.
Hadoop Sqoop
Enables import and export of data from structured data stores such as relational databases, enterprise
data warehouses, and NoSQL systems.
Hadoop
A Big Data technology framework, it is an open source software framework that supports data-intensive
distributed applications licensed under the Apache v2 license.
HBase
Refers to a Bigtable-like structured storage system for Hadoop HDFS.
Big Data: The Next Big Thing
97
Heat Maps
Is a graphical representation of data where the individual values contained in a matrix are represented
as colours.
HIPAA
Refers to the Health Insurance Portability and Accountability Act (HIPAA) that protects health
insurance coverage for workers and their families when they change or lose their jobs, and requires
the establishment of national standards for electronic healthcare transactions and national
identiers for providers, health insurance plans, and employers.
History Flow
Charts the evolution of a document as it is edited by multiple contributing authors.
HITECH Act
The HITECH Act set meaningful use of interoperable EHR adoption in the healthcare system as a critical
national goal and incentivised EHR adoption.
Hive
Is a data warehouse infrastructure which allows SQL-like ad-hoc querying of data (in any format)
stored in Hadoop.
Global In-house Centres
Are the ofshore centres of global corporations for services that are to be kept in-house and involve
intellectual property and sensitive data.
In-memory Analytics
Is a BI methodology used to solve complex and time-sensitive business scenarios. It works by increasing
the speed, performance and reliability when querying data in the servers random access memory.
In-memory Databases
Refers to the database management system that primarily relies on main memory for computer
data storage.
Machine Learning
Is a scientic discipline that deals with the design and development of algorithms that take as input
empirical data, such as that from sensors or databases, and yield patterns or predictions thought to
be features of the underlying mechanism that generated the data.
Mahout
Refers to a scalable Machine Learning algorithms using Hadoop.
Big Data: The Next Big Thing
98
MapReduce
Is a programming model for processing large datasets, and the name of an implementation of the model
by Google. MapReduce is typically used to do distributed computing on clusters of computers.
Massively Parallel Processing
Refers to the coordinated processing of a programme by multiple processors that work on diferent
parts of the programme, with each processor using its own operating system and memory.
NoSQL Databases
Refers to the next-generation databases that are non-relational, distributed, open-source and
horizontally scalable. NoSQL databases are not primarily built on tables and can manage large
unstructured datasets.
Online Analytical Processing
Provides solutions to multi-dimensional analytical queries. OLAP is a category of business intelligence,
which also encompasses relational reporting and data mining. Key applications of OLAP include business
reporting for sales, marketing, management reporting, business process management, budgeting and
forecasting, nancial reporting, etc.
Pure-play Service Providers
Refer to independent service providers who are specialists in offering a broad range of
Big Data services.
RDBMS
A Relational Database Management System (RDBMS) is a traditional system of storing data in which
data is stored in tables and the relationships among the data are also stored in tables.
Real-time Dashboards
Is a real-time graphical presentation of data analysis.
Software-as-a-Service
Software-as-a-Service refers to software that is accessed via a web browser and is paid on a
subscription basis, and for which the user does not have to pay for ownership, maintenance
and installation.
Spatial Information Flow
Describes the physical location of objects and the metric relationships between objects such as
aerial and satellite remote sensing imagery, the Global Positioning System (GPS), and Computerised
Geographic Information Systems (GIS).
Big Data: The Next Big Thing
99
Stochastic Optimisation
Are optimisation methods that generate and use random variables. Stochastic optimisation methods
also include methods with random iterates.
Structured Datasets
Data that resides in xed elds within a record or le. Relational databases and spreadsheets are
examples of structured data.
Tag Cloud
Refers to the weighted visual list where words that appear most frequently are larger and words that
appear less frequently are smaller.
Unstructured Datasets
Refers to information that either does not have a pre-dened data model and/or does not t well into
relational tables. Unstructured information is typically text-heavy, but may contain data such as dates,
numbers, and facts as well. For example, tweets, emails, RSS, XML, etc.
Zookeeper
Is a high performance coordination service for distributed applications.
Big Data: The Next Big Thing
100
List of Abbreviations
APAC Asia Pacic
B2C Business-to-Consumer
BFSI Banking, Financial Services and Insurance
BI Business Intelligence
BPO Business Process Outsourcing
CA Chartered Accountant
CFA Certied Financial Account
CAGR Compounded Annual Growth Rate
CARC Cumulative Aggregate Rate of Change
CoE Centre of Excellence
CLV Customer Lifetime Value
CPG Consumer Packaged Goods
CRISIL Credit Rating Information Services of India
GR&A Ltd. Global Research & Analytics
CRM Customer Relationship Management
CEO Chief Executive Of cer
COO Chief Operating Of cer
CIO Chief Information Of cer
CTO Chief Technology Of cer
Delhi NCR Delhi National Capital Region
DW Data Warehouse
ERP Enterprise Resource Planning
ETL Extract, Transform and Load
EU European Union
FTE Full-Time Equivalent
FTP File Transfer Protocol
GB Gigabyte
HITECH Health Information Technology for Economic
and Clinical Health
HR Human Resources
IP Intellectual Property
IPO Initial Public Ofering
IT Information Technology
ITeS Information Technology-enabled Services
IVR Interactive Voice Response
MBA Masters in Business Administration
MIS Management Information Systems
M&A Mergers and Acquisitions
MNC Multinational Corporation
MPP Massively Parallel Processing
M.S. Master of Science
NoSQL Not Only Structured Query Language
OLAP Online Analytical Processing
PACS Picture Archiving and Communication System
PB Petabyte
Ph.D Doctor of Philosophy
PoS Point of Sale
P&L Prot & Loss
RDBMS Relational Database Management System
RFID Radio-Frequency Identication
R&D Research and Development
RoI Return on Investment
RoW Rest of the World
RSS Rich Site Summary/Really Simple Syndication
SAS Statistical Analysis System
SaaS Software-as-a-Service
SKU Stock Keeping Unit
SME Small and Medium Enterprises
SPSS Statistical Package for the Social Sciences
SQL Structured Query Language
Big Data: The Next Big Thing
101
SVP Senior Vice President
TB Terabyte
UK United Kingdom
US United States (of America)
USD United States Dollar
VC Venture Capital
VP Vice President
XML Extensible Markup Language
International Youth Centre
Teen Murti Marg, Chanakyapuri
New Delhi 110 021, India
T 91 11 2301 0199 F 91 11 2301 5452
research@nasscom.in
www.nasscom.in

Anda mungkin juga menyukai