Anda di halaman 1dari 34

Empowering the Data-driven Enterprise

SOLIX COMMON DATA PLATFORM:


Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

EXECUTIVE SUMMARY

“You cannot manage what you cannot measure.”

Those prescient words from management guru Peter Drucker more than 30 years ago encapsulated the
evolution of enterprise software platforms and have paved the path to the era of the data-driven enterprise.

Data is remaking industries and reshaping the global economy. Those who embraced data found growth
areas and improved earnings. Organizations that ignore the promise of data can no longer survive in the
new world economy.

Today, Drucker’s words are as true as ever. To continue being data-driven, organizations must be able to
ingest and analyze new forms of data from constantly developing new sources. Those who are ready to mine
new data streams, such as social, IoT and more, are primed to transform economies and reap the benefits
with new growth opportunities. Businesses ready to leap into the fray are adopting Big Data. Big Data brings
together structured, semi-structured and unstructured data. When all forms of data are brought together,
it not only multiplies the value of every single piece of data, but it also presents a new set of challenges
around data storage, governance and consumption.

In today’s enterprise world, business users want to make real-time, data-driven decisions using the vast
amount of data available. Yet, IT departments are faced with the challenge of increasing storage and
Business Intelligence (BI) costs, complex governance and Information Lifecycle Management (ILM) for data,
which is now in the scale of petabytes and beyond. Unfortunately, current enterprise ready technology
offerings are not capable of managing this data tsunami, let alone take advantage of all the possibilities this
data offers. The tension created within organizations is clear.




“ As Forrester also points out in its research report, in the era of Big Data, traditional EDW is
failing to meet new business requirements, such as support for real-time and ad hoc customer
analytics, new sources of data, and self-service capabilities.1

The Solix Common Data Platform (CDP) allows organizations to embrace Big Data, while keeping the
challenges in check. The Solix CDP helps organizations leverage their existing infrastructure and allows
them to collect, store and analyze massive amounts of data from every source without sacrificing
governance, security or management. Further, with the Solix CDP all data keeps its original context and
structure, allowing organizations to ask complex questions and gain deep contextual insights from data at
any point. The Solix CDP creates a new paradigm fostering a meaningful, frictionless partnership between IT
departments and business users. IT departments can now become the guardians of data and business users
can become the owners and direct consumers of data.

Solix created the CDP to bring ILM to the Data Lake and innovation to the EDW. The Solix CDP is the next
evolution in the new enterprise blueprint, offering Enterprise Data Archiving and Enterprise Data Lake to
create an Advanced Analytics platform with unprecedented levels of ILM in a Big Data setting.

1
Forrester Report on The Next-Generation EDW is the Big Data Warehouse, August 2016

1 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

INTRODUCTION — SOLIX COMMON DATA PLATFORM

G
VIN
CHI
AR

EN
ISE

TE
- -------
----- ---

RPR
---

RPR
-- --

ENTE

--

--

ISE D
---

---
-----------

----
SOLIX COMMON

-----------

ATA LAKE
DATA PLATFORM

----
---

---
--
--

-
- ---
--- ----
-----------
IN
FO

M
R
AT
IO
N GOVE ANCE
RN

Solix CDP = Enterprise Archiving + Enterprise Data Lake + Information Governance

At the core of the Solix CDP are Enterprise Archive and The Enterprise Data Lake.

The Solix CDP utilizes the Solix Big Data Suite to provide comprehensive enterprise data management and
robust ILM. With the CDP, organizations can vastly expand the reach of analytics by creating an Advanced
Analytics platform. For the CIO, Enterprise Archiving offers a quick ROI that will ensure budgetary support
from the organization and dissolves the obstacles between Big Data and ILM.

The Solix CDP brings enterprise-grade capabilities to the Hadoop framework, addressing all shortcomings
of the Data Lake. Solix CDP provides uniform data collection, metadata management, ILM and secure data
access for Advanced Analytics.

The Solix CDP does this all while maximizing an organization’s existing infrastructure. With no need to
reboot the organization’s enterprise architecture, the Solix CDP harnesses the current architecture to
develop a new enterprise blueprint, capable of evolving with the business requirements of an organization.

The Solix CDP is also capable of evolution. As businesses stretch Hadoop to its limits, new Big Data
technologies will emerge. The Solix CDP is primed to adapt with them.

The Solix CDP brings enterprise-grade capabilities to the Hadoop framework, addressing
all shortcomings of the Data Lake.

2 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

WHY SOLIX COMMON DATA PLATFORM?

The need to become data-driven is clear. Transformation has hit every major industry and disruptors have
become powerhouses in the global economy based on their capabilities to mine data. Any organization
wanting to compete must become data-driven or it is destined to fail.

The current enterprise architecture offering, EDW, provides a canonical, top-down view of enterprise data to
meet end user requirements, but those views rarely satisfy the function-specific requirements of data-driven


applications.

As per Forrester research, Big Data platforms such as Hadoop have made Big Data architectures
more affordable, allowing companies to pursue new business insights for increased data-driven
competitive advantage.2

The Solix CDP, built on top of Hadoop distributions, enables data-driven organizations to gain more value
from their data because now data can be visualized in more specific ways.

The cost of relying on the EDW to collect and analyze all of this data would also exceed the budget of most
organizations. The Solix CDP is a uniform data collection system for structured, unstructured and semi-
structured data featuring low-cost data storage and Advanced Analytics. Solix CDP stores data “as-is” to
reduce costly Extract, Transform and Load (ETL) operations, as well as transforms data to feed downstream
NoSQL and analytics applications. Solix CDP enables organizations to create a true enterprise Data Lake
with full access to the data, rather than a data swamp where the data gets lost. This enables the CIO to find
a better solution than trying to collect and store all of the enterprise data in the expensive Tier 1 storage
and existing EDW architectural offerings.

The Solix CDP does not require costly infrastructure and offers the scalability and flexibility the Big Data
platform architecture provides, along with enterprise-grade governance and security. The Solix CDP lays
the foundation for information governance, efficient infrastructure utilization and Advanced Analytics at
petabyte scale.

The Solix CDP lays the foundation for information governance, efficient infrastructure
utilization and Advanced Analytics at petabyte scale.

2
Forrester Report on Big Data Fabric Drives Innovation and Growth, March 2016

3 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Here is the comparison on how Solix CDP differs from a traditional Data Warehouse and a
Data Lake:

DATA WAREHOUSE DATA LAKE SOLIX CDP

Structured, Semi-structured/ Processed, Structured, Semi-structured/


Data Processed, Structured
Unstructured Unstructured

Schema On write On read On read / On write

Storage Costs High Low Low

Scalability Low High High

Agility Low, Fixed configuration High, Configure & Reconfigure High, Configure & reconfigure

Metadata Repository Centralized MetaData Repository No Centralized MetaData Repository

Data Access Query Search Query + Search

Query Performance High Medium Medium

Security / Governance Mature Maturing Mature

Business Users, Data Analysts, Data


Users Business Users Data Scientists
Scientists

Role based Access Yes No Yes

ILM No No Yes

Regulatory Retention
No No Yes
Management

Legal Hold No No Yes

ROI High Low High

4 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

PRODUCT/ SOLUTION OVERVIEW OF THE SOLIX COMMON DATA PLATFORM

Built on top of Hadoop distributions, such as Cloudera CDH or Hortonworks HDP, the Solix CDP provides an
integrated suite of enterprise connectors through its Object Workbench to build a consolidated repository of
enterprise data and metadata. The Solix Analyst Workbench allows multiple teams and users to collaborate,
create virtual workspaces and projects to access the data without compromising on compliance and
security. Because it runs on top of both of the most popular Hadoop distributions, it eliminates one of the
basic questions behind the creation of a Hadoop stack, and it can bridge the two in environments where
both are being used.

The data-driven enterprise does not wait for the business question to develop and then use the data to
answer it. The data-driven organization uses Advanced Analytics and Business Intelligence to mine the data
for the questions and then the answers. With Solix CDP, business users can create data models and derive
the insights needed to move the organization forward. The self-service model takes IT out of the equation,
freeing it to focus on its work, while ensuring security and governance measures are also met. The Solix CDP
brings robust ILM to all data.

The Solix CDP ensures all data retains context by retaining its metadata, meaning its value is never lost.
This ensures the business questions being raised by the data are truly valid and the answers analysts find
are relevant.

The Solix CDP ensures all data retains context by retaining its metadata, meaning its
value is never lost.

5 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Enterprise Archiving

“ In the era of Big Data, Archiving is a No-Brainer Investment.3

Up to 80 percent of production data used by core applications is inactive. Data archiving has emerged as
an ILM best practice to meet data growth challenges. Solix CDP ensures that Enterprise Archiving improves
production application performance, reduces infrastructure costs and meets the regulatory and compliance
needs.

As part of Enterprise Archiving, application data running online is first moved into Tier 2 or Hadoop
infrastructure, and then purged from its source location, according to ILM policies. Data archiving best
practice requires that MOVE and PURGE processes be coordinated and validated. Enterprise Archiving
on Solix CDP ensures proper data governance since enterprise data is ingested and stored based on ILM
retention policies and business rules.

Archive data is classified for security and compliance requirements, such as legal hold, and universal access
is provided for business users through structured reports and full text search for business objects.

Active Data Semi-Active Data InActive Data Reporting / BI Tools


(RDBMS) (Hadoop )

Solix APM
Structured Data (Repository, Query, Search)

Native
Access
SOLIX COMMON DATA PLATFORM

Solix EDMS
Universal Access
Database Archive Database
Archiving
Custom Apps
DB Solix Big Data
Suite
MOVE & COPY Archiving
Semi/Unstructured Data

BI Reporting
Solix
BigData Analytics
VIDEOS MACHINE
Suite
EMAIL
DATA
Enterprise Business Record
Print Stream Capture
Search & Query Access
IMAGES FILE
XML
MOVE, COPY, PRINT Retention Management and Legal Hold
SHARE

Solix CDP ensures that Enterprise Archiving improves production application performance,
reduces infrastructure costs and meets the regulatory and compliance needs.

3
Forrester Report on Vendor Landscape: Big Data Archiving, August 2015

6 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Enterprise Business Record (EBR)

An EBR is a de-normalized, point-in-time snapshot of a business transaction, which may include structured
or unstructured elements. The Solix CDP helps model, ingest and manage EBR data into a Hadoop optimized
file format that is fully accessible for text search or structured query.

Data can be ingested to build both a long-term Enterprise Archive and a transient Enterprise Data Lake. For
the archiving use case, older inactive data is moved from the source application to the Solix CDP. For the
Data Lake use case, current data can be transformed and then copied from the source application to the
Solix CDP.

SAP BO

Oracle BI
LDAP/Active Crystal Reports
Directory
IBM Cognos

Role Based Security Dashboards Retention Search Reports BI Reports


Policies

0 – 3 Years Solix Application Portfolio Manager

Enterprise Application
3 – 10 Years

AR Invoice
Transactions
Enterprise Business Record Solix Big Data Suite

Master Data
Ingest After
Create into SBDS Retention
EBR
Reference Data

Attachments
Complete business object Search & Query
Report Files Denormalized structure Retention Management
Point-in-Time snapshot Legal Hold
Decoupled from application Scheduler & CLI

EDW Augmentation

Currently enterprises are struggling to maintain costs associated with both storage and processing
capabilities around traditional EDW implementations. Offloading storage as well as costly ETL functions to a
commodity hardware such as Hadoop enables enterprises to focus on utilizing the existing Data Warehouse
infrastructure to its best ability in doing BI and Advanced Analytics.

Migrating warm or cold data from the EDW via archive onto low cost bulk storage system such as Hadoop
enables organizations to save millions on storage costs and significantly speeds up the processing power to
get more value from the data warehouse by extracting valuable insights at a quicker pace from the collected
data.

Currently enterprises are struggling to maintain costs associated with both storage and
processing capabilities around traditional EDW implementations.

7 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Enterprise Data Lake and Advanced Analytics

The Solix CDP based on Apache Hadoop establishes new capabilities for Advanced Analytics applications.
It stores data “as-is” eliminating the need for demanding ETL processes during ingestion. It captures and
maintains the metadata connected to each byte of data, which is half or more of the value of the data itself.
The Enterprise Data Lake may then be mined for critical business insights using text search, structured query
or further processing by downstream analytical applications. The Solix CDP utilizes either Hive or Spark
query frameworks dependent on the user requirements.

The Solix Enterprise Data Lake reduces the complexity and processing burden of staging EDW and analytics
applications and provides highly efficient, bulk storage of enterprise data for later use. Once resident within
HDFS, enterprise data may be more easily distilled and better described at petabyte-scale by business
analytics applications. This allows organizations to develop an enterprise architectural strategy that is
responsive to the business stakeholders without driving up the investment in hardware and software.

STRUCTURED DATA

CLOUD APPS ERP CUSTOM APPS

ANALYTICS DATA MINING REPORTING


DATA WAREHOUSE CRM
DISCOVERY
SEARCH

SEMI STRUCTURED DATA

JSON XML CSV


STAGE DATA MART
TRANSFORM
MACHINE DATA SENSORS
HIV
HIVE
ARCHIVE
LOGS

UNSTRUCTURED DATA

DATA LAKE
AUDIO VIDEOS SOCIAL MEDIA IMAGES

EMAIL DOCUMENTS WORD DOCUMENTS

Information Governance

Analysts have warned that applying existing information governance practices to Big Data will result in
failure. Comprehensive information governance provided by Solix CDP establishes the control framework
necessary for proper data access control, data assessment, data discovery, data classification, data
validation, retention management, legal hold and privilege management.

8 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise



Forrester estimates that the average Hadoop repository doubles in size every year; some
implementations double in volume every month. More Hadoop silos are creating data challenges
around security, integration, governance and delivery.4

To achieve robust ILM, new security and governance measures must be put into place to match the variety
and complexity of the new data assets. The Solix CDP provides a true ILM continuum that addresses the
complexity of governance in the Big Data world, while ensuring governance for core enterprise applications
is not sacrificed. The Solix ILM framework manages the data within HDFS and provides an integrated
retention-management and legal-hold capabilities.

Structured and unstructured data from various data sources are migrated into HDFS with full data-validation
and audit reports. These reports provide the necessary defensibility and chain of custody for compliance
and data governance. ILM policies and business rules may be pre-configured to meet industry standard
compliance objectives, such as COBIT, or custom designed to meet more specific requirements.

ION GOVE
RMAT RN
FO AN
IN
r,

CE
to
Moni s
s
Asse Cr
ea
te
pose
Hold etain,
, Dis

CONTROL AUDIT
R

Classif

RECONCILE
y

Se
En cu
cr re,
yp
t Archive,
Retire

Additionally, ILM also helps to solve the data growth problem by moving less frequently accessed data from
high-cost Tier 1 infrastructure to Hadoop, leveraging cheap commodity infrastructure. Relocating inactive
data to low-cost bulk data storage creates enormous infrastructure cost savings. Because governance, risk
and compliance concerns grow by the terabyte, the Solix CDP ensures ILM for data throughout its lifecycle.

4
Forrester Report on Big Data Fabric Drives Innovation and Growth, March 2016

9 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

COMPONENTS OF THE SOLIX COMMON DATA PLATFORM


Solix Object Workbench

Integrated Connectors

Solix Object Workbench provides integrated connectors that can extract and ingest vast amounts of data
“as-is” from an extensive set of enterprise data sources, including structured, semi-structured, unstructured
and streaming data sources. The Object Workbench provides functionality to copy, move, and transform data
from various data sources into the Solix CDP.

Extract, Transform and Load (ETL)

The Solix CDP Object Workbench also enables the ETL process to be undertaken as data is moved into the
Enterprise Data Lake. This provides the ability to transform complex application data into meaningful data in
a ready-to-use format from which the business user can gain immediate insight, with the use of BI tools.

Solix Virtual Printer

The Solix Virtual Printer provides functionality to capture print stream output from any application,
transform it into a PDF document, automatically ingest it into Hadoop, index it and make it available for
search access with full role-based security.

10 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

ERP | CRM | HR | Custom Applications Solix Big Data Suite


DISCOVERY
SEARCH

Print Steam STAGE


TRANSFORM
ARCHIVE

Solix Virtual Printer


Data Lake
L k

The virtual printer can be used to supplement an archiving project by capturing key report output —
including all formatting — from the source application and storing it alongside the structured data.

The virtual printer can also be used to support a streamlined archiving approach called “print-and-purge.”
Using this approach, key documents, such as invoices or customer documents, are first “printed” by the
Solix Virtual printer and ingested into Hadoop, after which the underlying data from the source application
can be purged.

Real-Time and Streaming Data




“ Business users want data that’s integrated in real time from multiple sources, including legacy
data, social media, sensor data and weblogs, so they can make better decisions and increase
their company’s competitiveness.5

Insights from enterprise data is now not restricted to formulated data repositories, which only contain
data at the end of its operational life. Huge amounts of data are now collected from both internet enabled
devices and also from terminals in real-time and streaming formats.

This data can also be captured via the Solix CDP Object Workbench, enabling teams to create views of data
to analyze and deliver actionable intelligence. This will enable enterprises across industries to become truly
data-driven.

5
Forrester Report on Big Data Fabric Drives Innovation and Growth, March 2016

11 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Solix Analyst Workbench


The Analyst Workbench is designed for business analysts, data scientists, and DBAs to securely access the
data within the Solix CDP and build virtual workspaces to manage analytics projects. All data within the
platform is automatically made searchable and reportable in a secure and governed manner.

Functionality included with the analyst workbench includes:

Data Lake Visualizer

The Data Lake Visualizer is a graphical inventory of the data contained in the lake. Using the visualizer the
data analyst can quickly find the data sets needed to complete their analytics assignment. Once the data
sets are identified they can be selected for inclusion in the analytics project.

12 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Virtual Projects and Workspaces

For each analytics assignment a virtual project can be created by the analyst. Within each project one or
more virtual workspaces can be created. The objects identified in the visualizer can then be virtually copied
into the workspace, eliminating the need to make physical copies of data.

Once the virtual workspace has been created, the data analyst can do data mashups by creating new
composite objects to support the analytics assignment.

Data Preparation

Data preparation is the foundation for becoming a data-driven organization. To properly use data it must
not only be collected from its ever-increasing variety of sources, it must also be put into a repository where
those varied forms can be used by the analysts.

As more organizations utilize Data Scientists and Advanced Business Analysts to


wrangle their data to enable digital transformation, the lack of proper tools hinders
this progression. In fact, research shows that Data Scientists spend 80 percent of
their time just cleaning data.

Every analytics project requires the proper preparation of the data set. The nature of the data — from
semi-structured data (such as log files), unstructured data (such as social, IoT) and structured data (such as
relational databases) – must be understood, organized and transformed quickly and efficiently. The Solix
CDP offers powerful, easy to use self-serve data preparation capabilities, including the ability to parse, clean,
join and enrich data, as well as populate missing information and calculate new metrics.

The Solix CDP utilizes the Spark framework. Spark runs in-memory within the cluster and provides machine
learning capabilities for faster and more advanced data preparation.

Search and Reporting Functionality

Solix CDP supports universal access to all enterprise data on a petabyte scale via text search, structured
query or further processing by downstream analytical applications. End users gain improved data-driven
results because their data is better able to be described.

13 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Information Governance
The Solix CDP provides the ability to govern all of the data within the Hadoop repository for compliance
and security. For example, automatically purge data based on a time-horizon, apply legal holds on files
and transactions, enforce Kerberos/LDAP authorization for user access and more. This level of security and
governance is able to be maintained because the CDP ensures all data retains its metadata.

Information governance establishes the control framework necessary for proper data access control, data
assessment, data discovery, data classification, data validation, retention management, legal hold and
privilege management.

Solix Common Data Platform API


To enable development of custom applications and integration with existing BI and Advanced Analytics
tools, the platform provides extensive APIs to access the unified repository. The API allows users to
seamlessly access data from the Data Lake to enable the data-driven enterprise.

Solix App Store


The Solix App Store makes inductive BI user-friendly. The App Store offers out-of-the-box analytics through
pre-integrated applications and also offers the opportunity to utilize third-party apps.

ow Sa
h-fl le
Cas lytics An s/pi
Ana al pe
yti lin
cs e
--- --------
----
rting

---
---
ls

-- -
ing Too
c repo

--
-
---

---
and

-----------

----

CURRENT
Social A and
Marke
Search
Ad-ho

-----------

SOLIX ANALYTICS
APP STORE OFFERS
----

ti n
nalytics
---

g
---
--

--
-

- ---
--- ----
An -----------
Da exec ith
sh uti nw
bo ve
ard g r atio I tools
e
Int ard B eau,
d l
stan as Tab tc.
c h , e
su plunk
S

14 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

DEPLOYMENT MODELS
The Solix CDP is enabled for a number of deployment models including bare metal infrastructure, data
center deployment, cloud infrastructure and also hosted multi-tenant deployment.

Solix never delivers a one-size-fits-all solution. The Solix team has experts ready to address customer needs
from IT, business use and financial perspectives. The Solix team will work to understand organizational
needs and then implement the best solution.

SPARK ON SOLIX CDP


The Solix CDP utilizes the Spark programing models. Spark runs in-memory within the cluster and does not
depend on the two-stage Hadoop MapReduce paradigm. Therefore, repeat access to data is much faster.
Spark relies on HDFS and runs on Hadoop YARN to be able to analyze the data stream.

Solix is committed to adding support for new Big Data tools as they appear in the fast-evolving Big Data
ecosystem. This future-proofs Solix CDP installations, an important commitment given the speed with which
the open source Big Data stack is evolving. It also simplifies the creation and evolution of an enterprise’s
Big Data environments.

BENEFITS OF THE SOLIX CDP

The benefits of the Solix CDP include:

• Combining the advantages of Hadoop with the ability to preserve the full metadata.

• Providing advanced ILM capabilities, including the ability to copy data from the data warehouse
and to archive older data.

• Supporting advanced data security, as well as third party analysis packages, including machine
learning and cognitive computing analysis of the data.

• Preserving all data in its original format and with full metadata and supporting established open
standard interfaces. It future-proofs the Data Lake, ensuring the data will be usable by the new
technologies and for new use cases that are as yet undefined.

• Providing a unified data governance layer from the time of data ingestion to use of data by
business users for operational insights and Advanced Analytics.

• Ability to utilize either Hive or Spark query frameworks dependent on the user requirements.
• Cloud, on-premise and hybrid deployment models.

• Working with all Hadoop distributions such as Cloudera and Hortonworks.

15 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

The Solix CDP is the first solution to address all the data needs of an organization. From governance
to analytics, the Solix CDP works with an organization’s existing infrastructure to create a true ILM
continuum that ensures the onslaught of data can be an asset and not a hindrance to business growth and
development. The Solix CDP brings together the Enterprise Archiving, EDW and the Enterprise Data Lake
while preserving metadata, allowing for schema on read, analytics opportunities, low cost implementation
and maintenance as well as offering incredible scalability.

CONCLUSION
The era of game-changing digital disruption is here, and to thrive in this competitive environment,
organizations need leadership that can effectively leverage all the data to derive actionable insights to fuel
growth.

Reducing infrastructure costs, attaining operational efficiencies and deriving insights from BI and Advanced
Analytics is the desire of many organizations. Solix CDP maximizes the insights that can be achieved, while
reducing risk, ensuring compliance and governance to create a true ILM framework to lead organizations
into the future. The Solix CDP gives organizations all the tools necessary to lower the total cost of
ownership and satisfy the desire for return on investment.

16 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
Empowering the Data-driven Enterprise

Empowering the Data-driven Enterprise

Solix Technologies, Inc.


4701 Patrick Henry Dr., Bldg 20
Santa Clara, CA 95054

Toll Free: +1.888.GO.SOLIX (+1.888.467.6549)


Telephone: +1.408.654.6400
Fax: +1.408.562.0048
URL: http://www.solix.com

Copyright ©2016, Solix Technologies and/or its affiliates. All rights reserved.

This document is provided for information purposes only and the contents hereof are subject to change
without notice.

This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether
expressed orally or implied in law, including implied warranties and conditions of merchant- ability or
fitness for a particular purpose.

We specially disclaim any liability with respect to this document and no contractual obligations are formed
either directly or indirectly by this document. This document may not be reproduced or transmitted in any
form or by any means, electronic or mechanical, for any purpose, without our prior written permission.

Solix is a registered trademark of Solix Technologies and/or its affiliates. Other


names may be trademarks of their respectively.

17 Solix Common Data Platform: Advanced Analytics and the Data-Driven Enterprise
For Enterprise Architecture Professionals

The Next-Generation EDW Is The Big Data


Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

by Noel Yuhanna
August 29, 2016

Why Read This Report Key Takeaways


EDW is not dead; it’s evolving! Enterprise data Without Modernizing Your Current EDW
warehouses have come a long way in delivering Platform, You Will Likely Fail
value by predicting trends, minimizing churn, Business users are demanding faster, more real-
and identifying new business opportunities. time, and integrated customer analytics from
However, in the era of big data, traditional EDW is multiple sources, so they can make better decisions
failing to meet new business requirements, such and increase their company’s competitiveness.
as support for real-time and ad hoc customer Current EDW platforms have gaps and limitations
analytics, new sources of data, and self-service that fail to meet these new requirements.
capabilities. Enterprise architects should read this
Forrester’s Big Data Warehouse Strategy
report to learn how the new big data warehouse
Extends The Existing EDW Framework
addresses these gaps by delivering timely and
Based on interviews of customers and vendors,
actionable insights to gain competitive edge and
Forrester has laid out an architecture to guide
enable innovation and growth.
enterprise architects in creating a big data
warehouse framework tailored to their firm’s
requirements to support both existing and new
actionable business insights.

You Need A Big Data Warehouse Strategy To


Succeed
Big data warehouse is a modern data
warehouse architecture that leverages traditional
and new data repositories, in-memory, cloud,
and other technologies.

forrester.com
For Enterprise Architecture Professionals

The Next-Generation EDW Is The Big Data Warehouse


Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

by Noel Yuhanna
with Gene Leganza and Shreyas Warrier
August 29, 2016

Table Of Contents Notes & Resources


2 EDW Has Been The Analytics Platform King Forrester interviewed various customers in the
For Decades financial, oil and gas, retail, and healthcare
sectors.
But New Business Requirements Are
Changing EDW Requirements
Related Research Documents
EDW Technology Gaps Are Making
Enterprises Look Elsewhere Big Data Fabric Drives Innovation And Growth

6 The Big Data Warehouse Extends The EDW The Forrester Wave™: Enterprise Data
Platform Warehouse, Q4 2015

Big Data Fabric Connects The Superset Of TechRadar™: Big Data, Q1 2016
Your Data Sources — Including Your BDWs

The BDW Provides A Comprehensive View


And Integrated Analytics

10 The Major EDW Vendors Provide BDW


Components

BDW Use Cases Go Beyond Traditional


Analytics

Recommendations

12 Extend Your Current EDW Platforms Toward


A BDW Strategy

13 Supplemental Material

Forrester Research, Inc., 60 Acorn Park Drive, Cambridge, MA 02140 USA


+1 617-613-6000 | Fax: +1 617-613-5000 | forrester.com
© 2016 Forrester Research, Inc. Opinions reflect judgment at the time and are subject to change. Forrester®,
Technographics®, Forrester Wave, RoleView, TechRadar, and Total Economic Impact are trademarks of Forrester
Research, Inc. All other trademarks are the property of their respective companies. Unauthorized copying or
distributing is a violation of copyright law. Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

EDW Has Been The Analytics Platform King For Decades


The enterprise data warehouse is an architecture, not a technology. The traditional EDW platform has
served and continues to serve a broad range of business users, including enterprise architecture (EA)
pros, feeding both analytical and operational systems. EDWs:

›› Organize and aggregate historical analytical data from functional domains. EDWs house
information from data subject areas such as customer, manufacturing, finance, and human
resources that align with key processes, applications, and roles. Most of the traditional EDW
platform has been built using relational database management system (DBMS) and columnar
database platforms using extract-transform-load (ETL), change data capture (CDC), and replication
technology (see Figure 1).

›› Offer a strong decision support framework. EDWs provide in-database analytics, predictive
models, and embedded business algorithms to drive business decisions.

›› Are central to a firm’s data ecosystem. The EDW is a proven ecosystem that supports integration
with data models and security frameworks, automation, and a broad range of business intelligence
(BI) and visualization tools.1

›› Provide the foundation for BI. EDWs support timely reports, ad hoc queries, and dashboards and
supply other analytics applications with trusted and integrated data. Many use the EDW to deliver
operational intelligence — in the form of query responses, reports, dashboards, charts, and other
analytic views — in support of various decision scenarios.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 2
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

FIGURE 1 The Traditional Enterprise Data Warehouse Platform

Source Storage/persistence Compute/processing

Business intelligence
OLTP Relational • Modeling
Operational
• Data quality
CRM reporting
• Security governance
transformation
ERP Analytics
Columnar • Integration
Social Predictive
analytics
SaaS

ETL/CDC/replication

On-premises Hybrid Cloud

But New Business Requirements Are Changing EDW Requirements

Today, business users are demanding real-time analytics that’s integrated from legacy, social, and
cloud sources, while business execs want self-service and autonomous access to fit-for-purpose
customer data insights. In our 2016 global survey, 59% of respondents stated that leveraging big data
and analytics was a critical or high priority (see Figure 2). But increasing data volume and dealing with
multimodel customer data are slowing down timely analytics and putting constraints on traditional
warehouse platforms, causing firms to revisit their EDW architectures. Businesses are reporting that
current EDW platforms:

›› Can’t share current data quickly enough for timely business decisions. With increasing big
data comes a major challenge for any enterprise: knowing what to look for and where, and then
making sense of it. In our survey, 30% of businesses reported growth of data volume and variety
affecting their BI strategy (see Figure 3). Firms are realizing that traditional data warehouses fall
short when it comes to real-time analytics.2

“With data explosion and increasing demand for real-time analytics by the business, we are
finding it challenging to support our LOB users. While we already use Hadoop, our traditional
data warehouses still are important for analytics, but we are now looking at modernizing that
architecture.” (Enterprise architect, oil and gas, North America)

›› Don’t support ad hoc and dynamic analytics for new customer trends. EDWs were built for a
limited set of uses, providing answers to known questions. But 27% of enterprises report that fast-
changing analytics and reporting requirements are one of the biggest challenges when orchestrating

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 3
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

their BI strategy, while 30% cite the growth and variety of their data. Processes using traditional
EDWs don’t scale well when you introduce ambiguity or add new and dynamic questions. EDWs
need to ingest, process, and curate data continuously and support dynamic insights.

“We are now looking to [build] a modern data warehouse that can provide insights to all kinds of
tough questions critical for our business to succeed. Including identifying business risks and
opportunity.” (Business analyst, financial services, Europe)

›› Don’t provide a self-service platform for strategic and operational decision-making. When
executives need to determine why something is happening or what the best course of action is,
they can’t wait for a data processing cycle to make data available. Analysts need to be able to
aggregate and prepare data sets without technology management’s involvement. Twenty-seven
percent of companies reported lack of end user self-service capabilities as one of the biggest
challenges in executing their BI strategy. Self-service customer analytics has become critical for
organizations to succeed.

“Self-service for all data is our long-term strategic direction, and we know it’ll take us some time
to get there, but we have to start somewhere. We have started to integrate our current EDW
appliances to Hadoop and in-memory to create [a] unified and integrated analytical platform.”
(Enterprise architect, financial services, North America)

FIGURE 2 Big Data And Analytics Have Become A Priority

“Which of the following initiatives are likely to be your organization’s top business priorities?”
(Better leverage big data and analytics in business decision-making)

Don’t know 1%

Critical priority 19%

High priority 40%

Moderate priority 28%

Low priority 9%

Not on our agenda 0%

Base: 3,343 data and analytics decision-makers

Source: Forrester’s Global Business Technographics® Data and Analytics Survey, 2016

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 4
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

FIGURE 3 Data Growth And Variety Are Affecting Business Intelligence And Analytics Strategy

“What are the biggest challenges your firm faces when orchestrating
its business intelligence strategy?”

Data security and privacy 35%

Growth of data volume/variety 30%

Fast-changing analytic and reporting requirements 27%

Lack of alignment between IT and business 25%

Lack of adequate user training 25%

Poor data quality 25%

Inadequate or missing relevant internal skills 22%

Legal and regulatory compliance 22%

Lack of data standards 21%

Lack of end user self-service capabilities 20%

Lack of access to data and insights 19%


Inadequate change management programs
(communications, incentives, etc.) 17%
Widespread utilization of insights for
17%
decision-making and planning
Lack of business C-level executive support 16%

Don’t know/does not apply 4%

Base: 3,343 data and analytics decision-makers

Source: Forrester’s Global Business Technographics® Data and Analytics Survey, 2016

EDW Technology Gaps Are Making Enterprises Look Elsewhere

While traditional data warehouses often took years to build, deploy, and reap benefits from, today’s
organizations want more simplified, agile, integrated, cost-effective, and automated solutions. Firms
are revisiting their EDW strategies, as they spend too much time loading, unloading, transforming,
securing, integrating, and curating customer data. Enterprises face:

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 5
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

›› A data volume explosion that’s affecting customer analytics. Traditional structured data
continues to grow rapidly, slowing down legacy data warehouse systems and affecting analytics
and timely insights. Regulatory requirements now mandate storing compliant data for several years,
and business growth is generating more data at a faster pace than ever before.

“We are experiencing tremendous data explosion for traditional data sets that’s impacting our data
warehouses. While we are still looking at improving the performance of existing data warehouses
for the short term, we are now starting to look at alternatives, both supplementary and replacement
as longer-term strategy.” (Enterprise architect, oil and gas, North America)

›› Data variety that’s making it harder to support using traditional warehouses. Business users
can’t easily spot patterns and trends in content such as documents, email, images, audio, and
social media. In addition, storing, processing, and accessing unstructured data in data warehouses
pushes the limits of traditional technologies and architectures, which were not designed to handle
such data types.3

›› Data speed that’s making it harder to keep up. New sources of data are coming in a lot faster,
such as sensor and machine data, log and clickstream data, cloud and software-as-a-service
(SaaS) data, and other streaming data. Storing, transforming, and processing such data requires
new technologies and systems to support new customer analytics, real-time analytics, and
operational intelligence reporting.4

“For us, real-time data sharing is critical internally among business users but also with various
partners that we engage with. Currently, not all of our data is available to everyone, but we are
looking at ways of expanding to support a more self-service real-time big data platform.” (Data
scientist, biotechnology company, North America)

The Big Data Warehouse Extends The EDW Platform


Firms are already using a variety of technologies in their big data strategy to support new, next-
generation analytics (see Figure 4). The big data warehouse (BDW) is a modern data warehouse
architecture that leverages traditional data warehouse architectures as well as modern big data
technologies (see Figure 5). Forrester defines the big data warehouse as:

A specialized, cohesive set of data repositories and platforms used to support a broad variety
of analytics running on-premises, in the cloud, or in a hybrid environment. BDW leverages both
traditional and new technologies such as Hadoop, columnar and row-based data warehouses,
ETL and streaming, and elastic in-memory and storage frameworks.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 6
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

FIGURE 4 Cloud, Streaming, And Distributed In-Memory Are Already Part Of Firms’ Big Data Strategy

“Which of the following are included in your plans for big data?”

Public cloud big data services 40%

Large-scale predictive modelling, data mining,


or other advanced analytics 36%

Streaming analytics/computing 33%

Distributed in-memory databases, grids,


30%
analytics tools

Unstructured data mining/analytics 28%

Packaged analytics technologies that brand


27%
themselves as big data
Marketing or digital data management platforms and
26%
service providers that brand their offerings as . . .

Creating or building out a data lake 26%

Data anonymization or de-identification 23%

Hadoop (including Hbase or Accumulo) 23%

Semantic technologies (ontology building,


22%
search, autocuration, graph, etc.)
A massively parallel processing (MPP)
18%
data warehouse

NoSQL other than Hadoop 16%

Don’t know 8%

Base: 2,094 data and analytics decision-makers

Source: Forrester’s Global Business Technographics® Data and Analytics Survey, 2016

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 7
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

FIGURE 5 Big Data Warehouse Architecture

Storage/compute
Sources processing Management Interaction Use cases
Business
OLTP Integration intelligence
Relational
In-memory/
CRM Data quality Apache Spark Operational
reporting
ERP Security Self-service
Columnar Analytics
Social Transformation Ad hoc
SaaS interactions
Governance Predictive
Devices Modeling analytics
Apache Hadoop Machine learning
Sensors Real-time
analytics

ETL/CDC/replication

Streaming

On-premises Hybrid Cloud

Big Data Fabric Connects The Superset Of Your Data Sources — Including Your BDWs

The big data warehouse is part of a larger big data fabric architecture, which embodies data from
multiple — potentially distributed — data sources, including BDWs and data lakes. The big data
fabric architecture enables integration, data quality, security, governance, data curation, data
preparation, and data management to support an end-to-end, real-time big data platform (see Figure
6).5 The two architectures:

›› Can exist separately but work best as complements. Multiple traditional EDWs, BDWs, and data
lakes have become the new norm to support the variety of analytical workloads. While both BDWs
and big data fabric architectures can exist independent of each other, typically firms leverage
both to deliver a blend of real-time and batch across various distributed enterprise data sets to
support broader use cases. For example, some financial services organizations use the BDW to
support mostly financial data analytics — leveraging columnar data warehouses, Hadoop, and ETL
technologies. The BDW also acts as a source within the big data fabric architecture that delivers
real-time customer analytics across BDW, Twitter, Salesforce, and clickstream data.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 8
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

›› Vary significantly in the amount of data transformation required. We often see big data fabric
used for real-time analytical use cases that integrate data across many disparate sources, including
BDW, with the BDW used mostly for batch and near-real-time analytics for data stored in a data
warehouse and Hadoop clusters that require aggregation, transformation, and further processing
before becoming available to BI users or analytical processes. Exploration occurs within the fabric,
with transformations captured within the BDW.

FIGURE 6 Big Data Fabric Architecture Integrated With Big Data Warehouse

Big data fabric

Hadoop BDW Processing and


persistence
Spark
New York

Hadoop Spark
EDW
Singapore

Data ingestion
(streaming/replication/batch)
On-premises sources Cloud sources

The BDW Provides A Comprehensive View And Integrated Analytics

A key component of the BDW architecture is the ability to leverage various specialized data
repositories such as traditional relational data warehouses, columnar data warehouses, and Hadoop.
Unlike traditional data warehouses, the BDW minimizes complexity and hides heterogeneity by
embodying a trusted model, supports all kinds of data types including unstructured data, and adapts
to changing business requirements more rapidly through a self-service platform. The BDW centralizes
administration of distributed data repositories, in-memory compute resources, metadata, storage,
access, and processing functions. It leverages new technologies such as:

›› Hadoop to support diverse data sets and distributed computing. By leveraging Hadoop, the
BDW enables organizations to deal with a wider variety of data structures than traditional EDWs.
Hadoop can also deal with extremely large data sets that are inappropriate for traditional EDW

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 9
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

platforms. Enterprise architects can choose to store data in relational, columnar, wide columns, or
Hadoop based on business needs. For example, a retailer leverages legacy structured data stored
in a traditional data warehouse, and Hadoop for clickstream data, and integrates them to deliver a
360-degree view of the customer for recommendations and churn analysis.

›› In-memory to enable faster customer analytical capabilities. A key component of the BDW is
the ability to use in-memory to deliver performance and faster access to business data. We are
heading toward having large memory platforms that will store petabytes in DRAM and Flash/SSD in
the coming years. For example, several retailers are using BDW to leverage customer-related data
to determine product discounting strategy, optimize product distribution across stores, and enable
personalized customer experiences.

›› Streaming engines to support new data channels for ingestion and processing. Market data,
clickstream, mobile devices, and sensors are new sources for analytical information that are not in
your existing data warehouse. Streaming technology boosts integrating, transforming, and curating
data on diverse data streams in real time.6 Integrating streaming technology with data platforms
such as Hadoop and Spark — as well as traditional data warehouses — has become critical. For
example, we see oil and gas industry firms leveraging streaming technology for insights into new
business opportunities, such as predicting staffing and resource requirements for various drilling
sites and performing machine failure analysis.

The Major EDW Vendors Provide BDW Components


From an implementation viewpoint, most enterprises are currently building BDW platforms themselves
by integrating their traditional data warehouses with Apache Spark, Hadoop, Storm, and in-memory
technologies. Forrester sees many enterprises already using an extract-Hadoop-load (EHL) approach to:

1. Extract data from various source systems such as traditional databases and flat files.

2. Load data into Hadoop to perform aggregation and transformation using Apache Hadoop
ecosystem tools.

3. Finally load the result into the EDW platform.7

BDW Use Cases Go Beyond Traditional Analytics

Adoption of BDW architectures will accelerate as enterprises run into existing EDW challenges. But
building a BDW platform internally will require more time and effort, which will likely put pressure on
the overall business technology (BT) agenda. The good news is that solutions are starting to emerge
from vendors such as IBM, Microsoft, Oracle, SAP, Snowflake, and Teradata that provide some or all of
the components to build and deploy a BDW strategy.8 Enterprises are already using BDWs to support
social analytics, risk analysis, campaign analysis, fraud assessment, and pricing trends. The top BDW
use cases include:

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 10
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

›› Integrated analytics. A key challenge in the traditional EDW approach was that if data didn’t
exist in the warehouse, you couldn’t do any analytics — full stop. With BDW architecture, you can
perform integrated analytics across data warehouse and Hadoop clusters. Hadoop can store and
process large sets of semistructured and unstructured data, log files, and streaming data with ease.
For example, health research often requires looking at complex patient data and determining how
effective a treatment is likely to be based on factors like age, sex, and health status. The BDW
enables gathering and storing millions of data points in Hadoop and performing complex navigation
and modeling using traditional data warehouse and in-memory technology.

›› Internet-of-things (IoT) analytics. Traditional data warehouses don’t deal with IoT data.
However, the BDW offers the ability to store, process, and access large volumes of IoT data from
sensors and devices in Hadoop repositories efficiently through automation and machine learning
technologies. Manufacturers deal with highly sophisticated machinery to support their plants,
whether they’re building a car, airplane, or tire or bottling wine or soda. Every minute of machine
downtime can cost a manufacturer dearly. IoT analytics on BDW platforms enables manufacturers
to predict machine failures based on sensor data, minimizing or eliminating production slowdown.

›› Right-time business analytics. Traditional EDW architectures were based on mostly batch
processing, with ETL doing the heavy lifting of data from traditional systems to operational systems
to data warehouses. As a result, by the time data arrived in data warehouses, it was already 12 to
48 hours old. BDWs enable right-time analytics by leveraging streaming and replication with direct
access to data sources, whether on-premises or cloud, bypassing traditional ETL approaches. The
financial services industry has been an early adopter of BDW to support right-time analytics for
portfolio management, fraud detection, and asset management.

›› Adaptive, self-service analytics. Most EDWs use predefined data sources to deliver predictive
analytics, trends, and insights. The BDW enables organizations to dynamically leverage new data
sources quickly to deliver new insights. It enables self-service capabilities for business users to
ask complex and new questions so they can make more accurate decisions. The BDW adapts to
the new sources and can help correlate data using machine learning and adaptive intelligence. For
example, a major European bank recently built a BDW framework that business units now use to
support self-service for making better decisions on investments and risks. The platform represents
a major shift from the static reports the bank used previously.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 11
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

Recommendations

Extend Your Current EDW Platforms Toward A BDW Strategy


Don’t throw away your existing EDW platform! The investments you have already made in EDWs will form
the foundation of the next-generation BDW strategy. However, attaining this demands that you rearchitect
your existing EDW platform and invest in new technologies to deliver on a new vision of right-time
analytics, self-service, and intelligent and contextualized customer analytics. Forrester recommends that
enterprise architects extend existing EDW platforms toward a BDW strategy by leveraging:

›› Hadoop for low-cost storage and processing of big data. Let Hadoop be the first stop for your
big data that has no other home in your data warehouse. Hadoop offers the ability to store very
large volumes of data (including unstructured data) more efficiently than traditional warehouses —
and at a fraction of cost. In addition, Hadoop helps you offload data from traditional warehouses
and leverage a distributed computing framework to perform transformation, aggregation, and
curation quickly.

›› In-memory technology to support right-time analytics. Without in-memory technology,


customer analytics, personalization, and right-time analytics will run slowly. This could cause you
to miss key trends like customer churn or miss the opportunity to offer new products and services
or identify weak markets. You can also use data from the BDW as part of the bigger big data
fabric framework that leverages distributed in-memory computing to deliver a broader enterprise
information fabric.

›› Hybrid platforms to support on-demand and scalable BDWs. Storing all of your data on-
premises need no longer be the default. Cloud platforms like those from Amazon Web Services,
Google, IBM, Microsoft (Azure), Oracle, and Rackspace offer pay-as-you-go facilities to store,
process, and access any amount of data.9 Hybrid is the new norm — look at utilizing both on-
premises and cloud data warehouse platforms as part of your BDW architecture, with a common
administration facility.

›› Vendor solutions that help achieve faster time-to-value. Data warehouse, Hadoop, and other big
data solutions from vendors such as Cloudera, Hortonworks, IBM, MapR Technologies, Microsoft,
Oracle, SAP, and Teradata can reduce time-to-value by automating and simplifying various BDW
functions and implementation steps. Look at vendors that support broader solutions and can
support your business data. Ask your vendor how it plans to provide the BDW vision. Review the
various components that the vendor has integrated and ask how it plans to fill any gaps.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 12
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

Engage With An Analyst


Gain greater confidence in your decisions by working with Forrester thought leaders to apply
our research to your specific business and technology initiatives.

Analyst Inquiry Analyst Advisory Webinar

To help you put research Translate research into Join our online sessions
into practice, connect action by working with on the latest research
with an analyst to discuss an analyst on a specific affecting your business.
your questions in a engagement in the form Each call includes analyst
30-minute phone session of custom strategy Q&A and slides and is
— or opt for a response sessions, workshops, available on-demand.
via email. or speeches.
Learn more.
Learn more. Learn more.

Forrester’s research apps for iPhone® and iPad®


Stay ahead of your competition no matter where you are.

Supplemental Material
Forrester’s Global Business Technographics® Data And Analytics Survey, 2016 was fielded in March
2016. This online survey included 3,343 respondents in Australia, Brazil, Canada, China, France,
Germany, India, New Zealand, the UK, and the US from companies with 100 or more employees.

Forrester’s Business Technographics ensures that the final survey population contains only those with
significant involvement in the planning, funding, and purchasing of business and technology products
and services. Research Now fielded this survey on behalf of Forrester. Survey respondent incentives
include points redeemable for gift certificates.

Please note that the brand questions included in this survey should not be used to measure market
share. The purpose of Forrester’s Business Technographics brand questions is to show usage of a
brand by a specific target audience at one point in time.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 13
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals August 29, 2016
The Next-Generation EDW Is The Big Data Warehouse
Big Data Warehouses Drive Faster, Integrated, Self-Service Analytics

Endnotes
Today, organizations still rely on EDW platforms to deliver actionable, timely, and trustworthy intelligence. EDW
1

technology organizes and aggregates analytical data from various functional domains and serves as a critical
repository for organizations’ operations. See the “The Forrester Wave™: Enterprise Data Warehouse, Q4 2015”
Forrester report.

It takes a long time to measure a business process. Enterprise data hubs need to accommodate more data and an
2

infinite set of queries. See the “Create A Road Map For A Real-Time, Agile, Self-Service Data Platform” Forrester
report.

Data consumers — from casual data analysts to data scientists to your customers — are looking across a broad
3

variety of data today to find answers to their questions. See the “Compose Digital Data To Create A Symphony Of
Insight” Forrester report.

Data bottlenecks create business bottlenecks. The days of provisioning data to simply meet the requirements of
4

systems of record are over. Business stakeholders at the executive and line-of-business levels need data faster to
keep up with customers, competitors, and partners. See the “Create A Road Map For A Real-Time, Agile, Self-Service
Data Platform” Forrester report.

Forrester defines big data fabric as “bringing together disparate big data sources automatically, intelligently, and
5

securely, and processing them in a big data platform technology, such as Hadoop and Apache Spark, to deliver a
unified, trusted, and comprehensive view of customer and business data.” See the “Big Data Fabric Drives Innovation
And Growth” Forrester report.

Streaming technology helps integrating, transforming, and curating data on diverse data streams in real time. See the
6

“The Forrester Wave™: Big Data Streaming Analytics, Q1 2016” Forrester report.

Forrester sees many enterprises already using an extract-Hadoop-load approach to extract data from various source
7

systems, such as IoT devices and cloud and traditional platforms, then load it into Hadoop, perform aggregation
and transformation, and finally load it into the EDW to support business analytics. See the “The Forrester Wave™:
Enterprise Data Warehouse, Q4 2015” Forrester report.

Most big data integration vendors focus on making classic processes faster with tools for moving data into a lake and
8

working with it there. Three innovative vendors — Looker Data Sciences, SnapLogic, and Snowflake Computing —
offer alternative approaches. See the “Breakout Vendors: Big Data Integration” Forrester report.

According to Forrester customer feedback, such cloud-based storage is typically over 20% less expensive than on-
9

premises deployment.

© 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 14
Citations@forrester.com or +1 866-367-7378
We work with business and technology leaders to develop
customer-obsessed strategies that drive growth.
Products and Services
›› Core research and tools
›› Data and analytics
›› Peer collaboration
›› Analyst engagement
›› Consulting
›› Events

Forrester’s research and insights are tailored to your role and


critical business initiatives.
Roles We Serve
Marketing & Strategy Technology Management Technology Industry
Professionals Professionals Professionals
CMO CIO Analyst Relations
B2B Marketing Application Development
B2C Marketing & Delivery
Customer Experience ›› Enterprise Architecture
Customer Insights Infrastructure & Operations
eBusiness & Channel Security & Risk
Strategy Sourcing & Vendor
Management

Client support
For information on hard-copy or electronic reprints, please contact Client Support at
+1 866-367-7378, +1 617-613-5730, or clientsupport@forrester.com. We offer quantity
discounts and special pricing for academic and nonprofit institutions.

Forrester Research (Nasdaq: FORR) is one of the most influential research and advisory firms in the world. We work with
business and technology leaders to develop customer-obsessed strategies that drive growth. Through proprietary
research, data, custom consulting, exclusive executive peer groups, and events, the Forrester experience is about a
singular and powerful purpose: to challenge the thinking of our clients to help them lead change in their organizations.
For more information, visit forrester.com. 128005