Anda di halaman 1dari 67

Designing An Enterprise

Data Fabric

Alan McSweeney

http://ie.linkedin.com/in/alanmcsweeney
What Is An Enterprise Data Fabric?

• Set of hardware and software infrastructure, tools and facilities to


implement, administer, manage and operate data operations across the
entire span of the data within the enterprise across all data activities
including data acquisition, transformation, storage, distribution,
integration, replication, availability, security, protection, disaster recovery,
presentation, analytics, preservation, retention, backup, retrieval, archival,
recall, deletion, monitoring, capacity planning across all data storage
platforms enabling use by applications to meet the data needs of the
enterprise
• Mesh enabling the movement of data around the enterprise
• Provides access to all data assets
• Supports the flow, processing, distribution, management and exchange of
data throughout the enterprise
• Provide coherent data framework for use by custom and acquired
applications
• Independent of specific applications
• Independent of specific data platforms
18 February 2018 2
Building An Enterprise Data Fabric

18 February 2018 3
Core Data Fabric Conceptual Model

18 February 2018 4
Data Fabric Conceptual Model – Components - 1 of
2
Component Description
External Interacting Parties These are the range of external parties that supply data to and access data from the enterprise
External Party Interaction These are the set of applications and data interface and exchange points provided specifically to
Zones, Applications, Channels External Interacting Parties to allow them supply data to and access data from the enterprise
and Facilities
These can be hosted internally or externally or a mix of both
External Third Party These are third-party applications (such as social media platforms) that contain information
Applications about the enterprise or that are used by the enterprise to present information to or interact with
External Interacting Parties or where the enterprise is referred to, affecting the perception or
brand of the enterprise
External Data Sensors Sources of remote data measurements
External Party Interaction Zones These are applications and sets of data created by the enterprise to be externally facing where
Data Stores external parties can access information and interact with the enterprise
External Devices These are devices connected with services offered by the enterprise (such as ATMs and Kiosks)
Date Intake/Gateway This is the set of facilities for handling data supplied to the enterprise including validation and
transformation including a possible integration or service bus

This can be hosted internally or externally or a mix of both


Line of Business Applications This represents the set of line of business applications deployed on enterprise owned and
managed infrastructure used by business functions to operate their business processes
Organisation Operational Data These are the various operational data stores used by the Line of Business Applications
Stores

18 February 2018 5
Data Fabric Conceptual Model – Components - 2 of
2
Component Description
Line of Business Applications This represents the set of line of business applications deployed on external infrastructure used
Hosted Outside the Organisation by business functions to operate their business processes This includes cloud facilities such as
external data storage and XaaS facilities and an integration service to connect external data to
internal data
External Application Operational These are the various operational data stores used by the Line of Business Applications used by
Data Stores Line of Business Applications Hosted Outside the Organisation
Data Mastering These are facilities to create and manage master data and data extracted from operational data
to create a data warehouse and data extracts for reporting and analysis. This includes an extract,
transformation and load facility

These can be hosted internally or externally or a mix of both


Data Reporting and Analysis This represents the range of tools and facilities to report on, analyse, mine and model data
Facilities
These can be hosted internally or externally or a mix of both
Document Sharing and These are tools used within the enterprise to share and collaborate on the authoring of
Collaboration documents
Document Management Systems These are systems used to manage transactional and ad hoc structured and unstructured
documents in a formal and controlled manner, including the metadata assigned to documents
Desktop Applications These are applications used by individual users to view and author documents
Document and Information This provides structured access to documents and information including externally hosted
Portal applications providing these facilities
Unstructured Data Stores These are storage locations for enterprise documentation
18 February 2018 6
Zones Within Data Fabric Conceptual Model

• Sets of components of conceptual data fabric model can


be grouped into zones:
− Internal – within the enterprise’s boundary
− Cloud Extension – extensions to enterprise applications and data
held in external cloud platforms
− Interface – set of components responsible for getting data into
and out of the enterprise and presenting data and applications
externally
− Externally Located Extension – infrastructure and applications
that are connected to the wider enterprise network
− External Controlled – components outside the enterprise but
under the control of the enterprise
− External Uncontrolled – components outside the enterprise and
not under the direct control of the enterprise

18 February 2018 7
Why Create A Conceptual Data Fabric Model?

• Conceptual data fabric model represents a rich picture of the enterprise’s data
context
− Embodies an idealised and target data view
• Detailed visualisations represent information more effectively than lengthy
narrative text
− More easily understood and engaged with
• Show relationships, interactions
• Capture complexity easily
• Provides a more concise illustration of state
• Better tool to elicit information
• Gaps, errors and omissions more easily identified
• Assists informed discussions
• Evolve and refine rich picture representations of as-in and to-be situations
• Cannot expect to capture every piece of information – focus on the important
elements
• A rich picture is not a data management process map (yet)

18 February 2018 8
Differences Between Current And Target Conceptual
Data Model
• Use the conceptual data fabric model to identify gaps
between the current and desired target

18 February 2018 9
Core Data Fabric Conceptual Model

• Conceptual level is one representation of data related components


and their interactions within, across and outside the enterprise
• Not all components apply to all enterprises
• Useful as a basis for understanding the enterprise’s ideal data
architecture
− Creating an inventory of components in each conceptual area
− Defining an idealised target data fabric
• Just one dimension of defining, detailing and describing data
infrastructure
• Other dimensions include:
− Data types
− Data volumes
− Individual data flows
− Individual applications
− Individual data platforms and applications

18 February 2018 10
Responding To Interrelated Data Trends
Internal and
External Digital
Expectations,

Cloud Offerings Data


Data Regulations
and Services Trends

Analytics
Capabilities

18 February 2018 11
Responding To Interrelated Data Trends

• Designing a data fabric enables the enterprise respond to and take


advantage of key related data trends
− Internal and External Digital Expectations
• External actors expect to be able to interact digitally
• Within the enterprise there is an imperative to offer digital interactions and extensions
• Gives rise to large amounts of direct and indirect data that may or may not be processed
− Cloud Offerings and Services
• There are multiple providers of cloud-based services that enable the enterprise invest in
and avail of application and data capabilities with low cost and time of entry
• Data location changes and data must be integrated across platforms
− Data Regulations
• The data regulation landscape is changing - GDPR, ePrivacy Regulation Digital Single
Market, eIDAS, NIS Directive
• This requires greater data compliance and governance effort
• Uncontrolled data platforms and storage represent a significant and real risk to the
enterprise
− Analytics Capabilities
• New analytics capabilities across dimensions of data volumes and complexity enables
more complex analysis

18 February 2018 12
IT Function Data Leadership

• Enables the IT function demonstrate positive data


leadership
• Shows the IT function is able and willing to respond to
business data needs

18 February 2018 13
What Are The Data Challenges?

• More and more data of many different types


• Increasingly distributed platform landscape with data
movement, integration and management across multiple
service providers and cloud-based services
• Compliance and regulation requiring greater control of
personal data
• Newer data technologies and facilities outside the core
competence of the enterprise
• Shadow IT occurs when the IT function cannot deliver IT
change and new data facilities quickly

18 February 2018 14
Data Fabric Is Much More Than A Move To The
Cloud
• Enterprise data fabric should enables appropriate and seamless
move to multiple cloud/XaaS platforms - public, private and
hybrid - across the entire data infrastructure
− Storage
− Business applications
− Data management
− Reporting and analytics tools
• Cloud impacts the enterprise’s approach to data
− Enterprises cannot ignore cloud and XaaS options
• Enterprise data fabric needs to encompass the diversity of data
storage infrastructures
• Design an open and flexible data fabric that improves the
responsiveness of the IT function and reduces shadow IT

18 February 2018 15
Why Have An Enterprise Data Fabric?

• Enables adoption of new data technologies, platforms, systems and


infrastructures within an overall data context
• Enables move to simplification of data infrastructure
• Enables scalability of data infrastructure
• Enables industrialisation and automation of data operations,
administration, management, governance and common security
model
• Reduce the effort and cost of management and administration
• Focus on extracting data value
• Improve the reliability of data operations
• Manage risk of mixed data platforms, uncontrolled data on
uncontrolled platforms
• Allows benefits of scalable data infrastructures that are located
anywhere to be achieved
18 February 2018 16
Why Have An Enterprise Data Fabric?

• Focus on achieving benefits from data rather than on data


operations
− Reduce time to manage, find, combine and curate data
− Reduce wasted time, capacity, resources, cost
• Abstract data infrastructure from data usage
• Enable use of data in currently unanticipated ways through
flexible and adaptable facilities
• Reduce time to achieve insights

18 February 2018 17
Creating A Data Vision

• Data fabric is concerned with creating a data vision for the


enterprise
− Data capabilities, competencies
− Where the enterprise is and where it wants to be
• Define the future target landscape and define the required
journey to achieve it
• Ensures the vision can be executed
• Allows the delivery effort and resources to be quantified
• Permits the enterprise to move away traditional
approaches to managing data

18 February 2018 18
Creating A Data Vision – Making The Enterprise Data
Focussed
• Enable value to be derived from data
− Shorten the distance between business and analytics
• Facilitate data initiatives by removing the barriers to data
enablement
• IT needs to understand the data needs and associated data
business processes of the business and deliver results
− IT showing data leadership
• Top-down visualisation that is then implemented by
appropriate components are different layers

18 February 2018 19
Current Data Fabric State

18 February 2018 20
Target Data Fabric Future State

18 February 2018 21
Achieving The Target Data Fabric State

• Use the data fabric as a model


to describe the target future
state
• Articulate the future state
vision

• Identify the steps needed to


achieve the vision
• Data fabric is linked to the
applications that generate and
use data
18 February 2018 22
Data Fabric And Digital Enablement

• One element of digital business transformation is being


able to handle and process large amounts of data and
numbers of data sources
• The data environment changes very quickly while at the
same time becoming more distributed
• Traditional data management approaches, toolsets and
infrastructures fail to scale
• Analytics tools tend to be linked to individual business
function and data silos

18 February 2018 23
Key Design Principles Of A Data Fabric

Administration, Management and Control – Keep control of and be able to


manage and administer data irrespective of where it is located

Security – Common security standards across entire fabric, automate


governance and compliance and manage risk
Automation – Management and housekeeping activities automated

Integration – All components interoperate together across all layers

Stability, Reliability and Consistency – Common tools and facilities used to


delivery stable and reliable fabric across all layers
Openness, Flexibility and Choice – Ability to choose and change data
storage, data access, data location
Performance, Retrieval, Access and Usage – Applications and users can get
access to data when it is needed, as soon as it is needed and in a format in
which it is usable
18 February 2018 24
Business And IT Drivers For Data Fabric
Reduce Cost of Balance Cost of
Change and Maintenance and
Reaction Cost of Change

Have A Choice Of
React and Move And Be Able To
Quickly Adopt New
Technologies
Business IT
Offer Innovative
React and Move
Facilities and
Substantially
Functions

React Quickly To
Enable Growth
New
Opportunities
Requirements
18 February 2018 25
Data Fabric Is A Basic Building Block Of An Enterprise
Data Strategy

}
Insight/
Reporting
Forecast
You Cannot
Have This ...
Monitoring Analysis
... Without
Data Architecture This
Management
Data Governance
Data Warehousing and Business Intelligence
Management Solid
Data Security Management Data
Reference and Master Data Management
Management
Foundation
Document and Content Management and
Framework
Metadata Management

Data Development

Data Quality Management

Data Operations Management


18 February 2018 26
Every Enterprise Aspires To Data Driven Insights ...

What Happened? Why Is Likely To


Happen In The Future?

Insight/
Reporting
Forecast

Monitoring Analysis

What Is Currently
Happening? Why It Happened?

February 18, 2018 27


Data Driven Trailing And Leading Indicators

Reporting
• Report on Gathered Information On What Happened
To Understand Pinch Points, Quantify Effectiveness,
Measure Resource Usage And Success

Monitoring Trailing
• Gather Information In Realtime To Understand Indicators
Activities, Respond And Make Reallocation Decisions

Analysis
• Understand Reasons For Outcomes and Modify
Operation To Embed Improvements Leading
Indicators
Insight and Forecast
• Quantify Propensities, Forecast Likely Outcomes,
Identify Leading Indicators, Create Actionable
Intelligence
February 18, 2018 28
Objective Of Designing An Enterprise Data Fabric

• Understanding all the data flows throughout the


enterprise
• Understanding yields insight into what is needed and what
will generate a benefit

18 February 2018 29
Extended Data Fabric Conceptual Model
Administration,
Management

Monitoring,
Logging Alerting, Event
Management

Archival,
Recall

18 February 2018 30
Extended Data Fabric Conceptual Model

• Extended data fabric considers operating principles across core


fabric components and their interactions
Administration, Management • Ability to manage and administer the entire data fabric
• Have a single view of the data fabric
Utility, Usability • Be usable and be able to be used
Operations • Support the automation of data fabric operations, perform capacity planning and
management
Monitoring, Alerting, Event • Provide monitoring of data fabric and support event management and alerting of problems
Management
Governance, Compliance, Risk • Support data governance principles and enforcement of regulatory compliance
Management • Manage data risks
Security, Protection • Enforce data security and ensure protection of data
Archival, Recall • Support necessary and appropriate data archival and recall if required
Preservation, Retention, • Provide facilities to enforce and automate data preservation, retention and deletion policies
Deletion
Capacity Planning • Manage capacity across all dimensions of data storage and I/O volumes and throughput
Logging • Log and maintain details on data activities for reporting and analysis
Installation, Upgrade. • Support the seamless installation, upgrade and reconfiguration of new hardware and
Reconfiguration software components
Backup, Recovery, Replication, • Implement backup and recovery, including business continuity, availability and replication
Continuity, Availability across infrastructure components
18 February 2018 31
Data Fabric Needs To Support Entire Data Lifecycle

18 February 2018 32
Data Lifecycle View

• The stages in this generalised lifecycle are:


− Architect, Budget, Plan, Design and Specify - This relates to the design and specification of the data
storage and management and their supporting processes. This establishes the data management
framework
− Implement Underlying Technology- This is concerned with implementing the data-related hardware and
software technology components. This relates to database components, data storage hardware, backup
and recovery software, monitoring and control software and other items
− Enter, Create, Acquire, Derive, Update, Integrate, Capture- This stage is where data originated, such as
data entry or data capture and acquired from other systems or sources
− Secure, Store, Replicate and Distribute - In this stage, data is stored with appropriate security and access
controls including data access and update audit. It may be replicated to other applications and distributed
− Present, Report, Analyse, Model - This stage is concerned with the presentation of information, the
generation of reports and analysis and the created of derived information
− Preserve, Protect and Recover- This stage relates to the management of data in terms of backup,
recovery and retention/preservation
− Archive and Recall - This stage is where information that is no longer active but still required in archived
to secondary data storage platforms and from which the information can be recovered if required
− Delete/Remove - The stage is concerned with the deletion of data that cannot or does not need to be
retained any longer
− Define, Design, Implement, Measure, Manage, Monitor, Control, Staff, Train and Administer, Standards,
Governance, Fund - This is not a single stage but a set of processes and procedures that cross all stages
and is concerned with ensuring that the processes associated with each of the lifestyle stages are
operated correctly and that data assurance, quality and governance procedures exist and are operated

February 18, 2018 33


Using The Core Conceptual Model

• Understand the true complexity of data requirements


within and across the enterprise
• Use this complexity to derive a simplified an integrated
data fabric

18 February 2018 34
Data As A Realisable Asset

• Raw data must be refined into a format that can be used in order to
be viewed as an asset with realisable value
• For data to be an asset it must be:
− Have its underlying value extracted
− Accessible
− Usable
• Data has physical and tangible characteristics:
− Mass – it has bulk and requires resources to store, process and move
− Heat – it gets cold over time with different levels of dissipation
− Energy – data has different levels of energy based on its movement and value
− Volatility – the underlying value of the data can be lost at differing rates
− Complexity – the content and structure of the data is variable
− Motion – data moves from location to location as it is generated, stored,
process
− Structure – data may be structured, semi-structured or high-structured
− Size to Value Ratio – the usable value with the data may be large or small
relative to the volume of the raw data

18 February 2018 35
External Interacting Parties

18 February 2018 36
External Interacting Parties

• Enterprises typically operate in • Business Customer


a complex environment with • Client
• Collaborator
multiple interactions with • Competitor
different communication with • Contractor
many parties of many different • Counterparty
types over different channels •

Dealer
Distributor
• Many types of external party •

Franchisee
Intermediary
the enterprise interacts with • Licensee
Licensor
• There will be multiple •
• Outsourcer
interactions with different • Partner
communications with many • Provider
parties of many different type • Public
Regulator
over different channels •
• Regulated Entity
Representative
• Every interaction will involve •
• Retail Customer
data being accessed, presented, • Service
transferred and processed • Shareholder
• Sub-Contractor
• Supplier

18 February 2018 37
External Party Interaction Zones, Applications,
Channels and Facilities

18 February 2018 38
External Party Interaction Zones, Applications,
Channels and Facilities
• This is the range of application-based modes and methods
of interaction between the enterprise and the External
Interacting Parties (rather than pure email)

18 February 2018 39
External Party Interaction Zones Data Stores

18 February 2018 40
External Party Interaction Zones Data Stores

• The data belonging to and data about the interactions with


External Interacting Parties using External Party Interaction
Zones, Applications, Channels and Facilities will be stored
and managed

18 February 2018 41
Date Intake/Gateway

18 February 2018 42
Date Intake/Gateway

• Generalised representation of the set of facilities for enabling and


managing all communications between the enterprise (and its systems)
and external parties
− Broker and integration facilities for centralising all external communications –
messaging, file transfer, web services
− Allows two-way communications – send/receive and to/from internal and external
− Supports multiple external channels and protocols
− Supports multiple authentication schemes and standards
− Provides asynchronous messaging
− Includes application programming interface
− Allows the exposure of endpoints which external parties can access such as SFTP
− Provides management and administration facilities to define how communications
should operate and for support and problem identification and resolution
− Delivers facilities for orchestration, transformation, development and deployment
management, traffic management
− Ensure data quality
− Provides workflow definition, implementation and operation
− Maintains an audit trail of all messages and communications
− Delivers high performance, resilience and availability

18 February 2018 43
External Third Party Applications

18 February 2018 44
External Third Party Applications

• The enterprise may use external applications (such as


social media platforms) as sources of external party data,
as routes to advertise or direct a message to external
parties or as channels to interact with external parties
− Information and content stored directly on applications
− Information about usage and interactions available from
applications
• The enterprise may also use external applications for
collaboration and information sharing either within the
enterprise or with external parties

18 February 2018 45
External Data Sensors

18 February 2018 46
External Data Sensors

• These represent measurement infrastructure and


applications owned by the enterprise, located externally
on some wide area network or other communications
facility that generate data that is transmitted to the
enterprise
− Telemetry units

18 February 2018 47
External Devices

18 February 2018 48
External Devices

• These represent infrastructure and applications owned by


the enterprise, located externally on some wide area
network or other communications facility that are
accessed and used by external parties to interact with the
enterprise
− ATMs
− Kiosks
− Point of sale devices

18 February 2018 49
Line of Business Applications

18 February 2018 50
Line of Business Applications

• This represent the applications used by individual business


functions or across the enterprise that are hosted on
internal enterprise infrastructure or are hosted externally
by application or platform service providers

18 February 2018 51
Data Storage Platforms

18 February 2018 52
Data Storage Platforms

• These represent the various structure data stores and


associated database management software used by
applications that are hosted on internal enterprise
infrastructure or are hosted externally by application or
platform service providers

18 February 2018 53
Data Reporting and Analysis Facilities

18 February 2018 54
Data Reporting and Analysis Facilities

• This represents the set of facilities to extract operational


data from business applications, create, store and manage
reference and master data, create and store enduring data
and analyse the data including reporting, visualisation,
mining and modelling

18 February 2018 55
Document Management Systems And Document
Sharing and Collaboration

18 February 2018 56
Document Management Systems And Document
Sharing and Collaboration
• This represents the facilities to store structure and
unstructured document-oriented data including document
metadata, extract information from documents and
support ad hoc and formal workflows related to these
documents

18 February 2018 57
Desktop Applications

18 February 2018 58
Desktop Applications

• These are the suite of desktop applications including email


to create, update, distribute and collaborate on
documents

18 February 2018 59
Many Data Types

Transactions and Unstructured Document


Documents
Application Data Data Images

Videos Sound Usage Logs

Third-Party Data Files Messages Reports

Derived Data Data Models Web Content Telemetry Data

Data Warehouse Reference and


Emails Metadata
and Data Marts Master Data

18 February 2018 60
Data Fabric As Data Plumbing And A Data Refinery

• Data fabric should enable the flow of data throughout the


enterprise and the refinement of data to create appropriate
refined and derived data products from raw data

18 February 2018 61
Data Layers Across Data Fabric
Layer Components Data Scope
Layer 8+ Data Operations, Usage, Unified management across all environments and all
Management, Control, layers and ensure performance, availability,
Governance, Analysis, Modelling reliability, scalability, maintainability and
supportability
Layer 7 Data Presentation, Platforms, Set of data accessing and data using business
Applications, Systems and Business applications
Processes
Layer 6 Data Security and Governance Implement common data security policies across all
environments and platforms
Layer 5 Data Logical Access and Integration Insulate and abstract access from knowledge of
environments and platforms and integrate data
systems and data management
Layer 4 Data Transportation Provide a common data transport that connects all
environments
Layer 3 Data Network and Connectivity Connections to storage and physical access
irrespective of location across entire network
Layer 2 Data Physical Access Provide physical access to data on storage layer
Layer 1 Data Storage and Transmission Store data transparently on multiple environments
Infrastructure and move data between environments
18 February 2018 62
Building A Comprehensive Data Vision
Strategy Area

Enterprise Data Strategy …

Strategy Area

Component

Component Type …

Component
Core Data Fabric Conceptual
Model Components …
Component

Component Type …
Data Management and
Comprehensive Data Vision Component
Operations Facility
Extended Data Fabric
Conceptual Model …
Data Management and
Operations Facility
Stage

Data Lifecycle …

Stage

Type

Data Types …

Type
18 February 2018 63
Extending Conceptual Model To Additional Levels Of
Detail To Build A Comprehensive Data Vision
• Individual data views can be combined to articulate a
comprehensive data vision
− Enterprise Data Strategy
• Individual strategy areas
− Core Data Fabric Conceptual Model Components
• Individual elements within each component
− Extended Data Fabric Conceptual Model
• Operating principles and interactions
− Data Lifecycle
• Individual stages within lifecycle
− Data Types
• Individual data types
• Builds an understanding of how the enterprise wants and
needs to handle and use data
18 February 2018 64
Extending Conceptual Model To Additional Levels Of
Detail To Build A Comprehensive Data Vision

Additional
Data
Dimensions
and Views

Data Fabric Landscape

18 February 2018 65
Summary

• Data fabric is concerned with creating a data vision for the enterprise
• The conceptual data fabric model represents a rich picture of the enterprise’s data
context
− Detailed visualisations represent information more effectively than lengthy narrative text
• Use the conceptual data fabric model to identify gaps between the current and
desired target
• Data fabric provides a basis for understanding the enterprise’s ideal data
architecture
• Designing a data fabric enables the enterprise respond to and take advantage of
key related data trends
− Shadow IT occurs when the IT function cannot deliver IT change and new data facilities
quickly
− Uncontrolled data platforms and storage represent a significant and real risk to the
enterprise
• Enterprise data fabric should enables appropriate and seamless move to multiple
cloud/XaaS platforms - public, private and hybrid - across the entire data
infrastructure
• Enables the enterprise focus on achieving benefits from data rather than on data
operations

18 February 2018 66
More Information

Alan McSweeney
http://ie.linkedin.com/in/alanmcsweeney

18 February 2018 67

Anda mungkin juga menyukai