Business Continuity HA Backup DR

Business Continuity and Disaster Recovery with VMware Infrastructure 3
Larry Ellison Recovery Expert AccessFlow, Inc. August 7, 2007
Virtual DR Solutions
Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure Preventing downtime Protecting data and systems Rapid Disaster Recovery
Implementing Better Business Continuity
Agenda
Importance of Business Continuity Business Continuity Definition and Current Challenges
BC Importance and Focus
Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure
Traditional Challenges
Preventing downtime Protecting data and systems Rapid Disaster Recovery
Defining business continuity

What is business continuity? Business continuity is about protecting data, systems, and services:
Preventing data loss Minimizing planned downtime Preventing unplanned downtime Ensuring rapid recovery
Power failure Virus infection Fire Server failure Data corruption Storage failure User Error Disk failure OS fault Backup windows Hurricane
Software failure Server maintenance
Why So Much Focus on Business Continuity?

Standards for availability are rising Faster pace of business more critical change Agility a competitive advantage, demands highest service levels Number and severity of threats increasing
1 out of 500 data centers will have a major outage each year (Disaster Recovery Journal)
43% of companies experiencing disasters never re-open, and 29% close within two years (McGladrey and Pullen)
Circuit breakers wipe out the Web PG&Es faulty equipment reveals the Internets vulnerability to a disruption of its power source Verne Kopytoff, Chronicle Staff Writer Thursday, July 26, 2007 All it took to wipe out some of the Internet's biggest sites Tuesday was some faulty PG&E electrical breakers that caused a blackout in downtown San Francisco. Some of the Web's hottest destinations - Craigslist, Yelp, Second Life - were suddenly inaccessible from San Mateo to Singapore after back-up generators failed at the facility housing their computer equipment. Although mostly fixed within 12 hours, the incident shows how easy it is to send major swaths of the online world to the dark ages. Sites that millions of people rely on can be knocked offline by freak accidents, not to mention major catastrophes, and this event served as a wake-up call to the executives that operate them. "If the data center was that vulnerable to a power outage, what if something really catastrophic happened like an earthquake?" asked Derek Gordon, marketing vice president for Technorati, a search engine of blogs that was brought down for a couple of hours Tuesday after the blackout. "What does that say about the vulnerability of the Internet in the Bay Area?" The troubles started when 365 Main, a key data center in downtown San Francisco that touts its "state-of-the-art electrical system," failed to get its backup generators started immediately after the power outage hit around 1:45 p.m. A number of companies that house their computer servers in the facility were suddenly offline, setting off a mad scramble to get the Web sites up and running. Shoppers at RedEnvelope, an online retailer, couldn't buy monogrammed pillow soaps. Hipsters on Yelp, the review site, had to take a break from sharing their reports of fabulous and not-so fabulous restaurants. Users of online classified service Craigslist were out of luck in finding a second-hand futon. The backup generators were turned on 45 minutes after the blackout started, a delay that 365 Main said it was still investigating yesterday. But it took some of the facility's customers anywhere from another hour to 11 hours to get their servers safely rebooted and their Web sites operational. What the episode exposed is that some companies operate entirely from one data center, a decision described by some security experts as risky. In emergencies, such companies can't shift traffic to an alternative facility where they keep additional servers. "There's all kinds of things that can happen from a power outage to a tornado to a backhoe," said Jason Needham, director of product management at F5 Networks, a Seattle company that sells software and equipment for data centers. "All these things seem far-fetched until they happen." However, Needham said the trend is for companies to put all their eggs in one basket, so to speak, in an effort to save money. In fact, just hours before Tuesday's power outage, 365 Main put out a press release trumpeting the fact that RedEnvelope had moved all its operations to its facility and closed an unneeded center in the Midwest. Data centers are usually designed with redundant equipment to ensure power during outages, earthquakes and floods. Backup electricity is supposed to kick in within seconds after an outage through a complex system that keeps servers humming without interruption. Gordon, from Technorati, called opening several data centers ruinously expensive for thinly funded Internet startups, of which there are hundreds in the Bay Area. Only profitable companies can afford such an extravagance, he said, though he acknowledged that Technorati, which isn't profitable, is in the process of moving into a second facility. Tuesday's outage "added to the sense of urgency," Gordon said. "The lesson here is despite all of your planning and all of your promises, you are vulnerable." This article appeared on page C - 1 of the San Francisco Chronicle
Challenges in Implementing Business Continuity

Cost
Additional hardware; identical 1:1 Additional tools and training
Site A Site B
Complexity
Management and provisioning
Configure hardware Install OS Config OS Install backup/ restore agent Start Singlestep automatic recovery
Application-specific business continuity needs
Reliability
Complex solutions are hard to test Requires specialized training for personnel
X
Agenda
Properties of Virtual Machines Hardware Independence Encapsulation Isolation Partitioning
Preventing downtime Protecting data and systems Ensuring rapid recovery from failures
Business Continuity: The Killer App for Virtualization! Press Best Disaster Recovery Product of 2006 (TechTarget) Customers 55%
55% of customers using virtualization for BC/DR
N=2265
2006 Customer Survey (n=2265)
85% use VMware in production; 43% set as a default policy for

production servers
VMware Virtualization Basics Before Virtualization:
After Virtualization:
Software tied to hardware Single OS image per machine One application workload per OS
Inflexible, costly infrastructure
Hardware-independence of OS and applications Virtual machines can be provisioned to any system Manage OS and application as a single unit
Virtualization Enablers for Business Continuity

Hardware Independence
Eliminate need for 1:1 hardware duplication for BC Eliminate risk of hardware configuration drift
Run a virtual machine on any server without modification
Re-use older servers for BC-DR
Encapsulation
Apps System Data
System portability Simplify provisioning for recovery Simplify backup and replication Simplify copying and cloning of systems
Physical Server
= File
Encapsulate entire systems as simple files
Copyright 2006 VMware, Inc. All rights reserved.
Virtualization Enablers for Business Continuity

Isolation
Batch Job
App
OS
DR Test
App OS
Easier testing of a BC-DR plan Stability and security Utilize DR hardware for other tasks
App OS
VMware Infrastructure
Each VM isolated from other VMs Partitioning
Consolidate servers Boost utilization

% Utilization
Safely run multiple VMs simultaneously on a single server
Provide significant cost savings
Agenda
Preventing downtime Protecting data and systems Rapid Disaster Recovery
Elements of preventing downtime Eliminate planned downtime Reduce un-planned downtime with better fault tolerance
Avoiding Planned Downtime Has the Biggest Impact on Business Continuity

UNPLANNED DOWNTIME 20% 10%
Per studies from IBM & Sun, planned downtime is responsible for 80-90% of total system downtime
90%
PLANNED DOWNTIME
80%
Eliminating planned downtime can increase system availability by a full order of magnitude
SUN estimate
IBM estimate
Planned Downtime: Zero-downtime maintenance using VMware technology

Use VMware VMotion to evacuate hosts Move running applications to other servers without disruption Perform maintenance at any time of day Zero downtime for hardware maintenance Automate with DRS maintenance mode Automates moving virtual machines to other hosts Automates re-balancing after maintenance complete 1. Activate Maintenance Mode for physical host 2. DRS migrates running virtual machines to other hosts 1. Shut down idle host and perform maintenance 4. Restart host; DRS automatically rebalances workloads
VMotion VMotion
Unplanned Downtime: Server Failure - VMware HA

Simple, Cost effective high availability for all servers
Automatic restart of virtual machines in case of server failure
X
Resource Pool
No need for dedicated stand-by hardware None of the cost and complexity of clustering
Unplanned Application/OS Failure: Virtual Infrastructure Makes Clustering Easier

Use the same clustering software you use today but gain:
More flexible options:

Cluster physical machines with virtual machines Cluster virtual machines with virtual machines Cluster applications using fewer physical servers Test cluster configurations on a single physical server
Lower cost:

Unplanned: Protecting from Hardware Failures

Tolerate network path failures
Built-in NIC teaming Ability to share redundant components across workloads
Tolerate storage path failures

Built-in storage multi-pathing Share redundant storage paths among multiple virtual machines
Provides, at a lower cost, fault-tolerance equivalent to that possible with physical systems
Unplanned Downtime: Preventing downtime due to resource bottlenecks

Physical Infrastructure Resource bottlenecks create outages Inflexible resources Lengthy, manual process to rebalance workloads
VMware Infrastructure Resource Cluster VMotion
With VMware Infrastructure Prevent resource bottlenecks with DRS Automated load balancing across a pool of servers Ability to dynamically add resources to server pool
1. Overloaded host: automatic workload balancing 2. Dynamically add resources: DRS rebalances load
Agenda
Preventing downtime Protecting data and systems Rapid Disaster Recovery Implementing Better Business Continuity
Keys to protecting data and systems Minimize complexity Minimize impact on services Ensure comprehensive protection
Protecting data and systems with VMware Infrastructure

Virtual machines store system and data state
Entire system encapsulated in files: hardware configuration, operating system, applications, data
Impact
Systems are data Protect system using same tools and processes used to protect data Virtual machines are the simplest, most portable way to store system
Virtual Machine
.nvram .vmx .vmdk
Physical Server
Backup Options with VMware Reduce Backup Windows

App
Backup Agent
App
Service Console
Backup Agent
OS
OS
Service Console
Backup Agent
tape
Backup Server
In-VM Agent in each VM

Same architecture as physical system backup File-level incremental backup possible Any storage
In-Console Agent in Service Console

Simplified backup of fulldisk images Any storage
VCB Consolidated Backup Agent on Proxy Server

Move backup out of VM Provide LAN-free backup Eliminate backup windows Requires FC SAN Pre-integrated with 3rd party backup products
Agenda
Failure Types DR Challenges Physical-Virtual; Virtual-Virtual Replication Technologies BC Budgeting
DR Challenges Today DR Challenges: Infrastructure and Recovery

Production
Application OS
OS files
DR Site
Bound to HW 5-10% utilized
WAN
Boot & Pray

cd, tape or ghost image
Application OS
OS files
x86
local storage
x86
local storage
Storage
Storage
OS & applications have 1:1 dependencies on hardware configuration
Slow and Unreliable Process, Separate processes for systemInfrastructure Expensive and application data
Complex to physically recover OS, applications & data
Tier 2 & 3 applications left unprotected, adding to Tier 1 RTO risk
Recovery Process in a Virtualized Environment

Example recovery process comparison 40+ hrs
P-P
Configure hardware Install OS Configure OS Install backup agent Start Single-step automatic recovery
V-V
Restore VM
Power on VM
< 4+ hrs RTO of minutes to a few hours, not days to weeks!
Physical to Virtual (P-V) Recovery

P-V a viable option: - Server ownership issues - Lockdown on servers
imaging
Primary site
P2V
conversion
Secondary Site
Web
imaging conversion
App
imaging
P2V P2V
conversion
WAN replication
DNS How:
If Rapid Recovery is required: Boot virtual machines on any hardware Start data recovery of application data if necessary
VMware Converter creates virtual machines matching physical machines Copy virtual machines to recovery site
Host Based Replication - How it works

Site Failure
Replication
Write data
Host Based Replication

WAN WAN
Replicate data to DR site Failure Boot VM using replicated data Done !

Source storage
Virtual machine disks Virtual machine disks
Target storage
Replication with VMware: Array-Based Replication

PRIMARY
Site Failure
DR SITE
WAN or WAN or Dark Fiber Dark Fiber
Array-Based Replication
Source VMFS
Target VMFS
Customer Example
Result: <17 minutes to failover!
Real VMware Customer Results Business Metric Results From

Server Utilization Consolidation Ratio Server Provisioning Time Planned Downtime Unplanned Downtime Time to Recovery Payback (Break-Even) TCO 4X- 5X Increase From 2:1 up to 30:1 > 60% reduction > 95% reduction > 30% reduction Down to Minutes < 6 months 30-70% reduction
Source: VMware customers surveyed post-use of VMware products.
Business Continuity Budgeting without VMware Software

Cost Total Cost
Budget
Business continuity implementation limited by budget
Applications Protected
Business Continuity Budgeting with VMware Software

Cost Total Cost Total Cost with Virtualization Budget
More applications (Tier 0,1,2) protected with the same budget
Number of Applications Protected
Customer Results
Our virtual IT infrastructure will help us provide greater availability than ever before for our most critical applications.
-- Paul Poppleton, IT Manager QUALCOMM
Using VMware virtual infrastructure, we can offer the same levels of service and more flexibility for up to 40 percent lower server and operating system costs.
-- Rob Jones, Director of Technology, Northern Europe ALSTOM
We can move a virtual machine to another physical server, apply a patch, and move it back without any service interruption.
-- Jamey Vester, Member of Professional Staff Subaru of Indiana
Agenda
The VMware Difference Rapid, Reliable, Affordable Business Continuity Products
VMware Infrastructure 3
VMware Virtual SMP
Enables single VM to use up to 4 physical processors simultaneously
VMware Consolidated Backup

Centralized agent less backup for VMs
Virtual Machine File System (VMFS)

High performance cluster file system. Allows multiple ESX Servers to access same VM storage concurrently
VMware High Availability

Cost effective automatic restart of virtual machines in case of server failure
VMware Virtual Center

Centralize management of VM infrastructure
VMware Distributed Resource Scheduler

Dynamic and intelligent balancing of computing resources across resource pools based on pre-defined rules.
VMware Converter
Automates conversion of physical to virtual machines (physical-virtual)
VMware VMotion
Moves live, running VMs from one host to another while maintaining continuous service availability.
VMware ESX Server 3.0

Production-proven virtualization layer that resources into multiple virtual machines (VMs) Bare Metal
Business Continuity : The VMware Difference Rapid

Hardware independent failover and recovery for HA,DR Eliminate backup windows with LAN free backup Rapid provisioning of systems/data; backup and replication
Affordable
Realize early savings from consolidation Increase HA and DR coverage for more applications Fund your BC plan with hardware and operational savings
Reliable
Zero downtime planned maintenance Automatic restarts for un-planned server failure Frequent non-disruptive DR testing with dual-use of DR site
Get Started Today

Learn www.vmware.com/solutions/continuity/
Plan a PoC
AccessFlow can be engaged to assess your business continuity needs and design an appropriate roadmap to implement a robust DR solution http://www.accessflow.com
Try
Free evaluation download: www.vmware.com/download/vi/eval.html

Business Continuity HA Backup DR

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Business Continuity HA Backup DR

Diunggah oleh

Hak Cipta:

Format Tersedia

Business Continuity and Disaster Recovery with VMware Infrastructure 3

Larry Ellison Recovery Expert AccessFlow, Inc. August 7, 2007

Implementing Better Business Continuity

Preventing downtime Protecting data and systems Rapid Disaster Recovery

Implementing Better Business Continuity

Defining business continuity

Software failure Server maintenance

Why So Much Focus on Business Continuity?

Challenges in Implementing Business Continuity

Application-specific business continuity needs

Properties of Virtual Machines Hardware Independence Encapsulation Isolation Partitioning

Implementing Better Business Continuity

2006 Customer Survey (n=2265)

85% use VMware in production; 43% set as a default policy for

VMware Virtualization Basics Before Virtualization:

Inflexible, costly infrastructure

Virtualization Enablers for Business Continuity

Run a virtual machine on any server without modification

Re-use older servers for BC-DR

Encapsulate entire systems as simple files

Copyright 2006 VMware, Inc. All rights reserved.

Virtualization Enablers for Business Continuity

Each VM isolated from other VMs Partitioning

Consolidate servers Boost utilization

Provide significant cost savings

Copyright 2006 VMware, Inc. All rights reserved.

Preventing downtime Protecting data and systems Rapid Disaster Recovery

Implementing Better Business Continuity

Avoiding Planned Downtime Has the Biggest Impact on Business Continuity

Planned Downtime: Zero-downtime maintenance using VMware technology

Unplanned Downtime: Server Failure - VMware HA

Automatic restart of virtual machines in case of server failure

Unplanned Application/OS Failure: Virtual Infrastructure Makes Clustering Easier

Unplanned: Protecting from Hardware Failures

Tolerate storage path failures

Unplanned Downtime: Preventing downtime due to resource bottlenecks

Better Business Continuity with VMware Infrastructure

Protecting data and systems with VMware Infrastructure

.nvram .vmx .vmdk

Backup Options with VMware Reduce Backup Windows

In-VM Agent in each VM

In-Console Agent in Service Console

VCB Consolidated Backup Agent on Proxy Server

Failure Types DR Challenges Physical-Virtual; Virtual-Virtual Replication Technologies BC Budgeting

DR Challenges Today DR Challenges: Infrastructure and Recovery

Boot & Pray

OS & applications have 1:1 dependencies on hardware configuration

Tier 2 & 3 applications left unprotected, adding to Tier 1 RTO risk

Recovery Process in a Virtualized Environment

< 4+ hrs RTO of minutes to a few hours, not days to weeks!

Physical to Virtual (P-V) Recovery

Host Based Replication - How it works

Host Based Replication

Replicate data to DR site Failure Boot VM using replicated data Done !

Replication with VMware: Array-Based Replication

WAN or WAN or Dark Fiber Dark Fiber

Result: <17 minutes to failover!

Real VMware Customer Results Business Metric Results From

Business Continuity Budgeting without VMware Software

Business continuity implementation limited by budget

Business Continuity Budgeting with VMware Software

More applications (Tier 0,1,2) protected with the same budget

Number of Applications Protected