Anda di halaman 1dari 43

Business Continuity and Disaster Recovery with VMware Infrastructure 3

Larry Ellison Recovery Expert AccessFlow, Inc. August 7, 2007

Virtual DR Solutions

Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler

Better Business Continuity with VMware Infrastructure Preventing downtime Protecting data and systems Rapid Disaster Recovery

Implementing Better Business Continuity

Agenda
Importance of Business Continuity Business Continuity Definition and Current Challenges
BC Importance and Focus

Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure

Traditional Challenges

Preventing downtime Protecting data and systems Rapid Disaster Recovery

Implementing Better Business Continuity

Defining business continuity


What is business continuity? Business continuity is about protecting data, systems, and services:
Preventing data loss Minimizing planned downtime Preventing unplanned downtime Ensuring rapid recovery
Power failure Virus infection Fire Server failure Data corruption Storage failure User Error Disk failure OS fault Backup windows Hurricane

Software failure Server maintenance

Why So Much Focus on Business Continuity?


Standards for availability are rising Faster pace of business more critical change Agility a competitive advantage, demands highest service levels Number and severity of threats increasing

1 out of 500 data centers will have a major outage each year (Disaster Recovery Journal)

43% of companies experiencing disasters never re-open, and 29% close within two years (McGladrey and Pullen)

Circuit breakers wipe out the Web PG&Es faulty equipment reveals the Internets vulnerability to a disruption of its power source Verne Kopytoff, Chronicle Staff Writer Thursday, July 26, 2007 All it took to wipe out some of the Internet's biggest sites Tuesday was some faulty PG&E electrical breakers that caused a blackout in downtown San Francisco. Some of the Web's hottest destinations - Craigslist, Yelp, Second Life - were suddenly inaccessible from San Mateo to Singapore after back-up generators failed at the facility housing their computer equipment. Although mostly fixed within 12 hours, the incident shows how easy it is to send major swaths of the online world to the dark ages. Sites that millions of people rely on can be knocked offline by freak accidents, not to mention major catastrophes, and this event served as a wake-up call to the executives that operate them. "If the data center was that vulnerable to a power outage, what if something really catastrophic happened like an earthquake?" asked Derek Gordon, marketing vice president for Technorati, a search engine of blogs that was brought down for a couple of hours Tuesday after the blackout. "What does that say about the vulnerability of the Internet in the Bay Area?" The troubles started when 365 Main, a key data center in downtown San Francisco that touts its "state-of-the-art electrical system," failed to get its backup generators started immediately after the power outage hit around 1:45 p.m. A number of companies that house their computer servers in the facility were suddenly offline, setting off a mad scramble to get the Web sites up and running. Shoppers at RedEnvelope, an online retailer, couldn't buy monogrammed pillow soaps. Hipsters on Yelp, the review site, had to take a break from sharing their reports of fabulous and not-so fabulous restaurants. Users of online classified service Craigslist were out of luck in finding a second-hand futon. The backup generators were turned on 45 minutes after the blackout started, a delay that 365 Main said it was still investigating yesterday. But it took some of the facility's customers anywhere from another hour to 11 hours to get their servers safely rebooted and their Web sites operational. What the episode exposed is that some companies operate entirely from one data center, a decision described by some security experts as risky. In emergencies, such companies can't shift traffic to an alternative facility where they keep additional servers. "There's all kinds of things that can happen from a power outage to a tornado to a backhoe," said Jason Needham, director of product management at F5 Networks, a Seattle company that sells software and equipment for data centers. "All these things seem far-fetched until they happen." However, Needham said the trend is for companies to put all their eggs in one basket, so to speak, in an effort to save money. In fact, just hours before Tuesday's power outage, 365 Main put out a press release trumpeting the fact that RedEnvelope had moved all its operations to its facility and closed an unneeded center in the Midwest. Data centers are usually designed with redundant equipment to ensure power during outages, earthquakes and floods. Backup electricity is supposed to kick in within seconds after an outage through a complex system that keeps servers humming without interruption. Gordon, from Technorati, called opening several data centers ruinously expensive for thinly funded Internet startups, of which there are hundreds in the Bay Area. Only profitable companies can afford such an extravagance, he said, though he acknowledged that Technorati, which isn't profitable, is in the process of moving into a second facility. Tuesday's outage "added to the sense of urgency," Gordon said. "The lesson here is despite all of your planning and all of your promises, you are vulnerable." This article appeared on page C - 1 of the San Francisco Chronicle

Challenges in Implementing Business Continuity


Cost
Additional hardware; identical 1:1 Additional tools and training
Site A Site B

Complexity
Management and provisioning
Configure hardware Install OS Config OS Install backup/ restore agent Start Singlestep automatic recovery

Application-specific business continuity needs

Reliability
Complex solutions are hard to test Requires specialized training for personnel
X

Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure

Properties of Virtual Machines Hardware Independence Encapsulation Isolation Partitioning

Preventing downtime Protecting data and systems Ensuring rapid recovery from failures

Implementing Better Business Continuity

Business Continuity: The Killer App for Virtualization! Press Best Disaster Recovery Product of 2006 (TechTarget) Customers 55%
55% of customers using virtualization for BC/DR
N=2265

2006 Customer Survey (n=2265)

85% use VMware in production; 43% set as a default policy for


production servers

VMware Virtualization Basics Before Virtualization:

After Virtualization:

Software tied to hardware Single OS image per machine One application workload per OS

Inflexible, costly infrastructure

Hardware-independence of OS and applications Virtual machines can be provisioned to any system Manage OS and application as a single unit

Virtualization Enablers for Business Continuity


Hardware Independence

Eliminate need for 1:1 hardware duplication for BC Eliminate risk of hardware configuration drift

Run a virtual machine on any server without modification

Re-use older servers for BC-DR

Encapsulation
Apps System Data

System portability Simplify provisioning for recovery Simplify backup and replication Simplify copying and cloning of systems

Physical Server

= File

Encapsulate entire systems as simple files

Copyright 2006 VMware, Inc. All rights reserved.

Virtualization Enablers for Business Continuity


Isolation
Batch Job
App
OS

DR Test
App OS

Easier testing of a BC-DR plan Stability and security Utilize DR hardware for other tasks

App OS

VMware Infrastructure

Each VM isolated from other VMs Partitioning

Consolidate servers Boost utilization


% Utilization
Safely run multiple VMs simultaneously on a single server

Provide significant cost savings

Copyright 2006 VMware, Inc. All rights reserved.

Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure

Preventing downtime Protecting data and systems Rapid Disaster Recovery

Elements of preventing downtime Eliminate planned downtime Reduce un-planned downtime with better fault tolerance

Implementing Better Business Continuity

Avoiding Planned Downtime Has the Biggest Impact on Business Continuity


UNPLANNED DOWNTIME 20% 10%

Per studies from IBM & Sun, planned downtime is responsible for 80-90% of total system downtime
90%

PLANNED DOWNTIME

80%

Eliminating planned downtime can increase system availability by a full order of magnitude

SUN estimate

IBM estimate

Planned Downtime: Zero-downtime maintenance using VMware technology


Use VMware VMotion to evacuate hosts Move running applications to other servers without disruption Perform maintenance at any time of day Zero downtime for hardware maintenance Automate with DRS maintenance mode Automates moving virtual machines to other hosts Automates re-balancing after maintenance complete 1. Activate Maintenance Mode for physical host 2. DRS migrates running virtual machines to other hosts 1. Shut down idle host and perform maintenance 4. Restart host; DRS automatically rebalances workloads
VMotion VMotion

Unplanned Downtime: Server Failure - VMware HA


Simple, Cost effective high availability for all servers

Automatic restart of virtual machines in case of server failure

X
Resource Pool
Copyright 2005 VMware, Inc. All rights reserved.

No need for dedicated stand-by hardware None of the cost and complexity of clustering

Unplanned Application/OS Failure: Virtual Infrastructure Makes Clustering Easier


Use the same clustering software you use today but gain:
More flexible options:

Cluster physical machines with virtual machines Cluster virtual machines with virtual machines Cluster applications using fewer physical servers Test cluster configurations on a single physical server

Lower cost:

Unplanned: Protecting from Hardware Failures


Tolerate network path failures
Built-in NIC teaming Ability to share redundant components across workloads

Tolerate storage path failures


Built-in storage multi-pathing Share redundant storage paths among multiple virtual machines

Provides, at a lower cost, fault-tolerance equivalent to that possible with physical systems

Unplanned Downtime: Preventing downtime due to resource bottlenecks


Physical Infrastructure Resource bottlenecks create outages Inflexible resources Lengthy, manual process to rebalance workloads
VMware Infrastructure Resource Cluster VMotion

With VMware Infrastructure Prevent resource bottlenecks with DRS Automated load balancing across a pool of servers Ability to dynamically add resources to server pool

1. Overloaded host: automatic workload balancing 2. Dynamically add resources: DRS rebalances load

Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler

Better Business Continuity with VMware Infrastructure

Preventing downtime Protecting data and systems Rapid Disaster Recovery Implementing Better Business Continuity

Keys to protecting data and systems Minimize complexity Minimize impact on services Ensure comprehensive protection

Protecting data and systems with VMware Infrastructure


Virtual machines store system and data state
Entire system encapsulated in files: hardware configuration, operating system, applications, data

Impact
Systems are data Protect system using same tools and processes used to protect data Virtual machines are the simplest, most portable way to store system
Virtual Machine

.nvram .vmx .vmdk

Physical Server

Backup Options with VMware Reduce Backup Windows


App
Backup Agent

App
Service Console

Backup Agent

OS

OS

Service Console
Backup Agent

tape

Backup Server

In-VM Agent in each VM


Same architecture as physical system backup File-level incremental backup possible Any storage

In-Console Agent in Service Console


Simplified backup of fulldisk images Any storage

VCB Consolidated Backup Agent on Proxy Server


Move backup out of VM Provide LAN-free backup Eliminate backup windows Requires FC SAN Pre-integrated with 3rd party backup products

Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure

Preventing downtime Protecting data and systems Rapid Disaster Recovery Implementing Better Business Continuity

Failure Types DR Challenges Physical-Virtual; Virtual-Virtual Replication Technologies BC Budgeting

DR Challenges Today DR Challenges: Infrastructure and Recovery


Production
Application OS
OS files

DR Site
Bound to HW 5-10% utilized
WAN

Boot & Pray


cd, tape or ghost image

Application OS
OS files

x86

local storage

x86

local storage

Storage

Storage

OS & applications have 1:1 dependencies on hardware configuration

Slow and Unreliable Process, Separate processes for systemInfrastructure Expensive and application data
Complex to physically recover OS, applications & data

Tier 2 & 3 applications left unprotected, adding to Tier 1 RTO risk

Recovery Process in a Virtualized Environment


Example recovery process comparison 40+ hrs
P-P
Configure hardware Install OS Configure OS Install backup agent Start Single-step automatic recovery

V-V

Restore VM

Power on VM

< 4+ hrs RTO of minutes to a few hours, not days to weeks!

Physical to Virtual (P-V) Recovery


P-V a viable option: - Server ownership issues - Lockdown on servers
imaging

Primary site

P2V

conversion

Secondary Site

Web
imaging conversion

App
imaging

P2V P2V

conversion

WAN replication

DNS How:

If Rapid Recovery is required: Boot virtual machines on any hardware Start data recovery of application data if necessary

VMware Converter creates virtual machines matching physical machines Copy virtual machines to recovery site

Host Based Replication - How it works


Site Failure

Replication
Write data

Host Based Replication


WAN WAN

Replicate data to DR site Failure Boot VM using replicated data Done !


Source storage
Virtual machine disks Virtual machine disks

Target storage

Replication with VMware: Array-Based Replication


PRIMARY
Site Failure

DR SITE

WAN or WAN or Dark Fiber Dark Fiber

Array-Based Replication
Source VMFS

Target VMFS

Customer Example

Result: <17 minutes to failover!

Real VMware Customer Results Business Metric Results From


Server Utilization Consolidation Ratio Server Provisioning Time Planned Downtime Unplanned Downtime Time to Recovery Payback (Break-Even) TCO 4X- 5X Increase From 2:1 up to 30:1 > 60% reduction > 95% reduction > 30% reduction Down to Minutes < 6 months 30-70% reduction
Source: VMware customers surveyed post-use of VMware products.

Business Continuity Budgeting without VMware Software


Cost Total Cost

Budget

Business continuity implementation limited by budget

Applications Protected

Business Continuity Budgeting with VMware Software


Cost Total Cost Total Cost with Virtualization Budget

More applications (Tier 0,1,2) protected with the same budget

Number of Applications Protected

Customer Results
Our virtual IT infrastructure will help us provide greater availability than ever before for our most critical applications.
-- Paul Poppleton, IT Manager QUALCOMM

Using VMware virtual infrastructure, we can offer the same levels of service and more flexibility for up to 40 percent lower server and operating system costs.

-- Rob Jones, Director of Technology, Northern Europe ALSTOM

We can move a virtual machine to another physical server, apply a patch, and move it back without any service interruption.
-- Jamey Vester, Member of Professional Staff Subaru of Indiana

Agenda
Importance of Business Continuity and Current Challenges Virtualization as a BC enabler
Better Business Continuity with VMware Infrastructure

Preventing downtime Protecting data and systems Rapid Disaster Recovery Implementing Better Business Continuity
The VMware Difference Rapid, Reliable, Affordable Business Continuity Products

VMware Infrastructure 3
VMware Virtual SMP
Enables single VM to use up to 4 physical processors simultaneously

VMware Consolidated Backup


Centralized agent less backup for VMs

Virtual Machine File System (VMFS)


High performance cluster file system. Allows multiple ESX Servers to access same VM storage concurrently

VMware High Availability


Cost effective automatic restart of virtual machines in case of server failure

VMware Virtual Center


Centralize management of VM infrastructure

VMware Distributed Resource Scheduler


Dynamic and intelligent balancing of computing resources across resource pools based on pre-defined rules.

VMware Converter
Automates conversion of physical to virtual machines (physical-virtual)

VMware VMotion
Moves live, running VMs from one host to another while maintaining continuous service availability.

VMware ESX Server 3.0


Production-proven virtualization layer that resources into multiple virtual machines (VMs) Bare Metal

Business Continuity : The VMware Difference Rapid


Hardware independent failover and recovery for HA,DR Eliminate backup windows with LAN free backup Rapid provisioning of systems/data; backup and replication

Affordable
Realize early savings from consolidation Increase HA and DR coverage for more applications Fund your BC plan with hardware and operational savings

Reliable
Zero downtime planned maintenance Automatic restarts for un-planned server failure Frequent non-disruptive DR testing with dual-use of DR site

Get Started Today


Learn www.vmware.com/solutions/continuity/

Plan a PoC

AccessFlow can be engaged to assess your business continuity needs and design an appropriate roadmap to implement a robust DR solution http://www.accessflow.com

Try

Free evaluation download: www.vmware.com/download/vi/eval.html

Anda mungkin juga menyukai