Anda di halaman 1dari 53

Evaluator Guide

Site Recovery Manager 1.0


Site Recovery Manager Evaluator Guide

© 2006-2008 VMware, Inc. All rights reserved. Protected by one or more of U.S. Patent Nos. 6,397,242,
6,496,847, 6,704,925, 6,711,672, 6,725,289, 6,735,601, 6,785,886, 6,789,156, 6,795,966, 6,880,022,
6,944,699, 6,961,806, 6,961,941, 7,069,413, 7,082,598, 7,089,377, 7,111,086, 7,111,145, 7,117,481,
7,149,843, 7,155,558, 7,222,221, 7,260,815, 7,260,820, 7,269,683, 7,275,136, 7,277,998, 7,277,999,
7,278,030, 7,281,102, and 7,290,253; patents pending.
VMware, the VMware “boxes” logo and design, Virtual SMP and VMotion are registered trademarks or
trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names
mentioned herein may be trademarks of their respective companies.

VMware, Inc.
3401 Hillview Ave.
Palo Alto, CA 94304
www.vmware.com

2 VMware, Inc.
CONTENTS

About This SRM 1.0 Evaluator Guide

SRM Quick Start Checklist

SRM Evaluation Checklist

Chapter 1: Overview of VMware Site Recovery Manager (SRM)


Chapter 2: Planning for BC/DR when using VMware SRM

Chapter 3: SRM Workflow setup at the Protected and Recovery sites

Chapter 4: Using SRM to run a Test against a Recovery Plan

Chapter 5: Using SRM to failover the Protected Site to the Recovery Site

Chapter 6: Failback from the Recovery Site to the Protected Site

Chapter 7: SRM Alarms and Site Status Monitoring

Chapter 8: SRM Roles and Privileges

Conclusion

3 VMware, Inc.
Site Recovery Manager Evaluator Guide

About This Evaluator Guide

Intended Audience
The Site Recovery Manager (SRM Evaluator Guide is intended to provide the SRM customers and
evaluators with a guide that will walk them through the SRM workflow that has to be completed to allow
for the successful and automated service failover from the designated SRM protected site to the
designated SRM recovery site. This guide will also provide an overview which includes the
considerations and guidance to execute a failback of services from the recovery site back to the site
that was originally designated as the SRM protected site.

To successfully use this SRM Evaluator Guide the following is assumed:

 ESX Server 3.0.2 or ESX 3.5 has been installed on physical servers in the SRM protected and
recovery sites.

 An instance of VirtualCenter 2.5 exists in each of the SRM protected and recovery sites.

 A multisite SAN infrastructure is in place, and setup to replicate designated VMFS datastores
between the SRM protected and recovery sites.

 The virtual machines (VMs) that have been selected to be protected VMs for the SRM
evaluation have been moved onto the designated replicated datastores. VMs that have not
been selected to be protected VMs for the evaluation should be moved to non replicated
datastores. If you are running ESX 3.5, Storage VMotion could be used to complete the move
with zero downtime.

Referring to the SRM Installation and Administration Guide for details complete the
following:

 The basic installation of Site Recovery Manager on the SRM or VirtualCenter servers in the
SRM protected and recovery sites has been completed.

 A SRM license is installed on the VirtualCenter license server at the protected and recovery
sites.

 The installation of the SRM plug-in has been completed and the SRM plug-in has been enabled
on the Virtual Infrastructure Client instances that will be used to access the SRM protected and
recovery sites.

VMware Infrastructure Documentation


If you need additional information on VMware Virtual Infrastructure, consult the VMware Infrastructure
documentation, which consists of the combined VMware VirtualCenter and ESX Server documentation
set. Documentation is available from: http://www.vmware.com/support/pubs/

4 VMware, Inc.
Disaster Recovery (DR), Virtual Infrastructure (VI) and SRM abbreviations Used in this Guide
The following DR, VI and SRM abbreviations are used throughout this evaluator guide:

Abbreviation Description
BC/DR Business Continuity and Disaster Recovery
SRM Site Recovery Manager
VC VirtualCenter
VI Client Virtual Infrastructure Client used to access Virtual Center and SRM
VM Virtual machines on a managed host
RP Virtual Infrastructure Resource Pool
VMFS Virtual Machine File System
SAN Storage area network type datastore shared between managed hosts

Disaster Recovery (DR) and SRM Terminology Used in this Guide


The following DR and SRM terminology is used throughout this evaluator guide:

DR and SRM Description


Terminology
array-based Replication of virtual machines that is managed and executed by
replication the storage subsystem itself rather than from inside the virtual
machines, the vmkernel or the Service Console.
logical unit number Refers to a single SCSI storage device on the SAN that can be
(LUN) mapped to one or more ESX Servers.
Failover Event that occurs when the recovery site takes over operation in
place of the protected site after the declaration of a disaster.
Failback Reversal of failover, returning IT operations to the primary site.
datastore Storage for the managed host
Host VirtualCenter managed hosts
SRM Server Manages and monitors the SRM recovery plans
protected VM A VM that is protected by SRM because it is located on a replicated datastore
un-protected VM A VM that is not protected by SRM because it is located on a non replicated
datastore
protected site The site that contains the protected VMs
recovery site The site that contains the replicated protected VMs from the protected site
datastore group Replicated datastores containing complete sets of protected VM
protection group A group of VMs that will be failed over together to the recovery site during test
or recovery
Storage Replication Enables SRM to interact with a storage array
Adapter (SRA)
shadow VM An artifact in the recovery site VC inventory that represents a protected VM
from the protected site VC
inventory mappings Associations between protected resource pools VM folders, networks and
their destination counterparts at the recovery site
recovery plan Contains the complete set of steps needed to recover (or test recovery of) the
protected VMs in one or more protection groups

Document Feedback
VMware welcomes your suggestions for improving our documentation. If you have comments, send
your feedback to:

docfeedback@vmware.com.

5 VMware, Inc.
Site Recovery Manager Evaluator Guide

Technical Support and Education Resources


The following sections describe the technical support resources available to you.

Online Support for the Site Recovery Manager


Technical support is available through the Support Request (SR) system. Go to
http://www.vmware.com/support/ and click on Create Support Request.

Please note that you will need a valid VMware account in order to open a SR. If you do not already
have an account, you will need to register for one here.

Support Offerings
Find out how VMware support offerings can help meet your business needs, go to

http://www.vmware.com/support/services.

VMware Education Services


VMware courses offer extensive hands on labs, case study examples, and course materials designed
to be used as on the job reference tools. For more information about VMware Education Services,
go to

http://mylearn1.vmware.com/mgrreg/index.cfm.

6 VMware, Inc.
SRM Quick Start Checklist

Before starting to work through the SRM installation and configuration workflows that are outlined in this
SRM Evaluator guide, we recommend you refer to the SRM Storage Release notes that are specific for
the SRM supported storage platform that you will be using in your protected and recovery site and work
through the checklist to ensure your storage platforms are ready for integration with SRM. Once you have
worked through the appropriate storage checklist you should then proceed to work through the SRM Pre-
Install checklist below which when completed will ensure you are ready to proceed with the setup SRM.

SRM Pre-Install Checklist

Site Description Yes / No


Using your VMware Store Account access the URL below to download the
SRM software, Storage Replication Adapter, and other relevant product and
program information:
http://www.vmware.com/download/srm/eval.html
Protected SQL Enterprise 2005 or Oracle Database server setup and ready for use.
Protected A database instance has been created for Virtual Center
Protected A database user created with „db owner‟ and „create table‟ privileges
Protected A system DSN created for the VC database
Protected Virtual Center 2.5 server installed and ready for use.
Protected The ability to access the Virtual Center 2.5 server via the VI Client
Protected At least one ESX Server 3.0.2 or ESX Server 3.5 installed and integrated
into Virtual Center, with access to a LUN on a SAN that has been
configured as a VMFS datastore and setup for data replication to a
corresponding SAN in the recovery site.
Protected A database instance has been created for Site Recovery Manager (SRM)
Protected A database user created with „db owner‟ and „create table‟ privileges
Protected A system DSN created for the SRM database
Protected Identify a system (physical or virtual) to install the SRM software and the
Storage Replication Adapter (SRA) for your respective array
Recovery SQL Enterprise 2005 or Oracle Database server setup and ready for use.
Recovery A database instance has been created for Virtual Center
Recovery A database user created with „db owner‟ and „create table‟ privileges
Recovery A system DSN created for the VC database
Recovery Virtual Center 2.5 server installed and ready for use.
Recovery The ability to access the Virtual Center 2.5 server via the VI Client
Recovery At least on ESX Server 3.0.2 or ESX Server 3.5 installed and integrated into
Virtual Center, with access to a LUN on a SAN that has been configured as
a VMFS datastore and setup for data replication to a corresponding SAN in
the recovery site.
Recovery A database instance has been created for Site Recovery Manager (SRM)
Recovery A database user created with „db owner‟ and „create table‟ privileges
Recovery A system DSN created for the SRM database
Recovery Identify a system (physical or virtual) to install the SRM software and the
Storage Replication Adapter (SRA) for your respective array

7 VMware, Inc.
Site Recovery Manager Evaluator Guide

SRM Evaluation Checklist


To aid you with your SRM evaluation please refer to the SRM Evaluation Checklist below which
provides a high level summary of the various SRM workflows and configuration tasks that should be
completed during your SRM Evaluation.

SRM Evaluation Checklist

Site Description Yes / No


Protected Connection: This involves pairing the VirtualCenter servers at the
protected and recovery sites.
CH 3 - Pg 15
Protected Array Managers: SRM leverages array based replication between a
protected site and a recovery site. When working through the Array
Manager configuration wizard SRM will identify which arrays are available
including the datastores groups that have been setup for replication
between the protected and recovery site.
CH 3 - Pg16
Protected Inventory Preferences: Using the Inventory Mapper wizard, the protected
VMs now need to be mapped to the Networks, Compute Resources and
Virtual Machine Folders that are available at the recovery site.
CH 3 - Pg 19
Protected Protection Groups: A protection group is a group of VMs that will be failed
over together to the recovery site.Work through the Protection Groups
configuration wizard to complete the Protection Group setup.
CH 3 - Pg 20
Recovery Recovery Plan. A recovery plan describes the steps necessary to recover
the protected VMs in one or more protection groups. These steps can be
predefined (e.g. Power On VM) or user-defined callouts. When a recovery
plan is defined, the basic steps necessary to recover the protection groups
it contains are automatically generated.
CH 3 - 25
Protected IP Address Network Customization to allow protected VMs to start with
the correct IP addresses and network configuration in the recovery site.
CH - 3 Pg 25
Note: This task may not be required for protected VMs that use DHCP to
obtain an IP address or in environments that have a Stretched VLAN
network topology.
Recovery Test your Recovery Plan via the SRM TEST Recovery Plan option.
CH 4
Recovery Run your Recovery Plan via the SRM RUN Recovery Plan option.
CH 5

Note: SRM does not support an automated „push one button‟ failback via
the SRM User Interface. A failback to the original protected site is possible
and is documented in CH 6 should you want to resync the data in the
recovery site back to the protected site.
Recovery Failback (Optional). Refer to CH 6 for the failback procedure which will
Protected involve you working closely with your storage team to complete failback.
Protected SRM Alarms and Site Monitoring. Enable the appropriate notifications
Recovery and alarms to stay in compliance with your documented monitoring policies.
CH 7
Protected SRM Roles and Privileges. Assign the appropriate SRM Roles to stay in
Recovery compliance with your documented security policies.
CH 8

8 VMware, Inc.
Chapter 1: Overview of VMware Site Recovery Manager (SRM)

VMware Site Recovery Manager(SRM) provides business continuity and disaster recovery protection
for virtual environments. Protection can extend from individual replicated datastores to an entire virtual
site. VMware‟s virtualization of the data center offers advantages that can be applied to business
continuity and disaster recovery:
 The entire state of a virtual machine (memory, disk images, I/O and device state) is
encapsulated. Encapsulation enables the state of a virtual machine to be saved to a file. Saving
the state of a virtual machine to a file allows the transfer of an entire virtual machine to another
host.
 Hardware independence eliminates the need for a complete replication of hardware at the
recovery site. Hardware running ESX at one site can provide business continuity and disaster
recovery protection for hardware running ESX at another site. This eliminates the cost of
purchasing and maintaining a system that sits idle until disaster strikes.
 Hardware independence allows an image of the system at the protected site to boot from disk
at the recovery site in minutes or hours instead of days.
SRM leverages array based replication between a protected site and a recovery site. The workflow that
is built into SRM automatically discovers which datastores are setup for replication between the
protected and recovery sites. SRM can be configured to support bi-directional protection between two
sites.
SRM provides protection for the operating systems and applications encapsulated by the virtual
machines running on ESX. A SRM server must be installed at the protected site and at the recovery
site. The protected and recovery sites must each be managed by their own VirtualCenter Server. The
SRM server uses the extensibility of the VirtualCenter Server to provide:
 Access control
 Authorization
 Custom events
 Event-triggered alarms
Site Recovery Manager Prerequisites
SRM has the following prerequisites:
 A VirtualCenter server installed at the protected site.
 A VirtualCenter server installed at the recovery site.
 Pre-configured array-based replication between the protected site and the recovery site.
 Network configuration that allows TCP connectivity between SRM servers and VC servers
 An Oracle or SQL Server database that uses ODBC for connectivity in the protected site and in
the recovery site.
 A SRM license installed on the VC license server at the protected site and the recovery site.

9 VMware, Inc.
Site Recovery Manager Evaluator Guide

Site Recovery Manager Configuration and Protection


Setup and configuration are accomplished by following workflows for the protected and recovery sites.
SRM is installed as a plugin into a Virtual Infrastructure Client (VI Client). SRM uses the VI Client as the
User Interface (UI). The SRM UI is accessed by clicking on the Site Recovery icon in the VI client
toolbar and is used for the setup of the SRM workflows, recovey plan testing as well as services
failover from the protected site to the recovery site.
It is important to complete the worklows in the order they are presented in this guide.
The recovery site configuration workflow involves the following activities:
 The user installs the SRM server.
 The user installs the SRM plugin into the VI Client
The protection site configuration workflow involves the following activities:
 The user installs the SRM server.
 If a different VI Client is used to access the protected and recovery sites, the user installs the
SRM plugin into the VI Client, otherwise this activity can be skipped.
 Security certificates are established between the SRM servers and the VC servers.
 The user pairs the SRM servers at the protected and recovery sites.
 SRM identifies available arrays and replicated datastores and determines the datastore groups.
The protection site protection workflow involves the following activities:
 Using the Inventory Mapper, the user maps the networks, compute resources and virtual
machine folders in the protected site to their counterparts in the recovery site.

 The user creates protection groups from the datastores discovered by SRM.
 For each protected VM, the user can override default values.
The recovery site protection workflow involves the following activities:
 The user creates the recovery plan.
 SRM creates the recovery plan steps.
 Optionally the user has the ability to customize the recovery plan
Failover and Testing
SRM automates many of the tasks required at failover. With the push of one button, SRM:
 will power down the protected VMs if there is connectivity between sites and they are online.
 suspend data replication and Read/Write enable the replica datastores.
 rescan the ESX servers at the recovery site.
 registers the replicated protected VMs.
 shuts down non-essential VMs at the recovery site if required to free up resources for the
protected VMs being failed over.
 completes power-up of replicated protected VMs in accordance with the recovery plan.

10 VMware, Inc.
SRM does not require production system downtime to run tests. This means you can test often to
ensure that you are protected in case of a disaster. For testing, SRM:
 creates a test environment that includes network and storage infrastructure that is isolated from
the production environment.
 rescans the ESX servers.
 registers the replicated VMs.
 completes power-up of protected VMs in the order specified during creation of the disaster
recovery plan.
 provides a report of test results.
 resets everything in preparation for a disaster or next scheduled SRM Test.

11 VMware, Inc.
Site Recovery Manager Evaluator Guide

Chapter 2: Planning for BC/DR when using VMware SRM

This chapter will provide an overview of the site planning and preparation that should be completed to
ensure the SRM protected and recovery sites are prepared for the SRM Evaluation.

Figure 2.1

Figure 2.1 represents a SRM protected site which contains „local services‟ and „protected services‟.
The „local services‟ are infrastructure type services (Active Directory, Print services, Virus Management
services and Security Camera services) and are generally bound to the data center. The „protected
services‟ are application type services, and these are the services that need to be made available to the
business at time of test or disaster. This will be accomplished using SRM. Using the SRM protected site
depicted in Figure 2.1 we will now review the planning and preparations that should be completed to
ensure both the SRM protected and recovery sites are ready for a successful SRM deployment.

Site planning and preparation at the protected site involves the following:
 Identify which VMs will be designated as protected VMs.
o app_vm1 through app_vm12

 Identify which VMs will be designated as un-protected VMs


o ad_server, print_server, security_camera_server and virus_mgt_server

 Determine the number of datastore groups that will be required to hold the protected VMs.
o Based on the 12 VMs we have designated to be protected VMs and for the purposes of
the SRM configuration that will be depicted in this evaluator guide we will require 2
datastore groups which will contain six complete VMs per datastore group.

12 VMware, Inc.
If existing datastores will be used for the protected VMs, identify which datastores need to be
configured as datastore groups otherwise provision the required number of new datastores to host
the protected VMs. Working with your SAN team ensure all the datastores that will host protected
VMs are configured as datastore groups i.e. setup for replication between the protected and
recovery site.
o Referring to Figure 2.2, we will require 2 datastore groups, shared-san-1 and shared-
san-2, which were previously configured to allow for the replication of data to the
recovery site. Note: The setup and configuration of SAN replication will differ from
array vendor to array vendor, if you are unsure of how to complete the necessary
replication setup and configuration, consult with your array vendor who should be in a
position to provide you with all the necessary information.

 Move all the designated protected VMs onto the SRM datastore groups. Storage VMotion can
be used to complete the relocation of the protected VMs with zero service downtime. If possible
ensure there are only protected VMs on the datastores that are being replicated from the
protected site to the recovery site. Referring to Figure 2.2 which is a VMware topology map
view, it is clear to see that:
o app_vm1 through app_vm6 are hosted from datastore group shared-san-1
o app_vm7 through app_vm12 are hosted from datastore group shared-san-2
o The non replicated infrastructure VMs are hosted from datastore vim22-storage1

Figure 2.2

13 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 2.2 represents a different view of the same SRM protected site depicted in Figure 2.1 which
contains „local services‟ being hosted from datastore vim22-storage1 and the „protected services‟
hosted from datastore groups shared-san-1 and shared-san-2 respectively. The „protected services‟
are under the control of SRM and will be made available at time of test or disaster via SRM at the
recovery site.

Figure 2.3

Figure 2.3 represents a SRM recovery site which contains „local services‟ and will also service the failed
over „protected services‟ which are all the protected VMs hosted from datastore groups shared-san-1
and shared-san-2 in the SRM protected site depicted in Figure 2.2. Once again the „local services‟ are
infrastructure type services (Active Directory, Print services, Virus Management services and Security
Camera services) that are bound to the recovery site data center.

14 VMware, Inc.
Chapter 3: SRM Workflow setup at the Protected and Recovery sites

This chapter will provide an overview of the SRM workflows that have to be completed to ensure SRM
is providing BC/DR services for the designated virtual machines at time of test or during an actual event
that necessitated the declaration of a disaster.

Figure 3.1

The SRM workflows that will be outlined below will be associated with the virtual data centers vim22dc
and vim23dc depicted in Figures 2.1, 2.2 and 2.3.
Figure 3.1 shows part of the VI Client window, the Site Recovery icon has been highlighted as well as
the Setup pane. The protected site SRM workflows will be completed via the VI Client by selecting
Configure and working through the configuration wizard for each of the steps identified below:
 Connection: This involves pairing the VirtualCenter servers at the protected and recovery
sites.
The VC server in the local data center vim22dc is dr-vc-vim22.eng.vmware.com. The VC
server in the remote data center vim23dc is dr-vc-vim23.eng.vmware.com.

15 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 3.2

Once the remote VC servers information has been entered you will be presented with the
following Connect to Remote Site window.

Figure 3.3

Once reciprocity has been established, click Close to complete the setup. You are now ready
to move onto the Array Manager configuration step.
Note: If you are using certificates that are not properly signed, the last two check marks in
Figure 3.3 may appear as „yellow‟ warning triangles when „Reciprocity is established‟. The
use of certificates that are not properly signed will not prevent you from moving onto the next
SRM configuration step which involves configuring the Array Managers.
 Array Managers: SRM leverages array based replication between a protected site and a
recovery site. When working through the Array Manager configuration wizard SRM will identify
which arrays are available including the datastores groups that have been setup for replication
between the protected and recovery site.

16 VMware, Inc.
Virtual machines reside on VMFS datastores which are created on LUNs that reside on the storage
arrays. SRM uses the term "datastore group" as a way of identifying a replicated datastore(s) that
protect virtual machines. If SRM detects a virtual machine spanning more than one datastore (i.e VM
has two virtual disks one on each datastore) then to allow that whole VM to be failed over the SRM
datastore group *must* contain both datastores and SRM will enforce this, we will cover this further
below. In SRM a datastore group is the basic unit of replication.

SRM PROTECTED SITE - STORAGE ARRAY


Datastore Group 1 Datastore Group 2

VM VM VM VM
2 2 4 4
Disk .vmx . vmx Disk

VM VM VM VM VM VM
1 1 3 3 5 5
. vmx Disk . vmx Disk .vmx Disk

VMFS VMFS VMFS VMFS


Datastore A Datastore B Datastore C Datastore DE

Non
Replicated Replicated Replicated Replicated
LUN A LUN B LUN C LUN D

Replicated
LUN E

VM 1 is VM 2 is not VM 3, VM 4 and VM 5
Protected by SRM Protected by SRM are Protected by SRM

Figure 3.4

In Figure 3.4, VM1 is contained in Replicated LUN A (Datastore Group 1) so VM1 is a protected
VM. VM2 is contained on a non replicated LUN and therefore VM2 is not protected by SRM.
Datastore Group 2 consists of Datastore C and Datastore DE which contains protected VMs 3,
4, and 5. It is worth noting that even though VM4 spans two datastores, it is completely
contained within Datastore Group 2 and as a result is fully protected by SRM. In SRM an entire
VM needs to be located within a datastore group which has a one to one mapping to a
protection group. Datastore groups are automatically discovered by SRM, a datastore group is
defined by the configuration of the virtual machines selected for protection by SRM. SRM
protection groups will be covered at a later stage in this guide.
It is worth noting at this time that VMware is actively working with our storage partners who are
responsible for the development of their own Storage Replication Adapaters (SRAs), which will
enable their storage arrays to integrate with SRM. For this reason VMware anticipates the list of
storage Manager Types to become more extensive over time as the storage partners complete
work on their respective SRAs.

17 VMware, Inc.
Site Recovery Manager Evaluator Guide

If you do not see a Manager Type for a storage array which you have in your environment that
you wish to integrate with SRM, VMware strongly urges you to follow-up directly with the
storage vendor in question to enquire about the availability of their SRM storage replication
adapter as VMware is not in a position to comment on the availability of products currently
under development by our partners, in this case our storage partners.
Working through the Array Manager configuration wizard will take you to the Add Array
Manager window depicted in Figure 3.5, select the correct Manager Type for the SAN in your
environment.

Figure 3.5

Once you have selected the correct Manager Type from the drop down box, complete the entry
of all the appropriate information within the Array Manager Information section and click
Connect to start the SRM Discover Storage Array process which will run for several minutes.
The Array Manager configuration wizard will walk you through the configuration for the
Protection Side Array Managers and Recovery Side Array Managers.
Figure 3.6 is a consolidated view of the three Configure Array Managers windows to shows
an example of the information that will be presented at the end of the SRM Discover Storage
Array process for the protected and recovery sites. The information shown below is for the
Virtual Data Centers vim22dc and vim23dc. At this stage you are now ready to move onto the
Inventory Preferences configuration step.

18 VMware, Inc.
Figure 3.6

 Inventory Preferences: Using the Inventory Mapper wizard, the protected VMs now need to
be mapped to the Networks, Compute Resources and Virtual Machine Folders that are
available at the recovery site. The mappings are completed via the Inventory Preferences
pane below by clicking on the respective Primary Site Resources catergory and clicking on
Edit and working through the inventory mapper wizard for the respective resource catergory.

19 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 3.7
In Figure 3.7 the Inventory Preferences pane shows the configured Inventory Preferences for
the protected VMs in the virtual data center vim22dc and how they are mapped to the appropriate
resources in the virtual data center vim23dc which is the designated recovery site.

 Protection Groups: A protection group is a group of VMs that will be failed over together to the
recovery site. A protection group is associated with a single datastore group. A datastore group
could contain a single datastore or multiple datasotres as illustrated in Figure 3.4.
Working through the Protection Groups configuration wizard you will get to the window shown
in Figure 3.8. During the creation of the Protection Groups, SRM requires a location to store
some temporary VirtualCenter inventory files for the protected VMs. SRM will present the
available datastores at the recovery site that could be selected for the storing of these
temporary files. It is preferable and suggested that you select a non replicated datastore for
these temporary files at the recovery site.

Figure 3.8

20 VMware, Inc.
Select the datastore that will store the temporary virtual machine files and click Next which will
take you to the next Create Protection Group window shown in Figure 3.9. You will now be
presented with the list of all the protected VMs that will be assigned to the Protection Group
currently being created.

Figure 3.9

Figure 3.9 shows the six protected VMs assigned to the first protection group called
„Protection Group 1‟ which was created in the virtual data center vim22dc.

21 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 3.10

Figure 3.10 shows the six protected VMs assigned to the second protection group called
„Protection Group 2‟ which was created in the virtual data center vim22dc.
The creation of the protection groups completes the SRM workflow activities for the protected site. To
recap we have worked through the following SRM workflow activities so far:
 Connection: This involved the pairing the VirtualCenter servers at the protected and recovery
sites.
 Array Managers: SRM leverages array based replication between a protected site and a
recovery site. The integration of array based replication with SRM is achieved by selecting the
correct Manager Type for the SAN in your protected and recovery sites.
 Inventory Preferences: Using the Inventory Mapper , the protected VMs are mapped to the
Networks, Compute Resources and Virtual Machine Folders that are available at the
recovery site.
 Protection Groups: A protection group is a group of VMs that will be failed over together to the
recovery site. The creation of a protection group results in VC inventory updates in the recovery
site.

22 VMware, Inc.
Figure 3.11 shows the view of the recovery site virtual data center vim23dc, it is worth noting that once
the protection groups were created, the virtual infrastructure inventory in the recovery site was
automatically updated with new inventory objects. The first new inventory object is the nested resource
pool called recovery under the top level RP called shared. The remaining inventory objects that have
been added are the protected VMs from the SRM protected site.

Figure 3.11

The remaining workflow activity which is the creation the SRM Recovery Plan and any subsequent
customizations to the recovery plan are completed via the VI Client connecting to the VC server in the
designated recovery site
The recovery site protection workflow involves the following activities:
 Building a Recovery Plan. A recovery plan describes the steps necessary to recover the
protected VMs in one or more protection groups. These steps can be predefined (e.g. Power
On VM) or user-defined callouts. When a recovery plan is defined, the basic steps necessary to
recover the protection groups it contains are automatically generated. These steps can then be
customized by you by re-ordering existing steps or adding new steps and callouts. When a
Recovery Plan is created for one or more protection groups, the plan is automatically populated
with the basic steps needed to failover the protected VMs to the recovery site.

23 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 3.12 shows the recovery steps from a SRM recovery plan called „Recovery Plan 2 –
Protection Group 2‟ required to complete a partial site failover for the local data center
vim22dc which is protected by SRM. The protected VMs that will be failed over are app_vm7
through to app_vm12 from Protection Group 2 which is associated to the datastore group
shared-san-2.

Figure 3.12

24 VMware, Inc.
In Figure 3.13 the VI Client lists three Recovery Plans that were created by working through the
Recovery Plan wizard. To create a new Recovery Plan, click on the Add button on the toolbar
or Add Recovery Plan under the Commands section and work through the Recovery Plan
wizard.

Figure 3.13

The following section will illustrate one of the supported ways to customize settings associated with a
protected VM. By working through the steps that follow you will be able to customize the network
configuration settings for a protected VM which will allow the protected VM to start up at the recovery
site after an actual failover with an IP address that is correct for the network in the recovery site.

Referring to Figure 3.14, working from the VI Client that is connected to the recovery site complete the
following. From the Edit menu option, click on Customization Specifications and work through the
wizard that follows.

Figure 3.14

A Customization Specification Manager window opens up, click on New, and complete the
information being requested by the wizard. Ensure you select the correct Target Virtual Machine OS
and provide a name for virtual machine customization profile in the Customization Specification
Information as shown in Figure 3.15.

25 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 3.15

Click Next to continue to the window below which is depicted by Figure 3.16.

Figure 3.16

26 VMware, Inc.
At this point it worth noting that the only information that will be applied to the protected VM you wish to
customize via SRM will be Network information. For the SRM 1.0 release you may need to provide
information for all the virtual machine properties highlighted in Figure 3.16 by the two „red boxes‟ to
allow you to move onto the next virtual machine property screen as the Next button will only become
active once you enter text into the property field currently being displayed to you.

Figure 3.17

When you get to the Network properties section as depicted in Figure 3.17, please be sure to
click on the Custom Settings radio button, and then click on the Next button to proceed to the
Network Custom Settings window shown in Figure 3.18.

Figure 3.18

27 VMware, Inc.
Site Recovery Manager Evaluator Guide

Referring to Figure 3.18, highlight the NIC 1 and click on the Customize button which will take you to
the standard Network Properties window below which is shown in Figure 3.19. Complete entering all
the relevant network configuration information you wish to assign to the protected VM when it is started
in the recovery site and click OK.

Figure 3.19

Complete working through the Customization Specifications wizard which will bring you to the
Customization Specification Manager window depicted in Figure 3.20.

Figure 3.20

The steps that follow outline how to apply the network customization profile to a protected VM, in this
case app_vm12 which is completed via the VI Client that is connecting to the protected site. Select the
protection group that contains the virtual machine you wish to customize. In this case we want to
customize app_vm12 which is associated to Protection Group 2 as indicated by Figure 3.21. Click on
Configure Protection which will launch the Recovery Launch wizard.

28 VMware, Inc.
Figure 3.21

Working through the Recovery Launch wizard, you will get to the Virtual Machine Customization
Window that is shown in Figure 3.22. From the drop down box select the virtual machine customization
profile you wish to assign the protected VM, in this case customization_appvm12, and click Next to
proceed onto the last two remaining customizations which are Before Power On and After Power On
which will conclude Customization Specifications wizard.

Figure 3.22

29 VMware, Inc.
Site Recovery Manager Evaluator Guide

Once the Customization Specifications wizard closes you will be taken back to the VI Client view
depicted in Figure 3.23. Under the Recent Tasks pane you should see an entry which will serve as
confirmation that the customization profile customization_appvm12 was successfully applied o
app_vm12. The protected VM app_vm12 has now been customized and will start up with the network
configuration information assigned to the customization profile customization_appvm12.

Figure 3.23

30 VMware, Inc.
Chapter 4: Using SRM to run a Test against a Recovery Plan

This chapter will provide an overview of how SRM enables you to „Test‟ a recovery plan by simulating a
failover of virtual machines from the protected site to the recovery site. The benefit of using SRM to run
a failover simulation against a recovery plan is that it allows you to confirm that the recovery plan has
been setup correctly for the protected VMs. You will be able to confirm that the protected VMs startup in
the correct order, taking into account the various application service dependencies for the protected
VMs in your environment.

It is worth pointing out that when you select the option to „Test‟ a recovery plan via SRM, the simulated
failover is executed in an isolated environment that includes network and storage infrastructure at the
recovery site that is isolated from the protected site (production environment) which ensures the
protected VMs at the protected site are not subject to any kind of service interruption during the testing
of the recovery plan. SRM will also create a test report that can be used to demonstrate your level of
preparedness to the business or individual business units whose services are being protected by SRM
as well as to the auditors and compliance officers if required.

The simulated failover completes by resetting the environment to be ready for the next event which
could be another simulated failover, or an actual failover for a scheduled BC/DR test or in response to
an event which resulted in the business declaring a disaster.

Figure 4.1

31 VMware, Inc.
Site Recovery Manager Evaluator Guide

We will now work through a simulated failover leveraging the SRM „Test‟ a recovery plan option.
In Figure 4.1 the VI Client lists the three Recovery Plans that were created by working through the
Recovery Plan wizard. There are two ways to initiate the simulated failover, you can either click on the
„Test‟ button in the toolbar or click on the „Test Recovery Plan‟ link under the Commands section, and
both are highlighted in Figure 4.1. Before the simulated failover is started you will be presented with the
dialog box (Figure 4.2) that informs you the performance of local virtual machines may be impacted if
there are insufficient compute resources at the recovery site to support the local virtual machines and
protected VMs. The dialog box also informs you that the replication of the datastore groups may be
suspended during the simulated failover. Click „Yes‟ to start the „Test‟ of your recovery plan.

Figure 4.2

While the simulated failover test is running, the status of each step that makes up the recovery plan can
be monitored by going to Recovery Steps tab in the VI Client which will inform you what steps are
currently Running as well as what steps were completed with a Success status. It is worth pointing out
that there are some steps in a recovery plan that will only be executed during a simulated test, these
steps are identified by „Test Only‟ under the Mode column, there are also some steps that will only be
executed during an actual failover, these steps are identified by „Recovery only‟ under the Mode
column.

Figure 4.3

32 VMware, Inc.
Figure 4.4 shows a partial view of the recovery site‟s VI Client window. SRM provides an audit trail via a
report which is generated automatically at the end of each SRM Test or SRM Recovery. The reports
are accessible via the History tab and can be viewed by clicking on the View link under the Actions
column, which will result in a browser window opening that contains a log of the steps executed during
the test, with the total time to execute the recovery plan and the time it took to execute each step in the
recovery plan.

Figure 4.4
The following is a recap of the highlevel tasks excecuted by SRM when performing a simulated failover
via the „Test‟ a recovery plan option that is availble via SRM enabled VI Client. With the push of one
button, SRM:
 creates a test environment that includes network and storage infrastructure that is isolated from
the production environment.
 rescans the ESX servers.
 registers the replicated VMs.
 completes power-up of protected VMs in the order specified during creation of the disaster
recovery plan.
 provides a report of test results.
 resets everything in preparation for a disaster or next scheduled SRM Test.

33 VMware, Inc.
Site Recovery Manager Evaluator Guide

Chapter 5: Using SRM to failover the Protected Site to the Recovery Site
This chapter will provide an overview of how SRM enables you to „Run‟ a recovery plan which will result
in the actual failover of virtual machines from the protected site; the failover process via SRM is rapid,
repeatable, reliable, manageable and auditable.

Figure 5.1

We will now work through an actual failover leveraging the SRM „Run‟ a recovery plan option.
In Figure 5.1 the VI Client lists the three Recovery Plans that were created by working through the
Recovery Plan wizard. There are two ways to initiate the actual failover, you can either click on the
„Run‟ button or click on the „Execute Recovery Plan‟ link under the Commands section, and both are
highlighted in Figure 5.1.

The Run Recovery Plan dialog box represented by Figure 5.2 warns you that you are about to run the
a recovery plan which will result in changes to the protected virtual machines and the infrastructure of
both the protected and recovery site datacenters. Click the radio button to confirm you understand the
implications of running your recovery plan and then click on the Run Recovery Plan button that is
highlighted in figure 5.2 to start the failover of protected VMs from the protected site to the recovery
site.

The Run Recovery Plan dialog box also provides a summary of the Recovery Plan Information, that
includes the Recovery Plan that is going to be run, along with the names of the protected and recovery
sites, the number of protected VMs that will be failed over as well as a connectivity status from the
recovery site back to the protected site.

34 VMware, Inc.
Figure 5.2

While the failover is being executed, the status of each step that makes up the recovery plan can be
monitored by going to Recovery Steps tab highlighted in Figure 5.1 of recovery site‟s VI Client which
will inform you what steps are currently Running as well as what steps were completed with a Success
status. Once again it is worth pointing out that there are some steps in a recovery plan that will only be
executed during a simulated test, these steps are identified by „Test Only‟ under the Mode column,
there are also some steps that will only be executed during an actual failover, these steps are identified
by „Recovery only‟ under the Mode column.
Once all the protected VMs have been failed over and reported to be powered, which can be confirmed
in several places from within the VI Client you are now ready to start validating that all application
services restarted cleanly at the recovery site, in this case we are referring to the protected VMs
app_vm7 through app_vm12 from Protection Group 2 which is associated to the datastore group
shared-san-2. Once you have completed the validation of the failed over application services at the
recovery site you are now in a position to report the successful failover to the business and allow the
respective business users to access the application services which are now being hosted out of the
recovery site.
Figure 5.3 shows a VMware topology map view, it is clear to see that app_vm7 through app_vm12
were successfully failed over to the recovery site and that they are being hosted from a datastore
connected to a host in the recovery site. The non replicated infrastructure VMs (ad_server,
print_server, security_camera_server and virus_mgt_server) are hosted from datastore vim23-
storage1 in the recovery site.

Note: SRM will automatically perform a re-signature of the replicated datastore in the recovery site, which
means LVM.EnableResignature will be set to 1 on the ESX host/s that have access to the replicated
datastores in the recovery site. The re-signature that is initiated by SRM will result in the replicated
datastores being presented with a prefix of snap-0000000X- where X is a number, this is evident in
Figure 5.3 which shows the replicated datastore presented as snap-000000020-shared-san-2.

35 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 5.3

As pointed out in the previous chapter SRM will automatically generate a report, in this instance the
report is for a SRM „Run‟ operation against the recovery plan we selected. The report is acessible via
the History tab and can be viewed by clicking on the View link under the Actions column.
The steps to failback services from the recovery site back to the protected site once the disaster event
is over will be outlined in the next chapter of this guide.
The following is a recap of the highlevel tasks excecuted by SRM when performing a failover of virtual
machines from the protected site to the recovery site via the „Run‟ a recovery plan option that is availble
via SRM enabled VI Client. SRM automates many of the tasks required at time of failover. With the
push of one button, SRM:
 will power down the protected VMs if there is connectivity between sites and they are online.
 suspend data replication and Read/Write enable the replica datastores.
 rescan the ESX servers at the recovery site.
 registers the replicated protected VMs.
 shuts down non-essential VMs at the recovery site if required to free up resources for the
protected VMs being failed over.
 completes power-up of replicated protected VMs in accordance with the recovery plan.

36 VMware, Inc.
Chapter 6: Failback from the Recovery Site to the Protected Site

Although not included as an automated procedure in SRM 1.0 this chapter will provide an overview
which includes the considerations and guidance to execute a failback of services from the recovery site
back to the site that was originally designated as the protected site or to a new site due to a
catastrophic disaster that has destroyed the original site.

For the purpose of this SRM evaluator guide we will consider the failback scenario outlined below:

Failback Scenario: Failback of application services from the recovery site back to the protected site
after a scheduled BC/DR test which was conducted using SRM to perform the Recovery „actual
failover‟ and not a „test failover‟ of services from the protected site to the recovery site.

With this scenario we will assume the infrastructure at protected site has not changed during BC/DR
test and that SRM completed a successful recovery of the protected VMs required for the BC/DR test to
the recovery site. The protected VMs from the protected site that were failed over for the BC/DR test
were confirmed to have been shutdown by SRM in the protected site. The business has a requirement
to restore all data generated during the BC/DR test at the recovery site back to the protected site.

To aid in the explanation of the failback steps that will follow we will use the abbreviations listed in the
following table.

Abbreviation Description
Site A original protected site
Site B original recovery site
PG 1 original protection group that was defined at Site A
RP 1 original recovery plan that was defined at Site B
PG 2 new protection group that is defined at Site B to facilitate the
failback from Site B via SRM back to Site A
RP 2 new recovery plan that is defined at Site A to facilitate the
failback from Site B via SRM back to Site A
PG 3 new protection group that is defined at Site A to facilitate the
failover to Site B via SRM. Note: this protection group is
basically the same protection group that was defined for PG1
Source LUN VMFS Datastore being replicated to alternate data center
Target LUN Replicated Datastore in the alternate data center
Clone LUN This is a clone of the Target LUN, this will be used during a
„Test‟ failover only

Note: Failback in SRM 1.0 is a multi step procedure that for most part is a manual process;
however SRM can be leveraged to provide some automation during the failback as outlined in
Steps 1 through 20 below when working in conjunction with your storage team to complete the
necessary storage configuration work (storage personality swap) outlined in Step 7 and Step 17.

Before starting with Step 1, you will need to set the LVM.DisallowSnapshotLun = 0 on all the ESX
hosts in the protected site that are zoned to the LUNs which have been assigned for SRM protected
virtual machines and configured for replication between the protected and recovery site. This one
time operation is required to ensure the ESX hosts in the protected site are able to access the
replicated datastores after the storage personality swap has occurred.

Steps 1 through 20 outline the steps that need to be completed for a successful failback from
Site B to Site A as well as the steps to complete the re-protection of Site A after the failback
from Site B.

37 VMware, Inc.
Site Recovery Manager Evaluator Guide

Storage configuration after running a Recovery in SRM (Actual Failover)


from Site A to Site B for datastore ‘shared-san-2’
Site A - Protected Site Site B - Recovery Site
Data Replication is suspended
Source LUN in Site A with Target LUN in Site B
Write Disabled Read Write
(read only) Enabled

Source LUN Target LUN


(shared-san-2) (shared-san-2)
Protected VMs Protected VMs
(app_vm7 to app_vm12) (app_vm7 to app_vm12)
All powered off by SRM All powered on by SRM
at start of SRM Recovery during the SRM Recovery
Note: Datastore „shared-san-1‟ will be in the same configuration state as „shared-san-2‟
A Clone LUN is not used during a Recovery in SRM.

Figure 6.1

1. Ensure all users that were involved with the BC/DR test have completed their test
scripts and are no longer accessing any of the protected VMs that were recovered from
Site A for the BC/DR test.
2. Shutdown all of the protected VMs that were recovered to Site B for the BC/DR test.
3. Ensure you have created a list of all the Protected VMs that were recovered to Site B.
4. Perform a cleanup of the directory in Site B that contained the VM configuration files
created during protection group creation in Site A (this is the location selected during
the creation of the original protection group/s in Site A - Protection Group 1 and
Protection Group 2, refer to Figure 3.8). Refer to Figure 6.2 for an example of the
placeholder VM configuration file information that was written to vim23-storage1
during the creation of the protection groups in Site A. You can use your list you created
in Step 3 above as a reference during this clean-up step.

Figure 6.2

38 VMware, Inc.
5. Connect to the VC instance in Site A and delete PG 1 (Protection Group 1 and
Protection Group 2). Refer to Figure 6.3, a protection group can be removed by right
clicking on the protection group in the right hand side Site Recovery pane.

Figure 6.3

6. Connect to the VC instance in Site A and perform a remove from inventory operation
on all the protected VMs in Site A that were recovered to Site B. In Figure 6.4 all the
protected VMs in Site A have been selected, by right clicking on the selected VMs
you will be presented with a menu box, click on Remove from Inventory, which will
then remove the all the highlighted protected VMs from the VC inventory in Site A.

Figure 6.4

39 VMware, Inc.
Site Recovery Manager Evaluator Guide

7. Work with your Storage team to complete a storage configuration change „personality
swap‟ whereby the Source LUN is now associated with Site B and the Target LUN is
associated with Site A, as depicted in Figure 6.5. Rescan the ESX servers at the
protected and recovery site to ensure they become aware of the underlying storage
changes.

Storage configuration prior to running a Recovery in SRM from Site B to


Site A (Failback) after a storage configuration change ‘personality swap’
Site A - Recovery Site Site B - Protected Site
Data Replication is now configured from Site B to Site A
Source LUN is in Site B with Target LUN in Site A
Write Disabled Read Write
(read only) Enabled

Target LUN Source LUN


(shared-san-2) (shared-san-2)
Protected VMs Protected VMs
(app_vm7 to app_vm12) (app_vm7 to app_vm12)
Protected VMs offline Protected VMs that will be
in Site A recovered to Site A
Note: A Clone LUN is not configured in Site A and will not be used during
the Failback from Site B back to Site A

Figure 6.5

8. Complete the Array Manager configuration wizard from Site B. After the storage
configuration work is completed in Step 7, the Source LUN is now assigned to Site B
and the Target LUN is assigned to Site A as depicted in Figure 6.6.

Figure 6.6

40 VMware, Inc.
9. Configure the Inventory Preferences in Site B, these inventory preferences will be
assigned to the protected VMs when they are restarted in Site A after the failback.
Figure 6.7 shows the inventory preferences that will be mapped to the protected VMs in
Site A after the failback from Site B.

Figure 6.7

10. Connect to the VC instance in Site B and configure PG 2 (Failback Protection Group
1 and Failback Protection Group 2) in Site B as depicted in Figure 6.8 for the
protected VMs you wish to failback to Site A.

Figure 6.8

11. Connect to the VC instance in Site A and configure RP 2 in Site A. Note: You should
not delete RP 1 (Recovery Plan 3 – Complete Site Failover , refer to Figure 6.8) that
was created in Site B during the initial SRM workflows that were completed to protect

41 VMware, Inc.
Site Recovery Manager Evaluator Guide

the designated VMs in Site A. Refer to Figure 6.9 which shows RP 2 (Failback
Recovery Plan 3) which was created in Site A.

Figure 6.9

12. Using SRM complete the Failback of the original protected VMs back to Site A. This is
accomplished by performing a Recovery against RP 2 (Failback Recovery Plan 3)
shown in Figure 6.9. Figure 6.10 depicts the storage configuration once the SRM
Recovery against RP 2 completes.

Storage configuration after running a Recovery in SRM (Actual Failover)


from Site B to Site A for datastore ‘shared-san-2’

Site A - Recovery Site Site B - Protected Site


Data Replication is suspended
Source LUN in Site B with Target LUN in Site A
Read Write Write Disabled
Enabled (read only)

Target LUN Source LUN


(shared-san-2) (shared-san-2)
Protected VMs Protected VMs
(app_vm7 to app_vm12) (app_vm7 to app_vm12)
All powered on by SRM All powered off by SRM
during the SRM Recovery at start of SRM Recovery
Note: A Clone LUN is not used during a Recovery in SRM (Actual Failover) in SRM.

Figure 6.10

42 VMware, Inc.
13. Shutdown all of the protected VMs in Site A that were failed back from Site B during the
SRM Recovery operation performed in Step 12.
14. Perform a cleanup of the directory in Site A that contained the VM configuration files
created during protection group creation in Site B (this is the location selected during
the creation of PG 2 in Site B, refer to Figure 3.8). Refer to Step 4 above for guidance if
required.
15. Connect to the VC instance in Site B and delete PG 2 (Failback Protection Group 1
and Failback Protection Group 2) that were created in Site B. Refer to Step 5 above
for guidance if required.
16. Connect to the VC instance in Site B and perform a remove from inventory operation
on all the protected VMs in Site B that were recovered to Site A, in this scenario this
would be app_vm1 to app_vm12. Refer to Step 6 above for guidance if required.
17. Work with your Storage team to complete a second storage configuration change
„personality swap‟ whereby the Source LUN is now re-associated with Site A, the
Target LUN is re-associated with Site B along with the Clone LUN as depicted in
Figure 6.11. Rescan the ESX servers at the protected and recovery site to ensure they
become aware of the underlying storage changes. Note: The Storage configuration
has now been reverted back to the original configuration that was handed over to the
Virtualization team prior to the setup of SRM and for this reason we depict the storage
configuration in Figure 6.11 with the Clone LUN in Site B. The data synchronization
method (snapshot at intervals or continuous synchronization) of the Target LUN to the
Clone LUN is determined by the Storage Array vendor. When a simulated failover is
initiated via the „Test‟ option in SRM, final data synchronization is performed from the
Target LUN to the Clone LUN.

Storage configuration during a SRM Test failover from Site A to Site B


for datastore ‘shared-san-2’
Site A - Protected Site Site B - Recovery Site
Data Replication continues between the Source LUN and Target LUN
The data synchronization between the Target LUN and the Clone LUN is suspended

Read Write Write Disabled Read Write


Enabled (read only) Enabled

Source LUN Target LUN Clone LUN


(shared-san-2) (shared-san-2) (shared-san-2)
Protected VMs
Protected VMs
(app_vm7 to app_vm12)
(app_vm7 to app_vm12)
Protected VMs powered on
Protected VMs that will be
in Site B during the SRM
recovered to Site B
Test failover

Note: Datastore „shared-san-1‟ will be in the same configuration state as „shared-san-2‟

Figure 6.11

18. Create PG 3 (Protection Groups 1 and Protection Group 2) in Site A for the
protected VMs. Note: The protection groups you create here should be identical to the
protection groups that were originally associated with RP 1 (Recovery Plan 3 –
Complete Site Failover), the recovery plan that was executed in Recovery mode that
resulted in the startup of the protected VMs in Site B.

43 VMware, Inc.
Site Recovery Manager Evaluator Guide

19. Re-associate the protection groups created in step 19 in Site A with RP 1 (Recovery
Plan 3 – Complete Site Failover) in Site B. Refer to Figure 6.12 which highlights the
first part of the re-association which is initiated by right clicking on Recovery Plan 3 –
Complete Site Failover. Working through the Edit Recovery Plan wizard you will then
get to a screen that will require you to select which protection groups you wish to re-
associate with the recovery plan, refer to Figure 6.13 which shows the two protection
groups Protection Group 1 and Protection Group 2 that should be selected so they
can be re-associated with RP1 (Recovery Plan 3 – Complete Site Failover) in Site B.
Note: You do not need to delete the RP 2 (Failback Recovery Plan 3) that was
created in Site A to facilitate the recovery back to Site A from Site

Figure 6.12

Figure 6.13

20. Once the protection groups have been re-associated with the original recovery plan as
detailed in step 19 you have now completed the re-protection of the protected VMs in
Site A with SRM. It is highly recommended to that you now complete a final „Test‟ a
simulated failover against RP 1 (Recovery Plan 3 – Complete Site Failover) to
ensure that Site A is protected and ready for any event that may necessitate a
Recovery via SRM to Site B should the business deem it necessary to declare a
disaster.

44 VMware, Inc.
Note: The SRM Failback steps 1 through 20 outlined in this Chapter did not detail the following:

 Site Pairing from Site B back to Site A via the SRM Connection wizard: This step is not
required as SRM maintains a bi-directional relationship between the paired sites, and therefore
the Connection workflow only needs to be completed once for Site A and Site B to ensure each
site is aware of each VC and SRM instance in the respective sites.
 SRM License transfer between Site A and Site B: The SRM failback steps did not discuss
the transfer of SRM licenses between Site A and Site B that should be completed to ensure you
are in compliance with the SRM EULA.
 DNS Updates: SRM 1.0 does not provide a mechanism to update DNS. DNS updates will need
to be completed by you as the virtual machines are moved between Site A and Site B with
SRM and under go IP address changes to accommodate disparate networks in Site A and Site
B should they not be joined by Stretched VLANs.

Figure 6.14 shows a summary view of the SRM configuration of Site A which has now been reverted
back to the original protected site. In addition to the two protection groups Protection Group 1 and
Protection Group 2 which are associated with the recovery plan Recovery Plan 3 – Complete Site
Failover in the designated recovery site – Site B, we also have the recovery plan Failback Recovery
Plan 3 listed which was used to enable the failback procedure outlined in this chapter.

Figure 6.14

45 VMware, Inc.
Site Recovery Manager Evaluator Guide

SRM Failback Checklist

To aid with Failback after a SRM Recovery please refer to the SRM Failback Checklist below which
provides a high level summary of Failback steps 1 through 20 which are detailed in this chapter.

Step # Site Description Yes / No


1 Site B Protected VMs recovered to Site B are no longer being used and
can be powered down
2 Site B Power down the Protected VMs in Site B
3 Site B Create a list of all the Protected VMs that were recovered to Site B
4 Site B Perform a cleanup of the directory in Site B that contained the VM
configuration files created during protection group creation in Site A
5 Site A Connect to the VC instance in Site A and delete PG 1
6 Site A Connect to the VC instance in Site A and perform a remove from
inventory operation on all the protected VMs in Site A that were
recovered to Site B
7 Storage Work with your Storage team to complete a storage configuration
Work change „personality swap‟ whereby the Source LUN is now
associated with Site B and the Target LUN is associated with
Site A. Refer to Figure 6.5. Rescan the ESX servers at the protected
and recovery site to ensure they become aware of the underlying
storage changes.
8 Site B Complete the Array Manager configuration wizard in Site B which
now has the Source LUN configured in Site B and the Target LUN
configured in Site A
9 Site B Configure the Inventory Preferences in Site B, these inventory
preferences will be assigned to the protected VMs when they are
restarted in Site A after the failback
10 Site B Connect to the VC instance in Site B and configure PG 2
11 Site A Connect to the VC instance in Site A and configure RP 2 in Site A
12 Site A Using SRM complete the failback of the original protected VMs back
to Site A. This is accomplished by performing a Recovery against
RP 2. Figure 6.10 depicts the storage configuration after the
Recovery completes
13 Site A Shutdown all of the protected VMs in Site A that were failed back
from Site B during the SRM Recovery operation performed in Step
12
14 Site A Perform a cleanup of the directory in Site A that contained the VM
configuration files created during protection group creation in Site B
15 Site B Connect to the VC instance in Site B and delete PG 2 that was
created in Site B in step 10
16 Site B Connect to the VC instance in Site B and perform a remove from
inventory operation on all the protected VMs in Site B that were
recovered to Site A
17 Storage Work with your Storage team to complete a second storage
Work configuration change „personality swap‟ whereby the Source LUN is
now re-associated with Site A, the Target LUN is re-associated
with Site B along with the Clone LUN as depicted in Figure 6.11.
Rescan the ESX servers at the protected and recovery site to ensure
they become aware of the underlying storage changes.
18 Site A Create PG 3 in Site A for the protected VMs
19 Site B Re-associate PG 3 from step 18 in Site A with RP 1 in Site B
20 Site B Complete a final „Test‟ a simulated failover against RP 1 to ensure
that Site A is protected and ready for any event that may necessitate
a Recovery via SRM to Site B should a disaster be declared

46 VMware, Inc.
Chapter 7: SRM Alarms and Site Status Monitoring

This chapter will provide an overview of some of the SRM Alarms that will be generated due to certain
types of failures or conditions that may occur at the protected or recovery site. Awareness of the SRM
alarms is an important part of understanding how SRM works across the protected and recovery sites.
During the SRM product evaluation it is recommended that where possible and without impact
to your production environment, failures or conditions be created in the protected and recovery
site that will result in the generation of SRM alarms. The generation of these SRM alarms will serve
as validation that SRM is monitoring both the protected and recovery site correctly.

Each SRM server monitors the CPU utilization, disk space, and memory consumption of the guest on
which it is running, and also maintains a heartbeat with its peer SRM server. VC events are sent if any
of these measures falls outside of configured bounds.

SRM will support the configuration of event-triggered alarms so that you can associate a notification
action with any given SRM Alarm Event. These alarms are configured via the SRM UI.

SRM will support the following alarm notification actions:


 Send e-mail to specified address
 Send SNMP trap to VC trap receivers
 Execute specified command on VC host

Please refer to Chapter 9 – Alerting and Monitoring in the Administrators Guide for Site Recovery
Manager which details how to setup the alarm actions listed above.

Failure of either site generates events which can be associated with VC alarms.
 Problems with the local site (e.g. resource constraints)

Problems with remote site (e.g., unable to ping remote site which may indicate a disaster)
 Remote site failure is reflected in the SRM Alarm Events and will not automatically trigger a
recovery. This must be initiated manually.

SRM will raise VC events for the following conditions:


 Disk Space Low
 CPU use exceeded limit.
 Memory low.
 Remote Site not responding.
 Remote Site heartbeat failed.
 Recovery Plan Test started, ended, succeeded, failed, or cancelled.
 Virtual Machine Recovery started, ended, succeeded, failed, or reports a warning.

As a starting point during the SRM Evaluation we recommend you complete the Action setup for the
SRM Alarm Events listed below for the protected and recovery sites. You should be able to trigger
these events in your environment without impacting your production environment, with the goal being
that you see first hand how SRM responds and notifies you when subjected to one of the failure events
listed below.

 Remote Site Down


 Remote Site Ping Failed
 Replication Group Removed
 Recovery Plan Destroyed
 License Server Unreachable

47 VMware, Inc.
Site Recovery Manager Evaluator Guide

Figure 7.1

48 VMware, Inc.
As you become more familiar with SRM, it associated workflows that allow you to Test your recovery
plans as well as Run your recovery plan which results in the failover of services from your protected
site to your recovery site we recommend that you work through the list of SRM Alarm Events which are
accessed via the Alarms tab, as depicted in Figure 7.1 and enable the appropriate notification Actions
for any additional SRM Alarm Events that you deem to be important for your environment.

49 VMware, Inc.
Site Recovery Manager Evaluator Guide

Chapter 8: SRM Roles and Privileges


This chapter will provide an overview of the SRM roles and the types of SRM privileges that can be set.
Authorization in SRM uses the same authorization model as VirtualCenter Server.

Figure 8.1 show the default SRM roles which become available for use after the SRM plug-in has been
installed and enabled for use. To access these roles click on the Administration icon in the toolbar and
click on the Roles tab to see a list of all the roles that are available. These default SRM roles provide
the ability to delegate control to a very granular level.

Figure 8.1

50 VMware, Inc.
There are two sets roles. The first set contains the roles required for the primary site user to
administer protection and the SRM roles are prefixed by Protection. The second set contains the roles
required for the secondary site user to administer recovery and the SRM roles are prefixed by
Recovery.

Protection Side SRM Roles

 Protection Virtual Machine Administrator: This role should be assigned on the protected
Virtual Machine object in the VC inventory. It grants the associated user the ability to setup and
modify the protection characteristics of the protected virtual machine.

 Protection SRM Administrator: This role should be assigned on the Service Instance object
in the primary SRM inventory. It grants the associated user the ability to pair two sites,
configure inventory mappings, and SAN arrays.

 Protection Groups Administrator: This role should be assigned on the Primary


Configuration/Protection Service object in the SRM inventory. It grants the associated user the
ability to create and modify protection profiles/groups.

Recovery Side SRM Roles

 Recovery Inventory Administrator: This role should be assigned on the root of the VC
inventory. It grants the associated user the ability to view customization specifications existing
on the secondary site.

 Recovery Datacenter Administrator: This role should be assigned on the Datacenter object
in the VC inventory where the VMs will be recovered. It grants the associated user the ability to
view available datastores and perform recovery (shadow) VM customizations.

 Recovery Host Administrator: This role should be assigned on the Host or DRS cluster object
in the VC inventory where the VM will be recovered. It grants the associated user the ability to
configure VM components during recovery.

 Recovery Virtual Machine Administrator: This role should be assigned on the Folder and
Resource Pool objects in the VC inventory where the recovery (shadow) VMs are to be placed.
It grants the associated user the ability to create and add shadow VMs to the resource pool and
the folder as well as the ability to reconfigure and customize the shadow VMs at runtime and
during the process of recovery.

 Recovery SRM Administrator: This role should be assigned on the Service Instance object in
the secondary SRM inventory. It grants the associated user the ability to configure SAN arrays
and create protection profiles.

 Recovery Plans Administrator: This role should be assigned on the Secondary


Configuration/Recovery Service object in the SRM inventory. It grants the associated user the
ability to reconfigure protection and shadow VMs and setup and run recovery.

Note: VirtualCenter already defines a Read-Only system role which can be used to grant users the
ability to view the Site Recovery Manager service. In addition, the Administrator role can be used to
grant user complete control over both the protection and recovery SRM components.

51 VMware, Inc.
Site Recovery Manager Evaluator Guide

SRM also allows for the creation of custom SRM roles by allowing you to clone one of the default SRM
roles and then by editing the cloned SRM role you can select which privileges should be associated to
the custom SRM role that you are creating. Figure 8.2 shows a Custom SRM Role and all the privileges
that can be selected to complete the creation of the SRM Custom Role.

Figure 8.2

52 VMware, Inc.
Conclusion

Site Recovery Manager will leverage your VMware Infrastructure to make disaster recovery:

 Rapid - by automating the disaster recovery process for your virtual machines by eliminating the
complexities of traditional physical disaster recovery.

 Reliable - by ensuring proper execution of the recovery plan as well as the ability to enable easier,
more frequent tests in an isolated environment without impacting services in the protected site.

 Manageable - centrally manage recovery plans and make plans dynamic to match a dynamic
virtualized environment.

 Affordable - utilize recovery site infrastructure and reduce management costs.

Site Recovery Manager will enable you to:

 Expand disaster recovery protection - now any workload in a virtual machine can be protected
with minimal incremental effort and cost.

 Reduce time to recovery - as soon as a disaster is declared, SRM allows for the recovery of
protected virtual machines with a few mouse clicks to the designated recovery site.

 Increase reliability of recovery - replication of system state ensures your protected virtual
machines have all they need to startup in the protected site. Hardware independence which is
realized through your VMware Infrastructure eliminates failures due to different hardware.

 Easier and more frequent testing – SRM enables you to test your recovery plan in an isolated
environment without impacting services in the protected site while using the actual failover
sequence that will be executed during a real disaster.

53 VMware, Inc.

Anda mungkin juga menyukai