Anda di halaman 1dari 37

Disaster Tolerant Solutions with

HPE 3PAR Remote Copy

Technical white paper


Technical white paper

Contents
Overview ..........................................................................................................................................................................................................................................................................................................................................................4
Data replication and disaster tolerant solution challenges ......................................................................................................................................................................................................................... 4
Disaster tolerant solution metrics....................................................................................................................................................................................................................................................................................... 4
Remote Copy Theory of Operations ...................................................................................................................................................................................................................................................................................... 5
HPE 3PAR Remote Copy replication modes ...........................................................................................................................................................................................................................................................5
Synchronous mode .......................................................................................................................................................................................................................................................................................................................... 5
Periodic asynchronous mode ................................................................................................................................................................................................................................................................................................. 6
Asynchronous streaming mode ...........................................................................................................................................................................................................................................................................................6
Remote Copy Link Layer support ............................................................................................................................................................................................................................................................................................. 7
Remote Copy connections ........................................................................................................................................................................................................................................................................................................ 7
Latency support with Remote Copy ................................................................................................................................................................................................................................................................................ 8
Remote Copy over Native IP Adapters (RCIP) ....................................................................................................................................................................................................................................................... 8
Remote Copy over Fibre Channel ...................................................................................................................................................................................................................................................................................... 9
Choosing between RCIP and FCIP for long distance replication over an IP network ............................................................................................................................................................ 10
Pros and Cons of RCIP versus FCIP .............................................................................................................................................................................................................................................................................. 11
RCIP advantages ............................................................................................................................................................................................................................................................................................................................ 11
RCIP disadvantages .................................................................................................................................................................................................................................................................................................................... 11
FCIP advantages ............................................................................................................................................................................................................................................................................................................................ 11
FCIP disadvantages ..................................................................................................................................................................................................................................................................................................................... 11
Remote Copy topologies ............................................................................................................................................................................................................................................................................................................... 12
One-to-one topology ................................................................................................................................................................................................................................................................................................................. 12
M-to-N topology ............................................................................................................................................................................................................................................................................................................................ 12
Synchronous Long Distance (SLD) topology ...................................................................................................................................................................................................................................................... 14
Three Data Center Peer Persistence ...........................................................................................................................................................................................................................................................................15
Sizing Remote Copy Solutions ................................................................................................................................................................................................................................................................................................. 16
System sizing considerations when using Remote Copy.......................................................................................................................................................................................................................... 16
Data replication link sizing .................................................................................................................................................................................................................................................................................................... 16
Sizing a Solution Using Synchronous Mode ......................................................................................................................................................................................................................................................... 17
Periodic Asynchronous Mode ............................................................................................................................................................................................................................................................................................ 17
Asynchronous Streaming Mode....................................................................................................................................................................................................................................................................................... 18
Remote Copy functionality .......................................................................................................................................................................................................................................................................................................... 19
Remote Copy coordinated snapshots ........................................................................................................................................................................................................................................................................ 19
Auto recovery policy ................................................................................................................................................................................................................................................................................................................... 19
Creatercopygroup and setrcopygroup options for Asynchronous Streaming mode ...................................................................................................................................................... 19
Auto synchronize policy .......................................................................................................................................................................................................................................................................................................... 19
“snap_freq” subcommand for Asynchronous Streaming Remote Copy groups................................................................................................................................................................... 20
“period” subcommand behavior for Asynchronous Streaming Remote Copy groups ................................................................................................................................................... 20
Technical white paper

Converting a 1:1 Remote Copy Topology into a Synchronous Long Distance (SLD) Topology ............................................................................................................................... 21
Steps for Converting from an existing 1:1 Remote Copy Configuration to an SLD Remote Copy Configuration................................................................................ 21
Creating Remote Copy Targets for the SLD Configuration.................................................................................................................................................................................................................... 23
Admitting the new Remote Copy targets to the existing Remote Copy group ................................................................................................................................................................... 26
Converting an SLD configuration back to a 1:1 configuration ............................................................................................................................................................................................................ 28
Remote Copy failover options .................................................................................................................................................................................................................................................................................................. 28
How Remote Copy recovers synchronous Remote Copy groups following replication link failure .......................................................................................................................... 29
Synchronous replication during normal operation ......................................................................................................................................................................................................................................... 29
Description of synchronous operation during Remote Copy link failure.................................................................................................................................................................................... 29
Restart of Remote Copy .......................................................................................................................................................................................................................................................................................................... 30
Detail of the Resynchronization process ................................................................................................................................................................................................................................................................. 30
How Remote Copy recovers Periodic Asynchronous Remote Copy groups following replication link failure ............................................................................................... 30
Description of Periodic Asynchronous replication during normal operation .......................................................................................................................................................................... 30
Description of operation during Remote Copy link failure ...................................................................................................................................................................................................................... 30
Detail of the Resynchronization process ................................................................................................................................................................................................................................................................. 31
How Remote Copy recovers Asynchronous Streaming Remote Copy groups following replication link failure .......................................................................................... 31
Description of Asynchronous Streaming replication during normal operation .................................................................................................................................................................... 31
Description of operation during WAN failure ...................................................................................................................................................................................................................................................... 31
Conversion options for VVs in a Remote Copy group ....................................................................................................................................................................................................................................... 32
Three conversion solution scenarios are available for VVs in a Remote Copy group .................................................................................................................................................... 32
Scenario 1: Non Peer Persistence solutions with no application downtime ............................................................................................................................................................................ 32
Scenario 2: Conversion with application downtime........................................................................................................................................................................................................................................ 33
Scenario 3: Conversion for solutions using Peer Persistence ............................................................................................................................................................................................................... 33
Process for converting non Peer Persistence Remote Copy VVs with no application downtime (Requires a Remote Copy full
copy synchronization) ............................................................................................................................................................................................................................................................................................................... 33
Process for converting Remote Copy VVs without requiring a full copy synchronization (Avoids a Remote Copy full
synchronization) ............................................................................................................................................................................................................................................................................................................................. 34
Cross-product interoperability ................................................................................................................................................................................................................................................................................................. 37
Technical white paper Page 4

Overview
Today IT organizations like yours are faced with the difficult task of satisfying the diverse disaster tolerant needs of the entire enterprise.
Business, governmental, and industry-driven requirements compel the need to store more data and make it continuously available. With the
increased need for data availability comes the increased demand for fast disaster recovery. An enterprise that is unable to recover its data assets
quickly after a disaster may be at risk for regulatory action, or, worse yet, an inability to continue business. How do IT organizations protect more
applications and data than ever without commensurate increases in human and capital resources?

HPE 3PAR StoreServ Storage addresses this challenge by offering a powerful yet simple solution for remote data replication that is the
foundation of a properly designed and deployed disaster tolerant solution: HPE 3PAR Remote Copy, storage replication software.

Data replication and disaster tolerant solution challenges


Following a natural or human-induced disaster that drastically affects day-to-day operations, businesses must continue to function. Compliance
with business standards, industry trends, or federal regulations may place additional requirements on an enterprise looking to create or expand
its disaster recovery capabilities. The maximum acceptable Recovery Point Objective or RPO the Enterprise can tolerate is one such important
requirement.

For some organizations, adequate funding for disaster recovery is difficult to obtain because it may be perceived as an added expense for a very
limited subset of corporate data. Clearly articulating how and why disaster recovery is necessary to meet the requirements, put in place by
management or the Government, is paramount. This helps to deploy a workable solution that meets expectations for defined RPO and Recovery
Time Objective (RTO).

Many storage administrators believe that by simply replicating data from the primary data center to a backup data center, they have fulfilled the
enterprise’s requirement for disaster tolerance. This couldn’t be farther from the truth. A proper disaster tolerant solution is a combination of
technologies, software, and processes. These that are combined into a solution designed to meet a defined goal for RPO and RTO, and not just a
technology that replicates data from one location to another.

Most probably, you’ll agree that planning and implementing a disaster recovery solution is one of the most complex, time-consuming, and
expensive projects that any enterprise will undertake. Designing one that meets a very small RPO can make the objective even more daunting.

Disaster tolerant solution metrics


The primary metrics on which a disaster tolerant solution is designed and measured are Recovery Point Objective (RPO) and Recovery Time
Objective (RTO). RPO is a definition of the maximum amount of data that can be lost in the event a disaster occurs and RTO defines the
maximum amount of time that elapses following a disaster before the application is back online and processing new transactions.

RPO is generally defined as an amount of time and not a given quantity of data. For example, an RPO of two hours would guarantee that
following a disaster that occurs at 2 pm, once recovery completes at the disaster recovery site, all transactions that were committed up to and
including 12 pm would be present in the replicated copy of a database. The RPO could, in fact, contain transactions committed after 12 pm but
the defined two-hour RPO guarantees transactions committed up to 12 pm will always be present.

RTO on the other hand is a definition of the amount of downtime that may elapse following a disaster before the database needs to be up and
running, and consuming new transactions. In the vast majority of disaster tolerant designs RTO is secondary to the amount of data loss that can
be tolerated. A good disaster tolerant solution will have some type of automatic recovery mechanism, such as clustering, which is integrated with
the data replication software to both automate and limit the amount of downtime suffered following a disaster. In most cases RTO is driven by
the amount of time it takes to initiate a failover of the database to the disaster recovery data center, restart the database and have it complete
crash recovery although data recovery in the disaster recovery data center (applying DB logs, promoting array snapshots, etc.) can also
contribute to the RTO.

Even though for many enterprises having a very small RTO is not extremely important (for some it is but generally not), the maximum amount of
data that can be lost in the event of a disaster, RPO, is very important. At the very core of any disaster tolerant solution is the ability to ensure
that data loss will not exceed the defined RPO for the solution following recovery from a disaster. For those enterprises who are very concerned
about RTO HPE 3PAR offers solutions like Peer Persistence that help minimize both RPO (0) and RTO (a few seconds of application interruption).
Technical white paper Page 5

Remote Copy Theory of Operations


HPE 3PAR Remote Copy replication modes
HPE 3PAR Remote Copy, storage replication software, now offers a full set of features that can be used to design disaster tolerant solutions
requiring an RPO as small as zero to an RPO measured in seconds to an RPO measured in minutes, hours or days or even longer. Users can
choose between the three different data replication modes offered, synchronous, Asynchronous Streaming, or Periodic Asynchronous replication
to design the most cost-effective solution that meets their solution requirement for RPO and RTO.

Synchronous mode
In synchronous mode, host-initiated write data is mirrored to write cache on both the primary and the secondary StoreServ arrays before the
write completion is acknowledged to the host. On the primary StoreServ array, data is mirrored across the cache of two nodes. The write request
is then sent to the backup StoreServ array via a communication link.

The backup StoreServ array also mirrors the data in its cache (again, on two nodes) and then sends an acknowledgement to the primary system.
The write IO complete is acknowledged to the server after the remote array’s acknowledgement is received by the primary array (Figure 1). As
with all synchronous replication solutions, Synchronous Remote Copy provides an RPO of zero or no data loss.

Figure 1. Synchronous mode

When used with an RCFC (Remote Copy over Fibre Channel) transport, HPE 3PAR Remote Copy Synchronous storage replication software
utilizes a patented protocol that only requires three trips across the FC network to replicate data vs. the standard double round trip SCSI protocol
found in most implementations. Synchronous Remote Copy does this by having the secondary array post a number of SCSI read requests to the
primary array. As host I/Os are received by the primary array it simply responds to the previously posted read requests from the secondary array,
hence saving one of the trips found in a normal SCSI write request.

Synchronous Remote Copy is supported on all network transports provided by Remote Copy, RCIP, RCFC, and FCIP. The maximum round trip
latency supported by Remote Copy when configured for synchronous replication is dependent on the HPE 3PAR OS Release and the transport
chosen. Please contact you HPE representative for details on the maximum latency supported for your release of HPE 3PAR OS.
Technical white paper Page 6

Note
Please contact your HPE representative for details on the maximum latency your version of HPE 3PAR OS supports

Periodic asynchronous mode


Periodic Asynchronous mode is asynchronous replication that is ideal for environments that need to replicate data where the write I/O service
times would be too large if the data were being replicated synchronously and where an RPO of 10 minutes or greater can be tolerated. Periodic
Asynchronous mode insulates the host write I/Os from any replication latency resulting from the link latency or speed and from the target array
write IO latency. Periodic Asynchronous mode also makes excellent use of the replication transport by only replicating the last write to a given
block during the delta resync interval.

The network throughput requirements when using Periodic Asynchronous mode are generally more relaxed than they would be with
synchronous or Asynchronous Streaming mode (Asynchronous Streaming mode is covered in the next section). This is because with Periodic
Asynchronous mode the replication link speed can be sized close to the average data generation rate that occurs during the largest delta resync
period—providing cost saving on the replication network.

Figure 2. Periodic asynchronous mode

In Periodic Asynchronous mode, host writes are completed on the primary array and the host write is acknowledged as soon as the data is
mirrored across two nodes. The volumes on the primary array and volumes on the secondary array are then resynchronized “periodically”—when
scheduled or when resynchronization is manually initiated via the syncrcopy command.

In this example (Figure 2), what the user gets on the secondary array would be an I/O consistent copy of the source data as it looked at 12:10
initially and then again at 12:20. The process would then repeat itself 10 minutes later (in this example, the resync interval has been set to 10
minutes) and the user will get an I/O consistent copy of the source data as it looks at 12:30, 12:40, and so on. The Recovery Point Objective of
the solution is 2X the delta resync interval chosen which for this example would be an RPO of 20 minutes (delta resynchs occur every 10
minutes).

When using Periodic Asynchronous Remote Copy choose the largest delta-resync interval practical that will meet your solution RPO. The larger
the delta resync interval, the smaller the replication network needs to be in Megabits to match the “average” data generation rate of the
replicated data. This saves overall network cost.

Asynchronous streaming mode


For synchronous storage replication software solutions, as link latency increases so does the IO latency between a host write and array IO
acknowledgement. In most cases, this puts a practical limitation on the link latency that can be used for synchronous replication. With the release
of HPE 3PAR OS v3.3.1, support for the new HPE 3PAR Asynchronous Streaming Remote Copy is available. It is an excellent option where a very
low RPO is desired (120 seconds or less) but where the write I/O latency associated with a synchronous replication solution cannot be tolerated.
Technical white paper Page 7

Asynchronous Streaming Remote Copy is, as the name implies, asynchronous replication of data between two HPE 3PAR arrays but it differs
from Periodic Asynchronous Remote Copy. The difference is that instead of replicating data periodically at a defined delta resynchronization
interval (for example, every 10 minutes) Asynchronous Streaming Remote Copy constantly places I/O data from the primary array onto the
replication link for delivery at the secondary array. Asynchronous Streaming Remote Copy does this by holding data in cache on the primary
array while it is being replicated to the secondary array. Once receipt of the data at the secondary array has been confirmed to the primary, the
cache space on the primary array holding the data is freed up. This continuous “streaming” of the data between arrays allows solutions based on
Asynchronous Streaming Remote Copy to provide RPOs measured in seconds or even less (120 seconds maximum). Asynchronous Streaming
Remote Copy will allow a maximum amount of data that is equal to either 20% of the data cache on the Primary array or 120 seconds worth of
data to be queued on the Primary array. If either of these limits is exceeded Asynchronous Streaming will start to suspend Remote Copy groups.

With HPE 3PAR StoreServ OS 3.3.1 many Asynchronous Streaming Remote Copy solutions generally requires replication link speed to be sized
within 75–80 percent of the maximum data generation rate. This ensures cache on the primary array is not saturated and a very small RPO is
delivered.

Figure 3. Asynchronous streaming mode

Remote Copy Link Layer support


In order to satisfy the varying business or technical needs of users, Remote Copy offers two native methods of connectivity: Built-in dedicated
Gigabit or 10GbE (native 10GbE on HPE 3PAR 20000 series and native 1GbE on HPE 3PAR 7000, 8000 and 10000 series arrays and 10GbE
with the optional combined 10GbE/Fibre Channel adapter on HPE 3PAR 8000 series) to support RCIP. HPE 3PAR Remote Copy also supports
native Fibre Channel over optical fiber to support RCFC. With suitable FC-IP routers, FCIP is also supported. With FCIP Remote Copy over Fibre
Channel can utilize a SAN fabric that is extended over an IP network to provide long-distance Remote Copy support over IP.
• Built-in 1GbE RCIP adaptors on 7000, 10000 and 8000 series arrays (Not supported for use with Asynchronous Streaming mode)
• Built-in 10GbE RCIP adaptors on 20000 series arrays
• Optional four port adaptors on the 8000 series arrays with two 10GbE and two FC ports (or two iSCSI ports)

Remote Copy connections


Remote Copy supports multiple link layer connections per node.
• Four RCFC or FCIP connections per node (for nodes configured with enough FC ports)
• One RCIP connection per node
Technical white paper Page 8

It is possible for a single eight-node HPE 3PAR StoreServ array to have up to 40 Remote Copy links:
• 32 combined RCFC and FCIP connections (four per node) + 8 RCIP connections (one per node) = 40 total Remote Copy connections
• Sufficient network throughput between arrays must be provided to support the Remote Copy links configured

The Remote Copy connections on an array can all be used between a single pair of HPE 3PAR StoreServ arrays in a one-to-one topology or they
can be used to connect the array to multiple HPE 3PAR StoreServ arrays together in an “M-to-N” or a synchronous long distance (SLD) or 3-Data
Center Peer Persistence topology (more on these topologies in the section titled “Remote Copy topologies”). For a single pair of eight-node
HPE 3PAR arrays, if both arrays have enough ports to support 40 connections then all 40 connections may be used. Please see the HPE 3PAR
Feature and Availability Matrix (FA-Matrix) on the HPE Single Point of Configuration Knowledge (SPOCK) for details on the maximum number of
Remote Copy connections supported.

Please see h20272.www2.hpe.com/spock/ for details on HPE SPOCK

Latency support with Remote Copy


The maximum latency supported on the network used by Remote Copy has increased significantly with the release of HPE 3PAR OS 3.3.1.

Please see the HPE 3PAR Feature and Availability Matrix (FA-Matrix) on the HPE Single Point of Configuration Knowledge (SPOCK) for details
on the maximum supported latency for Remote Copy

Please see h20272.www2.hpe.com/spock/ for details on HPE SPOCK

Remote Copy over Native IP Adapters (RCIP)


Remote Copy over IP (RCIP) is a native IP implementation of Remote Copy over Ethernet. Every HPE 3PAR StoreServ array comes with a
standard Gigabit or 10GbE (10GbE on HPE 3PAR 20000 series) port on every node for use with RCIP. On the HPE 3PAR 8000 series array an
optional combined 10GbE/Fibre Channel adapter can be installed to provide 10GbE RCIP support. RCIP is ideal for short, medium, and long-haul
replication. It is most often used by organizations that require disaster recovery for small to medium environments requiring moderate data
generation rates.

RCIP uses multiple links between arrays (minimum of two and up to eight) to maximize bandwidth and ensure availability. By offering native IP
replication RCIP can be quicker and cheaper to implement than solutions that don’t offer native IP replication and require bridging FC
connections over a WAN via FC-IP routers. RCIP solutions can also be easier to support and troubleshoot as they are simpler to install and
manage than solutions that bridge Fibre Channel over IP.
All replication modes are supported on the RCIP interfaces however, for Asynchronous Streaming Remote Copy using RCIP only the 10GbE RCIP
adapters found on the HPE 3PAR 8000 and 20000 series arrays are supported. Asynchronous Streaming mode is not supported on the 1GbE
RCIP adapters found on the 7000/8000/10000 series arrays.

Note
Check with your HPE representative in regards to Asynchronous Streaming Remote Copy support via RCIP.

Note
Asynchronous Streaming mode is not supported with the 1GbE RCIP adapters found on the HPE 3PAR 7000/8000 and 8000 series arrays.
Only 10GbE RCIP adapters are supported.
Technical white paper Page 9

When implementing solutions based on RCIP it is not required but is highly recommended (and a best practice) that dedicated network
bandwidth, not shared with other protocols like HTTP, ftp, etc., be allocated to Remote Copy (or a dedicated VLAN carved out of a site network).
If the RCIP Remote Copy solution is deployed on an IP network that is shared with other IP protocols, there is no way to ensure the solution will
meet the defined performance requirements for a synchronous replication solution or defined RPO when using asynchronous replication if RCIP
cannot get all the network bandwidth it requires when needed.

For synchronous solutions, it is possible that non-Remote Copy traffic using bandwidth on the network will result in higher than expected write I/O
latencies for the host I/Os being replicated across the shared network. For solutions based on Periodic Asynchronous Remote Copy, sharing the
network bandwidth with other protocols may prevent data replication within the defined resync interval. This results in missing the RPO target of the
solution (system log reporting will also send out messages that the delta resync interval has been exceeded). With Asynchronous Streaming
solutions not having a deterministic network throughput for Remote Copy can result in the unexpected suspension of Remote Copy groups.

While RCIP is compatible with third party WAN networking solutions that optimize network bandwidth by compressing data on the network
(making a 100Mbps network look like a 400Mbps network, for example, by compressing 400Mbs of data so it fits on the 100Mbps link) the
nature of the data being replicated can reduce the effectiveness of these solutions (in some cases significantly). For example, if the data being
replicated is poorly compressible or not compressible at all, then the WAN optimizer will not provide the network bandwidth the solution was
sized to expect resulting in the solution not achieving its designed for RPO. Or, in the case of a synchronous replication solution resulting in high
write IO latencies for the server. If either of these situations occur it is not caused by Remote Copy’s behavior, it is a result of the nature of the
data being replicated and whether the WAN optimizer can do a good job of compressing that data or not. Poorly compressible data when used
with a WAN accelerator can also result in other problems for Remote Copy including dropped Remote Copy heartbeat messages and/or TCP
retry messages showing up in the system log. These can occur when the WAN accelerator begins to drop IP packets when its buffers overflow.

In summary, RCIP supports:


• Load balancing across all available RCIP links configured for the same replication mode between a pair of HPE 3PAR StoreServ arrays
• 1GbE and/or 10GbE connection between HPE 3PAR StoreServ systems through LAN/WAN switches
– 10GbE only available with 8000 and 20000 model arrays
• Synchronous, Periodic Asynchronous and Asynchronous Streaming modes
– Asynchronous Streaming mode is only supported on 10GbE RCIP interfaces
• Maximum of one RCIP connection per node (up to eight RCIP connections on an 8-node array)
• Can be used in conjunction with RCFC and FCIP
• Supported in all Remote Copy topologies

Remote Copy over Fibre Channel


For customers that choose Fibre Channel connectivity between arrays HPE 3PAR StoreServ Storage offers Remote Copy over Fibre Channel
connectivity (RCFC). RCFC is most often used for shorter distance solutions such as a campus or a metropolitan area although it can be bridged
across an IP network (FC-IP) to provide a long distance replication solution.

RCFC uses Fibre Channel connections (at least two) between arrays for availability as well as for increasing total available bandwidth for
replication. It provides the flexibility to use connections across any HPE approved Fibre Channel fabric to create multiple hops between arrays
(See the HPE SAN Design Guide on SPOCK for details on supported SAN solutions). These hops can include any HPE fabric vendor-approved
connectivity such as FC ISLs between buildings, fabric extension via long-haul ISLs, and more. The ISLs can be provided through long-wavelength
GBICs between switches and with wave division multiplexing solutions such as DWDM or CWDM that provide extended ISLs.

Note
Refer to the HPE SAN Design Guide for details on supported SAN fabric switches, GBICs, and distances. The HPE SAN Design Guide can be
accessed via SPOCK: h20272.www2.hpe.com/spock/
Technical white paper Page 10

In summary, RCFC supports:


• Load balancing across available links between a pair of HPE 3PAR arrays for links configured with the same replication mode (synchronous,
asynchronous streaming, or periodic asynchronous)
• Any HPE 3PAR supported SAN infrastructure as defined in the HPE SAN Design Guide
• Fibre Channel SAN networks that are extended over IP using HPE approved FC-IP routers (referred to as FCIP)
• Synchronous, asynchronous streaming, and periodic asynchronous replication modes
• Solutions requiring the replication of large amounts of data
• Up to four RCFC links per node (maximum of 32 per array)
• All Remote Copy topologies

Choosing between RCIP and FCIP for long distance replication over an IP network
As we have discussed HPE 3PAR Remote Copy storage replication software offers two separate link-level interfaces for replicating data between
arrays:
• RCIP: Provided through native 1GbE or 10GbE Ethernet ports found on every node on all HPE 3PAR array as well as with the combined four
port 10GbE/FC adapters supported on the HPE 3PAR 8000 series
• RCFC: Provided through native Fibre Channel ports on an HPE 3PAR array that has been configured as an “initiator” port so that it can be
used by Remote Copy to replicate data to an FC port on another HPE 3PAR array

In addition to 100% optical connections via Fibre Channel switches for use by the RCFC link-level interfaces HPE 3PAR arrays also support
Remote Copy RCFC connectivity over Fibre Channel ISLs extended via FC-IP routers, what HPE 3PAR refers to as FCIP. In actuality Remote Copy
cannot differentiate between a Fibre Channel SAN fabric using a 100% optical fabric and a Fibre Channel SAN fabric using an FC-IP router, they
both appear as a Fibre Channel port configured for Remote Copy. This means the RCFC protocol used by Remote Copy does not adjust and is
not adjustable for solution using a Fibre Channel fabric containing FC-IP ISLs.

With FCIP an FC-IP router connects two separate SAN fabrics together via a “virtual” ISL that is provisioned over an IP network. This type of
FC-IP routing is generally done to provide long distance ISL support between SAN fabrics although many customers use it in environments
where they are purchasing a lambda connection (channel) over a DWDM or CWDM optical based network and they want to have multiple
protocols share the lambda. Rather than purchase one lambda for Fibre Channel and a second for IP, etc., they will purchase a single lambda and
use it to support an IP network between sites. This allows the customer to run multiple different protocols like FC-IP, HTTP, FTP, etc. over the
single very fast lambda. In these cases if an array does not support native Ethernet replication Fibre Channel ports can be bridged to the IP
network via an FC-IP router and the lambda can then be used to support FC connections between SANs at different sites.

FC-IP routing when used to provide an FCIP Remote Copy solution with HPE 3PAR is a 100% supported configuration. Such solutions are stated
as being supported on IP networks with up to 120ms of latency delay (for Periodic Asynchronous replication) but such a network would not
always be considered ideal for an FCIP solution. FCIP use is recommended for environments where the IP replication network bandwidth is not
constrained (it is close to the speed of the RCFC interface), the amount of replication traffic are both expected to be relatively high and the
network latency is limited. Keep in mind that an FCIP connection is using either a 4Gb or an 8Gb or a 16Gb FC port on the HPE 3PAR array.
Bridging that to a relatively small replication network (1Gbps or slower) because only a limited workload needs to be replicated or fanning in
multiple RCFC connections to a single FC-IP router is not the best use of Fibre Channel array port resources and puts serious constraints on the
RCFC protocol. In these cases FCIP is not recommended.

Where latency and bandwidth over the IP network are concerns, and where the anticipated amount of data to be replicated is limited, using the
native onboard network adapter, RCIP, can be a better choice than FCIP. For environments where network latency and bandwidth (Bandwidth
close the FC interface speed) are not concerns and the amount of data to be replicated is high FCIP can be a better choice than RCIP.
Technical white paper Page 11

Table 1. RCIP versus FCIP versus RCFC


Limited replication network High/unconstrained Replication over native optical Solutions requiring <= 800Mbs Solutions requiring > 2000Mbs
bandwidth or low data network bandwidth and FC fiber (80MBs) per node of (200MBs) per node of
replication rates high data replication rates replication throughput replication throughput

RCIP X X X1
FCIP X X
RCFC X X X

Pros and Cons of RCIP versus FCIP


RCIP advantages
• A native Ethernet interface is provided on every HPE 3PAR node dedicated to Remote Copy, not consuming an HBA slot or FC port. 10Gbps
provides higher replication throughput but may require the installation of a 10Gbps Ethernet/FC combination card in the HBA slot on at least
two nodes (add-on cards are necessary to get 10GbE RCIP on HPE 3PAR 8000 series arrays. 10GbE is not available on 7000 and 10000
series arrays). 2
• RCIP based replication saves FC ports for other uses such as host connect, Peer Motion, online import, Federation.
• Better network congestion control for slow networks and networks with high latency. RCIP can be tuned to the speed and latency of the
replication network so that it makes the best use of the available network resources that the RCIP interface is connected to.
• Throughput limiting option: For environments where a customer wants to limit how much network bandwidth Remote Copy consumes.
• Troubleshooting issues between RCIP and the network are simplified compared to an FCIP solution where FC-IP routers are part of the
configuration.
• Simpler, more robust solution. HPE 3PAR nodes connect directly into the IP network just like a server would.
• Support a 2:1 or a 1:2 fan-in, fan-out configuration (i.e. a single pair of RCIP ports on one array can have a Remote Copy relationship with up to
two other arrays).

RCIP disadvantages
• Maximum of about 800Mbs (80MBs) of replication throughput per node for the 1GbE RCIP interfaces if the replication network provides
1Gbps of network throughput.

FCIP advantages
• Can support replication throughput rates greater than 80MBs assuming the network supports the desired rate.

FCIP disadvantages
• Requires installation and associated configuration, setup and maintenance of the FC-IP routing equipment.
• Leads to the loss of about 15% of the IP network bandwidth due to protocol conversion and data conversion overhead.
• More complex solution to troubleshoot when trying to determine if an issue is with the network or with Remote Copy.
• Reduces overall solution reliability by inserting additional components into the solution. The FC-IP routing equipment requires separate HA
hardening since it is not an integrated part of the 3PAR array.
• 1:1 port relationship (no fan-in and fan-out like that available with RCIP). Two FCIP ports are required to replicate data to a single target array).

1
10Gbs RCIP on HPE 3PAR 8000 and 2000 Series arrays only.
2
8000 Series arrays only. 20000Series arrays come with a native 10Gbps Ethernet interface per node for RCIP.
Technical white paper Page 12

Remote Copy topologies


HPE 3PAR Remote Copy software can be deployed in different topologies depending on customer needs. Remote Copy supports M-by-N
topologies (where M and N are >=1 and <=5). It also supports a topology we refer to as Synchronous Long Distance (SLD) that uses
synchronous replication mode and Periodic Asynchronous replication mode to replicate a Remote Copy group to two separate secondary arrays
to potentially provide a zero RPO at a distant location from the production array. The M-by-N topology is very flexible and allows for some very
complex Remote Copy solutions to be deployed.

One-to-one topology
Remote Copy’s simplest topology is a one-to-one configuration (it’s an M-to-N where M = N = 1) where a single pair of HPE 3PAR StoreServ
arrays replicate volumes between one another (Figure 4). This topology supports disaster tolerant and cluster-based scenarios (including Peer
Persistence) between two geographically distinct data centers if desired. HPE 3PAR StoreServ Storage supports bidirectional replication in a
one-to-one topology. Synchronous, Asynchronous Streaming, and Periodic Asynchronous replication modes may be used simultaneously
between two arrays in a one-to-one configuration, albeit for different Remote Copy groups and on different Remote Copy transport links.
When mixing different replication modes between two StoreServ arrays different physical transports must be used for each replication mode. For
example, RCIP for the Periodic Asynchronous Remote Copy groups, RCFC for Synchronous Remote Copy groups, and FCIP for Asynchronous
Streaming Remote Copy groups. It is possible to have two different replication modes, say Synchronous and Asynchronous Streaming, both use
RCFC but the RCFC links for each mode must be configured as different Remote Copy targets when created with the admitrcopylink command.

Figure 4. One-to-one bidirectional Remote Copy

M-to-N topology
In an M-to-N topology any HPE 3PAR StoreServ array in the topology can have a bidirectional Remote Copy relationship with up to four other
HPE 3PAR StoreServ arrays. The relationship between any pair of arrays in the M-to-N can be simultaneously synchronous and Periodic
Asynchronous, it can be bidirectional and use RCIP, RCFC, or FCIP link layer connections. Asynchronous Streaming is supported between any pair
of arrays in an M-to-N topology however, any given array can replicate in Asynchronous Streaming mode with at most one other array. In an
M-to-N topology, the Remote Copy groups are only replicated between a single pair of arrays, there is no support for a Remote Copy group to be
replicated from one primary array to two separate secondary arrays. This can be accomplished with a Synchronous Long Distance topology only,
see below for details. Figure 5 and 6 show example M x N topologies.
Technical white paper Page 13

Figure 5. Three StoreServ arrays in an M-to-N configuration (only one of these Remote Copy relationships can be in asynchronous streaming mode)
Technical white paper Page 14

Figure 6. Five HPE 3PAR StoreServ arrays in an any-to-any M-to-N configuration (notice that all relationships can be bidirectional) with asynchronous streaming mode supported
between two array pairs

Synchronous Long Distance (SLD) topology


A Remote Copy Synchronous Long Distance (SLD) topology allows volumes in a Remote Copy volume group to be replicated from one Primary
StoreServ array to two different Secondary StoreServ arrays. It does this by replicating data synchronously between two StoreServ arrays, the
“source” and “sync target” arrays while simultaneously replicating the same data via periodic asynchronous mode between the “source” and a
third array, the disaster recovery, or “async target” array. Asynchronous Streaming Remote Copy is not supported in SLD topologies.

The user has the option of utilizing the two sync arrays in an SLD topology in an active-active manner where both arrays are both Primary and
Secondary for different Remote Copy groups, failing over between them if and when a data center failure dictates a failover is necessary and
resuming operations on the “sync target array”. This provides a failover solution that delivers an RPO equal to zero due to the synchronous
nature of the replication that occurs between the sync arrays. On failover to a sync target array, the passive periodic asynchronous link between
that array and the async target array becomes active. Any data that was replicated to the sync target but that has not yet made it to the async
target array is sent to the async target array by the sync target following the failover. This brings the async target array up to date with the last
write that occurred on the sync target. Operations then continue in the sync target data center and it continues to replicate data to the async
target array.

The user also has the option, when a data center failure dictates that a failover is necessary, of failing over to and resuming operations on the
async target array rather than on the sync target. This can be done once data that was replicated on the sync target but that has not yet made it
to the async target is replicated to the async target array. Once the async target array is consistent with the state of the sync target array,
operations continue on the async target array with no data loss (RPO = 0). This allows the sync target array to operate in a lights out datacenter
without servers where its only responsibility is to bring the async target up to date in the event the sync source array fails.

This failover also results in the periodic asynchronous link between the sync target and async target array being reversed so updates to the
async target array are replicated back to the sync target array, albeit in periodic asynchronous mode. Used in this manner, an asynchronous long
distance topology can deliver an RPO of zero at the async target site except in those cases where a regional disaster has rendered both the sync
source and sync target arrays down simultaneously. In this case the RPO delivered at the asynchronous periodic target array will be something
greater than zero.
Technical white paper Page 15

Starting with the release of HPE 3PAR OS 3.1.2 bidirectional replication between the two “sync arrays” is supported. This means that Remote
Copy can support multiple SLD configurations across the three arrays that are setup in an SLD topology (Figure 7). Also, you can have other
separate Remote Copy groups that are not part of an SLD configuration replicating synchronously between the two sync arrays in the topology
or via Period Asynchronous mode between the sync arrays and the periodic target array.

Figure 7. SLD mode: Long-distance replication with the potential for zero data loss

Three Data Center Peer Persistence


Starting with HPE 3PAR OS release 3.3.1 three data center Peer Persistence is supported. Three data center Peer Persistence is in essence an
SLD configuration where the Remote Copy groups are configured for Peer Persistence between the “sync source” and “sync target” arrays.

Three Data Center Peer Persistence (3DC-PP) is a combination of a high availability Remote Copy configuration (Peer Persistence) and a DR
configuration (Synchronous Long Distance). You can in fact run separate non Peer Persistence SLD Remote Copy groups on the same three
arrays. There are some operational differences in regards to failover with 3DC-PP compared to a standard SLD solution.

For the synchronous arrays in a three data center Peer Persistence configuration each metro cluster host is connected to both arrays via
redundant SAN fabrics and the VVs are exported using an ALUA host persona attribute. This is the way hosts have always been connected in a
Peer Persistence configuration and is not new for three data center Peer Persistence. Connecting a server to both arrays should not be done on a
standard SLD configuration (or any non-Peer Persistence synchronous Remote Copy configuration for that matter). Hosts at the third async
target site are connected to the Periodic Asynchronous target via their own separate set of redundant fabrics. Hosts at the third site should not
be connected to either of the synchronous arrays at the primary sites. The I/O paths for a given volume are ALUA “Active” Read/Write on the
array where the “Primary” copy of the volume resides. Volume I/O paths on the synchronous “Secondary” array are in ALUA “Standby” mode.
Volume I/O paths to the Periodic Asynchronous target are “Active” Read Only.
Technical white paper Page 16

With Peer Persistence a Quorum Witness (QW) at some “other” location (not co-located with either of the synchronous arrays but it may reside
with the Periodic Asynchronous target array) enables automatic failover between the synchronous “Primary” and “Secondary” arrays when
necessary to respond to a failure. Only manual failover to the Periodic Asynchronous site is supported (no automatic failover to this array)
3DC PP is enable using a new Remote Copy group policy introduced in 3.3.1, Policy Name: mt_pp

See the HPE 3PAR Peer Persistence White Paper for more details on support for Three Data Center Peer Persistence

Sizing Remote Copy Solutions


System sizing considerations when using Remote Copy
When sizing HPE 3PAR arrays that will be used to replicate data using Remote Copy, care must be taken to account for the additional IOPS and
workload that Remote Copy will impose on the arrays. This means that in addition to being sized to service the native workload generated by
servers connected to it, the secondary array in a Remote Copy relationship must be sized to also service the additional write I/Os for the
replicated data coming from the primary array. This is true for all replication modes—synchronous, asynchronous streaming, and periodic
asynchronous.

Additionally, both arrays must be sized to account for the IOPS resulting from snapshots created by Remote Copy in addition to user created
snapshots. In periodic asynchronous mode, snapshots are created on both the primary and the secondary arrays every “period” interval defined
for a Remote Copy group. This means if a 10-minute “period” is specified for a Periodic Asynchronous Remote Copy group, the primary and
secondary arrays will both create snapshots for all the VVs in that Remote Copy group every 10 minutes. The arrays must be sized to account
for any user created snapshots and the additional back-end disk IOs that occur due to both Remote Copy and user created snapshots. The
arrays must also be size for the additional read workload applied to the primary array and the write workload applied to the secondary array
when an Asynchronous Periodic delta resync is occurring.

If you are using Asynchronous Streaming mode Remote Copy will create coordinated snapshots at every “snap_freq” interval (default is once an
hour). These coordinated snapshots will result in additional back-end I/Os on both the primary and secondary array. These additional I/Os must
be accounted for when sizing the arrays.

In synchronous mode, Remote Copy only creates snapshots if the replication links fail or if a Remote Copy group is suspended. So the impact of
snapshots is not as pronounced as they are in the asynchronous replication modes although user-initiated coordinated snapshots need to be
considered for all replication modes.

If the CPG specified on either array for the Remote Copy snapshots is a CPG using nearline drives, performance will suffer as the COW IOs to the
nearline drives as a result of the snapshots may not be serviced fast enough due to the NL drives’ limited performance. Also, any snapshot
promotes that Remote Copy may perform will take longer if the snapshots are on nearline Drives. This can affect the Recovery Time Objective
(RTO) of the solution). For this reason HPE recommends the administrator does not use nearline drives for snapshots on any volumes replicated
by via Remote Copy.

Note
HPE recommends that nearline drives not be used for the snapshots of volumes replicated with Remote Copy

Data replication link sizing


In any live data replication based disaster tolerant solution, the network used to replicate data between the primary and the disaster recovery
sites is a key part of the solution. It may be the case that the solution is active-active where both sites run production and are expected to back
one another up, so there is no “primary” and “secondary” site—both sites are primary for some Remote Copy groups and secondary for other
Remote Copy groups. The replication network’s speed has a direct effect on the RPO of the solution, as well as an effect on the total recurring
cost of the solution. It must be sized properly to ensure smooth operation of the entire solution at all times and at the best cost to the enterprise.
Technical white paper Page 17

Sizing a Solution Using Synchronous Mode


Most solutions replicating data synchronously require a replication link where the speed is sized to 130% of the maximum data generation rate of
the writes to the volumes being replicated. Added network throughput is needed for the impact of initial synchronization of volumes and
resynchronizations following link failures or array failover scenarios. For example, if there is a peak data generation rate of 400MB that lasts a few
minutes during the day, yet the average data generation rate for the rest of the day is closer to 250MB, a replication link of close to 520MB will
be required to prevent server I/O latency spikes when the 400MB I/O spike occurs if a resynchronization is occurring at the same time.

Best Practice: The replication link deployed for a synchronous replication solution should be sized such that 75–85% of the network throughput is
equivalent to the maximum IO rate spike that may be generated by the server. This ensures there is enough network throughput for resynchronizations
should they become necessary.

If the link throughput is not sufficient IOs will queue up waiting to get onto the link and this waiting will manifest itself as additional write I/O
latency suffered by the server. Think of it like a freeway with only two lanes versus a freeway with six lanes. For a given spike in traffic trying to
get onto the freeway, you will wait in line much longer to get onto the two-lane freeway than to get onto the six-lane freeway simply because the
six-lane freeway can move more traffic.

Figure 8. Effect of replication link speed on synchronous replication when the data generation rate exceeds the replication link speed

Periodic Asynchronous Mode


For solutions using periodic asynchronous replication, the replication link can be sized to the average data generation rate for the delta resync period
in which the most data is generated. For example, if the solution is configured for a period value of 10 minutes, if the 24-hour day is broken down
into 10 minute periods and in the worst case 100MBs on average is generated during one of these 10 minute periods then the replication link needs
to be sized to 100MBs minimum to ensure the data from this worst case period can be replicated within 10 minutes before the next resync
period starts.

If the data cannot be replicated within the defined period an alert will be logged in the system log if the “over_per_alert” policy is set for the
Remote Copy group indicating that the Remote Copy group did not meet its resync interval. Receiving an alert in the system log because all the
data was not replicated within the defined resync interval does not mean that the data is compromised in any way from an IO consistency point
of view. It simply means the RPO for the solution was exceeded for the delta period the alert was generated for.
Technical white paper Page 18

Asynchronous Streaming Mode


With HPE 3PAR Asynchronous Streaming Remote Copy, solutions can be designed to meet requirements with RPO definitions measured in
seconds (up to 120 maximum) without imposing negative performance implications on the servers whose data is being replicated.

With Asynchronous Streaming Remote Copy properly sizing the replication link becomes especially important as the solution is expected to
provide for an RPO measured in seconds rather than multiple minutes or hours. Delivering an RPO measured in a handful of seconds requires a
replication link whose speed is very close to the write rate of the data that is being generated by the servers so that Remote Copy can ship data
at close to the rate it is being generated.

This means that like a synchronous replication solution, a solution based on Asynchronous Streaming Remote Copy will require replication links
that are sized very close to 130% of the maximum write data generation rate expected. Asynchronous Streaming Remote Copy holds data that is
being replicated in cache, on both the primary and secondary arrays. It is important to ensure that this data does not consume too much cache
on either array. If too much cache is consumed on the Primary array Asynchronous Streaming Remote Copy will suspend Remote Copy groups in
reaction to the overconsumption of array cache. Having properly sized data replication links will prevent the over consumption of cache on the
primary array and ensure the constant and smooth streaming of data between the arrays, which in turn ensures the target RPO is met.

Figure 9 (a). Effect of replication link speed on Asynchronous Streaming Remote Copy cache consumption

Figure 9 (b). Effect of replication link speed on Asynchronous Streaming Remote Copy cache consumption

An Asynchronous Streaming Remote Copy Sizing tool is available to aid in properly sizing the replication links used in an Asynchronous
Streaming Solution. Please contact your HPE representative for more information on the Asynchronous Streaming Remote Copy Sizer.
Technical white paper Page 19

Remote Copy functionality


Remote Copy coordinated snapshots
Starting with HPE 3PAR OS 3.2.2, Remote Copy storage replication software now supports coordinated snapshots for VLUNs being replicated
synchronously, with Asynchronous Streaming mode and with Periodic Asynchronous mode. Prior versions of the HPE 3PAR OS only supported
coordinated snapshots for Remote Copy groups that were being replicated in synchronous mode.

Coordinated snapshots allow the storage administrator to, with a single “createsv –rcopy” command, create a snapshot of the VVs in a Remote
Copy group on both the source array and on the target array that represent the exact same point in time. By quiescing a database before
creating coordinated snapshots the storage admin can create transactionally consistent snapshots on the primary and secondary arrays that are
identical and suitable for database backup or other work.

Auto recovery policy


For Remote Copy groups, HPE 3PAR Remote Copy software provides the ability to set an “auto recovery” policy on each Remote Copy volume
group. This policy does not define whether replication groups should be reversed and replication resume automatically following an Remote
Copy group failover, rather, it defines whether or not replication should automatically resume when the replication links return following a
complete link failure. The default behavior is no_auto_recover, which prevents automatic restart in case of recovery from a complete link failure.
The default behavior allows the administrator to ensure that the network issue that caused the links to fail in the first place is adequately
resolved before resuming replication. In the case of an unstable WAN link, with this policy off, Remote Copy will not automatically restart Remote
Copy groups rather than attempt multiple restarts across an unreliable network.

The administrator can override the default setting for “auto recovery” by choosing auto_recover via the setrcopygroup CLI command or via the
HPE 3PAR SSMC, which allows setting the automatic restart of a Remote Copy group once the link between the local and remote site is
recovered. This mode is useful when the administrator does not wish to manually restart the Remote Copy operations manually after a link failure
and recovery.

Creatercopygroup and setrcopygroup options for Asynchronous Streaming mode


With the addition of Asynchronous Streaming Remote Copy there is a new replication mode associated with the “creatercopygroup” command.
The “async” option has been added to specify Asynchronous Streaming mode is desired. The setrcopygroup command also supports the new
“async” replication mode for Remote Copy groups, where the storage administrator wants to set Asynchronous Streaming mode for an existing
Remote Copy group. Switching replication modes for a group does require that the group first be stopped via the “stoprcopygroup” command.
The group is first stopped, the mode is then changed with the “setrcopygroup” command, and the group is then restarted with the
“startrcopygroup” command.

Auto synchronize policy


The auto_synchronize policy is available with HPE 3PAR OS 3.2.2 MU2 and later. When the auto_synchronize policy is set, the HPE 3PAR
Remote Copy system automatically recovers and synchronizes all volumes in the group after a system failover, for either automatic or manual
failover scenarios. For example, the HPE 3PAR StoreServ Storage at a failed site automatically recovers and synchronizes all volumes in the
group after system recovery and initialization are complete and after Remote Copy communication links have been re-established. Use the
setrcopygroup pol auto_synchronize <group_name> command to set the automatic synchronization policy. In addition, this policy also allows the
failover command to be used when Remote Copy groups are started and online. It is no longer necessary to stop the groups before initiating a
failover command to the secondary system.

Note
Running Remote Copy groups in different replication modes on the same set of targets is not supported
Technical white paper Page 20

“snap_freq” subcommand for Asynchronous Streaming Remote Copy groups


When using the “setrcopygroup” command for a Remote Copy group running in Asynchronous Streaming mode there is a new “snap_freq”
subcommand that specifies how frequently Remote Copy should automatically take coordinated snapshots of the VLUN members of the
group. For groups in Asynchronous Streaming mode Remote Copy will, on a regular basis (default is once an hour), create its own coordinated
snapshots. These snapshots are used as resync points should a delta resync become necessary for any reason.

Remote Copy will utilize these coordinated snapshots to get the primary and secondary arrays back into synchronization if necessary following a
failure. The default interval for creating these automatic coordinated snapshots is once an hour but this interval can be increased or decreased
with the “snap_freq” subcommand to the “setrcopygroup” command. It is recommended that care be taken if the “snap_freq” frequency is
reduced down from the default one-hour interval to a smaller more frequent value as creating snapshots too frequently (especially for Remote
Copy groups containing a lot of VLUNs) can add a substantial load to the array and can result in the suspension of Asynchronous Streaming
Remote Copy groups.

Note
HPE recommends the default snap_freq value of one hour not be reduced.

“period” subcommand behavior for Asynchronous Streaming Remote Copy groups


For Remote Copy groups set for Asynchronous Streaming mode, the “period” subcommand operates differently than it does for groups in
Periodic Asynchronous mode. With Periodic Asynchronous Remote Copy, the “period” subcommand specifies how frequently the VLUNs on the
secondary array should be delta-resynched with the VLUNs on the primary array. A value of zero specifies they will never be delta-resynched
(a manually initiated delta-resynch is required).

The “period” subcommand serves a different purpose with Asynchronous Streaming Remote Copy. When a Remote Copy group is started in
Asynchronous Streaming mode (“async” mode), the “period” subcommand serves two purposes. First, if Remote Copy determines that it needs to
suspend an Asynchronous Streaming Remote Copy group(s), in reaction to an issue causing excessive cache consumption such as replication
link throughout degradation for example, the “period” subcommand specifies the order in which Remote Copy will choose groups to suspend.
Groups with larger “period” value definitions will be suspended first and will to be restarted last vs. groups with smaller “period” values. This
provides the storage administrator with some QoS control on how Remote Copy behaves if the replication environment degrades for some
reason and groups need to be suspended as a result. The administrator can assign smaller period values to important groups and larger period
values to less important groups giving priority to the more important groups.

The second purpose of the “period” subcommand is to specify how long Remote Copy waits before trying to automatically restart an
asynchronous streaming group. If, after waiting, the defined “period” Remote Copy determines that resources are sufficient to restart the group
the groups will be started. If after waiting the “period” value Remote Copy decides resources are still insufficient to restart the group it will wait
another full period before trying to restart the group again. If a group is defined to have a period value of zero it will be the first group to be
suspended if necessary and Remote Copy will not attempt to restart this group automatically, a manual restart of the group will be required.

Note
Assign lower period values to important Remote Copy groups and higher period values to less important groups

BEST PRACTICE: Do not assign the same period value to more than one group.

Note
An Asynchronous Streaming Remote Copy group with a “period” value of zero (0) will be suspended first and will not automatically restart.
A manual restart of the group will be required if it becomes suspended.
Technical white paper Page 21

Converting a 1:1 Remote Copy Topology into a Synchronous Long Distance (SLD) Topology
If you have an existing 1:1 Remote Copy topology and desire to convert it to a Synchronous Long Distance (SLD) or Peer Persistence Three
Data Center topology this can be accomplished fairly easily. Keep in mind that in an SLD topology there are three arrays total with the Primary
array replicating the Remote Copy group synchronously to the synchronous target and via Periodic Asynchronous mode to the asynchronous
target. In the examples that follow we have three arrays:
• Freatc7400c—This is the synchronous source array in the configuration and is replicating data synchronously to array S1901 (sync target)
and will replicate via Periodic Asynchronous mode to array 1902(async target).
• S1901—This is the “synchronous target” system for synchronous replication.
• S1902—This is the system we are adding to the existing 1:1 configuration that already exists between Freatc7400c and S1901. It will become
the Asynchronous target for both Freatc7400c and for S1901.

To modify a 1:1 into an SLD a third array must be added into the 1:1 topology.

Figure 10. Adding a third array to an existing 1:1 topology to create a Synchronous Long Distance topology

Steps for Converting from an existing 1:1 Remote Copy Configuration to an SLD Remote Copy Configuration
The first step in converting a 1:1 Remote Copy configuration into an SLD configuration is to configure the RCFC or RCIP ports to be used in the
planned SLD configuration.

Note
Please refer to the HPE 3PAR Remote Copy User Guide for details on how to configure RCIP or RCFC ports for use with Remote Copy.

Once the Remote Copy ports have been properly configured define the new Remote Copy targets between the existing arrays and the new array
being added to the configuration. In our example we will be using RCIP as the transport between the two existing arrays and the new array being
added to the configuration although RCFC can be used if desired.
Technical white paper Page 22

Currently we have a Remote Copy group defined between arrays Freatc7400c and S1901 using RCFC as the transport. We will be adding array
S1902 to the configuration using RCIP so we need to create the following targets:
Freatc7400c  S1902

S1902  Freatc7400c

S1901  S1902

S1902  S1901

Here is the current Remote Copy configuration for the three arrays Freatc7400c, S1901 and S1902 (Note S1902 currently has no Remote Copy
configuration associated with it):

Freatc7400:
Technical white paper Page 23

S1901:

S1902:

Creating Remote Copy Targets for the SLD Configuration


Once the RCIP or RCFC ports have been configured for use by Remote Copy you can then create the new Remote Copy targets for the SLD
configuration.

If you are using RCFC transports between the existing arrays and the new array then the syntax for the creatercopytarget command is:

Creatercopytarget <target_name> RCFC <node_WWN> <N:S:P>:<WWN> <N:S:P>:<WWN>

If you are using RCIP as the transport then the syntax is:

Creatercopytarget <target_name> IP <N:S:P:<IP_addr> <N:S:P:<IP_addr>


Please refer to the HPE 3PAR Command Line Interface Reference Guide for details on the “creatercopytarget” CLI command

The following steps show how to add RCIP Remote Copy targets to the existing Remote Copy 1:1 configuration in our example. It doesn’t matter
whether you are converting a Periodic Asynchronous 1:1 to SLD or a synchronous 1:1 to SLD, these steps are the same. For this example we are
adding RCIP targets between arrays Freatc7400 and S1902 and between arrays S1901 and S1902.
Technical white paper Page 24

1. Create a target from Freatc7400c to S1902


creatercopytarget S1902 IP 0:3:1:10.10.1.30 1:3:1:10.10.1.31

2. Create target from S1902 to Freatc7400c


creatercopytarget Freatc7400c IP 0:3:1:10.10.1.10 1:3:1:10.10.1.11
Technical white paper Page 25

3. Create target from S1901 to S1902


creatercopytarget S1902 IP 0:3:1:10.10.1.30 1:3:1:10.10.1.31

4. Create target from S1902 to S1901


Technical white paper Page 26

Admitting the new Remote Copy targets to the existing Remote Copy group
Once the necessary Remote Copy targets have been created they can be added to the existing Remote Copy group from the original
configuration. The new targets must be added to the Remote Copy group on the source array in the original configuration. When the new targets
are admitted Remote Copy will automatically create a new Remote Copy group on the new target array. The VVs from the target array will be
added to the new group, but will not start syncing until a startrcopygroup is issued. You do not have to stop the Remote Copy group when
executing the admitrcopytarget command. Note that in this example only the new async target needs to be added to the group and Remote
Copy will automatically add the targets between the sync target and the async target.
admitrcopytarget <target_name> <mode> <group_name> [<pri_VV_name>:<sec_VV_name>]...

Note
All of the virtual volumes contained in the existing group have to be specified to the admitrcopytarget command, you cannot execute the
admitrcopytarget specifying just some of the VVs from the existing Remote Copy group and then add the remainder later with the admitrcopyvv
command.

Here is an example of the command issued on the array named Freatc7400c. This is adding Periodic Asynchronous target, S1902, to the
existing Remote Copy group “group1”.

Freatc7400c cli% admitrcopytarget S1902 periodic group1 WWAS-Lab-group1vv.0:WWAS-Lab-group1vv.0


WWAS-Lab-group1vv.1:WWAS-Lab-group1vv.1

Notice that Remote Copy adds all the targets between the three arrays automatically when admitrcopytarget command is used to add the target
between the source array and the new target array.

Remote Copy adds the targets and VVs on all three systems in the Synchronous Long Distance configuration.
Technical white paper Page 27
Technical white paper Page 28

Converting an SLD configuration back to a 1:1 configuration


Converting an SLD configuration back to a 1:1 configuration is as simple as dismissing the target for the array you no longer want data replicated
to in the SLD configuration. To remove an array from the SLD configuration for a Remote Copy group dismiss the Targets corresponding for the
array you want to remove from the Remote Copy group with the dismissrcopytarget command.
dismissrcopytarget [options] <target_name> <group_name>

In this example we will remove the array corresponding to target 3PAR-B for the Remote Copy group “rcg1” from the SLD configuration:
1. First stop the group you want to remove the target for.
3PAR-A> stoprcopygroup -f rcg1
2. Dismiss the target that corresponds to the array we want to remove. In this case we are removing the array corresponding to the target
named “3PAR-B” from the group “rcg1”. Data for this group will no longer be replicated to target “3PAR-B”.
3PAR-A> dismissrcopytarget -f 3PAR-B rcg1

Dismissing target 3PAR-B from group rcg1.


Target 3PAR-B has been dismissed from group rcg1.
3. Restart Remote Copy group “rcg1”.
3PAR-A> startrcopygroup rcg1

Remote Copy failover options


In the normal course of operation, if an outage occurs on the primary array a failover to the secondary array must occur for the volumes on the
secondary to become writeable (setrcopygroup failover). In most cases the Remote Copy group must be stopped before a failover can be
conducted (there are exceptions for Peer Persistence and the new “auto_synchronize” policy implemented for HPE 3PAR OS 3.3.1). After a
failover the roles of the Remote Copy group change from “primary” and “secondary” to “secondary-rev” and “primary-rev”. This indicates that the
previously “primary” array is now a secondary (target) array but in a reversed state (secondary-rev) due to a failover and the former “secondary”
array is now writeable as a primary (source) array but in a reversed state (primary-rev)due to the failover. While in this state a failover of the
group is not possible. To fail back the Remote Copy group must first be “recovered” (setrcopygroup recover) and then “restored” (setrcopygroup
restore). This causes the group to resync data from the “primary-rev” array back to the “secondary-rev” array and then switches the roles back to
the original “primary” and “secondary” that the group was in before the initial failover.

Quite often, following a failover, it is desired to not fail back to the original primary array but to have the “primary-rev” array become “primary”
and the “secondary-rev” array become “secondary”. In this state the Remote Copy group is once again in a state where a simple failover of the
group can occur. To get to this state the setrcopygroup “reverse” operation must be used. Execute “setrcopygroup reverse –natural” to switch
the “primary-rev” array to “primary” and the “secondary-rev” array to “secondary”.

Note
See the “auto_synchronize” policy on how to set a Remote Copy group to automatically reverse roles and synchronize following a failover.

If the Remote Copy group was stopped and a failover was executed without stopping the servers connected to the “primary” array to test DR
recovery on the secondary, once the DR recovery test is complete execute “setrcopygroup reverse –local –current” on the “primary-rev” array to
convert it to “secondary”. An automatic resync of data from the “primary” array to the “secondary” array will ensue.

The following entries explain the options that can be specified to the “setrcopygroup reverse” command and their effect on the state of the
Remote Copy group.

-local
The –local option only applies to the “reverse” operation and then only when the –natural or –current options to the “reverse” operation are also
specified. Specifying –local with the “reverse” operation and an associated –natural or –current option will only affect the array where the
command is issued and will not be mirrored to any other arrays in the Remote Copy configuration.
Technical white paper Page 29

-natural
Specifying the –natural option with the “reverse” operation changes the role of the groups but not the direction of data flow between the groups
on the arrays. For example, if the role of the groups are “primary” and “secondary”, issuing the –natural option with the “reverse” operation will
result in the role of the groups becoming “primary-rev” and “secondary-rev” respectively. The direction of data flow between the groups is not
affected (data flows from the primary-rev array to the secondary-rev array) only the roles. Since the –natural option does not change the
direction of data flow between groups it does not require the groups be stopped.

-current
Specifying the –current option with the “reverse” operation changes both the role and the direction of data flow between the groups. For
example, if the roles of the groups are “primary” and “secondary”, issuing the –current option to the “reverse” operation will result in the roles of
the group becoming “secondary-rev” and “primary-rev” respectively and the direction data flow between the groups is reversed. Since the
–current option actually reverses the direction of data replication it requires the group be stopped.

Both the –natural and –current options must be used with care to ensure the Remote Copy groups do not end up in a non-deterministic state
(like “secondary”, “secondary-rev” for example) and to ensure data loss does not occur by inadvertently changing the direction of data flow and
resyncing old data on top of newer data.

With HPE 3PAR OS 3.3.1 failover has been made easier via a new Auto Synchronize group policy

New Group Policy: “auto_synchronize”

The default value for this group policy state is “off” for backward compatibility and online upgrades. Setting this group policy automatically
synchronizes Remote Copy group(s) following a group failover command (setrcopygroup failover). It supports both Automatic Transparent
Failovers and Manual failovers. If this policy is set for a group the group is automatically synchronized and started following the “setrcopygroup
failover” command if the Remote Copy links are UP and both 3PAR systems are online. As part of the failover process the replication roles are
reversed automatically (primary becomes secondary, secondary becomes primary) rather than the roles being left as “primary-rev” and
“secondary-rev” when the “auto_synchronize” group policy is not set. Setting this policy also allows for a failover when the Remote Copy group is
started (no need to stop the Remote Copy group before failing over). With this policy turned on the use of the “recover”, “restore”, and “reverse”
commands are no longer required to affect failover and failback of a Remote Copy group. Please refer to the Remote Copy User Guide for
detailed information on the auto_synchronize Remote Copy group policy.

How Remote Copy recovers synchronous Remote Copy groups following replication link failure
Synchronous replication during normal operation
The defined behavior of the 3PAR synchronous Remote Copy feature during normal operation is that:

The host sends a write request to the primary storage system. HPE 3PAR Remote Copy writes the data to the cache of two nodes on the primary
system. This redundancy enables synchronization to continue if one node fails before the write is replicated. Then, HPE 3PAR Remote Copy
sends the write request through a communication link to the backup system. HPE 3PAR Remote Copy writes the replicated data to the cache on
two nodes on the backup storage system. The backup HPE 3PAR storage system sends an acknowledgment that replication is complete to the
primary system. After the active cache update on the primary system is complete and the primary system receives the backup system’s
acknowledgment, the primary array sends an acknowledgment of the write to the host. I/O to the host is complete.

Description of synchronous operation during Remote Copy link failure


For synchronous volume groups, when all links between a Primary and Secondary array fail:

HPE 3PAR Remote Copy continues to allow writes to the primary volume group after snapshots of the primary volumes have been taken.
Therefore, applications that are writing data remain active even though the primary and secondary volume groups stop and go out of
synchronization.

Remote Copy stops replication from the primary to the backup system.

The offset and length of any I/O that were sent to the secondary but that failed to be reported as complete on the secondary virtual volumes,
because the link failures, are recorded in non-volatile memory (Failed-IO list).
Technical white paper Page 30

The system records failed I/O replication separately in a failed-IO list because Remote Copy creates the snapshots of the primary virtual volumes
on the primary system after I/O is written to the primary virtual volumes, but before Remote Copy can confirm the in-flight write I/O to the
secondary virtual volumes was actually completed. These recorded failed I/O operations are then completed (resent) when the replication links
return and before the snapshots are used to achieve resynchronization of the volumes occurs.

Restart of Remote Copy


When the Remote Copy group is restarted the 3PAR array:

Restarts replication for the volume groups.

Resends I/O from the failed-IO list to the secondary virtual volumes.

Takes the snapshot of the primary volumes at the time the group was stopped and uses this snapshot to start a delta resync to the target array
sending the difference data needed to get the volumes back in sync.

Detail of the Resynchronization process


During the resynchronization process replication will resume in a mixed resync data and synchronous IO mode dependent on the new write I/Os
received from the servers writing to the VVs in the Remote Copy group. Snapshots are taken of the target VVs before resynchronization begins. The
difference data from the snapshots taken when the links failed is sent from the source array to the target array. All new IOs will be applied to the
source array and will be also replicated to the target array during the resynchronization process.

Once all of the difference data from the snapshots on the source array has been sent to the target array the data on the two arrays is once again
back in sync. During this period of resynchronization the 3PAR may throttle the resynchronization I/O in order to prioritize the new host write I/O
to try and optimize write IOs service times on the server. This behavior cannot be configured or adjusted in any way.

There may be WAN link contention from the host write I/Os (where the data and the acknowledgement is transmitted between the primary and
backup up 3PAR array) and the resynchronization I/O. If the WAN link is too small or the new write IO workload is too large then a negative impact
on new host write IO latencies may be suffered. The impact of this contention on the primary array performance will be more significant where:
• The primary 3PAR array is running close to the maximum achievable performance based on the controller and drive configuration
• The WAN link utilization is high, and/or the latency is high during normal operation (i.e. before the WAN link failure occurred)
• The utilization of replication ports on the 3PAR array is high during normal operation (i.e. before the WAN link failure occurred)

How Remote Copy recovers Periodic Asynchronous Remote Copy groups following replication
link failure
Description of Periodic Asynchronous replication during normal operation
In periodic asynchronous mode, host writes are completed on the primary array and the host write is acknowledged to the server. The volumes
on the primary array and volumes on the secondary array are then resynchronized “periodically”—when scheduled or when resynchronization is
manually initiated via the syncrcopy command. Resynchronization is accomplished by taking the difference between a new delta snapshot
taken when resynchronization is requested and an old delta snapshot taken at the last resynchronization interval. This “delta data” represents
the changes that have occurred to the VVs on the primary array during the resynchronization interval. This delta data is sent to the secondary
array and is applied to the VVs in the Remote Copy group. Once all of the delta data from the Primary array has been applied to the Secondary
VVs the Secondary VVs are then in an IO consistent state that looks exactly like the primary VVs did when the newest delta snapshot was taken.

Description of operation during Remote Copy link failure


During a WAN failure Periodic Asynchronous Remote Copy continues to operate in the same manner it does without a WAN failure except that
the delta resync interval is extended until the WAN links recover. Host IOs are accepted at the primary array and are acknowledged immediately
to the server. The main difference during WAN failure is that Remote Copy cannot replicate the current delta data and must wait until the WAN
links become operational again. This in essence extends the current delta resync interval until the WAN links return and the current delta set can
be replicated to the target array. Any delta resync that was in progress when the WAN links fail will resume and will be completed followed
immediately by a new delta resync if the defined delta resync interval was exceeded.
Technical white paper Page 31

Detail of the Resynchronization process


When the Remote Copy storage replication software links recover from a failure the Periodic Asynchronous Remote Copy groups will
automatically restart resynchronization if the “auto_recover” policy is set for the group. If the “no_auto_recover” policy is set the administrator
must manually restart the Remote Copy group for resynchronization to occur. Resynchronization following a failure of the Remote Copy links is
no different than a standard resynchronization performed on a defined resynchronization interval for the group except that the resynchronization
following the link down event may take longer than a standard resynchronization due to a large amount of data having queued up.

If, during resynchronization in asynchronous periodic mode, the primary system fails or all of the Remote Copy links fail, Remote Copy behaves in
the following manner for the volumes in the Remote Copy target volume group:
• For all volumes in the Remote Copy volume group that completed resynchronizing before the failure, 3PAR Remote Copy takes no action on
these volumes and retains all pre-synchronization snapshots for these volumes.
• For all volumes in the Remote Copy volume group that were in the process of resynchronizing at the time of the failure, but that did not
complete resynchronizing, 3PAR Remote Copy automatically promotes the pre-synchronization snapshot for all of these volumes.
• For all volumes in the Remote Copy volume group that had not started resynchronization at the time of failure, 3PAR Remote Copy does
nothing.

This means that if, at the time of the failure, some of the volumes in the volume group successfully synchronized before the failure, and some
volumes did not finish resynchronizing before the failure, then the volumes that had not completed resynchronization will be promoted to the
last recovery point and volumes that had completed their resynchronization will be left unchanged at the current recovery point. At this point in
time, the volumes on the target array are not in an I/O consistent state. Any snapshots taken from these targets will not be usable. To make the
volumes I/O consistent again, one of two actions must be performed:
• The Remote Copy volume group must be restarted after the failure has been recovered from, at which time a new resynchronization will occur,
resulting in all the volumes becoming I/O consistent with one another at the new resynchronization point in time.
• The Remote Copy volume group must be used for recovery, by means of a failover, at which time all of the volumes whose snapshots were not
promoted following the failure (the ones that completed synchronization) will have their pre-synchronization snapshots promoted and all the
volumes in the volume group will then revert to their I/O consistent pre-synchronization state.

How Remote Copy recovers Asynchronous Streaming Remote Copy groups following
replication link failure
Description of Asynchronous Streaming replication during normal operation
Asynchronous Streaming Remote Copy is, as the name implies, asynchronous replication of data between two HPE 3PAR arrays but it differs
from Periodic Asynchronous Remote Copy in that instead of replicating data periodically at a defined resynchronization interval Asynchronous
Streaming Remote Copy constantly places I/O data from the primary array onto the replication link for delivery at the secondary array and
applies the data in 100ms “delta-sets”. Asynchronous Streaming Remote Copy does this by holding data in cache on the primary array while it is
being replicated to the secondary array. Once receipt of the data for an IO has been confirmed by the Secondary array to the Primary array, the
cache space holding the data is freed up on the Primary array.

Description of operation during WAN failure


Asynchronous Streaming Remote Copy operates very much like Periodic Asynchronous Mode during a WAN failure. Coordinated snapshots
(think of delta resync snapshots for Periodic Asynchronous mode) are created by default once every hour (this interval can be changed but it is
recommended that it be left at one hour). Following a WAN failure, any incomplete delta sets on the target array are discarded (the data on the
target is IO consistent at the point in the time the last 100ms delta set was applied) and when the replication links recover the delta data
between the prior Coordinated Snapshot and a new snapshot of the base volume is sent to the target array (this is just like delta resynchronization
for a Periodic Asynchronous Remote Copy group). While this delta data is being replicated all new host IOs are also sent to the target array
(using the same 100ms delta set interval described above) and once the delta resync has completed the arrays return back into streaming
asynchronous mode delivering very small RPOs.
Technical white paper Page 32

Conversion options for VVs in a Remote Copy group


What does it mean to “convert” a VV? By converting a volume we mean the volume is changed from one volume type to another. For example,
a thinly provisioned volume (TPVV) can be converted to a dedupe volume (TDVV) or a fully provisioned volume (FPVV) may be converted to
a thinly provisioned volume (TPVV). Volumes on an HPE 3PAR array can be converted from any type to any other type by using the “tunevv”
CLI command or the SSMC. The volume types supported on HPE 3PAR are:
• Fully Provisioned without snap space (FPVV)
• Fully Provisioned with snapshot space (CPVV)
• Thinly Provisioned (TPVV)
• Thin dedupe (TDVV)
• Thin compressed (TCPP)
• Thin dedupe and compressed (DECO)

When a volume that has a snapshot associated with it is converted from one volume type to another any snapshots associated with the volume
being converted remain with the source volume following the conversion and are not available to or usable by the new converted volume. This
poses problems when converting volumes in a Remote Copy group from one volume type to another.

Today HPE 3PAR does not support converting a volume that is in a Remote Copy group. To convert a volume that is in a Remote Copy group
the Remote Copy group must first be stopped and the volume must be dismissed from the Remote Copy group via the “dismissrcopyvv”
command (or via the SSMC) and it can then be converted. Once the volume is converted it can then be put back into the Remote Copy group via
the “admitrcopyvv” CLI command (or the SSMC) and the Remote Copy group can be restarted. Since the group has to be stopped to dismiss a
volume from it normally the user would specify the “-keepsnap” option with the “dismissrcopyvv” command to create a snapshot of the volume
being dismissed at the moment it was dismissed from the group so that when he re-admits the volume back into the Remote Copy group
(admitrcopyvv) he can specify the snapshot generate by the -keepsnap option and only a delta resync of the VV will have to occur (i.e. only
changes to the volume that occurred between the time it was dismissed from the Remote Copy group and then readmitted to the Remote Copy
group will have to sync). But since any snapshot generated by the -keepsnap option cannot be used with the new converted VV following the
conversion Remote Copy cannot use that snapshot to perform a delta resync. Since the “-keepsnap” snapshot cannot be used to perform a delta
resync of the volume a full copy synchronization of the VV has to occur when the volume is readmitted into the Remote Copy group. This is a
problem especially in the case where the replication link between arrays is small as the full copy synchronization can potentially take a very long
time. It also exposes the solution to an extended RPO until the full copy synchronization has completed. There are however processes that can
be followed that allow the conversion of volumes without requiring a full copy synchronization when the VV is readmitted into the Remote Copy
group but these generally require at least some application downtime. If application downtime cannot be tolerated at all then potentially a full copy
synchronization will be necessary. There are basically three solution scenarios to choose from when converting VVs for a Remote Copy group.

Three conversion solution scenarios are available for VVs in a Remote Copy group
There are basically three different conversion solution scenarios for customers using Remote Copy.
• Scenario1: Non Peer Persistence solutions where no application downtime is acceptable.
– Requires a full synchronization following volume conversion
• Scenario2: Migration where application downtime can be tolerated.
– Will not require a full synchronization of the volume following conversion but failover and potentially failback are required
• Scenario3: If the solution is using Peer Persistence the VVs in the Remote Copy group can be converted without application downtime and
without requiring a full copy synchronization.

Scenario 1: Non Peer Persistence solutions with no application downtime


For solutions that are not using Peer Persistence there is no way to migrate the VVs in a Remote Copy group with absolutely zero downtime and
not have to suffer a full copy synchronization. Any solution not using Peer Persistence that requires no application downtime whatsoever will
have to use a process that requires a full resynchronization of all Remote Copy VVs following the VV conversion.
Technical white paper Page 33

Note
No application downtime is necessary to convert the VVs in a Peer Persistence Remote Copy group although RPO is exposed and the ability to
failover transparently is not available during the conversion.

In this scenario the application remains 100% online during the entire migration process. It does however require a Remote Copy full copy
synchronization following the migration. If you CANNOT tolerate a full synchronization of the VVs in the Remote Copy group due to bandwidth
constraints then option 2 that includes planned downtime has to be considered.

No data is replicated to the disaster site for the duration of the Primary and Secondary VVs’ conversion when using this migration scenario. This
exposes the solution to potentially significant data loss if a disaster should occur to the Primary data center during the VV migration process.

Scenario 2: Conversion with application downtime


At a minimum this migration scenario requires a failover of the Remote Copy groups at least once and potentially twice depending on the
operating environment. It assumes the customer’s DR solution is in good working order and that both data centers can run an application while
its VVs are being converted. If your DR solution is viable to support a failover of the Remote Copy groups to the secondary site OR if you cannot
tolerate the time it takes to perform a full resynchronization of the Remote Copy groups following migration then one of these options should be
considered:
• If you cannot tolerate a full synchronization of the Remote Copy groups following the migration then stop the application as part of the
conversion process. This can require SIGNIFICANT downtime for the applications as they must stay offline for the entire duration of the
migration to avoid a full synchronization.
• Failover the application as part of the migration process. The applications can remain online during the migration with minimal downtime
suffered as a result of the failover ASSUMING you have appropriate hardware to run the environment in a failed-over state. With Peer
Persistence configurations this scenario can be used with no application downtime.

Scenario 3: Conversion for solutions using Peer Persistence


If the solution is using Peer Persistence it is possible to convert the volumes in the Remote Copy group without any application downtime and
without having to perform a full synchronization following the conversion.

Process for converting non Peer Persistence Remote Copy VVs with no application downtime (Requires a
Remote Copy full copy synchronization)
This process explains how to convert VVs in a Remote Copy group without suffering any application downtime. Use of this process REQUIRES
a full copy synchronization of the VVs following the conversion.
1. Stop the Remote Copy group containing the VVs to be converted (stoprcopygroup) and ensure a final synchronization of the groups has
occurred
2. Dismiss the VVs to be converted from the Remote Copy group and keep a snapshot just in case (dismissrcopyvv –keepsnap)
3. Convert both the Primary and/or Secondary volumes (as necessary) to the new volume format desired (tunevv)

Note
Since a full synchronization of the volumes will be quired it is not necessary to convert the secondary VV to the new desired format. Simply
delete the existing secondary VV and recreate it in the desired format.

Note
Compressed and DECO VVs are not supported with Asynchronous Streaming at 3.3.1GA

4. Once all of the conversions have completed readmitted the volumes back into the Remote Copy group they belong to and DO NOT specify
the “-nosync” option to the admitrcopyvv command. (admitrcopyvv <VV_name> groupname <target_name>:<sec_VV_name>)
Technical white paper Page 34

5. Start the Remote Copy group (startrcopygroup)

Note
At this time a full copy resync of the volume will occur.

6. The old VVs and snapshots may be removed

Process for converting Remote Copy VVs without requiring a full copy synchronization (Avoids a Remote Copy
full synchronization)
There are two methods for avoiding a full copy resynchronization following the conversion of volumes in a Remote Copy group. In the first
method the application is completely stopped before the VVs are dismissed from the Remote Copy group and remain stopped until the entire
VV conversion process has completed, then the volumes are placed back into the Remote Copy group. This process can require SIGNIFICANT
application downtime. A side benefit of this method is that in the event of a disaster, loss of the Primary data center during the migration, there is
no exposure to data loss as no new data is being generated and the Secondary contains all of the customer’s data (small compensation given
that the application may be off line for an extended period of time). This process allows the volumes on both the Primary and Secondary array to
be converted simultaneously and hence can reduce the end-to-end time for the total conversion.
Method 1:
1. Stop the application using the Remote Copy group containing the VVs to be converted

WARNING
It is imperative to ensure the application is completely stopped and that no server is performing IO to the volumes from the Remote Copy group.
If the application using the VVs in the Remote Copy group is not completely stopped then undetected data inconsistency will result on the target
VVs when the Remote Copy group is restarted.

2. Stop the Remote Copy group containing VVs to be migrated (stoprcopygroup) and ensure a final synchronization of the groups has occurred
(showrcopy –d)
3. Dismiss the VVs to be converted from the Remote Copy group (dismissrcopyvv)
4. Convert the volumes to the desired volume type on both the Primary and Secondary arrays (tunevv)
5. After the volume conversions complete readmitted the volumes back into the Remote Copy group specifying the “-nosync” option.
(admitrcopyvv -nosync <VV_name> groupname <target_name>:<sec_VV_name>)
The -nosync option tells Remote Copy that no synchronization is to occur between the Primary and the Secondary volumes being admitted
into to the Remote Copy group, that they in fact contain exactly the same data and are already in synchronization with one another. We can
specify the –nosync option because the volumes were not changed by the application (i.e. no new writes occurred to them) between the time
they were dismissed from the Remote Copy group and the time they are readmitted back into the group (i.e. while they were being converted)
because the application was stopped the entire time.

WARNING
If the application is run before the conversion completes and the VVs are readmitted to the Remote Copy group the “-nosync” option should not
be used when the volumes are readmitted into the Remote Copy group. A full copy synchronization will be necessary to ensure IO consistency
between the arrays.

Note
Compressed VVs are not supported with Asynchronous Streaming at 3.3.1GA

6. Restart the Remote Copy group (startrcopygroup)


Technical white paper Page 35

7. Restart the application


8. Remove the old VVs and associated snapshots

Method 2:

The second method for migrating volumes in a Remote Copy group where a full copy synchronization cannot be tolerated and the application
cannot be stopped for an extended period of time is a bit more complex. It requires at least one and perhaps two periods of downtime while the
application is failed over between arrays. Unlike the first process, where the applications may be down for an extended period of time, with this
process the applications only need to be down long enough to fail them over to the Secondary array. This method also requires a second limited
outage if the applications are failed back to the original primary data center once the volume conversion process is complete. This process
requires appropriate hardware at the secondary data center to run the production workloads while the volumes on the Primary array are being
converted. If there isn’t sufficient hardware to run the production workload after failing over to the secondary data center then one of the other
migration options offered (application offline with no synchronization or application online with a full resynchronization) will have to be considered.

One drawback of this option is that it exposes the solution to potential data loss if a disaster should occur between the time the Remote Copy
groups are stopped and the time they have completed resynchronizing once restarted. The amount of time the solution is exposed is
nondeterministic as it is highly dependent upon how long it takes to convert the volumes once the Remote Copy group has been stopped and
this is affected by many factors such as number of volumes, volume size, array loading, etc.

If the two arrays are operated in an active/active fashion where both arrays are receiving host IOs for different Remote Copy groups being
replicated to the “other” array then this process can be used to convert VVs from different Remote Copy groups on both arrays simultaneously if
so desired. The VVs being converted are always those contained in the “Secondary” Remote Copy group on the target array.

Note
The volumes being converted are the volumes on the “Secondary” or “Secondary-rev” array for the Remote Copy group. DO NOT convert the
“Primary” or “Primary-rev” volumes as this will result in a full copy synchronization being required.

If it is desired to run the conversion on only one array at a time for environments where both arrays may be active there is the option of failing all
Remote Copy groups to one array making them all “Primary” and/or “Primary-rev” before starting the migration and then converting all the
“Secondary” and “Secondary-rev” Remote Copy group VVs on that array. Once complete the Remote Copy groups are all failed over to the “other”
array (or recovered and restored) and the process is then repeated for the groups when they are “Secondary” and/or “Secondary_rev” on that
array. The applications remain operational during most of this process with only short outages during Remote Copy group failover.
1. Stop the Remote Copy group containing the VVs to be migrated (stoprcopygroup)
2. If the servers have connectivity to the secondary array but are NOT in a Peer Persistence configuration (Note: this is NOT a recommended or
Best Practice configuration but it is possible for an environment to be configured this way) on the Secondary array unexport the VVs in the
Remote Copy group (removevlun)
This step is necessary because the VVs on the Secondary array will transition to writeable mode when they are dismissed from the Remote
Copy group and this will render them writable to the servers if they are not first un-exported. If the server has access to the VVs on both
arrays simultaneously and writes data to both data inconsistency will result
3. Dismiss the volumes to be migrated from the Remote Copy group and retain a snapshot (dismissrcopyvv -keepsnap)
4. Convert the volumes to the desired volume type on the Secondary or Secondary-rev array (tunevv). DO NOT convert the volumes if they
are on the Primary or Primary-rev array
5. Once the conversions are complete re-admit the volumes to the Remote Copy group on the Primary array specifying the snapshots created
in step 3 when the VVs were dismissed along with the converted volume on the secondary array (admitrcopyvv <VV_name:snap>
groupname <target_name>:<sec_VV_name>)
6. Start the Remote Copy group (startrcopygroup)
7. Export the volumes on the Secondary array to the servers (createvlun)
8. Allow the Remote Copy group to finish delta-resynching (showrcopy –d)
Technical white paper Page 36

9. Once the Remote Copy group has finished the delta-resynch, execute the steps appropriate for your environment to affect a failover or
switchover of the Remote Copy group. This includes recovery of the applications on the servers in the failover data center (this happens
automatically with Peer Persistence). This causes the “Primary” group to become “Secondary-rev” and the “Secondary” to become
“Primary-rev” except for Peer Persistence where the groups will become “primary” and “secondary”. If the “auto_synchronization” policy is set
for the group then Remote Copy will automatically convert the arrays to "Primary" and “Secondary” following the failover and
resynchronization of the groups. Please see the section on "Auto Synchronization" for details on the auto_synchronization group policy

Note
Failover of the application may involve stopping the application servers, manually failing over the group and then restarting the applications
manually on servers in the failover data center. It may involve using cluster software like CLX, Serviceguard or MetroCluster to fail over
applications from one data center to another. It may involve processes that are not documented in this migration guide. What is important is that
the application be failed over the alternate data center (the Remote Copy group is failed over to the Secondary array) and hence the target array
in that data center.

10. Repeat steps 1–8 and convert the VVs for the “Secondary” or “Secondary-rev” Remote Copy group
11. Fail all applications back to their original data center or whatever location is desired

Method 3:

Method 3 is very similar to method 2 except it is used for Remote Copy groups in a Peer Persistence configuration. For groups in a Peer
Persistence configuration the Remote Copy group can be failed over with no interruption to host IO.

As with method 2, a drawback of this option is that it exposes the solution to potential data loss if a disaster should occur between the time the
Remote Copy groups are stopped and the time they have completed resynchronizing once restarted. The amount of time the solution is exposed
is nondeterministic as it is highly dependent upon how long it takes to convert the volumes once the Remote Copy group has been stopped and
how long they take to resync once restarted and this is affected by many factors such as number of volumes, volume size, array loading, etc.

If the two arrays are operated in an active/active fashion where both arrays are receiving host IOs for different Remote Copy groups being
replicated to the “other” array then this process can be used to convert VVs from different Remote Copy groups on both arrays simultaneously if
so desired. The VVs being converted are always those contained in the “Secondary” Remote Copy group on the target array.

Note
The volumes being converted are the volumes on the “Secondary” or “Secondary-rev” array for the Remote Copy group. DO NOT convert the
“Primary” or “Primary-rev” volumes as this will result in a full copy synchronization being required.

If it is desired to run the conversion on only one array at a time for environments where both arrays may be active there is the option of switching
over all Remote Copy groups to one array making them all “Primary” before starting the migration and then converting all the “Secondary”
Remote Copy group VVs on that array. Once complete the Remote Copy groups are all switched over to the “other” array and the process is then
repeated for the groups when they are “Secondary”. The applications remain operational during all of this process.
1. Stop the Remote Copy group containing the VVs to be converted (stoprcopygroup)
2. On the Secondary array unexport the VVs in the Remote Copy group (removevlun)
This step is necessary because the VVs on the Secondary array will transition to ALUA Active or writeable mode when they are dismissed
from the Remote Copy group and this will render them writable to the servers if they are not first un-exported. If the server has access to the
VVs on both arrays simultaneously and writes data to both data inconsistency will result
3. Dismiss the volumes to be converted from the Remote Copy group and retain a snapshot (dismissrcopyvv -keepsnap)
4. Convert the volumes to the desired volume type on the Secondary array (tunevv). DO NOT convert the volumes if they are on the Primary array
5. Once the conversions are complete re-admit the volumes to the Remote Copy group on the Primary array specifying the snapshots created
in step 3 when the VVs were dismissed along with the converted volume on the Secondary array (admitrcopyvv <VV_name:snap>
groupname <target_name>:<sec_VV_name>)
Technical white paper

6. Start the Remote Copy group (startrcopygroup)


7. Export the volumes on the Secondary array to the servers (createvlun)
8. Allow the Remote Copy group to finish delta-resynching (showrcopy –d)
9. Once the Remote Copy group has finished the delta-resynch, execute the steps appropriate for your environment to affect a switchover of
the Remote Copy group. This causes the Primary and Secondary arrays to switch places and become Secondary and Primary respectively
10. Repeat steps 1–8 to convert the “Secondary” Remote Copy group VVs
11. Fail all applications back to their original data center or whatever location is desired

Cross-product interoperability
HPE 3PAR Remote Copy storage replication software is common to and fully interoperable across the entire HPE 3PAR StoreServ product family,
which consists of four different classes of HPE 3PAR StoreServ systems.

Learn more at:


hpe.com/us/en/product-catalog/storage/storage-software/pip.storage-software.5044771.html

Sign up for updates

© Copyright 2011–2013, 2015, 2017 Hewlett Packard Enterprise Development LP. The information contained herein is subject to
change without notice. The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty
statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty.
Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein.

4AA3-8318ENW, September 2017, Rev. 6