Anda di halaman 1dari 49

White Paper

VMware Virtual SAN with Cisco Unified Computing


System Reference Architecture
August 2014

Contents
Introduction .............................................................................................................................................................. 2
VMware Virtual SAN ............................................................................................................................................. 3
Cisco Unified Computing System.......................................................................................................................... 5
VMware Virtual SAN with Cisco UCS Architecture ............................................................................................... 7
Cisco UCS Configuration ...................................................................................................................................... 8
VMware Virtual SAN Configuration ..................................................................................................................... 12
VMware Virtual SAN Availability and Manageability ........................................................................................... 14
Benchmarking VMware Virtual SAN on Cisco UCS ............................................................................................ 26
Cisco UCS with VMware Virtual SAN Ready Nodes............................................................................................ 35
Conclusion ............................................................................................................................................................. 35
Appendix A: IO Meter Custom Configuration Files............................................................................................. 36
Appendix B: VMware Virtual SAN Requirements ................................................................................................ 46
Appendix C: Ruby vSphere Console and VMware VSAN Observer .................................................................. 47
Appendix D: Cisco Part Numbers ........................................................................................................................ 48
Resources .............................................................................................................................................................. 49
Acknowledgements ............................................................................................................................................... 49

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 49SAN and
Cisco UCS
Introduction
Virtualization helps alleviate the pressure on IT departments by allowing organizations to do more with less through
higher asset utilization, rapid deployment of applications, and the capability to scale existing deployments with
greater efficiency and speed.

Virtualization improves the efficiency and optimization of data center assets, and it provides significant
advantages including:

Consolidation: Server virtualization enables IT to provision and deploy multiple virtual machines rapidly
from fewer physical servers. This capability helps reduce or eliminate underutilized server and storage
hardware, software, and infrastructure.
Operation agility: Virtualization supports dynamic IT environments that both respond to problems and
anticipate increased demands with features such as automated virtual machine reconfiguration, flexible
resource control, and automated migration.
Business continuity: Virtualization provides business continuity and IT disaster recovery capabilities using
geographically dispersed clustering, remote management, and features such as live backup to reduce
potential data loss.

Virtualization also provides effective solutions that improve system manageability, reduce energy consumption, and
increase the capacity of data center facilities, thereby lowering the total cost of ownership (TCO).

When storage workloads are virtualized, they function differently than storage workloads in the physical world:

With virtual workloads, the relationship between applications and storage is N:1, rather than 1:1 as in a
physical environment.
Virtual workloads are more mobile than physical workloads, with capabilities such as VMware Storage
vMotion providing the flexibility to move workloads within the data center as required.
The I/O operations per second (IOPS) requirements for virtualized applications are more random than the
requirements for applications hosted in a physical world, which are more sequential.

In addition, emerging application trends for cloud computing and mobility initiatives are scale-out in nature and
requires the underlying storage solution to be built with specific constructs such as performance or scalability
in mind.

These features, plus the emergence of flash-memory storage solutions, including all flash-memory arrays, hybrid
flash memory, and server-side flash memory, have created opportunities for new storage architectures, providing
customers with a great deal of choice.

VMware vSphere, residing as the first piece of software code between the underlying infrastructure and the
virtualized applications, has the inherent knowledge of the application requirements from the underlying
storage systems and a global view of the infrastructure resources available, allowing it to meet these
requirements. This environment provides a unique opportunity for a hypervisor-converged storage
solution such as VMware Virtual SAN.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 2 of 49
Building VMware Virtual SAN as a hypervisor-converged solution on industry-leading data center infrastructure
founded on the Cisco Unified Computing System (Cisco UCS) and VMware vSphere platforms allows customers
to continue to benefit the extensive history of collaboration and joint innovation in the virtualized data center
between Cisco and VMware. This differentiated solution is designed to deliver a scalable virtual and physical
architecture, providing superior manageability, security, performance, and cost savings.

This document provides a reference architecture for VMware Virtual SAN on Cisco UCS, including configuration
details for building the joint solution, benchmark measurements, and manageability and availability aspects for
operating a VMware Virtual SAN on Cisco UCS environment.

VMware Virtual SAN


VMware Virtual SAN is a hypervisor-converged storage solution that is fully integrated with VMware vSphere.
VMware Virtual SAN combines storage and computing for virtual machines into a single device, with storage
provided within the hypervisor, instead of using a storage virtual machine running alongside other virtual machines.
VMware Virtual SAN aggregates locally attached disks in a VMware vSphere cluster to create a storage solution,
also called a shared datastore, which can be rapidly provisioned from VMware vCenter during virtual machine
provisioning operations.

VMware Virtual SAN is an object-based storage system that is designed to provide virtual machinecentric storage
services and capabilities through a storage-based policy management (SPBM) platform. SPBM and virtual
machine storage policies are solutions designed to simplify virtual machine storage placement decisions for
VMware vSphere administrators.

VMware Virtual SAN is fully integrated with core VMware vSphere enterprise features such as VMware vSphere
vMotion, High Availability (HA), and Distributed Resource Scheduler (DRS). Its goal is to provide both high
availability and scale-out storage functions. It can also be considered in the context of quality of service (QoS)
because virtual machine storage policies can be created to define the levels of performance and availability
required on a per-virtual machine basis.

As shown in Figure 1, VMware Virtual SAN shared datastore is constructed with a minimum of three VMware ESXi
hosts, each containing at least one disk group with at least one solid-state drive (SSD) and one magnetic drive,
and up to seven magnetic drives per disk group and up to five disk groups per host. The VMware virtual machine
files are stored on the magnetic drive, and the SSD handles read caching and write buffering. The disk group on
each host is joined to a single network partition group, shared and controlled by the hosts.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 3 of 49
Figure 1. VMware Virtual SAN Cluster

The size and capacity of the VMware Virtual SAN shared datastore are dictated by the number of magnetic disks
per disk group in a VMware vSphere host and by the number of VMware vSphere hosts in the cluster. VMware
Virtual SAN is a scale-out solution in which more capacity and performance are obtained by adding more disks to a
disk group, adding more disk groups to a host, and adding more hosts to the cluster.

With VMware Virtual SAN, SPBM plays a major role in the way in which administrators can use virtual machine
storage policies to specify a set of required storage capabilities for a virtual machine, or more specifically, a set of
requirements for the application running in the virtual machine.

The following VMware Virtual SAN datastore capabilities are surfaced up to vCenterVMware vCenter Server:

Number of Failures to Tolerate


Number of Disk Stripes per Object
Flash Read Cache Reservation
Object Space Reservation
Force Provisioning

For more information about the capabilities and features of VMware Virtual SAN, please refer to [1].

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 4 of 49
Cisco Unified Computing System
Cisco UCS is the fastest growing next-generation data center computing solution that unifies computing,
networking, management, virtualization, and storage access into a cohesive system. By converging data center
silos into a single unified system, Cisco UCS increases business agility and improves business continuity, thereby
lowering TCO and providing the following benefits:

Less infrastructure and more intelligent servers: This unique architecture enables end-to-end server
visibility, management, and control in both bare-metal and virtual environments and facilitates the move to
cloud computing and IT-as-a-service (ITaaS) with fabric-based infrastructure.
Consolidated resources with Cisco UCS servers: Cisco UCS servers allow dramatic reduction in the
number of devices an organization must purchase, cable, configure, power, cool, and secure. Cisco UCS
servers optimize virtualized environments across the entire system. Cisco servers can support traditional
operating systems and application stacks in physical environments.
Accelerated server deployment: The smart, programmable infrastructure of Cisco UCS simplifies and
accelerates enterprise-class application and service deployment in bare-metal, virtualized, and cloud
computing environments. With Cisco UCS unified model-based management, administrators can configure
hundreds of servers as quickly as they can configure a single server.
Simplified management: Cisco UCS offers simplified and open management with a large partner
ecosystem using Cisco UCS Manager.

The unified management capabilities provided by Cisco UCS Manager, integrated into Cisco UCS, offer
administrators flexibility and simplicity. Administrators can manage physical infrastructure similar to the way that
they manage virtual infrastructure. Cisco UCS applies familiar, critical virtualization concepts such as templates,
policies, and stateless computing to the physical infrastructure. The result is a model-based management system
that simplifies and automates administration, accelerates deployment and scaling, and reduces the likelihood of
configuration errors that can cause downtime and long troubleshooting efforts in physical network, computing, and
storage infrastructure.

The system integrates a low-latency, lossless 10 Gigabit Ethernet unified network fabric with enterprise-class, x86-
architecture servers. The system is an integrated, scalable, multichassis platform in which all resources participate
in a unified management domain (Figure 2).

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 5 of 49
Figure 2. Cisco Unified Computing System

The system helps reduce TCO by automating element-management tasks through the use of service profiles that
enable just-in-time provisioning. Service profiles increase business agility by quickly aligning computing resources
with rapidly changing business and workload requirements.

Cisco UCS C-Series Rack Servers


Cisco UCS C-Series Rack Servers are designed for both performance and expandability over a wide range of
storage-intensive infrastructure workloads ranging from big data to collaboration.

Cisco UCS C-Series Rack Servers provide the following benefits:

Form-factor-independent entry point into Cisco UCS


Simplified and fast deployment of applications
Extension of unified computing innovations and benefits to rack servers
Increased customer choice with unique benefits in a familiar rack package
Reduced TCO and increased business agility

Several Cisco UCS C-Series Rack Server models are available. Each model is optimized for particular types of
deployments to address different workload challenges through a balance of processing, memory, I/O, and internal
storage resources.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 6 of 49
The Cisco UCS with VMware Virtual SAN solution architecture is built on Cisco UCS C240 M3 Rack Servers. The
Cisco UCS C240 M3 is a high-density, enterprise-class, 2RU rack server designed for computing, I/O, storage, and
memory-intensive standalone and virtualized applications. It offers these benefits:

Suitable for nearly all storage-intensive, 2-socket applications


Unique Cisco UCS Virtual Interface Card (VIC) 1225: 2 x 10 Gigabit Ethernet PCI Express (PCIe) cards that
can support up to 256 PCIe virtual interfaces
Exceptional building block and entry point for Cisco UCS
Continual innovations from Cisco in server technology and at all levels of Cisco UCS

The Cisco UCS C240 M3 offers up to 768 GB of RAM, 24 drives, and four 1 Gigabit Ethernet LAN interfaces built
into the motherboard to provide outstanding levels of internal memory and storage expandability along with
exceptional performance.

The Cisco UCS C240 M3 supports:

Up to two Intel Xeon processor E5-2600 or E5-2600 v2 CPUs


Up to 768 GB of RAM with 24 DIMM slots
12 Large Form-Factor (LFF) or 24 Small Form-Factor (SFF) SAS, SATA, or SSD drives for workloads
demanding a large amount of internal storage
5 PCIe Generation 3 (Gen 3) slots and four 1 Gigabit Ethernet LAN interfaces on the motherboard
Trusted Platform Module (TPM) for authentication and tool-free access

VMware Virtual SAN with Cisco UCS Architecture


Figure 3 shows the VMware on Cisco UCS C240 M3 solution as built at the Cisco lab in San Jose, California.

Figure 3. VMware Virtual SAN with Cisco UCS Environment Details

This detailed architecture for the Cisco UCS with VMware Virtual SAN solution consists of the components listed
in Table 1.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 7 of 49
Table 1. Cisco UCS with VMware Virtual SAN Architecture

Component Description
Cisco UCS 8 Cisco UCS C240 M3 Rack Servers (x86 servers), each with:
2 Intel Xeon processor E5-2643 CPUs
24 8-GB 1600-MHz DDR3 RDIMMs, PC3-12800, dual rank, 1.35V
21 Seagate 1-TB SATA disks
3 SAMSUNG 800-GB SAS SSDs
1 LSI MegaRAID SAS 9271CV-8i controller
1 Cisco UCS VIC 1225 CNA
2 Cisco Flexible Flash (FlexFlash) cards

VMware software VMware vCenter 5.5 Update 1


VMware ESXi 5.5 Update 1

Fabric interconnects 2 Cisco UCS 6248UP 48-Port Fabric Interconnects

Each host is configured with three disk groups, each composed of one 800-GB SSD and seven SATA disks. This
configuration uses the maximum of 24 slots available on the Cisco UCS C240 M3 Rack Server. In this 8-node
cluster, the 168 magnetic disks provide a total VMware Virtual SAN datastore capacity of 152.40 terabytes (TB),
calculated as follows:
3 disk groups per host x 7 magnetic disks per disk group x 930 GB x 8 hosts =
152.57 TB of raw capacity
152.57 TB raw capacity 1681 GB metadata overhead = 152.40 TB of usable raw
capacity

In this architecture, the maximum number of disk groups per host is configured to showcase the maximum VMware
Virtual SAN datastore capacity available on the Cisco UCS C240 M3 in an 8 node cluster. Customers should build
the VMware Virtual SAN environment based on the VMware Virtual SAN Ready Nodes information as described in
the section Cisco UCS and VMware Virtual SAN Ready Nodes.

For networking, Cisco Data Center Virtual Machine Fabric Extender (VM-FEX) on a VMware vSphere distributed
switch is used with a dedicated VMkernel network interface card (NIC) used for VMware Virtual SAN control and
data traffic. This dedicated VMkernel NIC is a VMware Virtual SAN requirement, and multicasting needs to be
enabled for the VMware Virtual SAN Layer 2 network. In addition, two VMkernel ports are configured for
management and VMware vMotion traffic.

Cisco UCS Configuration


In this configuration, VMware ESXi is booted from the on-board Cisco FlexFlash SD cards [2], as shown here.

Note: In this lab, a single FlexFlash SD card was used; however, the recommended configuration is
two Cisco FlexFlash SD cards in a RAID configuration to help ensure reliability through redundancy in a
production environment.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 8 of 49
The Cisco FlexFlash SD card configuration is performed through a local disk policy that is applied to the service
profile, as shown here.

VMware Virtual SAN Storage Controller


VMware Virtual SAN supports storage controllers in two modes: pass-through mode and RAID0 mode. An
important consideration when choosing a storage controller for VMware Virtual SAN is whether it supports pass-
through mode, RAID0 mode, or both. For Cisco UCS, both types of controllers are supported, as listed in the
VMware Virtual SAN Compatibility Guide [3].

As a best practice, use the LSI MegaRAID SAS 9271CV-8i controller (Cisco UCS-RAID-9271-AI). This controller
achieves higher performance compared to other controllers because of its larger (1024) queue depth. Controllers
with a queue depth of less than 256 are not supported with VMware Virtual SAN as discussed in the VMware
knowledge base.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 9 of 49
When VMware Virtual SAN is implemented with pass-through controllers, VMware Virtual SAN accesses the drives
directly, and RAID configuration is not necessary. When VMware Virtual SAN is implemented with controllers that
do not support pass-through mode, a virtual RAID 0 drive must be created for each physical HDD that VMware
Virtual SAN will use. Follow these steps to configure virtual RAID 0 with the LSI 9271CV-81 controller:

1. Download the LSI StorCLI software from the LSI website [4] and install it on the VMware ESXi server.
2. Execute StorCLI commands from the VMware ESXi console or through SSH. Use this command to create
virtual disks of type RAID 0 for each of the individual disks:
/storcli /c0 add vd type=RAID0 name=vd1 drives=<enclosure number>:1
3. Configure VMware ESXi to mark SSDs because it cannot identify SSDs abstracted behind a RAID
controller. Use the following command:
esxcli storage nmp satp rule add -satp VMW_SATP_LOCAL -device <devicename> --
option="enable_local enable_ssd

Service Profile Configuration


The main configurable parameters of a service profile are summarized in Table 2.

Table 2. Service Profile Parameters

Parameter Type Parameter Description


Server hardware UUID Obtained from defined UUID pool
MAC addresses Obtained from defined MAC address pool

Worldwide port name (WWPN) and worldwide node Obtained from defined WWPN and WWNN pools
name (WWNN)
Boot policy Boot path and order
Disk policy RAID configuration
Fabric LAN Virtual NICs (vNICs), VLANs, and maximum transmission
unit (MTU)
SAN Virtual host bus adapters (vHBAs) and VSANs
Quality-of-service (QoS) policy CoS for Ethernet uplink traffic
Operation Firmware policy Current and backup versions
BIOS policy BIOS version and settings
Statistics policy System data collection
Power-control policy Blade server power allotment

For Cisco UCS service profiles for hosts in a VMware Virtual SAN cluster, you should configure the policies shown
here. This configuration does not include all Cisco UCS service profile settings. The settings shown here are
configurations that are specific to an implementation with VMware Virtual SAN.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 10 of 49
BIOS Policy
The following screen shows the BIOS policy configured for the VMware Virtual SAN environment. It is aimed at
achieving high performance.

The BIOS policy configuration includes:

Processor
Power technology = Custom
Enhanced Intel Speedstep = Enabled
Intel Turbo Boost = Enabled
Processor power state C6 = Disabled
Processor power state C1 enhanced = Disabled
Energy performance = Performance
Memory
Low-voltage DDR mode = Performance mode

Boot Policy
The boot policy is created with a Secure Digital (SD) card as the preferred boot option after the local CD or DVD
boot option.

vNIC Template
Two VMware Virtual SAN vNIC templates are configured: one each for Fabric A and Fabric B. Jumbo frames are
enabled to improve throughput and CPU efficiency, and the MTU size is set to 9000.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 11 of 49
The QoS policy is set with a priority level of Platinum to help ensure that the VMware Virtual SAN traffic receives
the highest traffic classification levels. Within the QoS system class, the enabled options include:

Class-of-service (CoS) value of 5


Weight value of 10
50 percent weight
MTU value of 9000
Multicast optimized

The network control policy is set to Cisco Discovery Protocol enabled.

The dynamic vNIC connection policy is applied with an adapter policy of VMware, as shown here.

VLANs
A dedicated VLAN is recomemnded for the VMware Virtual SAN VMkernel NIC, and multicast is requred within
the Layer 2 domain. This setting is configured as part of the VLAN as a multicast policy with snooping enabled.

VMware Virtual SAN Configuration


VMware Virtual SAN is a VMware ESXi cluster-level feature that is configured using the VMware vSphere
Web Client. The first step in enabling the VMware Virtual SAN feature is to select one of the two modes of
disk group creation:

Automatic: Enable VMware Virtual SAN to discover all the local disks on the hosts and automatically add
the disks to the VMware Virtual SAN datastore.
Manual: Manually select the disks to add to the VMware Virtual SAN shared datastore.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 12 of 49
The creation of disk groups and the overall health of the disks can be verified in the Disk Management view for the
cluster, shown here.

When virtual machines are provisioned on the VMware Virtual SAN datastore, the storage policies are applied
through the Virtual Machine Creation Wizard, shown here.

These storage polices are tied to the storage requirements for each virtual machine.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 13 of 49
The Virtual Machine Creation Wizard can be used to configure create new virtual machine storage policies by
choosing Rules and Profiles and then VM Storage Policies. Virtual machine policies can be applied using the
Manage VM Storage Policies tab by choosing Virtual Machine and then VM Storage Policies, as shown here.

These policies are used to provide different levels of availability and performance for virtual machines. You
can and should use different policies for different types of virtual machines within the same cluster to meet
application requirements.

Table 3 lists the virtual machine storage policy requirements in VMware Virtual SAN.

Table 3. Virtual Machine Storage Policy Requirements

Policy Definition Default Maximum


Number of disk stripes per Defines the number of magnetic disks across which each replica of a 1 12
object storage object is distributed
Flash-memory read-cache Defines the flash-memory capacity reserved as the read cache for the 0% 100%
reservation storage object

Number of failures to tolerate Defines the number of host, disk, and network failures a storage object can 1 3 (in 8-host cluster)
tolerate; for n failures tolerated, n+1 copies of the object are created, and
2n+1 hosts of contributing storage are required
Forced provisioning Determines whether the object is provisioned, even if currently available Disabled [Enabled]
resources do not meet the virtual machine storage policy requirements
Object-space reservation Defines the percentage of the logical size of the storage object that needs 0% 100%
to be reserved (thick provisioned) upon virtual machine provisioning (the
remainder of the storage object is thin provisioned)

Default storage policies are recommended for Tier 2 and 3 workloads. After the VMware Virtual SAN policies and
datastore are configured and the environment is operational, several availability and maintenance scenarios could
arise. These VMware Virtual SAN availability and manageability details are described in the next section.

VMware Virtual SAN Availability and Manageability


VMware Virtual SAN is fully integrated with VMware vSphere advanced features, including as VMware vSphere
vMotion, DRS, and High Availability, to provide the best level of availability for the virtualized environment.

For redundancy, VMware Virtual SAN uses the concept of distributed RAID, by which a VMware vSphere cluster
can contend with the failure of a VMware vSphere host or a component within a host. For example, the cluster can
contend with the failure of magnetic disks, flash-memory-based devices, and network interfaces while continuing to
provide complete capabilities for all virtual machines.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 14 of 49
In addition, availability is defined on a pervirtual machine basis through the use of virtual machine storage
policies. Through the use of virtual machine storage policies along with VMware Virtual SAN distributed RAID
architecture, virtual machines and copies of their contents are distributed across multiple VMware vSphere hosts in
the cluster. In the event of a failure, a failed node does not necessarily need to migrate data to a surviving host in
the cluster.

The VMware Virtual SAN data store is based on object-oriented storage. In this approach, a virtual machine on the
VMware Virtual SAN is made up of these VMware Virtual SAN objects:

The virtual machine home, or namespace, directory


A swap object (if the virtual machine is powered on)
Virtual disks or virtual machine disks (VMDKs)
Delta disks created for snapshots (each delta disk is an object)

The virtual machine namespace directory holds all virtual machine files (.vmx files, log files, etc.), excluding
VMDKs, delta disks, and swap files, all of which are maintained as separate objects.

This approach is important to understand because it determines the way that objects and components are built
and distributed in VMware Virtual SAN. For instance, there are soft limitations, and exceeding those limitations
can affect performance.

In addition, witnesses are deployed to arbitrate between the remaining copies of data should a failure occur
within the VMware Virtual SAN cluster. The witness component helps ensure that no split-brain scenarios occur.

Witness deployment is not predicated on any failures-to-tolerate (FTT) or stripe-width policy settings.
Rather, witness components are defined as primary, secondary, and tiebreaker and are deployed based on
the rules defined:

Primary witnesses: Primary witnesses require at least (2 x FTT) + 1 nodes in a cluster to tolerate the FTT
number of node and disk failures. If the configuration does not have the required number of nodes after all
the data components have been placed, the primary witnesses are placed on exclusive nodes until the
configuration has (2 x FTT) + 1 nodes.
Secondary witnesses: Secondary witnesses are created to help ensure that each node has equal voting
power toward a quorum. This capability is important because each node failure needs to affect the quorum
equally. Secondary witnesses are added to allow each node to receive an equal number of components,
including the nodes that hold only primary witnesses. The total count of data components plus witnesses on
each node is equalized in this step.
Tiebreaker witnesses: After primary witnesses and secondary witnesses have been added, if the
configuration has an even number of total components (data and witnesses), then one tiebreaker witness is
added to make the total component count an odd number.

The following sections describe the VMware Virtual SAN datastore scenarios for providing for resiliency and
availability while performing day-to-day operations.

Host Maintenance Mode


For planned operations, the VMware Virtual SAN provides three host maintenance mode options: Ensure
Accessibility, Full Data Migration, and No Data Migration.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 15 of 49
Ensure Accessibility
The Ensure Accessibility option is the default host maintenance mode option. With this option, VMware Virtual SAN
helps ensure that all accessible virtual machines on the host remain accessible when the host is either powered off
or removed from the cluster.

With this option, typically only partial data evacuation is required. Select Ensure Accessibility to remove the host
from the cluster temporarily, such as to install upgrades and then to return the host to the same cluster. Do not use
this option to permanently remove the host from the cluster.

Full Data Migration


When Full Data Migration is selected, the VMware Virtual SAN moves all its data to other hosts in the cluster. Then
it maintains or fixes availability compliance for the affected components in the cluster. The Full Data Migration
option results in the largest amount of data transfer, and this migration consumes the most time and resources.

Select this option only when the host needs to be migrated permanently. When evacuating data from the last host
in the cluster, make sure that you migrate the virtual machines to another datastore and then put the host in
maintenance mode.

Note: The host cannot enter maintenance mode if a virtual machine object that has data on the host is not
accessible and cannot be fully evacuated.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 16 of 49
With VMware Virtual SAN, placing a host in maintenance mode with Full Data Migration causes the virtual machine
objects to be transferred to a different host. This migration is in addition to any virtual machines that were
proactively migrated by administrators, because the host may have disk objects of virtual machines that reside on
other hosts.

This transfer can be verified by using the vsan.resync_dashboard 10.0.108.15 -r 0 Ruby vSphere Console (RVC)
command, which shows the data being migrated as shown on the following screen.

For more information about RVC, refer to Appendix C.

No Data Migration
When No Data Migration is selected, VMware Virtual SAN does not evacuate any data from this host. If the host is
powered off or removed from the cluster, some virtual machines may become inaccessible.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 17 of 49
VMware Virtual SAN Failure Simulations
During ongoing operations of a VMware Virtual SAN environment, in some cases either an individual disk failure
or a host failure may affect virtual machine availability based on the storage policies applied. This section
simulates these failure scenarios to provide an understanding of how VMware Virtual SAN maintains storage
data that is highly available under different conditions. This discussion provides guidance when applying
different levels of policies.

Magnetic Disk Failure Simulation


In a VMware Virtual SAN environment, if the magnetic disk that stores a virtual machine object fails, VMware
Virtual SAN waits 60 minutes as its default repair-delay time before it rebuilds the disk object. If the failure is
transient and the failed magnetic disk recovers prior to the end of this wait time, rebuilding is not required. The
VMware Virtual SAN simply mirrors the disk objects from the replica.

Note that you can modify the VMware Virtual SAN default 60-minute repair-delay time [5].

For this simulation, summarized in Table 4, object placements for a test virtual machine are configured with FTT =
1 and default storage policies. The magnetic disk residing on the host using IP address 10.0.108.15 is unplugged
from the server. (See the Non-SSD Disk Name and Non-SSD Disk UUID values for 10.0.108.15 highlighted in
the table.)

Table 4. Magnetic Disk Failure Simulation: Repair-Relay Time Not Reached

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1

Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)

Component Active 10.0.108.15 LSI Disk 5263c975-e1a7-fac7- Local LSI Disk 526aabec-434b-ce73-
(naa.600605b00729ea6 a979-b5bfc3dededd (naa.600605b00729ea6 b6dd-6482e1535b11
01ad06df020766fb3) 01ad06df820ef02e7)

Witness Active 10.0.108.13 Local LSI Disk 52200aeb-e555-5bd8- Local LSI Disk 52173f38-2a73-10e4-
(naa.600605b0073a802 2b46-d1921ca1e31b (naa.600605b0073a802 31b4-6be150a82cd9
0ff00009e098b7f65) 0ff000098092fb8d1)
* CS = Component State

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 18 of 49
This simulation demonstrates a single disk failure scenario that keeps the virtual machine active and accessible
because the failure is tolerable, as per FTT = 1. For the virtual machine storage policy, the disk compliance status
is Not Compliant because the disk cannot tolerate any additional faults, as shown in the following screen.

Another way to check the disk object information is by using the RVC command vsan.disk_object_info. In this
case, one of the disks is not found, as shown in the following screen.

After the repair-delay time is reached, VMware Virtual SAN rebuilds the disk objects from the replica and then uses
a different disk, as shown in Table 5. (See the Non-SSD Disk Name and Non-SSD Disk UUID values for
10.0.108.15 highlighted in the table.)

Table 5. Magnetic Disk Failure Simulation: Repair-Delay Time Expired

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1

Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)

Component Active 10.0.108.15 LSI Disk 5263c975-e1a7-fac7- Local LSI Disk 528c8b04-f6c7-cea9-
(naa.600605b00729ea6 a979-b5bfc3dededd (naa.600605b00729ea6 770d-e1832e97e531
01ad06df020766fb3) 01ad06e0721cb3825)

Witness Active 10.0.108.13 Local LSI Disk 52200aeb-e555-5bd8- Local LSI Disk 52173f38-2a73-10e4-
(naa.600605b0073a802 2b46-d1921ca1e31b (naa.600605b0073a802 31b4-6be150a82cd9
0ff00009e098b7f65) 0ff000098092fb8d1)
* CS = Component State

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 19 of 49
For the virtual machine storage policy, the rebuild provides a hard-disk compliance status of Compliant once again,
as shown in the following screen.

By using the vsan.disk_object_info RCV command on the new disk, the virtual machine object constructs are
found, as shown in the following screen.

SSD Failure Simulation


If am SSD in a VMware Virtual SAN disk group fails, the disk group becomes inaccessible, and the magnetic disks
in the disk group do not contribute to the VMware Virtual SAN storage.

As in the magnetic disk failure simulation, when an SSD fails, the VMware Virtual SAN waits through a 60-minute
default repair-delay time before it rebuilds the virtual machine objects from a different SSD: for instance, in the
event of a nontransient failure.

The SSD failure simulation exhibits this behavior on a test virtual machine that is configured using the default
storage policies and the disk placements shown in Table 6.

The SSD highlighted in Table 6 was unplugged from the server to simulate a hard failure. The SSD residing on the
host using the IP address 10.0.108.13 is unplugged from the server. (See the SSD Disk Name and SSD Disk UUID
values for 10.0.108.13 highlighted in the table.)

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 20 of 49
Table 6. SSD Failure Simulation: Repair Delay Time Not Reached

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1
Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)
Component Active 10.0.108.13 Local LSI Disk 524296e8-95d1-49a1- Local LSI Disk 52fa14cf-4917-1d6b-
(naa.600605b0073a802 983f-59d1fea4ae16 (naa.600605b0073a802 bb68-01bb8c919c6e
0ff0000a90a31aa58) 0ff00009709213120)
Witness Active 10.0.108.14 Local LSI Disk 521c910e-badb-891c- Local LSI Disk 52d5a898-56a4-3430-
(naa.600605b0072ca6e0 fa3b-efcfba0009cd (naa.600605b0072ca6e0 734b-d9f5c1165d5e
ff00009809308c2e) ff00009b0961a896)
* CS = Component State

For the virtual machine storage policy, the absent SSD makes the hard disk with components placed on the SSD
not compliant with the FTT = 1 policy, as shown in the following screen. The virtual machine remains accessible
while the hard disk is in the Not Compliant state because the data is served from the replica.

While the SSD is unplugged, the corresponding disk group on the host also becomes out of use, as shown in the
following screen.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 21 of 49
Another way to check the virtual machine object information is by using the vsan.vm_object_info RVC command.
This command yields output about the layout of the disk objects associated with the virtual machine, as shown in
the following screen.

After the repair-delay time is reached, if the SSD failure continues to exist, VMware Virtual SAN rebuilds the virtual
machine layout using a different SSD, as shown in Table 7. (See the SSD Disk Name and SSD Disk UUID values
for 10.0.108.13 highlighted in the table.)

Table 7. Magnetic Disk Failure Simulation: Repair-Delay Time Expired

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1
Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)
Component Active 10.0.108.13 Local LSI Disk 525e6181-3532-3bdb- Local LSI Disk 52584447-6e24-fcad-
(naa.600605b0073a802 7c78-9a6ea6929832 (naa.600605b0073a802 ee5b-f5597cbdec5a
0ff000095090d7333) 0ff0000b00a9f5be9)
Witness Active 10.0.108.14 Local LSI Disk 521c910e-badb-891c- Local LSI Disk 52d5a898-56a4-3430-
(naa.600605b0072ca6e0 fa3b-efcfba0009cd (naa.600605b0072ca6e0 734b-d9f5c1165d5e
ff00009809308c2e) ff00009b0961a896)
* CS = Component State

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 22 of 49
For the virtual machine storage policy, the rebuild provides a hard-disk compliance status of Compliant once again,
as shown in the following screen.

Host Failure Simulation


When an unplanned host failure occurs, VMware vSphere advanced features, such as VMware High Availability,
protect the virtual machines by moving them to a different host in a VMware Virtual SAN cluster.

However, with VMware Virtual SAN, the host may also have additional disk data for virtual machines residing on
other hosts in the cluster. In this case, when a host failure occurs, VMware Virtual SAN rebuilds these disk objects
on a different disk within a disk group on one of the other available hosts in the cluster. A virtual machine with a
default fault-tolerance policy of 1 is still available for end users when a single host failure occurs because the
replica serves the objects.

For this host failure simulation, a host with IP address 10.0.108.13 is powered off from Cisco UCS Manager, as
shown in the following screen.

In addition, a test virtual machine with FTT = 1 is placed on host 10.0.108.12 with additional disk data residing on
hosts 10.0.108.12 and 10.0.108.13, as shown in Table 8. (See the SSD Disk Name, SSD Disk UUID, Non-SSD
Disk Name, and Non-SSD Disk UUID values for 10.0.108.13 highlighted in the table.)

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 23 of 49
Table 8. Host Failure Simulation: TestVM Components

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1
Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)
Component Active 10.0.108.13 Local LSI Disk 525e6181-3532-3bdb- Local LSI Disk 52584447-6e24-fcad-
(naa.600605b0073a802 7c78-9a6ea6929832 (naa.600605b0073a802 ee5b-f5597cbdec5a
0ff000095090d7333) 0ff0000b00a9f5be9)
Witness Active 10.0.108.14 Local LSI Disk 521c910e-badb-891c- Local LSI Disk 52d5a898-56a4-3430-
(naa.600605b0072ca6e0 fa3b-efcfba0009cd (naa.600605b0072ca6e0 734b-d9f5c1165d5e
ff00009809308c2e) ff00009b0961a896)
* CS = Component State

When 10.0.018.13 is powered off, the physical disk placement reports Object not found because the host is
unavailable, as shown in the following screen.

Table 9. Components Unavailable During Host Failure Simulation

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1
Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)
Component Active 10.0.108.13 Object not found 525e6181-3532-3bdb- Object not found 52584447-6e24-fcad-
7c78-9a6ea6929832 ee5b-f5597cbdec5a
Witness Active 10.0.108.14 Local LSI Disk 521c910e-badb-891c- Local LSI Disk 52d5a898-56a4-3430-
(naa.600605b0072ca6e0 fa3b-efcfba0009cd (naa.600605b0072ca6e0 734b-d9f5c1165d5e
ff00009809308c2e) ff00009b0961a896)
* CS = Component State

In this case, the VMware Virtual SAN waits the default delay wait time and then rebuilds the components on to a
different host from the cluster 10.0.108.15, as shown in Table 10. (See the SSD Disk Name, SSD Disk UUID, Non-
SSD Disk Name, and Non-SSD Disk UUID values for 10.0.108.15 highlighted in the table.)

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 24 of 49
Table 10. Components Rebuilt on Different Host

Type CS* Host SSD Disk Name SSD Disk UUID Non-SSD Disk Name Non-SSD Disk UUID
RAID1
Component Active 10.0.108.12 Local LSI Disk 52426b0a-638f-3267- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00730cf60f 671b-a6227321ccbc (naa.600605b00730cf60f 542f-74ba3bbfe003
f00009a0953093c) f00009c09766fdf)
Component Active 10.0.108.15 Local LSI Disk 52c75a2b-ed5d-98d1- Local LSI Disk 52c2abc0-66dd-8aa5-
(naa.600605b00729ea6 d845-7350626b557 (naa.600605b00729ea6 542f-74ba3bbfe003
01af1b56d2bac60cc) 01af1b56c2ba091c4)
Witness Active 10.0.108.14 Local LSI Disk 521c910e-badb-891c- Local LSI Disk 52d5a898-56a4-3430-
(naa.600605b0072ca6e0 fa3b-efcfba0009cd (naa.600605b0072ca6e0 734b-d9f5c1165d5e
ff00009809308c2e) ff00009b0961a896)
* CS = Component State

VMware Virtual SAN Network Redundancy Verification


The VMware Virtual SAN VMkernel network is configured with redundant Cisco Data Center VM-FEX virtual
networks connected to fabric interconnects A and B. These connections are configured as a PortChannel between
the two uplinks, as shown in the following screen.

A physical NIC failure is simulated by disabling vmnic2, as shown in the following screen.

To verify that the VMware Virtual SAN traffic is not disrupted, disable the physical port from Cisco UCS Manager to
display a continuous vmkping to the VMware Virtual SAN IP address on the dedicated network, as shown in the
following screen.

Similar redundancy for the management network in a VMware Virtual SAN environment is also anticipated.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 25 of 49
Benchmarking VMware Virtual SAN on Cisco UCS
VMware Virtual SAN is a scale-out storage solution that takes optimal advantage of the flash-tier using a
read cache and a write buffer. All read and write I/O is sent through flash devices before it is destaged on
the magnetic disks.

For read caching, VMware Virtual SAN distributes a directory of cached blocks between the VMware vSphere
hosts in the cluster. The actual block that is read by the application running in the virtual machine may not be on
the VMware vSphere host on which the virtual machine is running. This feature reduces the I/O read latency in the
event of a cache hit.

Read caching enables a VMware vSphere host to determine whether a remote host has data cached that is not in
a local cache. In this case, the VMware vSphere host can retrieve cached blocks from a remote host in the cluster
over the VMware Virtual SAN network. If the block is not in the cache on any VMware Virtual SAN host, it is
retrieved directly from the magnetic disks.

The write cache performs as a nonvolatile write buffer, which reduces the latency for write operations. Because all
write operations go to SSD storage, VMware Virtual SAN helps ensure that a copy of the data exists elsewhere in
the cluster. All virtual machines deployed to VMware Virtual SAN inherit the default availability policy settings. This
feature helps ensure that at least one additional copy of the virtual machine data, including the write-cache
contents, is always available.

After write operations have been initiated by the application running inside the guest operating system, they are
sent in parallel to both the local write cache on the owning host and the write cache on the remote host. The write
operation must be committed to SSD on both hosts before it is acknowledged.

In the event of a host failure, this approach helps ensure that a copy of the data exists on another flash device in
the VMware Virtual SAN cluster, and that no data loss occurs. The virtual machine accesses the replicated copy of
the data on another host in the cluster through the VMware Virtual SAN network.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 26 of 49
In determining the performance and scalability of VMware Virtual SAN datastore, the following components play
critical roles:

VMware Virtual SAN storage policy configuration: The values specified for the storage policies have a
direct impact on the resulting performance. For example, higher performance can be achieved with an FTT
value of 0, although this setting does not provide redundancy for the virtual machines.
Type of workload: IOPS capacity and overall performance achieved from the VMware Virtual SAN
datastore varies based on the type of workload.
Size of disk groups (scale up): Scaling up with VMware Virtual SAN is achieved by increasing the number
of disks in a disk group or by increasing the number of disk groups per host.
Size of VMware Virtual SAN cluster (scale out): Scaling out with VMware Virtual SAN is achieved by
increasing the number of hosts in a VMware Virtual SAN cluster.
Ratio of SSDs to magnetic disks: For VMware Virtual SAN, a 1:10 ratio of SSDs to magnetic disks is
recommended, based on anticipated storage capacity use (not taking into account SPBM policies).
Balanced cluster: Although VMware Virtual SAN supports the use of nodes within the cluster that do not
contribute to storage, you should keep the cluster as balanced as possible. A balanced cluster provides
better resiliency and performance than a highly unbalanced cluster with both contributing and
noncontributing nodes.
SSDs and magnetic disks: The sizes and specifications of the underlying SSDs and magnetic disks also
have a direct impact on the overall performance of the VMware Virtual SAN datastore.

Benchmark Testing
Benchmarking tests for the Cisco UCS with VMware Virtual SAN solution were performed to provide guidance
about the expected baseline capacity for this solution. The actual results may vary for specific workloads.

VMware Virtual SAN on the Cisco UCS C240 M3 achieved linear scalability while scaling up from a 4-node to an 8-
node cluster for the following two types of IOPS benchmark tests:

4K 100 percent Read, and with 80 percent randomness


4K 70 percent Read with 30 percent Write, and with 80 percent randomness

For the benchmarking, average latency values of 20 milliseconds (ms) were considered to be the criteria for
acceptable application response times in real-world scenarios. The maximum IOPS values obtained for a 4-node
and an 8-node cluster demonstrate that a VMware Virtual SAN datastore scales in a linear fashion, as shown in
Figures 4 and 5.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 27 of 49
Figure 4. IOPS Capacity for 4K 100 Percent Read 80 Percent Random Workload

Figure 5. IOPS Capacity for 4K 70 Percent Read 30 Percent Write 80 Percent Random Workload

The benchmark testing was conducted using the I/O Analyzer tool [6], generating IO meter I/O loads to
measure VMware Virtual SAN datastore performance. I/O Analyzer virtual machines were configured with
the following specifications:

4 x virtual CPUs (vCPUs) per I/O Analyzer


8 x 8-GB VMDKs per I/O Analyzer to simulate parallel worker threads
Custom configuration files (these configuration files are provided in Appendix A):

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 28 of 49
4K 100% Read, with 80% Random, and 4K 70% Read, with 30% Write, and with 80% Random100 Percent Read
and 80 Percent Random
In this testing, the number of IO Analyzer virtual machines per host was gradually increased until the 20-ms
latency values were reached, starting from a single I/O Analyzer virtual machine per host. For read I/O testing,
3 I/O Analyzers per host were used, for a total of 12 I/O Analyzers for 4-node testing and 24 I/O Analyzers for 8-
node testing.

With default storage policies, VMware Virtual SAN provides redundancy (FTT = 1) through the use of replica
images of the objects. For each VMDK on each I/O Analyzer, these objects are automatically placed on hosts in
the cluster. For example, if IOWKR1 is on Host1, its individual VMDK objects are placed on any of the hosts
within the cluster. Depending on the object placement, the IOPS on individual hosts will differ, as shown by the
data. No manual effort was made to change this placement to reflect the real-world VMware Virtual SAN object
placement scenario.

The 4-node and 8-node benchmarking that was used to obtain read IOPS values is shown in the
following sections.

4-Node Read Benchmark Testing Results


Figures 6 through 8 show the results.

Figure 6. 4-Node Read IOPS Chart from I/O Analyzer Results

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 29 of 49
Figure 7. 4-Node Read I/O for Single-Host IOPS and Latency Graph from VSAN Observer

Figure 8. 4-Node 100 Percent Read 80 Percent Random I/O CPU Utilization Graph

8-Node Read Benchmark Testing Results


Figures 9 through 11 show the results.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 30 of 49
Figure 9. 8-Node Read IOPS Chart from I/O Analyzer Results

Figure 10. 8-Node Read I/O for Single-Host IOPS and Latency Graph from VSAN Observer

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 31 of 49
Figure 11. 8-Node 100 Percent Read 80 Percent Random I/O CPU Utilization Graph

70 Percent Read, 30 Percent Write, and 80 Percent Random


For read-write I/O testing, a single I/O Analyzer per host was sufficient to achieve the IOPS values at
20-ms latencies.

The 4-node and 8-node benchmarking that was used to obtain Read-Write IOPS values is shown in the
following sections.

4-Node Read-Write Benchmark Testing Results


Figures 12 through 14 show the results.

Figure 12. 4-Node Read-Write IOPS Chart from I/O Analyzer Results

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 32 of 49
Figure 13. 4-Node Read-Write I/O for Single-Host IOPS and Latency Graph from VSAN Observer

Figure 14. 4-Node 70 Percent Read 30 Percent Write 80 Percent Random I/O CPU Utilization Graph

8-Node Read-Write Benchmark Testing Results


Figures 15 through 17 show the results.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 33 of 49
Figure 15. 8-Node Read-Write IOPS Chart from I/O Analyzer Results

Figure 16. 8-Node Read-Write I/O for Single-Host IOPS and Latency Graph from VSAN Observer

Figure 17. 8-Node 70 Percent Read 30 Percent Write 80 Percent I/O CPU Utilization Graph

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 34 of 49
Cisco UCS with VMware Virtual SAN Ready Nodes
The Cisco UCS with VMware Virtual SAN solution can be built using either of the following options:

Build your own solution: Use the VMware Virtual SAN Compatibility Guide [3] to select the individual
components that are required to build the solution.
Choose a VMware Virtual SAN Ready Node: The VMware Virtual SAN Ready Node is a preconfigured
single-node or multiple-node Cisco server hardware configuration that is based on the recommended
configuration types for a VMware Virtual SAN solution.

Based on the Virtual SAN Ready System that is used, Cisco provides UCS Solution Accelerator Packs for
VirtualVMware Virtual SAN. As an example, Table 11 shows the components in the Cisco UCS Starter Kit for
VMware Virtual SAN (UCS-VSAN-IVB-28TBP).

Table 11. Cisco UCS Starter Kit for VMware Virtual SAN (UCS-VSAN-IVB-28TBP)

Cisco UCS Starter Kit Components Description

Cisco hardware configuration 2 x Cisco UCS 6248UP 48-Port Fabric Interconnects


4 x Cisco UCS C240 M3 Rack Servers:
CPU: 2 x 2.60-GHz Intel Xeon processors E5-2650 v2
Memory: 8 x 16 GB (128 GB total)
Cisco UCS VIC 1225
HDD: 7 x 1-TB SATA 7200-rpm SFF
SSD: 1 x 800-GB SAS SSD
LSI MegaRAID SAS 9271CV-8i controller

Node expansion 1 x Cisco UCS C240 M3 Rack Servers:


CPU: 2 x 2.60-GHz Intel Xeon processors E5-2650 v2
Memory: 8 x 16 GB (128 GB total)
Cisco UCS VIC 1225
HDD: 7 x 1-TB SATA 7200-rpm SFF
Disk-group expansion HDD:
7 x 1-TB SATA 7200-rpm SFF
SSD:
1 x 800-GB SAS SSD

Conclusion
Implementing VMware Virtual SAN on Cisco UCS allows customers to achieve great performance with a much
simpler management experience, with Cisco UCS Manager centrally managing the infrastructure and VMware
Virtual SAN integrated into VMware vSphere. Customers can continue to achieve the performance benefits of a
Cisco UCS solution for applications hosted on their virtualized environments with VMware vSphere with VMware
Virtual SAN as the hypervisor-converged storage solution.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 35 of 49
Appendix A: IO Meter Custom Configuration Files

4k 70 Percent Read and 30 Percent Write, with 80 Percent Random


Version 2006.07.27
'TEST SETUP ====================================================================
'Test Description
4k_70Read_80Rand_cust
'Run Time
' hours minutes seconds
0 1 0
'Ramp Up Time (s)
0
'Default Disk Workers to Spawn
NUMBER_OF_CPUS
'Default Network Workers to Spawn
0
'Record Results
ALL
'Worker Cycling
' start step step type
1 1 LINEAR
'Disk Cycling
' start step step type
1 1 LINEAR
'Queue Depth Cycling
' start end step step type
1 32 2 EXPONENTIAL
'Test Type
NORMAL
'END test setup
'ACCESS SPECIFICATIONS =========================================================
'Access specification name,default assignment
4k; 70% Read; 80% Random, NONE
'size,% of size,% reads,% random,delay,burst,align,reply
4096,100,70,80,0,1,4096,0
'END access specifications
'MANAGER LIST ==================================================================
'Manager ID, manager name
1,IOA-manager
'Manager network address
127.0.0.1
'Worker
IOA-worker

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 36 of 49
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdb
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdc
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 37 of 49
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdd
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sde
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 38 of 49
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdf
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdg
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 39 of 49
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdh
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
8388608,0
'End default target settings for worker
'Assigned access specs
4k; 70% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdi
'Target type
DISK
'End target
'End target assignments
'End worker
'End manager
'END manager list
Version 2006.07.27

4K 100 Percent Read, with 80 Percent Random


Version 2006.07.27
'TEST SETUP ====================================================================
'Test Description
4k_100Read_100Rand_cust
'Run Time
' hours minutes seconds
0 1 0

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 40 of 49
'Ramp Up Time (s)
0
'Default Disk Workers to Spawn
NUMBER_OF_CPUS
'Default Network Workers to Spawn
0
'Record Results
ALL
'Worker Cycling
' start step step type
1 1 LINEAR
'Disk Cycling
' start step step type
1 1 LINEAR
'Queue Depth Cycling
' start end step step type
1 32 2 EXPONENTIAL
'Test Type
NORMAL
'END test setup
'ACCESS SPECIFICATIONS =========================================================
'Access specification name,default assignment
4k; 100% Read; 80% Random, NONE
'size,% of size,% reads,% random,delay,burst,align,reply
4096,100,100,80,0,1,4096,0
'END access specifications
'MANAGER LIST ==================================================================
'Manager ID, manager name
1,IOA-manager
'Manager network address
127.0.0.1
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 41 of 49
'Target
sdb
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdc
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdd

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 42 of 49
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sde
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdf
'Target type
DISK
'End target

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 43 of 49
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdg
'Target type
DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdh
'Target type
DISK
'End target
'End target assignments
'End worker

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 44 of 49
'Worker
IOA-worker
'Worker type
DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
16,DISABLED,1
'Disk maximum size,starting sector
0,0
'End default target settings for worker
'Assigned access specs
4k; 100% Read; 80% Random
'End assigned access specs
'Target assignments
'Target
sdi
'Target type
DISK
'End target
'End target assignments
'End worker
'End manager
'END manager list
Version 2006.07.27

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 45 of 49
Appendix B: VMware Virtual SAN Requirements
Table 12 lists VMware Virtual SAN prerequisites, and Table 13 lists limitations and recommendations.

Table 12. VMware Virtual SAN Prerequisites

Component Description
VMware vCenter Server Minimum Version 5.5 Update 1
VMware vSphere Minimum Version 5.5
Hosts Minimum 3 VMware ESXi hosts
Disk controller Requires one of the following:
SAS or SATA HBA
RAID controller; must function in either pass-through (preferred) or RAID 0 mode

Hard disk drives Minimum 1 SAS, NL-SAS, or SATA magnetic hard drive per host
Flash-memory-based devices Minimum 1 SAS, SATA, or PCIe SSD per host

Network interface cards (NICs) Minimum one 1- or 10-Gbps (recommended) network adapter per host
Virtual switch VMware vSphere standard switch or distributed switch or Cisco Data Center VM-FEX
VMkernel network Minimum one 1- or 10-Gbps (recommended) network adapter per host

Table 13. VMware Virtual SAN Limitations and Recommendations

Limitations and Recommendations Description


Limitations Maximum 32 hosts per VMware Virtual SAN cluster
Maximum 5 disk groups per host
Maximum 7 magnetic disks per disk group
Maximum 1 SSD per disk group

Recommendations All cluster hosts share identical hardware configuration.


All cluster hosts have the same number of disk groups.
An SSDtomagnetic disk capacity ratio of 1:10 of the anticipated consumed storage capacity is
required before the FTT value is considered.
Each cluster host has a single VMkernel NIC enabled for VMware Virtual SAN.

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 46 of 49
Appendix C: Ruby vSphere Console and VMware VSAN Observer
Ruby vSphere Console (RVC) [7] is a Linux console user interface for VMware ESXi and vCenter. RVC is installed
on VMware vCenter as a requirement for using VSAN Observer commands.

VSAN Observer is a part of RVC, and it is recommended for more advanced, deeper visibility into a VMware Virtual
SAN environment. For this solution, VSAN Observer was extensively used to monitor all performance results and
maintenance and availability operations.

Table 14 lists the VSAN Observer commands that were used for this solution.

Table 14. VSAN Observer Commands

VSAN Observer Command Description


vsan.resync_dashboard 10.0.108.15 -r 0 Observe data migration while placing hosts in Full Migration maintenance mode.
vsan.disk_object_info Verify disk object information.
vsan.vm_object_info Verify virtual machine object information.

vsan.disks_info hosts/10.0.108.15 Obtain a list of disks on a specific host.


vsan.obj_status_report Obtain health information about VMware Virtual SAN objects. This command is helpful
in identifying orphaned objects.

vsan.reapply_vsan_vmknic_config Re-enable VMware Virtual SAN on VMkernel ports while performing network
configurationrelated troubleshooting.
vsan.observer {cluster name} -r -o -g /tmp -i 30 -m 1 Enable and capture performance statistics used for benchmark testing.
See also [8].

For a more comprehensive list of VSAN Observer command, please refer to [9].

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 47 of 49
Appendix D: Cisco Part Numbers
Table 15 provides Cisco part numbers for the components used in this solution.

Table 15. Cisco Ordering Information

Component Cisco Part Number


Cisco UCS C240 M3 server UCSC-C240-M3S
8-GB DDR3 1600-MHz RDIMM PC3-12800, dual rank, 1.35V UCS-MR-1X082RY-A
Seagate 1-TB SATA disk A03-D1TBSATA
Cisco UCS VIC 1225 CNA UCSC-PCIE-CSC-02
Cisco FlexFlash card UCS-SD-16G
Cisco UCS 6248UP fabric interconnect UCS-FI-6248UP

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 48 of 49
Resources
1. Whats new in VMware Virtual SAN
2. Cisco FlexFlash: Use and manage Cisco Flexible Flash internal SD card for Cisco UCS C-Series
standalone rack servers
3. VMware Virtual SAN compatibility guide
4. LSI
5. Changing the default repair-delay time for a host failure in VMware Virtual SAN
6. IO Analyzer
7. Ruby vSphere Console (RVC)
8. Enabling or capturing performance statistics using VMware Virtual SAN Observer
9. VMware Virtual SAN quick monitoring and troubleshooting reference guide
10. Cisco UCS C240 M3 high-density rack server (SFF disk-drive model) specification sheet
11. Working with VMware Virtual SAN
12. VMware Virtual SAN Ready System recommended configurations
13. Enabling or capturing statistics using VMware Virtual SAN Observer for VMware Virtual SAN Resources

Acknowledgements
The following individuals contributed to this paper:

Bhumik Patel, Partner Architect, VMware

John Kennedy, Technical Marketing Engineer, Cisco Systems

Wade Holmes, Senior Technical Marketing Architect, VMware

Printed in USA C11-732332-00 08/14

2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 49 of 49

Anda mungkin juga menyukai