Anda di halaman 1dari 10

Technical white paper

Adaptive Optimization for


HP 3PAR StoreServ Storage
Configure multiple tiers of storage devices for maximum performance

Table of contents
Executive summary ...................................................................................................................................................................... 2
Storage tiers: Opportunity and challenge ................................................................................................................................ 2
HP 3PAR Adaptive Optimization software ............................................................................................................................... 3
Brief overview of volume mapping ....................................................................................................................................... 3
Adaptive optimization implementation ............................................................................................................................... 4
Design tradeoff: Tiering vs. caching ...................................................................................................................................... 5
Configuration ............................................................................................................................................................................. 5
Tiering analysis algorithm ....................................................................................................................................................... 6
Design tradeoff: Granularity of data movement ................................................................................................................ 7
Results ............................................................................................................................................................................................. 7
Customer case study .................................................................................................................................................................... 9
Summary ....................................................................................................................................................................................... 10




Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage


2
Executive summary
New opportunities exist to optimize the cost and performance of storage arrays, thanks to the availability of a wide range of
storage media such as solid state drives (SSDs), high-performance hard disk drives (HDDs), and high-capacity HDDs. But
these opportunities come with the challenge of doing it effectively and without increasing administrative burdens, because
the tradeoffs for storage arrays are different from CPU memory hierarchies. This white paper explains some of the
tradeoffs, describes the technology that adaptively optimizes storage on HP 3PAR StoreServ Storage, and illustrates its
effectiveness with performance results.
Storage tiers: Opportunity and challenge
Modern storage arrays support multiple tiers of storage media with a wide range of performance, cost, and capacity
characteristicsranging from inexpensive (~$200 USD) 2 TB SATA HDDs that can sustain only about 75 input/output
operations per second (IOPS) to expensive (~$500+ USD) 50200 GB SLC/MLC flash memory-based SSDs that can sustain
more than 4,000 IOPS. Volume RAID and layout choices enable additional performance, cost, and capacity options. This wide
range of cost, capacity, and performance characteristics is both an opportunity and a challenge.
Figure 1. Autonomic Tiering 3PAR StoreServ

The opportunity is that the performance and cost of the system can be optimized by correctly placing the data on different
tiers: Move the most active data to the fastest (and most expensive) tier and move the idle data to the slowest (and least
expensive ) tier. The challenge, of course, is to do this in a way that minimizes the burden on storage administrators while
also providing them with appropriate controls. Currently, data placement on different tiers is a task usually performed by
storage administratorsand their decisions are often based not on application demands but on the price paid by the users.
If they don't use careful analysis, they may allocate storage based on available space rather than on performance
requirements. At times, HDDs with the largest capacity may also have the highest number of accesses. But the largest
HDDs are often the slowest HDDs. This can create significant performance bottlenecks.
There is an obvious analogy with CPU memory hierarchies. Although the basic idea is the same (use the smallest, fastest,
most expensive resource for the busiest data), the implementation tradeoffs are different for storage arrays. While deep
CPU memory hierarchies (first, second, and third level caches; main memory; and finally paging store) are ubiquitous and
have mature design and implementation techniques, storage arrays typically have only a single cache level (the cache on
disk drives usually acts more like a buffer than a cache). Automatic tiering in storage arrays is a recent development, and not
commonplace at all. The industry still has much to learn about it.
Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage



3
HP 3PAR Adaptive Optimization software
Brief overview of volume mapping
Before you can understand HP 3PAR Adaptive Optimization, it is important to understand volume mapping on HP 3PAR
StoreServ Storage as illustrated in Figure 2.
Figure 2. HP 3PAR Adaptive Optimization

HP 3PAR virtual volumes (VVs) are organized into volume families (or trees) consisting of a base volume at the root and
optional Un-Copy-On-Write (Un-COW) snapshot volumes of the base VV or of other snapshot VVs in the tree.
Each volume family has three distinct data storage spaces: 1) user space for the base volume; 2) snap space for the
copy-on-write data; and 3) admin space for the mapping metadata for the snapshots. If the base volume is fully
provisioned, there is a direct, one-to-one mapping from the VV virtual address to the user space. If the base volume is
thin-provisioned, only written space in the base volume is mapped to user space and the mapping metadata is stored in the
admin space. This is similar to Un-COW snapshots. The unit of mapping for the snapshot Un-COW or thin-provisioned VVs is
a 16 KB page. Caching is done at the VV space level and at a granularity of 16 KB pages.
Physical storage in HP 3PAR StoreServ Storage is allocated to the volume family spaces in units of logical disk (LD) regions.
The region size for the user and snap spaces is 128 MB, and the region size for the admin space is 32 MB.
Logical disk storage is striped across multiple RAID sets built from 256 MB allocation units of physical disks (PDs) known
as chunklets. Every RAID set within one LD has the same RAID type (1, 5, or 6), set size, and disk type (SSD, FC, and SATA
Nearline [NL]). These parameters determine the LD characteristics in terms of performance, cost, redundancy, and
failure modes.
HP 3PAR StoreServ Storage is a cluster of controller nodes. The chunklets for one LD are allocated only from PDs with the
primary access path directly connected to the same node, known as the LD owner node. You can achieve system level data
striping by striping the volume family space across regions from LDs owned by different nodes. This ownership partitioning
is one reason why thin-provisioned volumes still contain a user space mapping in which each region maps to a dummy zero
LD with no physical storage.
A common provisioning group (CPG) is a collection of LDs. It contains the parameters for additional LD space creation,
which includes RAID type, set size, and disk type for chunklet selection, plus total space warning and limit points. Multiple VV
family spaces may be associated with a CPG from which they get LD space on demand. Therefore, the CPG is a convenient
Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage


4
way to specify a tier for adaptive optimization because it includes all of the necessary parameters and it permits
adaptive optimization to operate after the cache. (There is no reason to bring busy data that is in the controller cache into
high-performance storage below the cache.) An additional benefit of tiering at this level is that all three volume spaces, not
just user space, are candidates for adaptive optimization. In fact, measurements show that admin space metadata regions
are frequently chosen to be placed in the fastest tier.
Figure 2 illustrates the volume mapping for both non-tiered as well as tiered (adaptively optimized) volumes. For non-tiered
VVs, each space (user, snap, or admin) is mapped to LD regions within a single CPG and therefore is in a single tier. For tiered
VVs, each space can be mapped to regions from different CPGs.
Finally, remember that although this mapping from VVs to VV spaces to LDs to chunklets is complex, the user is not
exposed to this complexity because the system software automatically creates the mappings.
The remainder of this white paper describes how this tiering is implemented and the benefits that can be expected.
Adaptive optimization implementation
In order to implement tiering, HP 3PAR Adaptive Optimization needs to do four things: (1) collect historical data of accesses
for all the regions in an array (this can be a lot of data); (2) analyze the data to determine the volume regions that should be
moved between tiers; (3) instruct the array to move the regions from one CPG (tier) to another; and (4) provide the user with
reports that show the impact of adaptive optimization.
HP 3PAR has an application software called System Reporter that runs from a host server and periodically collects detailed
performance and space data from HP 3PAR arrays, stores the data in a database, and analyzes the data. System Reporter
can then generate AO reports from a host, or the 3PAR StoreServ Storage array can generate AO reports from the 3PAR OS
management console.
HP implemented adaptive optimization by enhancing System Reporter to collect region-level performance data, perform
tiering analysis, and issue region movement commands to the array as shown in Figure 3.
Figure 3. Adaptive optimization implementation using System Reporter

Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage



5
Design tradeoff: Tiering vs. caching
Traditional caching is an obvious choice for an algorithm to manage the different tiers of storage. In this case, data is copied
from slower tiers into the fastest tier whenever it is accessed, replacing older data by using a simple, real-time algorithm
such as least recently used (LRU). These caching algorithms have been extensively studied in the context of CPU-memory
hierarchies. However, disk storage tiers in an array are different from a typical memory hierarchy in several respects.
In memory hierarchies, the faster tiers are almost always much smaller than the slower tiers. Plus, regions that are cached
in the faster tier occupy space on the slower tier, but the space duplicated on the slower tier is a small fraction of its total
size. In contrast, on arrays, the total space for mid-tier FC drives often is a significant fraction of the space on the slow-tier
NL drivesand losing the duplicated space is generally not desirable.
Memory hierarchies require very fast response times, so it is not feasible to use complex analysis to figure out what
should be cached or replaced. Simple algorithms such as LRU are all that designers can afford. For storage tiers, it is
possible to devote time to more sophisticated analysis of access patterns to come up with more effective strategies than
simple LRU algorithms.
Memory hierarchies typically use different hardware resources (memory buses) for different tiers, and transferring data
between tiers may not significantly impact the available bandwidth to the fastest tier. Disk tiers may often share the same
resources (FC ports). Also, the bandwidth used while transferring data between tiers impacts the total backend bandwidth
available to the controllers.
For these reasons, HP chose to move regions between tiers instead of caching.
Configuration
Simple administration is an important design goal, which makes it tempting to completely automate adaptive optimization.
That would require the administrator to do no configuration at all. However, analysis indicates that some controls are in fact
desirable for administration simplicity. Since HP 3PAR StoreServ Storage is typically used for multiple applicationsoften
for multiple customersHP allows administrators to create multiple adaptive optimization configurations so that they can
use different configurations for different applications or customers. Figure 4 shows the configuration settings for an
adaptive optimization configuration.
Figure 4. Configuration settings

You can select CPGs for each of the tiers and also set a tier size if you want to limit the amount of space that the algorithm
will use in each tier. You can set a very large number if you do not want to limit the size available for any given tier. Note that
adaptive optimization will attempt to honor this size limit in addition to any warning or hard limit specified in the CPG.
Make sure to define tier 0 to be on a higher performance level than tier 1, which in turn should be higher performance than
tier 2. For example, you may choose RAID 1 with SSDs for tier 0, RAID 5 with FC drives for tier 1 and RAID 6 with NL or SATA
drives for tier 2.
Best practices encourage you to begin your Adaptive Optimization configurations with your application CPG starting with
tier 1. For example, tier 1 could be CPG using your FC or SAS physical disks. This allows you to add both higher and lower tier
capabilities at a later date. If you don't have higher or lower tier, you can add either or both at a later date by using a new
CPG, such as tier 0 using SSDs or tier 2 using NL. Or, you could have CPG tiers with RAID 1 or RAID 5 and RAID 6. The main
point is that you should begin with middle CPG tier 1 when configuring Adaptive Optimization with your application.
Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage


6
It is also important to specify the schedule when a configuration will execute along with the measurement duration
preceding the execution time. This allows the administrator to schedule data movement at times when the additional
overhead of that data movement is acceptable (for example, non-peak hours). You can also set the schedule as to when
adaptive optimization should stop working before the next measurement period.
Plus, you can set a mode configuration parameter to one of three values:
1. Performance mode biases the tiering algorithm (described in the next section) to move more data into faster tiers
2. Cost mode biases the tiering algorithm to move more data into the slower tiers
3. Balanced mode is a balance between performance and cost
The mode configuration parameter does not change the basic flow of the tiering analysis algorithm, but rather it changes
certain tuning parameters that the algorithm uses.
Tiering analysis algorithm
The tiering analysis algorithm that selects regions to move from one tier to another considers several things described in
the following sections.
Space available in the tiers
If the space in a tier exceeds the tier size (or the CPG warning limit), then the algorithm will first try to move regions out of
that tier into any other tier with available space in an attempt to lower the tiers size below the limit. If no other region has
space, then the algorithm logs a warning and does nothing. (Note that if the warning limit for any CPG is exceeded, the array
will generate an alert.) If space is available in a faster tier, it chooses the busiest regions to move to that tier. Similarly, if
space is available in a slower tier, it chooses the most idle regions to move to that tier. The average tier service times and
average tier access rates are ignored when data is being moved because the size limits of a tier have been exceeded.
Average tier service times
Normally, HP 3PAR Adaptive Optimization tries to move busier regions in a slow tier into higher performance tiers. However,
if a higher performance tier gets overloaded (too busy), performance for regions in that tier may actually be lower than
regions in a slower tier. In order to prevent this, the algorithm does not move any regions from a slower to a faster tier if
the faster tiers average service time is not lower than the slower tiers average service time by a certain factor (a parameter
called svctFactor). There is an important exception to this rule because service times are only significant if there is sufficient
IOPS load on the tier. If the IOPS load on the destination tier is below another value (a parameter called minDstIops), then we
do not compare the destination tiers average service time with the source tiers average service time. Instead, we use an
absolute threshold (a parameter called maxSvctms).
Average tier access rate densities
When not limited, as described above, by lack of space in tiers or by high average tier service times, adaptive optimization
computes the average tier access rate densities (a measure of how busy the regions in a tier are on average, calculated with
units of IOPS per gigabyte per minute) and compares them with the access rate densities of individual regions in each tier.
Then, it decides whether to move the region to a faster or slower tier.
We first consider the algorithm for selecting regions to move from a slower to a faster tier. For a region to be
considered busy enough to move from a slower to a faster tier, its access rate density and accr(region) must satisfy
these two conditions:
First, the region must be sufficiently busy compared to other regions in the source tier:
accr(region) > srcAvgFactorUp(Mode) * accr(srcTier)
Where accr(srcTier) is the average access rate density of the source (slower) tier and srcAvgFactorUp(Mode) is
a tuning parameter that depends on the mode configuration parameter. Note that by selecting different values of
srcAvgFactorUp for performance, balanced or cost mode values HP 3PAR Adaptive Optimization can control how
aggressive the algorithm is in moving regions up to faster tiers.
Second, the region must meet one of two conditions: It must be sufficiently busy compared with other regions in the
destination tier, or it must be exceptionally busy compared with the source tier regions. This second condition is added to
cover the case in which a very small number of extremely busy regions are moved to the fast tier, but then the average
access rate density of the fast tier create too high a barrier for other busy regions to move to the fast tier:
accr(region) > minimum((dstAvgFactorUp(Mode) * accr(dstTier)), (dstAvgMaxUp(Mode)
* accr(srcTier)))
Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage



7
The algorithm for moving idle regions down from faster to slower tiers is similar in spiritbut instead of checking for access
rate densities greater than some value, the algorithm checks for access rate densities less than some value:
accr(region) < srcAvgFactorDown(Mode) * accr(srcTier)
accr(region) < maximum((dstAvgFactorDown(Mode) * accr(dstTier)),
(dstAvgMinDown(Mode) * accr(srcTier)))
HP makes a special case for regions that are completely idle (accr(region) = 0). These regions are moved directly
to the lowest tier.
Design tradeoff: Granularity of data movement
The volume space to LD mapping has a granularity of either 128 MB (user and snapshot data) or 32 MB
(admin metadata)and that is naturally the granularity at which the data is moved between tiers. Is that the optimal
granularity? On the one hand, having fine-grain data movement is better since we can move a smaller region of busy data to
high-performance storage without being forced to bring along additional idle data adjacent to it. On the other hand, having a
fine-grain mapping imposes a larger overhead because HP 3PAR Adaptive Optimization needs to track performance of a
larger number of regions, maintain larger numbers of mappings, and perform more data movement operations. Larger
regions also take more advantage of spatial locality (the blocks near a busy block are more likely to be busy in the near
future than a distant block). HP results show that the choice is a good one.
Results
HP measured the access rate for all regions for a number of application CPGs, sorted them by access rate, and plotted the
cumulative access rate versus the cumulative space as shown in Figure 5. For all the applications, most of the accesses are
concentrated in a small percentage of the regions. In several applications, this concentration of accesses is very pronounced
(more than 95 percent of the accesses to less than 3 percent of the data) but less so for others (more than 30 percent of
the space is needed to capture 95 percent of the accesses). In total, just 4 percent of the data gets 80 percent of the
accesses. This indicates that the choice of region size is reasonably good, at least for some applications.
Figure 5. Distribution of IO accesses among regions for various applications

Because SSD space is still extremely expensive relative to HDD space (10x to 15x), very pronounced concentration of
IO accesses to a small number of regions are needed in order for SSDs to be cost-effective. For applications that show
less pronounced access concentration, HP 3PAR Adaptive Optimization may still be useful between different HDD tiers.
One of the simple but important ideas in the implementation is the separation of the analysis and movement by CPGs
(or applications).
The example results in Figure 6 describe region IO density after HP 3PAR Adaptive Optimization has run for a while. Both
charts are histograms, with the x-axis showing the IO Rate Density buckets; the busiest regions are to the right and the most
idle are to the left. The chart on the left shows on the y-axis the capacity for all the regions in each bucket, while the chart
Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage


8
on right shows on the y-axis the total IOPS/min for the regions in each bucket. As shown in the charts, the SSD tier (tier 0)
occupies very little space but absorbs most of the IO accesses, whereas the Nearline tier (tier 2) occupies most of the space
but absorbs almost no accesses at all. This is precisely what the user wants.
Figure 6. The two Region IO density reports after adaptive optimization, the first with two tiers and the second with three tiers.


Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage



9
Customer case study
This section describes the real benefits that a customer derived from using HP 3PAR Adaptive Optimization. The customer
had a system with 96 300 GB 15k rpm FC drives and 48 1 TB 7.2k rpm NL drives. The customer had 52 physical servers
connected and running VMware with more than 250 VMs. The workload was mixed (development and QA, databases, file
servers) and they needed more space to accommodate many more VMs that were scheduled to be moved onto the array.
However, they faced a performance issue: they had difficultly managing their two tiers (FC and NL) in a way that kept the
busier workloads on their FC disks. Even though the NL disks had substantially less performance capability (because there
were fewer NL disks and they were much slower), they had larger overall capacity. As a result, more workloads were
allocated to them and they tended to be busier while incurring long latencies. The customer considered two options: either
they would purchase additional 96 FC drives, or they would purchase additional 48 NL drives and 16 SSD drives and use
HP 3PAR Adaptive Optimization to migrate busy regions onto the SSD drives. They chose the latter and were pleased with
the results (illustrated in Figure 7).
Figure 7. Improved performance after adaptive optimization

Before HP 3PAR Adaptive Optimization as described in the chartsand even though there are fewer NL drivesthey incur
greater IOPS load than the FC drives in aggregate and consequently have very poor latency (~40 ms) compared with the
FC drives (~10 ms). After HP 3PAR Adaptive Optimization has executed for a little while, as shown in the charts on the right,
the IOPS load for the NL drives has dropped substantially and has been transferred mostly to the SSD drives. HP 3PAR
Adaptive Optimization moved ~33 percent of the IOPS workload to the SSD drives even though that involved moving only
1 percent of the space. Performance improved in two ways: the 33 percent of the IOPS that were serviced by SSD drives
got very good latencies (~2 ms), and the latencies for the NL drives also improved (from ~40 ms to ~15 ms). Moreover, the
investment in the 16 SSD drives permitted them to add even more NL drives in the future, because the SSD drives have
both space and performance headroom remaining.

Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage



Sign up for updates
hp.com/go/getupdated



Rate this document

Copyright 20122013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only
warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should
be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
4AA4-0867ENW, March 2013, Rev. 1



Summary
HP 3PAR Adaptive Optimization is a powerful tool for identifying how to configure multiple tiers of storage devices for
maximum performance. Its management features can deliver results with minimal effort. As in all matters concerning
performance, your results may vary, but proper focus and use of HP 3PAR Adaptive Optimization can deliver significant
improvements in device utilization and total throughput.
Learn more at
hp.com/go/3PARStoreServ

Anda mungkin juga menyukai