Anda di halaman 1dari 18

Mastering HUAWEI OceanStor

In Only 4 Hours!

This document is intended for Huawei technical and product personnel and
is for internal use only.
For promotion data and policies, refer to the latest released promotion data
and sales guide.
Do not use this document as commitments to customers.

HUAWEI TECHNOLOGIES CO., LTD.

Copyright Huawei Technologies Co., Ltd. 2014. All rights reserved.


No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions


and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.

Notice
The purchased products, services and features are stipulated by the commercial contract made between
Huawei Symantec and the customer. All or partial products, services and features described in this
document may not be within the purchased scope or the usage scope. Unless otherwise agreed by the
contract, all statements, information, and recommendations in this document are provided AS
IS without warranties, guarantees or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.


Address:

Huawei Industrial Base


Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website:

http://www.huawei.com

Email:

support@huawei.com

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

Preface
What can we do in 4 hours? HmmWe can have a flight trip, enjoy a big feast, or watch a
movie. Now we have another option: to become a storage expert!
Of course, this shortcut to expertise only suits the gifted ones who can meet all of the
following conditions:

Condition 1: You are a Huawei employee.

Condition 2: You have certain background knowledge on storage, for example, you know
some terms like "disks".

Condition 3: You are not yet a storage expert.

Condition 4: You have 4 hours of free time.

If you have met all the conditions, let's start the journey to explore the storage world. After 4
hours, you will find out storage is such an easy thing to learn!

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

ii

About This Document


Intended Audience and Document Overview
This document is intended for Huawei frontline marketing personnel.
This document has a base version and an advanced version.
After reading the base version, you will be able to understand:
1.

Basic storage knowledge

2.

Storage market and its characteristics

3.

Major storage manufacturers

After reading the advanced version, you will be able to understand:


1.

Huawei storage products and their characteristics (especially their differentiated


competitiveness)

2.

Development tendency of Huawei storage

3.

Sales strategies of Huawei storage

This document aims to help the audience attain basic storage knowledge and understand the
unique values of Huawei storage.
This document only gives a glance to the technical features of Huawei storage. For more
information, you can obtain related Technical White Paper from the 3MS website. There are
reference links at the end of the document.
This document is for internal use only and cannot be used as commitments to customers.
For promotion data and policies, refer to the latest released promotion data and sales guide.

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

Contents
1 General Storage Knowledge .......................................................................................... 3
1.1 Basics.............................................................................................................................................. 3
1.1.1 What is storage? Where can we use storage? ........................................................................... 3
1.1.2 What are the categories of storage? ......................................................................................... 3
1.1.3 What are the components in a storage system?......................................................................... 4
1.1.4 What are the key indexes to evaluate a storage system? ........................................................... 5
1.2 Storage Market ................................................................................................................................ 6
1.2.1 What is the storage market size?.............................................................................................. 6
1.2.2 What are the development trends of the storage market? .......................................................... 6
1.2.3 Why are there so many models of storage products? ................................................................ 6
1.3 Major Storage Manufacturers........................................................................................................... 7
1.3.1 Who are the major storage manufacturers? .............................................................................. 7
1.3.2 What are the products provided by these major manufacturers? ............................................... 8
1.4 Key Storage Technologies................................................................................................................ 8
1.4.1 What is RAID? ....................................................................................................................... 8
1.4.2 What are the differences among RAID levels? ....................................................................... 10
1.4.3 What is reconstruction? Why is the reconstruction speed so important?.................................. 11
1.4.4 How to shorten the reconstruction process to improve storage reliability? .............................. 11
1.4.5 What is cache? Why is it important in improving storage efficiency? ..................................... 12
1.4.6 What are backup and disaster recovery? ................................................................................ 12
1.4.7 What are RTO and RPO? ...................................................................................................... 13
1.4.8 What are the common data backup solutions? ........................................................................ 13
1.4.9 How to improve the overall reliability of a storage system? ................................................... 13

2 Acronyms and Abbreviations ...................................................................................... 15

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

General Storage Knowledge

1.1 Basics
1.1.1 What is storage? Where can we use storage?
Answer: Storage refers to the equipment that stores data. Based on application scenarios,
storage is categorized into consumer storage and enterprise storage.
1.

Consumer storage is the storage equipment used by individuals, such as laptops, PADs,
and mobile storage media. They are of small capacities, low reliability, poor
performance, and low costs.

2.

Enterprise storage is the storage equipment used by enterprises, and its characteristics
are on the contrary of the consumer storage.

The most easy-to-understand storage is disks, but the reliability of disks is only two 9s
(99%). Data centers require storage to achieve five-9s (99.999%) or even six-9s (99.9999%)
reliability. In addition, the servers and applications in a data center must share data, so disks
are installed outside servers. Multiple storage engines use a redundancy algorithm and
architecture to simultaneously manage these disks and access disk and cache resources,
achieving shared storage of high reliability, performance, and scalability. This is what Huawei
enterprise storage does.
As the core equipment to store data, enterprise storage is widely used in fields including
government, finance, telecommunications, enterprises, energy, manufacturing, health
care, and education. The market size has reached $100 billion and is growing every year.
Storage, computing, network, and security are the four fundamental elements in the IT
infrastructure of enterprise data centers, and they cooperate to support the operating of
upper-layer applications. The various combinations of these four elements produce a wide
range of products and solutions, and we will not detail them in this document.

1.1.2 What are the categories of storage?


Answer: Storage can be categorized into SAN and NAS based on their different usage.
On a SAN, dedicated storage equipment is used to house disks and provide storage services.
Its interfaces and usage are similar to traditional disks. Compared with disks in servers, a
SAN delivers higher performance, reliability, and scalability, and is more applicable to
databases.

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

If users want to store, share, and access unstructured data such as videos and files on a SAN,
they have to configure the local file systems on servers to file servers. NetApp simplifies this
practice by combining file systems and file sharing functions into the SAN equipment, and
this is how the NAS comes into being. Compared with the SAN + file server solution, the
professional NAS equipment has lower cost, higher reliability (with a redundancy
architecture), higher performance, and more functions. As the functions of server file systems
develop slowly in recent years and customers impose lower demands on NAS than SAN,
NAS is gradually replacing SAN and becomes an arena where Huawei enterprise storage can
accomplish a big deal.
There is a popular trend in the storage industry, which is to combine SAN and NAS into the
same equipment and to make the equipment support both database and file sharing
applications. This practice greatly reduces network complexity, equipment purchase cost, and
equipment maintenance cost.

1.1.3 What are the components in a storage system?


Answer: Storage can be regarded as a computer with a huge disk, so it has a computing unit
(controllers) and a storage unit (disk enclosures).
1.

The computing unit is a high-reliability and high-performance computer that runs a


dedicated storage operating system. The enclosure where the computing unit resides is a
controller enclosure. The front end of the controller enclosure connects to application
servers through Fibre Channel or iSCSI links, and handles storage I/O requests. The back
end of the controller enclosure connects to disk enclosures, and forwards the I/O requests
to relevant disks for data reads and writes.

2.

The storage unit is the area where data is stored. We can compare the computing unit to
the human brain and the storage unit to the human body. The storage unit consists of
disks (HDDs and SSDs) to store data, and the enclosures that house these disks are called
disk enclosures.

The following shows the exteriors of a controller enclosure and a disk enclosure.

Controller enclosure

Disk enclosure
Notes:
1.

Issue 01 (2009-04-10)

To ensure high storage reliability:

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

A controller enclosure usually houses two controllers that are mutually mirrored. This is
called a dual-controller architecture.

A controller enclosure must have redundant batteries that supply power to controllers
during external power failures and help controllers write cache data to coffer disks.

2.

Multi-controllers: Similar to a multi-engine plane, a storage system that has multiple


controllers can deliver high performance and reliability, but it is also a high technical
difficulty. In the past, low-end and mid-range storage systems usually supported two
controllers, and only high-end storage systems supported more than two controllers.
Nowadays, as technologies develop, many storage manufacturers, including Huawei, add
multiple controllers to their low-end and mid-range storage systems.

3.

Disk and controller integration: To improve system integration and reduce costs, the
controller enclosure in a low-end or mid-range storage system contains disks. With such
a design, no extra disk enclosures are required in scenarios that need only small
capacities. This design is called "disk and controller integration", and HUAWEI
OceanStor S2600T and S5000T adopt this design.

4.

Disk and controller separation: In scenarios that require large capacities and high
performance, controller enclosures do not have disks and disk enclosures are responsible
for data storage. This design is called "disk and controller separation". HUAWEI
OceanStor S5600T, S5800T, S6800T, and enterprise storage systems adopt this design.

1.1.4 What are the key indexes to evaluate a storage system?


Answer: There are two groups of key storage indexes. The first group is hard indexes that
make a storage system robust. These indexes assess the system's performance, capacity,
hardware processing capability, and interface capability. The second group is soft indexes that
make a storage system smart. These indexes analyze the system's software functions in
resource utilization, service reliability, and user experience. The following table lists the
typical key indexes:
ID

Index

Description

Capacity

Maximum data volume that can be stored in a storage system.

IOPS

A performance index that counts the processed I/Os per second. It reflects the service
volume that a system can process during a specified period of time.

Latency

The time it takes for the original data to go through a series of processing steps.

Failure rate

The number of failures that may occur during a specified period of time.

Availability

The capability of an IT service and its elements providing required functions during a
specified period of time. It is measured by several 9s.

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

1.2 Storage Market


1.2.1 What is the storage market size?
Answer: According to Gartner, the storage market size in 2013 was $22.5 billion (including
the revenue from sales of storage equipment but excluding that of storage software or
services), and grows by 4.4% every year. The market size of storage equipment is expected to
reach $28 billion in 2018. The revenue from sales of storage hardware and services will reach
$40 billion, and the total revenue will reach $100 billion if the sales of software and
consulting services are counted in.

1.2.2 What are the development trends of the storage market?


Answer: The major trends include the popularity of SSDs, the cloud infrastructure, and the
software-defined storage (SDS).
1.

Popularity of SSDs: The development of big data and mobile social network causes a
fast data growth and poses a demanding requirement on storage performance and
capacity. SSDs, with their proven reliability, superb performance, and affordable cost,
become more and more popular in the storage market.

2.

Cloud infrastructure: A traditional service system usually consists of a server and a


storage system, and is dedicated to one type of services only. This results in many
information islands in customers' data centers, makes server and storage resources
unable to be shared, and wastes construction and maintenance costs. As these service
systems develop fast, data center administrators have to constantly adjust the system
configurations to address the capacity and performance bottles, which results in an
increasing maintenance cost. The cloud infrastructure combines server and storage
resources and adds all service systems into a resource pool. In this way, capacity and
performance can be shared. The infrastructure also employs smart data storage
management technologies to help administrators resolve capacity and performance issues.
The unified hardware platform, software platform, and management platform in the
cloud infrastructure reduce system cost, provide services for customers on demand, and
improve the user experience.

3.

SDS: The SDS achieves loose coupling of software and hardware. With it, storage
software can run on general servers and virtual machines rather than dedicated hardware,
so cost is reduced. In addition, using storage software, storage systems can attain higher
performance, higher scalability, and easier maintenance, so the overall system efficiency
is improved.

1.2.3 Why are there so many models of storage products?


Answer: As starters in the storage arena, people may have the same question: Why are there
so many models for storage products of the same series? Using the Huawei OceanStor series
as an example, its high-end models include 18800 and 18500, and its low-end and mid-range
models include S6800T, S5800T, S5600T, S5500T, S2600T, and S2200T. The same situation
applies to the EMC VNX series and the NetApp FAS series.
Similar to BMW 1/3/5 series, different storage products have different configurations, such as
CPUs, memory, number of ports, and number of disks. Therefore, storage products are
divided into many BANDs. The famous consulting firm, Gartner, defines 9 BANDs for
storage products according to their different prices. Customers can choose to buy products
in different BANDs based on their service requirements and budgets.
The following table lists the 9 BANDs and their prices:

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

BAND

Price

BAND9

1000K$+

BAND8

500K$~999.9K$

BAND7

300K$~499.9K$

BAND6

200K$~299.9K$

BAND5

100K$~199.9K$

BAND4

50K$~99.9K$

BAND3

25K$~49.9K$

BAND2

5K$~24.9K$

BAND1

0~4.9K$

The product prices increase from BAND1 to BAND9, where:

BAND1 consists of basic storage arrays such as JBOD.

BAND2 consists of entry-level storage arrays.

BAND3 to BAND6 consist of mid-range storage arrays.

BAND7 to BAND9 consist of high-end storage arrays.

1.3 Major Storage Manufacturers


1.3.1 Who are the major storage manufacturers?
Answer: In the global market, major storage manufacturers include Huawei, EMC, NetApp,
IBM, HP, DELL, HDS, Fujitsu, and Oracle. In the China market, the manufacturers include
MacroSAN, Tongyou, Sugon, Inspur, and UIT.

With accurate acquisition, EMC has built a complete storage product family. However,
these acquired products have different architectures and are hard to integrate with one
another.

IBM's storage products are sold together with its servers and consulting services.
However, its product models are limited and their market positioning is imprecise. One
product model usually covers multiple BANDs. The market share of IBM's storage
products keeps shrinking, so IBM may cut down its investment in the storage arena.

NetApp has seized the opportunity in unstructured data storage and builds a unified Data
ONTAP platform. Their storage products offer abundant software functions and features,
and have various differentiated highlights. However, the unified storage products
provided by NetApp are a simple combination of NAS and SAN, which cannot
maximize the NAS and SAN performance at the same time. What's more, NetApp does
not have a generally acknowledged high-end storage product.

Huawei has over 10-year accumulated experience in the storage arena. With the
combination of industry-leading technologies and its own innovation capability, Huawei
has achieved perfect integration of SAN and NAS, low-end/mid-range/high-end products,

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

SSDs and HDDs, primary storage and backup storage, and heterogeneous products.
Huawei now provides a complete range of storage products and solutions with high
security, proven reliability, on-demand configuration, easy operation, and high efficiency.
According to Gartner's statistics data in 2013 Q4, Huawei has surpassed Oracle and
Fujitsu in sales revenue, number of sold sets, and capacity of sold sets in the global
storage market. In addition, Huawei ranks No. 1 in the three aspects in the China
market.

1.3.2 What are the products provided by these major


manufacturers?
Answer: Competitive storage products nowadays include HUAWEI OceanStor series, EMC
VNX and VMAX series, and NetApp FAS series. The following table lists the major
manufacturers and their flagship products:

Category

Huawei

EMC

High-end
storage

OceanStor
18500/18800

VMAX

Mid-range
storage

OceanStor
S2600T/S550
0T/S5600T/S
5800T/S6800
T

VNX2/VNX

Solid-state
storage

Dorado

Big data
storage and
NAS

OceanStor
9000

NetApp

IBM

HP

HDS

DS8870

VSP

XIV G3

3PAR
10800/1040
0

FAS8000/F
AS6000/FA
S3000/FAS2
000

V7000U/V5
000/V3000

StoreServ70
00

HUS VM

Xtrem IO

EF540

FlashSystem
720/820

StorServ
7450

HUS VM

Isilon

FAS series

SONAS

StoreAll972
0/9320

HNAS3000
series

EVA
P6000/P200
0

HUS150/130/
110

OceanStor
N8500

1.4 Key Storage Technologies


1.4.1 What is RAID?
Answer: RAID is short for redundant array of independent disks. It is a data storage scheme
that allows data to be stored and replicated in a hardware disk group consisting of multiple
physical disks.
In the early stage, storage manufacturers were actually disk manufacturers. However, the
capacity, performance, reliability, and data sharing capability of single disks are limited and
cannot meet the requirements of enterprise businesses. This problem had not been solved until
1987 when a paper discussing the RAID technology was published by the University of
California, Berkeley. Since then, the storage industry started to boost and many giant storage

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

manufacturers such as EMC and IBM were established. The following figure shows a
photocopy of the paper.

The essence of the RAID technology is to combine multiple disks and to achieve
high-reliability and high-performance reads and writes using a dedicated algorithm. The
detailed implementation is as follows:
1.

The group of disks is logically divided into data disks and parity disks.

2.

Data writes: The data to be written to the disks is split into multiple segments and a
parity bit is calculated out using an algorithm. These data segments and parity bit
are written to the data disks and parity disks in parallel.

3.

Data reads: If the data needs to be read out, the algorithm sends a parallel write request
to the disk group, and then the data segments are combined and then returned to the
application.

4.

Exception handling: If a disk fails and a data segment cannot be accessed, the algorithm
uses the parity bit to retrieve the lost data, so the data integrity is ensured. In this way,
the failure of a single disk will not hamper the stable operation of the whole storage
system.

To achieve a balance between reliability and performance, a traditional RAID group usually
consists of 10 disks. A storage array can contain multiple RAID groups, which means that
thousands of disks are working simultaneously to provide a PB-level capacity. Compared with
a single disk, a disk array has the following advantages:
1.

Enhanced reliability: The failure of one or even two disks will not affect the operating
of the whole RAID group.

2.

Improved performance: The read/write performance of mechanical disks is always a


bottleneck. Using the RAID technology, the read/write requests are evenly distributed to
multiple disks to process, so the system performance is accelerated.

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

1.4.2 What are the differences among RAID levels?


Answer: Different RAID levels can provide different levels of reliability, performance, and
space utilization. You can select a desired RAID level to meet your specific requirements on
the storage system reliability, performance, and capacity.
RAID

Description

Reliability

Redundancy

Available
Space

Performance

RAID 0

It segments data into


stripes and then writes
these stripes to multiple
disks. It does not
support redundancy.

Lowest,
intolerable of
any disk
failure

100%

Highest

RAID 1

It divides disks in an
even number into two
groups, the data
of which are completely
the same.

High,
tolerable of
the failure of
a single disk

Mirror
redundancy

50%

Lowest

RAID 5

It segments data and


parity bits into stripes
and then writes these
stripes to multiple
disks. RAID 5 is one of
the most commonly
used RAID level.

High,
tolerable of
the failure of
a single disk

parity
redundancy

(N-1)/N

High

RAID 6

It is similar to RAID 5,
but it saves two copies
of parity bits and data
recovery requires both
of the two copies.

Highest,
tolerable of
the failure of
two disks

parity
redundancy

(N-2)/N

High

RAID 10

It incorporates the
features of RAID 1 and
RAID 0, that is, data
striping and mirroring
are adopted for data
reading and writing.
RAID 10 is also
a widely used RAID
level.

High,
tolerable of
the failure of
a single disk

parity
redundancy

50%

High

RAID 50

It incorporates the
features of RAID 5 and
RAID 0, that is, data
striping, parity bit
striping, and data
mirroring are adopted
for data reading
and writing.

High,
tolerable of
the failure of
a single disk

parity
redundancy

(N-1)/N

High

The typical configurations for common applications are as follows:

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

10

1.

High-speed databases: RAID 10 for the balance of performance and reliability

2.

Ordinary applications: RAID 5 that tolerates the failure of a single disk. If two disks fail
at the same time, the RAID group becomes invalid.

3.

High-reliability applications: RAID 6 that tolerates the failure of two disks. However, its
disk utilization and performance are lower than those of RAID 5.

1.4.3 What is reconstruction? Why is the reconstruction speed so


important?
Answer: A storage system usually has several backup disks, which are called hot spare disks.
If a data disk in a RAID group is damaged, the storage system uses a RAID algorithm to
retrieve all the data on that damaged disk and then writes the data to an available hot spare
disk for uninterrupted reading and writing. The whole process is called reconstruction. The
reconstruction process is transparent to external applications.
For example, if a data disk in a RAID 5 group is damaged, the storage system automatically
starts reconstruction. However, if another disk fails during the reconstruction process,
the whole RAID 5 group becomes invalid. This is called a dual-disk failure that is crucial to
a storage system.
There are two factors that determine the reconstruction speed: the data amount to be
reconstructed and the data write speed of hot spare disks. The reconstruction of 1 TB data
usually requires 10 hours. During this period of time, if another disk becomes faulty, all data
in the RAID group cannot be used any more, which is a disaster to the storage applications.
However, during the reconstruction process, the disk failure rate usually rises, and two
reasons are found out based on our testing and analysis records:
1.

Disks of the same batch are likely to fail simultaneously: The member disks in a
RAID group are usually installed at the same time, and they share the same workloads
during system operating. After a period of time, if one disk becomes faulty, the
possibility for other disks to become faulty rises. Therefore, some manufacturers prefer
to use disks manufactured in different batches to reduce the failure possibility.

2.

Disks are easy to fail under heavy workloads: During the reconstruction process, disks
are still processing I/O requests; therefore, the workloads on disks are increased and the
possibility for disk failures rises. If another disk fails at this time, data will be lost.

Therefore, how to shorten the reconstruction process and avoid the simultaneous failures of
multiple disks is a key issue to address.

1.4.4 How to shorten the reconstruction process to improve


storage reliability?
Answer: There are several methods:
1.

Using small-capacity and high-speed disks: During a traditional RAID reconstruction


process, multiple disks cooperate to restore data onto a hot spare disk. Therefore, the
capacity of these disks becomes an important factor. If we use small-capacity and
high-speed disks, the data volume to be reconstructed is small, and the reconstruction
process is shortened.

2.

Reducing the disk failure rate: The failure of a disk is usually caused by the failure of a
few tracks. Therefore, we can use a specific algorithm to isolate the failed tracks and
avoid the damage of the entire disk.

3.

Using a new RAID algorithm: Huawei has developed an innovative RAID algorithm,
RAID2.0+. This algorithm virtualizes physical disks and distributes the reconstruction

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

11

loads (write operations) to tens of disks (or even hundreds of disks). In this way, the
reconstruction speed is improved by 20 times and the dual-disk failure possibility is
minimized. RAID2.0+ is now well accepted by customers. As the disk capacity keeps
growing, RAID2.0+ must become more popular in the future.

1.4.5 What is cache? Why is it important in improving storage


efficiency?
Answer: An optimal storage configuration is in essence the perfect balance among
performance, reliability, and cost. All the storage technologies are developed based on these
three factors.
The disk I/O speed is a bottleneck of the whole storage system. Every read or write operation
is processed by disks, which may result in severe latency. The easiest way to improve
performance is to add more disks into the system, but this definitely increases the cost.
The cache technology is developed to resolve this dilemma. It uses high-performance storage
media (such as memory) as a temporary buffer to store frequently-accessed data (hotspot data).
With such storage media, the reads and writes to the hotspot data is directly processed by the
cache, and the performance is improved by tens of times.
However, the downside of cache is that it has a small capacity and costs a high price, so it is
only used to store hotspot data. Then a hotspot data identification and scheduling algorithm is
in need. In addition, the data temporarily buffered in the cache is also permanently stored in
disks, so we also need an algorithm to ensure the consistency of these two copies of data.
Here raises another question: Since the cache is so important in improving the system speed,
how to ensure its own reliability? For example, a write operation has been processed in the
cache and returns a write success message to upper-layer applications, and new data has not
yet been flushed to disks. Then an unexpected power failure occurs. Will this cause all the
data in the cache be lost? If the controllers in the storage system do not adopt any protection
measures, this power failure will cause data loss, which is unaccepted in core applications
such as financial applications. Therefore, we usually use two methods to protect the
controllers against power failures:
1.

Configuring backup battery units (BBUs) for the controllers: Once a power failure
occurs, these BBUs can supply power to the controllers and write the cached data to the
backup SSDs. When the power supply resumes, data on the SSDs can be restored to the
cache.

2.

Globally caching data to multiple controllers: If one controller fails, the other controllers
can store the cached data.

The core of a storage system is the absolute reliability of its data, so we must eliminate the
loss of even a bit of data.

1.4.6 What are backup and disaster recovery?


Answer: A storage system is only a standalone system. If customers want higher data
reliability, they can use backup and disaster recovery to achieve data redundancy.

Issue 01 (2009-04-10)

Backup: One or more duplicates can be created for a piece of data. Once the production
system becomes unavailable, the backup duplicates can be used to restore the system
data. The traditional backup period is one day, that is, the storage system can only
retrieve the data generated within one day. What's more, the data recovery period is long
and services are interrupted during the period. These two limitations are unaccepted by
many mission-critical applications.

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

12

Disaster recovery: Two storage systems are deployed in one city or different cities, and
these systems synchronize data with each other in real time or near-real time. Once the
primary storage system is down, the other storage system can take over the
services within a short time.

Backup and disaster recovery can coexist. There is only one disaster recovery data copy
but can be multiple backup data copies that are generated at different times. If the
original data is damaged by virus or man-made mistakes, an appropriate backup data
copy can be used to restore the lost data.

1.4.7 What are RTO and RPO?


Answer: RTO and RPO are two important indexes used to measure the reliability of a disaster
recovery system, where:

Recovery time objective (RTO): The length of time it takes to recover services from an
outage to an operational state. This index is used to measure the service recovery
capability of a disaster recovery system.

Recovery point object (RPO): The amount of lost data during the period from the
time when the disaster occurs to the time when the application system recovers to an
operational state. This index is used to measure the data redundancy capability of a
disaster recovery system.

Smaller RTO and RPO are translated into a higher service protection capability. Therefore, to
minimize the impact of a disaster on storage services and to achieve short RTO and RPO, we
need to build a highly reliable disaster recovery solution if the budge permits.

1.4.8 What are the common data backup solutions?


Answer: Data backup can be implemented on three layers:

Application-layer backup: Data is backed up across two or more sites by using


host-side applications, databases (such as Oracle and DB2), operating systems (such as
UNIX and volume management), and virtualization.

Network-layer backup: Data is synchronized and backed up by capturing and


forwarding I/O operations on the channels between hosts and storage systems. The
typical solutions of this category include Huawei VIS solution, EMC vPLEX solution,
and IBM SVC solution.

Data-layer backup: Data is backed up on storage systems. The backup technologies of


this category include synchronous replication and asynchronous replication.

1.4.9 How to improve the overall reliability of a storage system?


Answer: Reliability is crucial to a storage system and it can be improved from three aspects:
1.

Component-level reliability

The major measure is to select optimal components and strictly control quality. For example,
storage equipment manufacturers usually cooperate with disk manufacturers to strictly control
disk quality from the start of the production phase. In addition, storage manufacturers perform
strict tests on every batch of disks. For example, Huawei selects 500 to 1000 disks out of each
batch to perform drop, vibration, and temperature tests, and makes sure that there is no batch
issue exists.
2.

Issue 01 (2009-04-10)

Product-level reliability

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

13

Eliminate single points of failure in design. Configure redundancy for key components
such as controllers.

Install shockproof brackets and connectors for disks, reducing the vibration and
resonance caused by disk operations and increasing the disk service life.

Perform strict tests on the storage system, such as the temperature cycle test.

3.

Solution-level reliability

Implement multi-site disaster recovery for key data. For example, in the 911 event, many
enterprises located in the World Trade Center were attacked, but their services were not
affected. The reason was that their data and services were backed up for disaster
recovery. Data disaster recovery uses advanced technologies such as snapshot and clone to
protect data. If a disaster occurs, the backup data can be used to restore the production data,
and no data will be lost. Service disaster recovery adds service switchover on the basis of data
disaster recovery. If a disaster occurs, the backup site manually or automatically takes over
services of the production site. After the production site recovers, the services are seamlessly
switched back to the production site. During this process, no adverse impact will be imposed
on services.

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

14

Acronyms and Abbreviations

Table 2-1 Acronyms and abbreviations


Acronym

Full Spelling

RAID

Redundant Array of Independent Disks

DAS

Direct Attached Storage

SAN

Storage Area NetWork

NAS

Network Attached Storage

SAS

Serial Attached SCSI

NL-SAS

Nearline Serial Attached SCSI

SSD

Solid State Disk

OLTP

On-Line Transaction Processing

OLAP

On-Line Analytical Processing

ERP

Enterprise Resource Planning

Issue 01 (2009-04-10)

Huawei Proprietary and Confidential


Huawei Technologies Co., Ltd.

15

Anda mungkin juga menyukai