Anda di halaman 1dari 24

White Paper

Abstract
This white paper provides a technical overview of
EMC Avamar backup and recovery software and systems
with integrated global, client data deduplication. It includes an
in-depth look at the Avamar architecture, data deduplication
technology, key applications, and deployment options.

September 2013



EFFICIENT BACKUP AND RECOVERY WITH
EMC AVAMAR DEDUPLICATION SOFTWARE
AND SYSTEMS

2 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems


















Copyright 2013 EMC Corporation. All Rights Reserved.

EMC believes the information in this publication is accurate as
of its publication date. The information is subject to change
without notice.

The information in this publication is provided as is. EMC
Corporation makes no representations or warranties of any kind
with respect to the information in this publication, and
specifically disclaims implied warranties of merchantability or
fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in
this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC
Corporation Trademarks on EMC.com.

All other trademarks used herein are the property of their
respective owners.

Part Number H2681.9


3 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems

Table of Contents
Executive summary.................................................................................................. 4
Introduction and audience ....................................................................................... 5
EMC Avamar technology .......................................................................................... 5
Data deduplication ............................................................................................................. 5
Global elimination of redundant data at the client .......................................................... 5
Variable vs. fixed-length data segments ......................................................................... 6
Logical segment determination ...................................................................................... 6
Not all data deduplication is created equal .................................................................... 7
Benefits of Avamars efficiency ....................................................................................... 7
General architecture ........................................................................................................... 7
Avamar servers ............................................................................................................... 8
Avamar Administrator ..................................................................................................... 8
Avamar Enterprise Manager ............................................................................................ 9
Avamar client software ................................................................................................... 9
Avamar replication ......................................................................................................... 9
Integration with platforms and applications ....................................................................... 9
Operating systems: ...................................................................................................... 10
Applications: ................................................................................................................ 10
NAS systems ................................................................................................................ 11
Virtual environments .................................................................................................... 12
Export to tape ............................................................................................................... 14
Enterprise event management solutions ....................................................................... 16
Enterprise reporting capabilities ................................................................................... 16
Grid server architecture .................................................................................................... 16
Solving the index challenge .......................................................................................... 17
Scalability and performance: Automated load balancing .............................................. 18
Reliability and availability: RAIN, replication, and checkpoints ..................................... 19
Recoverability verified daily .......................................................................................... 20
Encryption ........................................................................................................................ 20
Secure data shredding ..................................................................................................... 20
Flexible deployment options............................................................................................. 20
Desktop/laptop backup and end-user recovery ............................................................ 21
Integration with EMC Data Domain ............................................................................... 22
Conclusion ............................................................................................................ 24

4 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Executive summary
Protecting critical data is a challenge for organizations of all sizes. Traditional backup
solutions store data repeatedly, expanding total storage under management by five to
10 times. Customers need solutions to help manage the information explosion. And
shipping tapes offsite can put data at risk especially if the tapes are lost or stolen
and the data is unencrypted.
Traditional backup solutions require a rotational schedule of full and incremental
backups, which move a significant amount of redundant data week after week. Due to
the unnecessary data movement, enterprises are often faced with backup jobs that
roll into production hours, network contention, and tedious management. The impact
is especially severe when dealing with virtual environments, NAS systems, remote
offices, and desktop/laptop systems.
In virtual environments, each virtual machine (VM) represents an individual backup
job, often with overlapping backup windows, and includes redundant operating
system, application, and file data. Consequently, backups for VMs often overrun
backup windows and tax shared resources, leaving data unprotected and creating
issues for backup administrators.
Protecting NAS systems can also pose significant challenges, since recurring full
backups (level-0) often fail to complete within the allotted time frame, which can
impact employee productivity, limit the amount of data on each NAS system, and
ultimately leave data unprotected.
In remote offices, limited network bandwidth makes centralized, automated WAN-
based backup nearly impossible. As a result, backup tasks must be handled by local
non-IT staff that often relies on faulty tape-based hardware and unreliable ad-hoc
manual processes.
The number of desktop and laptop users continues to grow exponentially as
expanding wireless networks and infrastructure enables employees to work outside
of traditional offices. Congested networks, limited remote support, and multiple time
zones can make data protection a daunting task for IT administrators.
Developed to solve the challenges associated with traditional backup, EMC
Avamar backup software and systems, equipped with integrated global, client-side
data deduplication technology, provide fast, daily full backups for virtual
environments, NAS systems, desktops/laptops, remote offices and business critical
applications. Avamar reduces the size of backup data at the clientbefore it is
transferred across the network and stored. Unlike traditional backup, Avamar delivers
fast, daily full backups via existing IP networks.
Avamar also deduplicates backup data globally across servers, desktops, laptops,
and offices worldwide to reduce the total required backup storage by up to 50x. As a
result, Avamar provides the benefits of efficient long-term retention on disk while
dramatically lowering capital and operating expenses including floor space, power,
and cooling.

5 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Avamar backups can be quickly recovered in just one stepeliminating the hassle of
restoring the last good full and subsequent incremental backups to reach the desired
recovery point. Avamars intuitive interface allows desktop and laptop users to
quickly recover their own data, reducing the burden on IT staff. In addition, Avamars
centralized web-based management and at-a-glance dashboard view make it easy for
administrators to protect hundreds of offices from a single location via existing
networks. And data can be encrypted in flight and at rest for added security.
Avamars grid architecture provides online scalability, and patented redundant array
of independent nodes (RAIN) technology provides high availability. While EMC
Avamar backs up data to disk, it can also export data to tape. And Avamar is
integrated with EMC NetWorker and EMC Data Domain Systems for efficient
backup of specific data types, simplifying management and maximizing existing IT
investments.
Introduction and audience
This paper provides a technical overview of Avamar backup and recovery software
and systems. It includes an in-depth look at the Avamar architecture, data
deduplication technology, key applications, and deployment options. It is intended
for backup administrators or technical staff seeking a more in-depth look at Avamar.
EMC Avamar technology
Data deduplication
Enterprise data is highly redundant, with identical files or data stored within and
across systems (for example, OS files or documents sent to multiple recipients).
Edited files also have tremendous redundancy with previous versions. Traditional
backup methods magnify this by storing all of this redundant data over and over
again. Avamar utilizes patented global, client-side data deduplication technology to
eliminate redundancy at both the file and the subfile data segment level.
Global elimination of redundant data at the client
Avamar solves the challenge of redundancy in backup data at the clientbefore
transfer across the LAN or WAN during a backup operation. Avamar backup agents are
deployed on the systems to be protected (for example, servers, desktops, laptops) to
identify and filter repeated data segments stored in files within a single system and
across multiple systems over time. This ensures that each unique data segment is
backed up only once across the enterprise. As a result, copied or edited files, shared
applications, embedded attachments, and even daily changing databases generate
only a small amount of new backup data.
By moving only new, unique subfile variable length data segments, Avamar
significantly reduces the required daily network bandwidth and storage. By storing

6 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
just a single instance of each subfile data segment globally, Avamar also reduces
total back-end storage by up to 50x for cost-effective, long-term, disk-based recovery.
Variable vs. fixed-length data segments
A key factor for eliminating redundant data at a segment (or subfile) level is the
method for determining segment size. Fixed-block or fixed-length segments are
commonly utilized by snapshot and some deduplication technologies. Unfortunately,
even small changes to a dataset (for example, inserting data into the beginning of a
file) can change all fixed-length segments in a dataset, despite the fact that very little
of the dataset has actually changed. Avamar uses an intelligent variable length
method for determining segment size that examines the data to determine logical
boundary points, eliminating the inefficiency.
Logical segment determination
Avamars patented method for segment size determination is designed to yield
optimal efficiency across all systems in an enterprise. Avamars algorithm analyzes
the binary structure of a dataset (the 0s and 1s that make up a dataset) in order to
determine segment boundaries that are context-dependent, so that Avamars client
agents will be able to identify the exact same segments for any dataset, no matter
where that dataset is stored in the enterprise. Avamars variable length segments
average 24 KB in size and are then compressed to an average of just 12 KB.
By analyzing the binary structure, Avamars method works for all file types and sizes,
including databases. For instance, if a paragraph is added to the beginning and the
middle of a text file, Avamars algorithm will identify and back up only the new,
modified segments, dramatically reducing the amount of backup data that needs to
be sent and stored.
Network
B
A
C C
D
D
B
A
B
A
C
B
A
C
B
C
D
D
A
B
A
B
A
C
D
Avamar agent
Avamar Data
Store
Physical or virtual
system


Figure 1. Avamar software identifies the unique, subfile variable length data
segments that comprise the data. Only a single instance of each data segment is
transferred during daily full backups and stored globally, across sites and servers.

7 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Not all data deduplication is created equal
For each sub-file data segment, Avamar generates a unique 20-byte ID, using the
SHA-1 algorithm. This unique ID is like a fingerprint for that segment. Avamars
software then uses the unique ID to determine whether a data segment has been
previously stored. Files, directories, entire file systems, and even databases can be
quickly and efficiently stored with a hierarchical map of these unique IDs.
Data Type
Amount of Primary
Data Backed Up
Amount of Data
Moved Daily
Windows file systems 3,573 GB 6.1 GB
Mix of Windows, Linux, and UNIX file systems 5,097 GB 11.7 GB
Engineering files on NAS (NDMP backups) 3,265 GB 24.2 GB
Mix of 20% databases, 80% file systems (Windows and UNIX) 9,583 GB 80.0 GB
Mix of Linux file systems and databases 7,831 GB 104.2 GB
Source: EMC

Figure 2. While results will vary by data type and mix, Avamar can dramatically
improve backup performance and efficiency
Benefits of Avamars efficiency
Avamars superior backup efficiency translates into many important customer
benefits, including:
Provides fast, daily full backups across the entire enterprise.
Utilizes existing LAN/WAN IP bandwidth and virtual infrastructure resources.
Reduces backup storage by up to 50x.
Up to 85 percent reduction in total client CPU utilization. Avamar agents run in
low priority or nice mode, so they do not contend with other applications
vying for CPU resources on client systems. While Avamar agents typically use
15 percent more CPU than traditional backup agents during backup
operations, Avamar backups complete up to 10x faster. As a result, Avamar
reduces total client CPU utilization by up to 85 percent over a seven-day period
compared to traditional backup methods.
Immediate, single-step recovery. Avamar stores all backups as virtual full
images, which can be immediately recovered in a single step to any system
running the Avamar agent. There is no need to restore from the last good full
and subsequent incremental backups to reach the desired recovery point.
General architecture
The Avamar solution consists of a number of components, including the Avamar
server, Avamar Administrator, Avamar Enterprise Manager, and Avamar client agent
software. Avamar servers can be deployed in either single-node or scalable multinode
configurations, depending upon the amount of performance and capacity required.

8 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Avamar servers
EMC offers flexible Avamar server deployment options including Avamar Data Store,
Avamar Business Edition, and the Avamar Virtual Edition virtual appliance. The EMC
Avamar Data Store is a scalable, all-in-one packaged solution consisting of Avamar
software preinstalled and preconfigured on EMC-certified hardware to simplify
purchasing, deployment, and service while minimizing onsite setup. For virtual
environments, Avamar Virtual Edition for VMware enables an Avamar server to be
quickly deployed as a virtual appliance, leveraging an existing ESX implementation
and its attached disk storage.
Avamar servers store client backups and enable management of policies for
scheduling, determining datasets, and retention periods. An Avamar backup provides
a point-in-time full copy of data that can be restored on demand from an Avamar
server. Multinode Avamar servers leverage multiple hardware servers for scalability,
performance, and redundancy.
There are two primary node types for a multinode Avamar server:
Storage NodeStores the deduplicated backup data. Multiple Storage Nodes
are configured with multinode Avamar servers based upon performance and
capacity requirements. Storage Nodes can be added to an Avamar server over
time to expand performance with no downtime required. Avamar clients
connect directly with Avamar Storage Nodes; client connections and data are
load balanced across Storage Nodes automatically without any downtime.
Utility NodeNode dedicated to scheduling and managing background
Avamar server jobs. One utility node is configured per multinode Avamar
server. Data on the Utility Node is protected by the Avamar server. Note: Utility
Nodes are not single points of access for an Avamar server; backups and
restores can still complete to connected clients, even without a Utility Node.
Other optional nodes include:
NDMP Accelerator Node - specialized node that provides fast, daily full
backups via NDMP for NAS systems (EMC Isilon, VNX family, Celerra

, and
NetApp).
Media Access Node - specialized node that enables export of Avamar backup
data to tape in non-deduplicated, native tape format for extended retention.
Avamar Administrator
The Avamar Administrator is a graphical management application that includes the
following:
Management of backup policies, including datasets, schedules, and retention
Management of users and clients
Centralized on-demand backups and restores
Detailed monitoring and reporting

9 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
The Avamar Administrator can be launched directly from the Web-based Enterprise
Manager user interface without any software installation or can be installed locally on
any Windows or Linux system.
Avamar Enterprise Manager
Avamar Enterprise Manager is a Web-based management interface that provides the
ability to monitor and manage a distributed Avamar deployment. It delivers an at-a-
glance dashboard that allows backup administrators to quickly assess the status of
backups across all Avamar systems in a distributed environment. The dashboard also
enables administrators to quickly drill down into a particular Avamar server to review
granular system and backup status or to perform detailed server management.
As shown in
Figure 3, the Web-based management interface provides an at-a-glance view of the
Avamar backup environment including success and warning indicators.

Figure 3. The Avamar Enterprise Manager provides intuitive, centralized management
Avamar client software
Avamar supports the automated protection of industry-leading operating systems and
mission critical applications. Avamar clients filter out redundant data before sending
backup data over networks, to protect systems even across slow or congested
LAN/WAN links.
Avamar replication
Avamar also enables efficient, encrypted, asynchronous replication of data stored in
an Avamar server to another Avamar server deployed in a remote location without the
need to ship tapes. Avamar replication is a scheduled process between two
independent Avamar servers, and can be used for disaster recovery objectives. It can
be scheduled to run at off-peak hours to optimize network bandwidth.
Integration with platforms and applications
Avamar client software supports the following industry-leading operating systems,
applications, and platforms. Please refer to the latest EMC Avamar Compatibility and

10 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Interoperability Matrix for more details on supported clients, operating systems,
platforms, and applications.
Operating systems:
Windows 2012, 2008 R2, 2008, 2003; Windows 8, 7, Vista, and XP
Avamar supports automated protection of all FAT, NTFS, or ReFS data, and also
provides the ability to back up and restore Windows system state.
Windows Bare Metal Recovery to the same or similar hardware or to virtual
machines (physical to virtual systems) is supported.
UNIX
A variety of UNIX, including Solaris, HP-UX, IBM AIX, SCO.
Linux
Red Hat, SUSE, Oracle Linux, Ubuntu, Debian, Free BSD and CentOS Linux.
MAC OS X
Avamar supports Mac OS X on Intel platform.
VMware
Avamar supports the VMware ESX/ESXi operating systems, and is integrated
with vSphere, vStorage APIs for Data Protection (VADP) and VMware vCenter.
Microsoft Hyper-V
Avamar supports the Microsoft Hyper-V virtual environment and can
offload backup to proxy servers for impact-free backups.
Applications:
Microsoft SQL Server databases
Avamar enables centralized, hot backups of SQL Server for an entire server or
individual databases. It also supports federated backups for SQL Server 2012
AlwaysOn Availability Groups, allowing SQL DBAs the ability to control which
servers are used for backup via SQL Server Management Studio.
Microsoft Exchange servers
Avamar enables federated online backups and granular recovery of
Exchange at a server, database, mailbox, or message level.
Microsoft SharePoint
Supports disaster and granular recoveries for MS SharePoint environments
from a single farm level backup.
Lotus Domino
Avamar provides protection for Lotus Domino on most versions of Windows,
Linux and AIX, for stand-alone for clustered environments.

11 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
SAP with Oracle
Robust protection including whole/granular/logs backup and recovery, CLI or
GUI interfaces, and single-/multi-stream backup and recovery to or from
Avamar and Data Domain, and recovery to any-Point-in-Time independent of
the backup time. Current versions of SAP BR Tools are also supported for
stand-alone and clustered environments.
Sybase ASE
Supports whole/granular/logs backup and recovery, CLI or GUI interfaces,
single-/multi-stream backup and recovery to or from Avamar and Data Domain.
Provided for current versions of Sybase ASE for stand-alone and for clustered
environments.
Oracle
Avamar utilizes Oracle Recovery Manager (RMAN) for protection of Oracle
databases on most platforms for stand-alone, clustered and RAC
environments.
IBM DB2
Avamar provides protection for DB2 databases on many Windows, Linux and
AIX versions, for stand-alone and DPF DB2 environments.
NAS systems
Avamar supports NDMP backups via the Avamar NDMP Accelerator Node to provide
reliable, high-performance backup and recovery for NAS systems (EMC Isilon, VNX
family, EMC Celerra, and NetApp systems). With an Avamar NDMP Accelerator node, a
level-0 backup is performed only once, during the initial full backup. Subsequent
daily full backups are achieved by requesting only level-1 incremental dumps,
enabling Avamar to dramatically reduce backup times and the impact on NAS
resources and networks. Avamar eliminates backup bottlenecks and provides the
freedom to consolidate storage and optimize NAS systemswithout limiting the
number and size of files or volumes due to potential backup performance limitations.
NAS System
LAN LAN/WAN
Avamar NDMP
Accelerator
Avamar
Data Store
WAN
DR Site
Avamar
Data Store
Datacenter

Figure 4. Avamar delivers fast, daily full backups for NAS systems

12 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Avamar also supports Windows Storage Server (WSS) NAS systems. For WSS 2008 R2
and 2008, Avamar is single instance storage (SIS) aware. As a result, Avamar can
efficiently backup the WSS single-instance storage links and the data associated with
them for fast, daily full backups. By deploying the Avamar agent directly on the
Windows Storage Servers, an Avamar NDMP Accelerator is not required.
And Avamar supports the cost efficient EMC Iomega PX Series NAS systems, which are
suited for small-to-mid size remote and branch offices. In this case, the Avamar agent
is embedded in the Iomega LifeLine OS and an Avamar NDMP Accelerator is not
required. When using Avamar with Iomega systems, Avamar only provides file data
backup (application/database backup requires Avamar agents on the application
servers). In all cases, Avamar provides fast, daily full backups with integrated data
deduplication. And data can be efficiently replicated offsite for disaster recovery.
Virtual environments
Avamar software quickly and efficiently protects VMware and Microsoft Hyper-V
virtual environments by reducing the size of backup data within and across virtual
machines. This eliminates traditional backup bottlenecks caused by the large amount
of data that must pass through the same set of shared resourcesthe physical
servers CPU, NIC, memory, and disk storage. Avamar reduces the traditional backup
loadup to 200 percent weeklyto as little as 2 percent over the same seven-day
period, dramatically reducing backup times and resource utilization.
Avamar can quickly protect virtual environments by installing agents on the virtual
machine guests, by leveraging the vStorage API for Data Protection (VADP) or by
leveraging Hyper-V VSS. In all cases, Avamars powerful deduplication technology
provides fast, daily full guest and/or image level backups while reducing required
network/infrastructure bandwidth and storage (Figure 5). In addition, unlike
traditional backup solutions, Avamar can deduplicate the virtual machines at the VM
image backup proxy, significantly reducing storage utilization and minimizing
network traffic. Tight integration with the latest VMware vStorage APIs for Data
Protection (VADP) and VMware vCenter offers advanced features including changed
block tracking for backup and recovery, flexible image level restores, and proxy server
load balancing. Furthermore, Avamar offers file-level recovery from image-based
backups, resulting in a single-pass backup workflow.


13 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems

Figure 5. Avamar deduplicates backup data within and across virtual machines, at the
client, and globally, providing fast and reliable daily full backups
The EMC Avamar Virtual Edition for VMware is the industrys leading virtual appliance
for backup, recovery, and disaster recovery. Avamar Virtual Edition enables users to
deploy Avamars deduplication technology easily, effectively, and in a repeatable
fashion on VMware ESX Server hosts. Each virtual appliance supports up to 4 TB of
deduplicated backup capacity (which under a typical traditional backup schedule
would require approximately 140 TB of tape or disk storage), and can leverage the
existing VMware shared server and storage infrastructure to lower costs and simplify
management. Avamar Virtual Edition supports vMotion for deployment flexibility, and
up to two Avamar Virtual Edition virtual appliances per ESX server provides scalability
(Figure 6). Replication between Avamar virtual appliances or from Avamar virtual
appliances to physical Avamar servers eliminates reliance on offsite tape shipments
and the risk of losing unencrypted data.


14 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems

Figure 6. Avamar Virtual Edition for VMware is the industrys leading virtual appliance
for backup, recovery, and disaster recovery
Export to tape
There are several options available when export to tape is required. Avamar Extended
Retention provides the ability to export monthly, quarterly, or yearly backup data from
Avamar (including Avamar Data Store, Avamar Virtual Edition and Data Domain
systems) to physical or virtual tape. Data is exported in non-deduplicated, native
format. As shown in the following illustration, an Avamar Media Access node is
utilized during export and import operations. To simplify the process, administrators
can leverage Avamars intuitive menus and wizards, along with the Avamar
Administrator for simplified management.

15 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems

Figure 7. Avamar Extended Retention enables backup data to be exported to physical
or virtual tape in non-deduplicated native format.
Avamar Data Transport is a complementary solution for exporting all daily backups
(on a monthly basis) in a deduplicated format for cost-effective, long-term retention.
It utilizes an intuitive interface with policy-driven processes for export and recovery.
Avamar Data Transport significantly reduces the required number of tapes by up to 50
times. Avamar is also integrated with EMC NetWorker, and utilizes the NetWorker
management interface that includes the ability to backup to tape.

Figure 8. Avamar Data Transport exports deduplicated backup data to tape for cost-
effective, long-term retention

16 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Enterprise event management solutions
Avamar integrates with popular enterprise management solutions that can pull SNMP
information from an Avamar SNMP server, or accept SNMP traps from the Avamar
server. Avamar system activities and operational status are events that can be
integrated into enterprise management solutions. Examples of various Avamar events
include client registration and activation, successful and failed backups, hard disk
status, and so on. Avamar provides the ability to monitor events via:
E-mailEvents can be configured on an event-by-event basis to send an e-mail
message to a designated list of recipients. E-mail notifications can be sent
immediately or in batches at regularly scheduled times.
SyslogEvents can be configured on an event-by-event basis to log
information to local or remote syslog files based on filtering rules configured
for the syslog daemon receiving the events. Third-party monitoring tools and
utilities capable of examining log entries can access the syslog files and
process them in order to integrate Avamar events into larger site activity and
status reports.
SNMPAvamar supports two SNMP methods to access events and activity
completion status. Avamar provides a mechanism for SNMP management
applications to pull information from Avamars SNMP server. Also, SNMP
traps provide a mechanism for the Avamar server to push information to
SNMP management applications whenever designated Avamar events occur.
Enterprise reporting capabilities
Avamar provides a number of standardized reports, and integrates with EMC Data
Protection Advisor for more extensive reporting. For organizations that prefer
customized reporting capabilities, Avamar integrates with standard enterprise
reporting solutions such as Crystal Reports or Actuate, or with specialized backup
reporting applications such as EMC Data Protection Advisor. To facilitate integration
of Avamar with external reporting tools, Avamar uses a PostgreSQL database to store
information about backup activities (for example, successful and failed backups) as
well as backup policies. Information in the database is accessible through any
PostgreSQL-compliant Open DataBase Connectivity (ODBC) interface. Avamar also
publishes comprehensive information about all available database views that
customers can leverage to create custom reports.
Grid server architecture
Avamar multi-node servers operate in a grid architecture, which provides significant
benefits including availability, scalability, performance and flexibility in deployment.
Avamars grid server architecture, shown in figure 9, provides scalable performance
and capacity. Every Avamar client can connect to every Storage Node for both backup
and recovery, which eliminates potential performance bottlenecks.

17 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Avamar
Server
Utility and
spare node
Parity across
storage nodes
Virtualization


Figure 9. Avamars grid architecture provides scalable performance and capacity
Solving the index challenge
As data volumes increase, a centralized index becomes increasingly complex and
difficult to manage, often introducing a bottleneck to backup operations. In addition,
corruption of the centralized index may result in the inability for an organization to
recover data, since the index can no longer be used to identify which tapes contain
the particular set of data that must be recovered. Avamar uses an elegant, distributed
indexing architecture to eliminate the indexing challenge. Avamar uses segment IDs
in a manner similar to a phone number for land lines. In a phone number, the area
code provides the first general area where a call needs to be routed, and the number
itself determines the exact location where the call is targeted. Avamar uses a portion
of each unique ID (like an area code) to determine which Storage Node will store a
specific segment of data. Avamar uses another portion of the unique ID (like a phone
number) to determine where that segment of data will be stored inside that Storage
Node. As a result, just by looking at the unique ID of a segment, Avamar can
determine exactly where to store or retrieve that segmenteven across a large
number of Storage Nodes.
Avamars distributed indexing architecture streamlines access to data (Figure 10).
Automatic load balancing distributes data across all available Storage Nodes and
enables linear performance increases by simply adding nodes when needed.

18 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems

Figure 10. Avamar indexing architecture streamlines access to data
Furthermore, Avamar takes groups of unique IDs (for instance, all the IDs for a set of
segments that make up a file) and generates a new unique ID for that group. A
request for that group ID will cascade into requests for all the segments that make up
that group. This process continues hierarchically, so Avamar can quickly store and
retrieve files, directories, entire file systems, and even databases, without the need
for any centralized index or database that can become a bottleneck to performance or
scalability. It is important to note that the Avamar Utility Node is not used as a
database or index for storage or access of unique segments. Unlike some other
deduplication architectures, Avamar does not use access nodes or metadata nodes,
which can become bottlenecks to performance. Every Avamar client can connect to
every Storage Node in an Avamar Server for both backup and restore. Avamars
elegant index structure eliminates redundancy even for indices and metadata,
ensuring that the indexing component of an Avamar Server remains approximately
two percent of total data storage.
Scalability and performance: Automated load balancing
Avamar uses the SHA-1 algorithm to determine unique IDs. This algorithm provides a
flat distribution of unique IDs across the full potential range of outputs. As a result,
Avamar automatically load balances data across all available Storage Nodes for
optimized scalability and performance for backup and restore.
New Storage Nodes can be added to an Avamar server with no need for scheduled
downtime, allowing an Avamar server to accommodate data growth or increased
retention periods. Since Avamar can be easily scaled by the addition of Storage

19 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Nodes, customers can purchase only the capacity they need (just-in-time purchasing)
and add Storage Nodes as they grow, reducing IT costs. In addition, when new
Storage Nodes are added to an existing Avamar server, data is actively load balanced
across the newly added nodes. Many disk-based data protection solutions available
today are deployed as a single, large server attached to a large pool of external disk
storage. As the amount of primary data that must be protected grows, these single,
large servers quickly become bottlenecks to performance and scalability.
Avamars load balancing enables linear performance increases by simply adding
Storage Nodes. Each incremental node increases CPU, memory, I/O and disk capacity
for the entire grid. This scalable performance is critical in disaster recovery situations
when multiple servers must be restored simultaneously to meet recovery-time
objectives.
Reliability and availability: RAIN, replication, and checkpoints
When traditional backup solutions fail, enterprises are exposed to windows of
potential data loss. Avamar employs patented redundant array of independent nodes
(RAIN) technology in order to provide high availability across the nodes in an Avamar
server grid. Avamar can continue to provide reliable data protection and access, even
if a server node fails or becomes unavailable, since data stored on any node can be
reconstructed from the other nodes.
In addition to protecting enterprise systems, Avamar also protects itself with twice
daily, internal checkpoints consistent snapshots of the entire Avamar system that
can be verified for integrity. If an integrity check fails due to an inconsistency, and the
inconsistency cannot be fixed, the system can be quickly rolled back to a prior
checkpoint. RAID protects data stored on disk and for further system protection
flexibility, checkpoints can also be stored on a Data Domain System for
implementations leveraging this integrated solution.
Avamar
Server
Utility and
spare node
Parity across
storage nodes
Virtualization
Verified
checkpoint

Figure 11. Avamar RAIN, RAID, and daily checkpoints for high availability

20 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
Avamar replication provides the ability to logically replicate the data stored within an
Avamar system to another Avamar system deployed in an off-site location. In the
event of a disaster where a complete Avamar system becomes unavailable, data can
be recovered directly from the replication target, providing a high level of availability.
Recoverability verified daily
Avamar runs daily, scheduled integrity checks of all data stores. The integrity
checking application uses a 160-bit checksum to verify that all data backups can be
restored to their original state, an exacting process that would be very difficult and
expensive to accomplish with removable-media solutions. As a result, Avamar
eliminates the risk of silent data loss or corruption, which is common with tape
archives. If integrity is compromised in an Avamar server, Avamar can recover the
corrupted data using RAIN, internal checkpoints, or a replicated Avamar server.
Encryption
Avamar provides comprehensive encryption capabilities, including the ability to
encrypt backup data while in transit and at rest. For enhanced security during
client/server data transfers, Avamar supports SSL encryption. SSL encryption utilizes
the 128-bit or 256-bit Advanced Encryption Standard (AES) algorithm and should be
used for any external network communications, where security is a significant
concern. The choice of encryption method can be made on a client-by-client basis or
for an entire group of clients. Avamar also supports the option to enable encryption of
data at rest using 128-bit AES encryption. By encrypting data at rest, organizations
are further protected from backup data theft or unauthorized access.
Avamar also supports the Federal Information Processing Standard (FIPS) publication
140-2. The Avamar server utilizes the FIPS 140-2 validated encryption library for all
communications. In addition, Avamar supports key rotation for encryption of data at
rest. These new encryption and key rotation options further enhance Avamar security.
Secure data shredding
Data shredding support helps facilitate the secure deletion of data from an Avamar
server when classified data is accidentally backed up to a non-classified server.
Typically, this scenario is applicable to Department of Defense installations, but some
enterprise companies may wish to perform the same actions. Avamars data
shredding implementation is in accordance with the DoD 5220.22-M standard.
Flexible deployment options
Agent-only option - For smaller remote offices, Avamar software agents can be
installed directly on the client systems to be protected, without the need for
additional local hardware. This enables data to be backed up directly over existing IP
WAN connections to a central Avamar server at the data center.
To protect larger remote offices and data centers, data can be backed up to a local
Avamar server for faster recovery, and then replicated to another Avamar server

21 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
located at the data center or remote disaster recovery site. When an Avamar server is
required, there are several deployment options available:
EMC Avamar Virtual Edition for VMware - The industrys leading deduplication virtual
appliance for backup, recovery, and disaster recovery. Each virtual appliance can
store up to 4 TB of deduplicated data, which makes it an ideal deployment option for
a fully virtualized remote office with strict SLAs. It also enables Avamar to be
deployed easily, effectively, and in a repeatable fashion, leveraging the existing
servers CPU and disk storage.
EMC Avamar Data Store An all-in-one packaged solution consisting of EMC Avamar
software running on preconfigured EMC-certified hardware. The Avamar Data Store is
available in two models a scalable multi-node model and a single-node model. This
approach simplifies purchasing, deployment, and service.
The multi-node Avamar Data Store is designed for the data center where backup data
is being consolidated from multiple remote locations or to protect virtual
environments and LAN/NAS servers. It can efficiently retain the equivalent of up to
several petabytes of traditional cumulative daily full backups.
The single-node Avamar Data Store and the Avamar Business Edition are ideal for
deployment at remote offices that require faster local recovery. A variety of single-
node capacities are available, which under a typical traditional backup schedule
could require hundreds of terabytes of disk or tape storage, depending on the backup
method and retention period. In addition, both multi- and single-node models
support replication, either from the remote office to the data center for consolidation,
or between data centers for disaster recovery.
Desktop/laptop backup and end-user recovery
Avamars intuitive interface allows desktop and laptop end users to quickly recover
their own data, reducing the burden on IT staff. End users can restore files and folders
directly to the original location with no loss of ACLs. In addition, cross-platform
recovery is supported, from Windows to Mac for example, for simplified support.

22 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems

Figure 12. Avamars intuitive, easy-to-use interface enables desktop and laptop end
users to quickly recover their own data, reducing the burden on IT staff.
In addition, Avamar 7 enables users to utilize popular tablet devices to browse and
access Avamar backup data on their tablet device. While Avamar does not currently
backup tablets, support for tablets does provide additional flexibility (data cannot be
overwritten when viewing with tablet).
Integration with EMC Data Domain
A Data Domain system is typically implemented to complete critical high-speed
backups of large, high-change rate databases. Avamar is typically implemented to
complete fast, daily full backups of file systems, NAS systems, virtual servers, low-
change-rate databases, laptops and remote data sets over the LAN/WAN. Avamar
provides the ability to direct backups to an Avamar Data Store and to a Data Domain
system simultaneously, eliminating the need for separately managed environments.
Using the Avamar interfaces and workflows, administrators can quickly and easily
direct enterprise-wide backups to both EMC deduplication platforms while simplifying
the backup management infrastructure. Avamar integrates with Data Domain via EMC
Data Domain Boost (DD Boost) software. DD Boost significantly increases
performance by distributing parts of the deduplication process to the Avamar client.
With the DD Boost Library integrated in Avamar clients, the Avamar client can send
unique data segments directly to the Data Domain system. These data types include:
Microsoft SQL Server, SharePoint, Exchange, Oracle, SAP, Sybase, Hyper-V,

23 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
DB2, Windows, Linux and major UNIX file systems; NDMP and VMware image data. All
other data types are still sent to the Avamar Data Store. This enables users to deploy
specific approaches for different data types and manage the entire infrastructure from
a single Avamar interface. This best of both worlds approach to deduplication
consolidates management, reduces backup times and network traffic, as well as
backup storage.
For some use cases, the integration with Data Domain adds to Avamars feature
capability. For example, for VMware Backup and Recovery, Data Domain enables the
Avamar solution to provide further recovery flexibility by offering the VM Instant
Access feature. This feature allows a Backup Administrator or VMware Administrator
to actually run a Virtual Machine natively in the vCenter environment directly from the
Backup Image residing on Data Domain. This works to further reduce RTO and enable
mission critical applications running on VMs to be online as soon as possible.
VMware Images, Hyper-V Images,
Exchange, SharePoint, SQL Server,
Oracle, SAP, Sybase, IBM DB2,
File Systems, NDMP, Lotus Notes
DD Boost
Remote Office, Desktop/Laptop
Avamar
Management
Avamar Agents
VM
DB
SharePoint
NAS/
NDMP


Figure 13: Avamars integration with Data Domain leverages existing investments,
simplifies management, and optimizes performance
When specifying an Avamar Data Store as the backup destination, the Avamar client
installed on each host will work as a typical Avamar client performing client side
deduplication and sending those unique sub-file segments over the WAN/LAN to the
Avamar Server including any required metadata. This dataflow is the same, well-
known implementation of previous versions of Avamar.
When specifying a Data Domain system as the backup target of a particular dataset
from within the Avamar Administrator interface, the same Avamar client leverages
Data Domain Boost (DD Boost) software to redirect this data directly to Data Domain.

24 Efficient Backup and Recovery with EMC Avamar Deduplication Software and
Systems
The recovery process, whether from an Avamar Data Store or Data Domain system, is
completely transparent to the backup administrator. The administrator utilizes the
same Avamar recovery processes that are native to current Avamar implementations.
Avamar will automatically retrieve datasets, which are stored on Data Domain
systems. No special retrieval mechanisms or processes are required.
Replication between primary and replica Data Domain systems is also integrated into
the Avamar management feature set. This is controlled via the Avamar replication
policies applied to each dataset. All typical Avamar replication scenarios are
supported for datasets targeted to Data Domain.
Conclusion
Avamar deduplication software and systems solve the challenges associated with
traditional backup by providing fast, daily full backups for virtual environments, NAS
systems, desktop/laptop systems, remote offices, and business critical applications.
Avamar utilizes integrated data deduplication technology to identify redundant data
segments at the client, significantly reducing daily backup data before it is
transferred across the network and stored. This allows companies to utilize existing
physical and virtual infrastructure for fast, reliable backup, recovery, and disaster
recovery. Data can be encrypted for added security, and centralized management
makes protecting hundreds of remote sites easy and efficient.
By storing just a single instance of each subfile variable length data segment globally,
Avamar reduces total back-end storage by up to 50x for cost-effective, long-term,
disk-based recovery. For extended retention on tape, Avamar backup data can also be
exported to physical or virtual tape in deduplicated or non-deduplicated format.
Avamars intuitive interface allows desktop and laptop end users to quickly recover
their own data, reducing the burden on IT staff. Avamar is tightly integrated with
NetWorker, utilizing the existing NetWorker management interface and policies. And
Avamar can also integrate with Data Domain systems for deduplicated backup of
specific data types.
Avamar offers flexible deployment options for physical and virtual environments. Built
on a scalable grid architecture, Avamar provides just-in-time capacity provisioning
and high performance. High availability is delivered via patented RAIN technology to
minimize single points of failure, along with daily server and data recoverability
checks, and offsite replication options. With Avamar, enterprise organizations can
quickly backup more data, dramatically reduce storage and network costs, and enjoy
fast single-step recovery.

Anda mungkin juga menyukai