IBMredbooks IPstorageNetworking SG246240

Front cover
IP Storage Networking:
IBM NAS and iSCSI Solutions
All about the latest IBM Storage

Network Products
Selection criteria for Storage

Networking needs
Application scenarios
Rowell Hernandez
Keith Carmichael
Cher Kion Chai
Geoff Cole
ibm.com/redbooks
International Technical Support Organization
IBM NAS and iSCSI Solutions
Second Edition
February 2002
SG24-6240-01
Take Note! Before using this information and the product it supports, be sure to read the
general information in “Special notices” on page 285.
Second Edition (February 2002)
This edition applies to the IBM TotalStorage Network Attached Storage 200, 300, and 300G with
microcode Release 2.0, IBM TotalStorage IPStorage 200i with microcode Release 1.2, Cisco
SN5420 storage and initiator clients running on Redhat Linux 7.1, Windows 2000, and
Windows NT.
Comments may be addressed to:

IBM Corporation, International Technical Support Organization
Dept. 471F Building 80-E2
650 Harry Road
San Jose, California 95120-6099
When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the
information in any way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 2001, 2002. All rights reserved.
Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject to
restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
Summary of changes
This section describes the technical changes made in this edition of the book and
in previous editions. This edition may also include minor corrections and editorial
changes that are not identified.
Second Edition, February 2002

This revision reflects the addition, deletion, or modification of new and changed
information described below.
New information
򐂰 Added information on IBM TotalStorage 200
򐂰 Added information on IBM TotalStorage 300
򐂰 Added information on Cisco SN5420
Changed information
򐂰 Removed all references to IBM ~ xSeries 150
򐂰 Updated to include information on IPStorage 200i new models and microcode
v1.2
򐂰 Updated to include information on NAS new models and preloaded software
v2.0
© Copyright IBM Corp. 2001, 2002 iii

iv IP Storage Networking: IBM NAS and iSCSI Solutions
Contents
Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
The team that wrote this redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Special notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
IBM trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Comments welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Chapter 1. Introduction to storage networking . . . . . . . . . . . . . . . . . . . . . . 1

1.1 The data explosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 The storage networking evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Growth in networked storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Storage architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 The role of storage and network protocols . . . . . . . . . . . . . . . . . . . . . 6
1.4 Direct attached storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.1 DAS media and protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 DAS uses block I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.3 Benefits of DAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.4 Other DAS considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Local area networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5.1 Ethernet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 IP Network communication protocols . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.3 Exploiting IP networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.4 Managing the IP network resources . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6 Network attached storage (NAS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.1 File servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6.2 Network appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6.3 NAS appliances use File I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.4 IBM TotalStorage Network Attached Storage 200 and 300 . . . . . . . 24
1.6.5 NAS benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.6.6 Other NAS considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.7 Storage Area Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.7.1 Overview of Fibre Channel storage networks . . . . . . . . . . . . . . . . . . 29
1.7.2 Fibre Channel SANs use block I/O . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.7.3 IBM SAN solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.7.4 SAN benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.7.5 Other SAN considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.7.6 Data and SAN management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
© Copyright IBM Corp. 2001, 2002 v

1.8 Getting the best of both worlds: SAN with NAS . . . . . . . . . . . . . . . . . 37
1.8.1 Tivoli SANergy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.8.2 SANergy uses a mix of file I/O and block I/O . . . . . . . . . . . . . . . . . . 40
1.8.3 SANergy benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.8.4 SANergy considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.8.5 The IBM NAS 300G appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.8.6 IBM NAS 300G benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.8.7 Other NAS 300G considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.9 A new direction: SCSI over IP networks . . . . . . . . . . . . . . . . . . . . . . . 48
1.9.1 Internet SCSI (iSCSI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
1.9.2 The IBM TotalStorage IP Storage 200i . . . . . . . . . . . . . . . . . . . . . . . 50
1.9.3 iSCSI gateways and the Cisco SN 5420 Storage Router . . . . . . . . . 52
1.9.4 iSCSI uses block I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.9.5 iSCSI benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.9.6 iSCSI considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.9.7 Where does the IBM IP Storage 200i fit? . . . . . . . . . . . . . . . . . . . . . 57
1.10 Storage networking solution options from IBM . . . . . . . . . . . . . . . . . 58
1.10.1 Which storage network? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1.11 Industry standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.11.1 Storage Networking Industry Association (SNIA) . . . . . . . . . . . . . . 61
1.11.2 Internet Engineering Task Force (IETF) . . . . . . . . . . . . . . . . . . . . . 61
Chapter 2. IP storage networking technical details . . . . . . . . . . . . . . . . . . 63

2.1 Open Systems Interconnection (OSI) model . . . . . . . . . . . . . . . . . . . . 64
2.1.1 Physical layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.1.2 Data link layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.1.3 Network layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.1.4 Transport layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.1.5 Session layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.1.6 Presentation layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.1.7 Application layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.2 TCP/IP technical overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.2.1 Protocol stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.2.2 The TCP/IP protocol stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.3 Ethernet technical overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.3.1 The history of Ethernet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.3.2 Ethernet design concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.3.3 The CSMA/CD protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.3.4 Ethernet frames. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.3.5 Ethernet physical topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.3.6 Ethernet media systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.3.7 Ethernet summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 iSCSI basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
vi IP Storage Networking: IBM NAS and iSCSI Solutions

2.4.1 iSCSI requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4.2 iSCSI concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.4.3 iSCSI functional overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.5 Understanding the storage I/O path . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.5.1 Hardware components of the I/O channel. . . . . . . . . . . . . . . . . . . . . 86
2.5.2 Software components of the I/O channel . . . . . . . . . . . . . . . . . . . . . 89
2.5.3 I/O operations hardware/software combination. . . . . . . . . . . . . . . . . 92
2.6 Network file system protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.6.1 Network File System (NFS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.6.2 Common Internet File System (CIFS) . . . . . . . . . . . . . . . . . . . . . . . . 95
2.6.3 Differences between NFS and CIFS . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.7 Tracing the I/O path for local storage . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.7.1 File system I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.7.2 Raw I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
2.7.3 Local and SAN attached storage block I/O summary . . . . . . . . . . . 101
2.8 Tracing the I/O path for network storage . . . . . . . . . . . . . . . . . . . . . 101
2.8.1 Redirected I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
2.8.2 Network File I/O summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.9 Tracing the I/O path for SANergy clients . . . . . . . . . . . . . . . . . . . . . 104
2.10 Tracing the I/O path for Internet SCSI (iSCSI) . . . . . . . . . . . . . . . . 106
2.11 Storage block I/O and network file I/O summary . . . . . . . . . . . . . . . 107
2.12 Clustering concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
2.12.1 Shared null . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
2.12.2 Shared Nothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
2.12.3 Shared Everything. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
2.13 Data and network management . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
2.13.1 Tivoli NetView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.13.2 Tivoli Storage Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
2.13.3 Tivoli Storage Network Manager (TSNM) . . . . . . . . . . . . . . . . . . . 115
2.13.4 Storage virtualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Chapter 3. IBM NAS and iSCSI storage products. . . . . . . . . . . . . . . . . . . 121

3.1 The IBM TotalStorage Network Attached Storage 200 . . . . . . . . . . . 122
3.1.1 IBM NAS 200 highlights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.1.2 IBM NAS 200 Model 201 tower hardware . . . . . . . . . . . . . . . . . . . . 124
3.1.3 IBM NAS 200 Model 226 rack hardware . . . . . . . . . . . . . . . . . . . . . 125
3.1.4 IBM NAS 200 technical specifications summary . . . . . . . . . . . . . . . 127
3.1.5 IBM NAS 200 features and benefits . . . . . . . . . . . . . . . . . . . . . . . . 128
3.1.6 IBM NAS 200 optional features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3.1.7 IBM NAS 200 preloaded software . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3.1.8 IBM NAS 200 high availability and serviceability . . . . . . . . . . . . . . 131
3.1.9 IBM NAS 200 scalability and growth . . . . . . . . . . . . . . . . . . . . . . . . 132
3.1.10 IBM NAS 200 system management . . . . . . . . . . . . . . . . . . . . . . . 133
Contents vii
3.2 IBM TotalStorage Network Attached Storage 300 . . . . . . . . . . . . . . . 135
3.2.1 IBM NAS 300 hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.2.2 IBM NAS 300 technical specifications. . . . . . . . . . . . . . . . . . . . . . . 140
3.2.3 IBM NAS 300 features and benefits . . . . . . . . . . . . . . . . . . . . . . . . 140
3.2.4 IBM NAS 300 optional features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
3.2.5 IBM NAS 300 preloaded software . . . . . . . . . . . . . . . . . . . . . . . . . . 141
3.3 IBM NAS 200 and 300 comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 144
3.4 IBM TotalStorage Network Attached Storage 300G . . . . . . . . . . . . . 145
3.4.1 IBM NAS 300G hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
3.4.2 IBM NAS 300G technical specifications . . . . . . . . . . . . . . . . . . . . . 151
3.4.3 IBM NAS 300G features and benefits . . . . . . . . . . . . . . . . . . . . . . . 152
3.4.4 IBM NAS 300G preloaded software . . . . . . . . . . . . . . . . . . . . . . . . 153
3.4.5 IBM NAS 300G connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
3.5 IBM TotalStorage IP Storage 200i Series . . . . . . . . . . . . . . . . . . . . . 158
3.5.1 IBM TotalStorage IP Storage 200i Configurations . . . . . . . . . . . . . 160
3.5.2 IBM TotalStorage IP Storage 200i Technical Specifications . . . . . . 161
3.5.3 IBM TotalStorage IP Storage 200i Microcode . . . . . . . . . . . . . . . . . 162
3.5.4 IBM TotalStorage IP Storage 200i features and profiles . . . . . . . . . 162
3.5.5 IBM IP Storage high availability and serviceability . . . . . . . . . . . . . 163
3.5.6 IBM IP Storage expandability and growth . . . . . . . . . . . . . . . . . . . . 164
3.5.7 IBM IP Storage 200i 4125-EXP Expansion Unit . . . . . . . . . . . . . . . 164
3.5.8 IBM IP Storage 200i Optional Features . . . . . . . . . . . . . . . . . . . . . 165
3.6 The Cisco SN 5420 Storage Router . . . . . . . . . . . . . . . . . . . . . . . . . 166
3.6.1 Cisco SN 5420 hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
3.6.2 Cisco SN 5420 technical specifications . . . . . . . . . . . . . . . . . . . . . 169
3.6.3 Cisco SN5420 clustering and high availability . . . . . . . . . . . . . . . . 170
3.6.4 Cisco SN5420 SCSI Routing Services . . . . . . . . . . . . . . . . . . . . . . 170
3.6.5 Cisco SN5420 features and benefits. . . . . . . . . . . . . . . . . . . . . . . . 171
Chapter 4. Management of IBM NAS and IP Storage solutions . . . . . . . 173

4.1 IBM NAS and IP Storage management. . . . . . . . . . . . . . . . . . . . . . . 174
4.1.1 NAS 300 and 300G base drive configuration . . . . . . . . . . . . . . . . . 175
4.1.2 Advanced System Management (ASM) Processor . . . . . . . . . . . . . 175
4.1.3 ASM PCI adapter option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
4.2 IBM NAS and IP Storage preloaded software . . . . . . . . . . . . . . . . . . 179
4.2.1 Configuration/Setup Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
4.2.2 SCSI Select Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
4.2.3 ServeRAID programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
4.2.4 Terminal Services Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
4.2.5 Universal Manageability Services (UM Services) . . . . . . . . . . . . . . 181
4.2.6 IBM Advanced Appliance Configuration Utility (IAACU) . . . . . . . . . 182
Chapter 5. Backup for IBM Network Attached Storage . . . . . . . . . . . . . . 191
viii IP Storage Networking: IBM NAS and iSCSI Solutions

5.1 IBM NAS cache exploitation for backup . . . . . . . . . . . . . . . . . . . . . . 192
5.1.1 IBM NAS cache mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.1.2 Persistent Storage Manager True Image Copies . . . . . . . . . . . . . . 197
5.1.3 PSM True Image copies can either be read-only or read-write. . . . 210
5.1.4 Differences between PSM and other similar implementations . . . . 210
5.1.5 Archival, backup, and restoration of IBM NAS appliances . . . . . . . 211
Chapter 6. Application examples for IBM NAS and iSCSI solutions . . . 221
6.1 NAS Storage consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.2 NAS LAN file server consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . 224
6.3 SANergy high speed file sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.4 SANergy with Tivoli Storage Manager (TSM) . . . . . . . . . . . . . . . . . . 227
6.4.1 Using TSM with SANergy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
6.4.2 TSM backup/restore using SANergy: Scenario 1 . . . . . . . . . . . . . . 228
6.4.3 TSM backup/restore using SANergy: Scenario 2 . . . . . . . . . . . . . . 228
6.5 NAS Web hosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.6 IP Storage 200i solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
6.6.1 Database solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
6.6.2 Transaction-oriented applications . . . . . . . . . . . . . . . . . . . . . . . . . . 233
6.7 Positioning storage networking solutions . . . . . . . . . . . . . . . . . . . . . 234
6.8 Typical applications for NAS and for iSCSI? . . . . . . . . . . . . . . . . . . . 235
Chapter 7. Other storage networking technologies . . . . . . . . . . . . . . . . . 237

7.1 Network performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.2 Storage over IP (SoIP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.3 Internet Fibre Channel Protocol (iFCP) . . . . . . . . . . . . . . . . . . . . . . . 239
7.4 Fibre Channel over TCP/IP (FCIP) . . . . . . . . . . . . . . . . . . . . . . . . . . 240
7.5 InfiniBand (IB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.5.1 InfiniBand objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.5.2 InfiniBand architecture specification . . . . . . . . . . . . . . . . . . . . . . . . 242
7.5.3 The benefits of InfiniBand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
7.6 Virtual Interface (VI) architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.6.1 The objectives of Virtual Interface architecture . . . . . . . . . . . . . . . . 244
7.6.2 Virtual architecture components . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
7.7 Direct Access File System (DAFS) . . . . . . . . . . . . . . . . . . . . . . . . . . 248
7.7.1 DAFS compared to traditional file access methods . . . . . . . . . . . . 249
7.7.2 Benefits of DAFS-enabled storage . . . . . . . . . . . . . . . . . . . . . . . . . 250
7.8 Network Data Management Protocol (NDMP) . . . . . . . . . . . . . . . . . . 251
7.8.1 NDMP terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.8.2 NDMP architecture model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.9 Industry standards bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
7.9.1 SNIA work groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
7.9.2 IETF work groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Contents ix
7.10 The bottom line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Appendix A. RAID concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Related publications . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... . 281

IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... . 281
Other resources . . . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... . 281
Referenced Web sites . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... . 281
How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . ...... ....... ...... . 283
IBM Redbooks collections . . . . . . . . . . . . . . . . . ...... ....... ...... . 283
Special notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
x IP Storage Networking: IBM NAS and iSCSI Solutions

Preface
This IBM Redbook is the result of residencies conducted at the International

Technical Support Organization, San Jose Center following the announcement of
Network Attached Storage and iSCSI products. This is the second edition; it has
been updated to reflect the announcement in April 2001 of the iSCSI Cisco
SN5420 Storage Router, and the June 2001 announcement of the IBM
TotalStorage Network Attached Storage 200 and 300 appliances.
This redbook will help you:

򐂰 Understand the different technologies involved in storage
򐂰 Learn about IBM’s latest storage networking product offerings
򐂰 Discover the different storage network solutions
򐂰 Decide which storage solution is right for a given situation
򐂰 Absorb the concepts behind iSCSI products and technology
We hope you will read this redbook from cover to cover, but in case you are in a
hurry, here is a guide to its organization:
For beginners without any knowledge about storage, we suggest you first read
Chapter 1, “Introduction to storage networking” on page 1. This chapter will guide
you through the different storage technologies, pros and cons, description,
terminologies, and so on—just the basics.
For more details, we suggest that you read Chapter 2, “IP storage networking
technical details” on page 63. This chapter discusses the different protocols
involved in storage networking and tells you what goes on under-the-covers.
In Chapter 3, “IBM NAS and iSCSI storage products” on page 121, we write
about the new IBM NAS and iSCSI products. You will get a comprehensive
overview of the different IBM TotalStorage Network Attached Storage and iSCSI
products.
The most important feature of these appliances is ease of use, detailed in

Chapter 4, “Management of IBM NAS and IP Storage solutions” on page 173.
We give you a hands-on tour through the different tools bundled with the
products.
These storage products will be storing important data—hence the importance of

backup. Chapter 5, “Backup for IBM Network Attached Storage” on page 191
covers this topic, and describes the operation of Persistent Storage Manager.
© Copyright IBM Corp. 2001, 2002 xi

Many customers will wonder: “How and where do I use these new products?” For
suggestions, check out Chapter 6, “Application examples for IBM NAS and iSCSI
solutions” on page 221, where we give some application scenarios.
And finally, what other developments are going on with regard to storage
networking? In Chapter 7, “Other storage networking technologies” on page 237,
we describe some of the key developments which are under way within the
industry, including work which is in progress to develop new industry standards in
important areas.
For those who are primarily interested in iSCSI topics, the following sections
cover various aspects of this new technology and the IBM iSCSI products:
򐂰 1.9, “A new direction: SCSI over IP networks” on page 48
򐂰 2.4, “iSCSI basics” on page 79
򐂰 2.10, “Tracing the I/O path for Internet SCSI (iSCSI)” on page 106
򐂰 3.5, “IBM TotalStorage IP Storage 200i Series” on page 158
The team that wrote this redbook

This redbook was produced by a team of specialists from around the world
working at the International Technical Support Organization San Jose Center. In
the following photograph, the team members (from left to right) are Rowell
Hernandez, Chai Cher Kion, and Geoff Cole. Keith, Rowell and Geoff updated
this redbook for the second edition.
IBM Redbook Development team
xii IP Storage Networking: IBM NAS and iSCSI Solutions

Rowell Hernandez is a Project Leader for Network Attached Storage and
Internet SCSI at the International Technical Support Organization, San Jose
Center. Before joining the ITSO in 2001, he worked as an IT Specialist for IBM
Philippines, providing support for Netfinity, Windows NT, clustering and Linux.
Rowell is also an IBM ~ Certified Systems Expert - xSeries Windows,
Microsoft Certified Systems Engineer + Internet, Citrix Certified Administrator,
Comptia A+ Certified Technician, and Red Hat Certified Engineer. He holds a
Bachelor of Science degree in Computer Science from AMA Computer
University with graduate work toward a Master of Science in Information
Management at Ateneo De Manila University.
Cher Kion Chai is a Storage Networking Solutions Consultant in the Storage

Systems Group, IBM Asia Pacific. He has 17 years of experience in the IT
industry, including 14 years working at IBM. He holds a Bachelor of Science
Degree in Computer Science. Chai also holds a professional diploma in
management. He is an AIX Certified Advanced Technical Expert. His areas of
expertise include storage networking, network attached storage, and IBM
storage server products. He set up and now manages the IBM ASEAN/SA SAN
Center located in IBM Singapore. Chai is based in Singapore and can be
reached at chaick@sg.ibm.com.
Geoff Cole is a Senior Advisor and Sales Support Manager in the IBM Storage
Networking Solutions Advisory Group. He provides sales support for the IBM
Storage Systems Group in Europe, Middle East, and Africa (EMEA). Geoff is
based in London. He has been with IBM for 30 years, and has 17 years
experience in IBM’s storage business. He has held a number of sales and
marketing roles in the United Kingdom, the United States, and Germany. Geoff
holds a Master of Arts degree in Politics, Philosophy, and Economics from Oxford
University. He is a regular speaker on storage networking-related topics at IBM
customer groups and external conferences in Europe. Geoff can be reached at
coleg@uk.ibm.com.
Keith Carmichael is an advisory IT Availability Professional

from IBM South Africa. He has been with the IBM Integrated
Technology Services Division for the last 5 years. Keith’s
current responsibilities include technical support for PCs and
managing the parts recovery center. He is a Microsoft
Certified Professional and is busy working on his Windows
2000 MCSE certification. His areas of expertise are Windows
NT, Windows 2000, Netfinity Servers, Desktop, ThinkPads and Thin Clients.
Keith holds a National Diploma in Electrical Engineering.
Preface xiii
Thanks to the following people for their valuable contributions to this project:
International Technical Support Organization
Jon Tate, Emma Jacobs, Yvonne Lyon, Deanna Polm, Will Carney, Alison
Chandler
IBM Raleigh
Jay Knott, Eric Dunlap, Robert Owens, Chuck Collins, David Heath, Thomas
Daniels, Jeff Ottman, Joao Molina, Rebecca Witherspoon, Ken Quarles, Sandra
Kipp, Christopher Snell, Megan Kirkpatrick, Holly Tallon, Garry Rawlins
IBM Advanced Technical Support Center

Ling Pong, Norman Bogard, Mark Bruni, Bill Kerney
IBM Rochester
Steve Miedema
IBM Chicago
David Sacks
IBM San Jose

Scott Drummond, John Hufferd, Jeff Barckley
IBM Austria
Wolfgang Singer
IBM Almaden Research

Prasenjit Sarkar, Kaladhar Voruganti
Special notice
This publication is intended to help IBMers, business partners and customers to
understand the different storage networking solutions. The information in this
publication is not intended as the specification of any programming interfaces
that are provided by IBM TotalStorage NAS 200, 300, 300G, IPStorage 200i and
Cisco SN 5420. See the PUBLICATIONS section of the IBM Programming
Announcement for IBM TotalStorage NAS 200, 300, 300G, IPStorage 200i and
Cisco SN 5420 for more information about what publications are considered to
be product documentation.
xiv IP Storage Networking: IBM NAS and iSCSI Solutions

IBM trademarks
The following terms are trademarks of the International Business Machines
Corporation in the United States and/or other countries:
e (logo)® Redbooks
IBM ® Redbooks Logo
AIX S/390
Alert on LAN ServeRAID
AT SP
Current StorWatch
DB2 Wave
Enterprise Storage Server Wizard
ESCON 400
FICON Lotus
Magstar Approach
Netfinity Freelance Graphics
Netfinity Manager Lotus Notes
OS/2 Word Pro
OS/390 Domino
Predictive Failure Analysis Notes
RAMAC Tivoli
RS/6000 NetView
OS/400
Comments welcome
Your comments are important to us!
We want our IBM Redbooks to be as helpful as possible. Send us your

comments about this or other Redbooks in one of the following ways:
򐂰 Use the online Contact us review redbook form found at:
ibm.com/redbooks
򐂰 Send your comments in an Internet note to:
redbook@us.ibm.com
򐂰 Mail your comments to the address on page ii.
Preface xv
xvi IP Storage Networking: IBM NAS and iSCSI Solutions
1
Chapter 1. Introduction to storage

networking
Recent IBM announcements of Network Attached Storage (NAS) and Internet
SCSI (iSCSI) appliances which exploit Ethernet TCP/IP networks have increased
your storage network options. The objectives of this book are to describe these
new products and position them in relation to Direct Attach Storage (DAS) and
Storage Area Network (SAN) storage solutions. After reading this book, we hope
you will be well equipped to understand when to select IBM IP network storage
solutions, and how to deploy them to meet your enterprise storage requirements.
Many volumes have already been written describing the explosion in data
storage, and the need for storage networks. We do not intend to repeat much of
what you have probably already read. We think that Information Technology (IT)
professionals who are involved in storage acquisition decisions understand very
well that we have reached a time when traditional approaches to data storage no
longer meet the needs of many applications and users. If you are a storage
veteran you may wish to turn straight to section 1.2, “Growth in networked
storage” on page 4.
© Copyright IBM Corp. 2001, 2002 1

1.1 The data explosion
For those who are less familiar with the storage scene, industry experts estimate
that the amount of data stored is at least doubling every year. This is due to
dramatic growth in existing applications, such as on-line transactions, e-mail and
so on, plus development of complex new e-commerce applications, including
multimedia applications. It is driven by systems like the Internet and intranets.
The requirement for continuous availability of information in the e-business world
encourages organizations to duplicate, even triplicate, on-line copies of their
data. And this explosive growth is ultimately enabled by the extraordinary ability
of the disk drive industry to keep doubling the capacity of hard drives almost
yearly, while at the same time delivering 30% to 40% compound annual price
reductions.
If your data is doubling every year, then in ten years it will have grown more than
one thousand fold. We all know that if we do nothing, we will drown in data. It will
become impossible to control, and our business effectiveness will suffer. We
have to become more efficient in the way we store and manage data. IDC
estimates that storage managers must increase efficiency more than 60% per
year.
The problem is aggravated by the fact that information technology professionals

with storage administration skills, like many other skilled IT staff, are becoming
increasingly difficult to hire and retain. There is expected to be a shortage
amounting overall to an estimated 1.5 million IT positions unfilled worldwide by
2002. In effect, users must manage more data, but with no additional human
resources. When we combine this issue with the need to back up and recover
these growing data mountains, control rising costs, and provide continuous
operations around the clock, it soon becomes apparent that some things have to
change.
Throughout the 1990s, more than 70% of all disk storage was directly attached to
an individual server. This was primarily due to the rapid growth in the capacity of
hard disk drive technology in individual PCs, as well as client and server
platforms, rising from tens of megabytes to tens of gigabytes. It is now generally
recognized that connectivity of storage devices must enable substantially higher
scalability, flexibility, availability, and manageability than is possible with directly
attached devices.
2 IP Storage Networking: IBM NAS and iSCSI Solutions

This is especially true to support today’s advanced business applications—
Enterprise Resource Planning (ERP), Customer Relationship Management
(CRM), Business Intelligence (BI), e-business, and so on. In today’s world the
most valuable business asset is data. To exploit its worth to the fullest,
information must be available to all who need it. In other words, it must be
sharable. To achieve this, the data storage must be consolidated and attached to
a network (see Figure 1-1).
Storage Network
Figure 1-1 Storage networks facilitate consolidation and data sharing
1.1.1 The storage networking evolution

In the early 1990s, in the open systems world (UNIX, Windows, NetWare), the
need to share departmental files efficiently gave rise to what has become known
as Network Attached Storage (NAS). This built on the infrastructure of Local Area
Networks (LAN). Since the late 1990s, another type of network has developed,
known as a Storage Area Network (SAN). SAN largely has grown from the need
to handle multi-terabyte databases enterprise wide, and to deal with the
never-ending demand for high speed transactions.
Links between NAS and SAN, by means of intelligent NAS appliances, were
announced in early 2001 by IBM. These enable LAN-attached clients to access
and share SAN-attached storage systems. Now a third type of network storage
solution is emerging, known as iSCSI. This utilizes features of both SAN and
NAS using SCSI storage protocols on LAN IP network infrastructures. IBM was
first to market with iSCSI solutions with its TotalStorage IP Storage 200i devices,
announced in February 2001.
Chapter 1. Introduction to storage networking 3

1.2 Growth in networked storage
As shown in Figure 1-2 on page 5, NAS and SAN are projected to grow into
multi-billion dollar markets. Figure 1-3 gives a proportional view of how the
implementation of storage network technologies is expected to dramatically
change the ratio of directly attached storage (DAS) in favor of SAN and NAS
during the next two or three years. Whereas DAS began the new millennium at
around 70% of the total, network storage systems are projected to represent
some 80% (SAN at approximately 60% and NAS at around 20%) by the end of
2003. This is an extraordinary change in a very short time frame.
Since iSCSI is, in effect, SAN over IP, predictions regarding its growth are
included in the SAN projections. One projection is that iSCSI could represent
some 15% of the SAN market within three years. Although industry analysts
anticipated delivery of such solutions after the beginning of 2002, IBM leadership
in storage networking allowed an earlier introduction.
Since the advent of SAN solutions there has been a tendency to view NAS and
SAN as competing technologies within the market. This is partly due to some
confusion on how to apply each technology. After all, both terms include the
words storage and network. The problem to be solved is how to connect lots of
storage to lots of servers. The best technology to use to resolve the problem is a
network . However, the implementations are very different. NAS exploits the
existing intermediate speed messaging network, whereas the SAN solution uses
a specially designed high-speed networked channel technology.

$ Billions
40
30
NAS Storage
20
SAN Storage
10
0
1997 1998 1999 2000 2001 2002 2003
S ou rce: G artne r IT xpo 1 0/2 0 00
Figure 1-2 NAS versus SAN spending
% Revenue
100 100
90 90
80 80
70 70
60 60
50 50
40 40
30 30
20 20
10 10
0 0
1998 1999 2000 2001 2002 2003
DAS NAS SAN

Source: Salomon Smith Barney
The SAN Book : Aug 2000
Figure 1-3 SAN and NAS adoption rate projections

In practice, IBM expects that NAS and SAN will be implemented as
complementary solutions, and together with directly attached storage and iSCSI
devices, customers may choose to implement a mix of these storage topologies
to suit their organization’s size, applications and budget. IBM has introduced
advanced, specialized NAS appliances which enable Local Area Network (LAN)
users to access enterprise class storage systems. The storage can be attached
either directly to the NAS appliance, or to a SAN fabric. This is an indication of
how storage architectures can cooperate in synergy to deliver the most cost
effective solution to meet users’ business requirements.
1.3 Storage architectures

To understand which storage architecture to select for which environment, it is
necessary to understand the differences between them, and the strengths and
weaknesses of each. In this chapter we look at the current options available in
the market that are supported by IBM. This information is presented in the
sequence in which the solutions have appeared in the marketplace, specifically:
򐂰 Directly attached storage (DAS)
򐂰 Local area networks and file transfer protocols
򐂰 Network attached storage (NAS)
򐂰 Storage area networks (SAN)
򐂰 SANergy file sharing
򐂰 SAN / NAS hybrid appliances
򐂰 Internet SCSI (iSCSI) appliances
We also refer to some recent IBM IP network storage solutions where applicable,
and show what benefits they can provide.
1.3.1 The role of storage and network protocols

When we discuss various network and storage solutions, we frequently refer to
protocols. The term protocol refers to a set of rules which govern
communications between computers and devices. A protocol is to a computer
what a language is to humans. We are writing this book in English, so to read it
you must understand English. In the same way, for two devices to communicate
over a network, they must both understand the same protocols. There are
numerous protocols which operate at different layers of the network
infrastructure. In this book we describe a number of different protocols, which
participate in storage networks at various different stages, or layers. They are
like different languages, or dialects. Each layer has its own language.

1.4 Direct attached storage
Direct Access Storage is the original and basic method of storage attachment.
Storage devices are attached by cable directly to the server. In PC
configurations, the storage is usually integrated in the same cabinet as the
processor. In mainframe and large open servers, the storage is typically located
in a separate unit some distance (meters) from the host. In the open systems
environment, the cable is known as an input/output (I/O) bus attaching to
specialized bus adapters on the host and the device. In the mainframe arena, it is
called an I/O channel. Each server effectively “owns” its own storage devices. I/O
requests access devices directly. This topology was designed initially for
efficiency and high performance. Sharing data between systems was not initially
anticipated.
The simplest configuration is a single disk or single tape drive attached to a

single processor. Disk subsystems normally contain multiple disk drives. These
may be configured as separate and independent disks, typically called a JBOD,
or “just a bunch of disks.” Many subsystems are configured, by default or
possibly optionally, as fault tolerant arrays of disks. These are known as
Redundant Arrays of Independent Disks, or RAID. A number of RAID topologies,
or methods, are available. For those readers who are not familiar with RAID
terminology, or who would like a refresher on the current RAID types supported
by IBM’s recently announced IP storage systems, we have included an overview
of RAID in Appendix A, “RAID concepts” on page 263.
Some disk systems allow the aggregate capacity of the subsystem to be

subdivided into “partitions”, and partitions can be assigned to different
processors, as shown in Figure 1-4. Subsystems like the IBM Enterprise Storage
Server (ESS) may allow partitions to be reassigned manually from one processor
to another. Each processor only sees its own storage capacity, and this is
essentially still a DAS approach.

Private Disk
Svr A
A
B
Svr B
C
Free space available
for dynam ic
Svr C allocation
Partitioned Disk Array
Figure 1-4 DAS implementations
1.4.1 DAS media and protocols

The storage is physically connected to the processor by means of industry
standard media in the form of cables. Media is managed by a low-level protocol
(set of rules) unique to itself, regardless of the attached devices. The protocol
provides the rules for exchanging information between devices, specifying the
format and sequence of electronic messages. The most commonly used types of
media and protocols for directly attaching storage and processors are:
򐂰 Small Computer Systems Interface (SCSI)
򐂰 Fibre Channel
򐂰 Serial Storage Architecture (SSA)
Small Computer Systems Interface (SCSI)

The parallel SCSI (pronounced “scuzzy”) I/O bus, with its roots in the early
1980s, is the most commonly used interconnect media in open systems. An I/O
bus is also known as a transport medium. As its name indicates, SCSI was
designed for the PC and small computer environment. SCSI provides a high
performance and reliable channel for data between servers and storage. Typical
bandwidths range from 40 MBps (Ultra SCSI), to 80 MBps (Ultra2 SCSI), and
160 MBps (Ultra160 SCSI). A parallel SCSI bus, utilizing copper cable media,
has a number of well known limitations on scalability, connectivity and distance
(maximum of 25 meters), due to its use of parallel data transfers over eight or
sixteen data lines within the physical cable.

In addition to being a physical transport, SCSI is also a protocol, which specifies
commands and controls for reading and writing blocks of data between the host
and the attached disk devices. SCSI commands are issued by the host operating
system in response to user requests for data. For instance, a SCSI I/O command
might tell a disk device to return data from a specific location on the disk drive, or
tell a tape library to mount a specific cartridge. The SCSI bus media is connected
to the host server by a SCSI bus adapter (SBA). The SBA carries out much of
the protocol mapping to disk with specialized firmware, thus optimizing
performance of the data transfer. Some operating systems, such as Windows
NT, treat all attached peripherals as SCSI devices, and issue SCSI commands to
deal with all I/O operations.
SCSI is a “block-level” protocol, called block I/O, since SCSI I/O commands
define specific block addresses (sectors) on the surface of a particular disk drive.
So with SCSI protocols (block I/O), the physical disk volumes are visible to the
servers that attach to them. Throughout this book we assume the use of SCSI
protocols when we refer to directly attached storage.
The distance limitations of parallel SCSI have been addressed with the
development of serial SCSI-3 protocols. These allow SCSI commands to be
issued over different types of loop and network media, including Fibre Channel,
SSA, and more recently IP Networks. Instead of being sent as a group of bits in
parallel, on separate strands of wire within a cable, serial SCSI transports carry
the signal as a stream of bits, one after the other, along a single strand of media.
Fibre Channel
Fibre Channel is an open, technical standard for networking. It combines many of
the data characteristics of an I/O bus, with the added benefits of the flexible
connectivity and distance characteristics of a network. Fibre Channel uses
serialized data transmission over either copper (for short distances up to 25
meters) or fiber optic media (for distances up to 10 kilometers). IBM devices only
support the use of fiber optic media.
Storage devices may be directly attached to Fibre Channel enabled servers by

means of point-to-point topology. They attach to a server’s Host Bus Adapter
(HBA). Note the similarity in name to the SCSI bus adapter (SBA). It clearly
indicates that the Fibre Channel attachment is a “bus-like” attachment, using
hardware assisted storage protocols. Like a SCSI bus, they communicate with
the attached storage device by means of SCSI block I/O.

Note: Fibre Channel (FC) is able to use a number of lower level storage
protocols, including SCSI (open systems) and ESCON (IBM ^zSeries
and S/390). In the open systems environment the FC protocol is called Fibre
Channel Protocol (FCP). In the mainframe arena it is called FICON. For the
purposes of this book, whenever we refer to FC protocols we mean FCP.
Devices attached in this Fibre Channel point-to-point topology are, in effect,

attached to a network comprising only two nodes. Because of its channel-like
(bus) qualities, hosts and applications see storage devices as if they are locally
attached storage.
Fibre Channel supports a number of low level storage protocols. When

implemented with the SCSI command set, the low level protocol is known as
Fibre Channel Protocol (FCP). Bandwidth is 100 MBps full duplex, with 200
MBps full duplex expected in late 2001.
Serial Storage Architecture (SSA)

SSA is a media technology developed by IBM. It is used to connect networks of
disks together inside some disk systems, such as the IBM 7133 and Enterprise
Storage Server. SSA uses a multiple loop architecture optimized for storage I/O.
SSA loops deliver very high performance, currently 160 MBps, and have very
high availability characteristics. SSA uses the serial SCSI-3 protocol, so it, too,
communicates with attached storage devices in storage protocols (block I/O).
1.4.2 DAS uses block I/O

Application programs and databases generate I/O requests which culminate in
data being read from, or written to, the physical storage device. Input/ Output
requests to directly attached storage, or to storage on a SAN, communicate in
storage protocols which are commonly called block I/Os. This is because the
read and write I/O commands identify a specific device (disk drive or tape drive)
and, in the case of disks, specific block (sector) locations on the disk are
identified within the I/O request.
In the case of I/Os to disks using SCSI protocols, the application may use
generalized file system services. These manage the organization of the data
onto the storage device via the device driver software. In the UNIX world, this
file-level I/O is called cooked I/O. However, many databases and certain
specialized I/O processes generate record-oriented I/O direct to the disk via the
device driver. UNIX fans call this raw I/O.

A fundamental characteristic of DAS (unlike some network storage devices) is
that, regardless of whether the application uses cooked I/O or raw I/O (that is, file
system or block access), all I/O operations to the device are translated to SCSI
protocol blocks. That means they are formatted in the server by the database
application, or by the operating system, into blocks which reflect the address and
structure of the data on the physical disk device.
These blocks are moved on the I/O bus to the disk device, where they are
mapped via a block table to the correct sector on the media (in mainframe
parlance, this is called channel I/O ). Block I/O is illustrated in Figure 1-5. For
technical details of how block I/Os are generated, refer to 2.7, “Tracing the I/O
path for local storage” on page 98.
IP network
Application
server
Block I/O
DAS
SCSI protocol
Application makes file I/O

request to file system in
server, which initiates
block I/O to disk
OR
Application initiates raw
block I/O to disk
Figure 1-5 DAS uses block I/O
1.4.3 Benefits of DAS

In summary, the benefits of direct storage attachment are these:
򐂰 Simplicity of connection: The cabling is either integrated in the cabinet
with the server, or it is a simple point-to-point connection, often over short
distances. Storage administrative skills required for installation are low.
򐂰 Low acquisition cost: SCSI bus cable costs are generally relatively low.
Logistical planning and administrative overheads are kept to a minimum. FC
point-to-point connection costs are likely to be higher owing to the need for
specialized HBAs and extended distances using fiber optic cables.

򐂰 High performance: The interconnection is designed for storage, and has a
significant amount of hardware assistance to minimize software overheads.
DAS uses storage protocol, such as SCSI block I/O, so performance is
optimized for all types of applications.
򐂰 General purpose solution: Since the DAS solution is optimized for all types
of storage processing, the investment in DAS can be applied to most
applications, giving good flexibility during the life of the acquisition.
1.4.4 Other DAS considerations

DAS connections have a number of constraints, as follows:
򐂰 Limited scalability: The disk device can scale to a set maximum capacity.
Bus connections normally strictly limit the distance at which storage devices
can be positioned from the server (maximum of 25 meters for parallel SCSI
bus), and limit the number of devices which can be attached to the bus (for
example, a maximum of 15 on a parallel SCSI bus).
򐂰 Dedicated connectivity: This is often at short distance, and prohibits the
ability to share capacity resources with other servers. This limitation,
however, is mitigated in storage systems, like the IBM Enterprise Storage
Server, which allow connection of multiple servers, each attached to its own
dedicated partition. SSA and FC point-to-point connections also may relieve
distance limitations.
򐂰 Function: In many cases, low cost disk systems attached to distributed
clients and servers have limited function when compared to consolidated
storage systems, which usually offer advanced capabilities such as RAID and
enhanced copy services.
򐂰 Backup and data protection: Backup must be done to a server-attached
tape device. This may lead to additional costs in acquiring multiple small tape
devices. These may be acquired more for reasons of low cost rather than for
quality and reliability associated with departmental or enterprise class
devices. Individual users of DAS may apply inconsistent, or even
non-existent, backup policies, leading to greater recovery costs in the event
of errors or hardware failures.
򐂰 Total cost of ownership: Storage resources attached to individual servers
are frequently inefficiently utilized. Capacity available to one server is not
available to other servers (unless the disk system allows attachment of
multiple servers and partitioning). Storage administration costs are increased
because the number of GBs an individual can manage in a distributed storage
environment is substantially less than for consolidated storage such as NAS
or SAN.

1.5 Local area networks
Much of the world’s business runs today on local area networks (LANs). A LAN is
the interconnection of two or more computers in such a way that users can easily
share files, programs, data, and physical resources such as printers, with
minimal effort. As its name implies, a LAN is usually local. In other words, all the
machines are physically located in a single building or campus site.
LANs proliferated from the mid-1980s to address the problem of “islands of

information” which occurred with standalone computers within departments and
enterprises. The objective was to enable users to share information and
applications across the organization and to communicate electronically. LANs
are the basis of what has become known as client/server computing. In this
model one computer (the client) initiates a request to another machine located
elsewhere (the server). The server computes the answer and sends it back to the
client.
Typically, LAN design is based on open system networking concepts. These

concepts are described in the network model proposed by the Open Systems
Interconnection (OSI) standards of the International Standards Organization
(ISO). The OSI model describes a seven layered approach to differentiate the
various parts and functions of a network. We refer to certain of the layers in the
following chapter, especially the Transport, Network, and Data Link layers. These
are described in 2.1, “Open Systems Interconnection (OSI) model” on page 64.
To achieve data exchange and sharing across networks, LANs require the use of
appropriate interconnection topologies and protocols. A LAN has a single logical
topology (access scheme), and will usually use a common network operating
system and common connecting cable.
A logical topology is the method used for transporting data around the network. It
is comparable to an access method (Media Access Control (MAC) in the OSI
Data Link layer). The access scheme handles the communication of data
packets, and places them in frames for transmission across the network.
Several different types of network access schemes were developed for LANs in
the 1980s. These include token ring passing schemes such as:
򐂰 Fiber Distributed Data Interface (FDDI), based on concentric rings of fiber
optic cable)
򐂰 Token Ring (developed by IBM)
򐂰 ARCnet (developed by Datapoint)
򐂰 Ethernet (originally designed by Xerox Corporation), which uses a
collision-detect access method

Several other access schemes were developed for wide area networks (WANs),
including Integrated System Digital Network (ISDN), Asynchronous Transfer
Mode (ATM), X.25 packet switching, and Frame Relay.
Today the predominant logical topology for LANs is Ethernet. IDC estimates that
more than 85% of all installed network connections worldwide are Ethernet,
which is so popular because it offers the best combination of price, simplicity,
scalability, and management ease of use. For this reason, we assume the
Ethernet protocol whenever we refer to LANs in this book.
1.5.1 Ethernet
Ethernet is an open industry standard for local area networks. It includes
definitions of protocols for addressing, formatting, and sequencing of data
transmissions across the network. The term Ethernet also describes the physical
media (cables) used for the network.
Based on initial developments by Xerox Corporation in the early 1970s, later

supported by Intel and Digital Equipment Corporation, formal industry standards
were defined in the early 1980s (IEEE 802.3 standard). It is an open, vendor
neutral technology, capable of delivering a high degree of interoperability.
Ethernet uses a media access protocol, known as Carrier Sense Multiple Access
with Collision Detection (CSMA/CD). The CSMA/CD protocol moves packets on
the network. In effect, every node monitors the network to see if the network is
already transmitting a packet. A node waits until the network is free before
transmitting its packet. Since the nodes are spread in different locations, it is
possible for more than one node to begin transmitting concurrently. This results
in a collision of the packets on the network. If a collision is detected, all nodes
then go into a wait mode. On a random basis, they attempt to re-transmit the
packets until they are successful.
More nodes tend to mean more data packets transferred, and therefore more
collisions. The more collisions there are, the slower the network runs. This
problem is alleviated by the division of Ethernet LANs into multiple smaller
“subnets” or collision zones, by means of routers. Implementation of switched
networks, which create collision-free environments, has overcome the potential
limitations of the CSMA/CD protocol. CSMA/CD is described in more detail in
2.3.3, “The CSMA/CD protocol” on page 73.
Early implementations supported small numbers of devices attached to a

relatively short (185 meter), single, shared segment of cable, rather like an I/O
bus. This operated at a speed of 10 Megabits per second (Mbps). Fast Ethernet
at 100 Mbps was delivered later, and in 1999 Gigabit Ethernet delivered 1000
Mbps (approximately 100 Megabytes per second (MBps)).

Ethernet has evolved over time to allow interconnection of multiple segments,
linked by signal repeaters (hubs), and bridges, over large campus distances. The
later introduction of high speed switches enabled many thousands of network
nodes to communicate over long distances, and to interconnect with other LANs,
intranets and the Internet, across wide area connections.
The physical topology of an Ethernet follows a number of possible

implementations, known as segment (an arbitrated bus-like technology);
spanning tree (groups of interlinked segments); and switched . These topologies
are described in 2.3.5, “Ethernet physical topologies” on page 74. Many sites
today have a combination of these implementations, but new LANs generally use
the switched fabric topology to deliver highest performance, scalability, and
flexibility.
There are several different types of Ethernet networks, based on the physical
cable implementations of the network. There are a number of media segments,
or cable types, defined in the Ethernet standards. Each one exhibits different
speed and distance characteristics. They fall into four main categories: thick
coaxial (thicknet), thin coaxial cable (thinnet), unshielded twisted pair (UTP), and
fiber optic cable. These are described in 2.3.6, “Ethernet media systems” on
page 77, for those readers who want more technical details.
Today, most sites use high quality twisted-pair cable, or fiber optic cables. Short
wave fiber optics can use multi-mode 62.5 micron or 50 micron fiber optic cables,
and single mode 9 micron fiber optic cable is used for long wave lasers. These
cables can all carry either 10 Mbps, 100 Mbps or 1 Gigabit signals, thus allowing
easy infrastructure upgrades as required.
Ethernet is well suited to many messaging applications, but it has some

limitations when applied to normal storage traffic. Ethernet’s major attractions are
that it is low cost, it is pervasive in most organizations of any size, and it is the de
facto standard for LANs.
We have included a technical overview of all aspects of Ethernet in 2.3, “Ethernet

technical overview” on page 72.
1.5.2 IP Network communication protocols

To get data to its destination as quickly and accurately as possible, a
communications protocol is required. This protocol is responsible for packaging
and formatting the data for transmission in a standard format.
Several communication protocols were developed for inter-computer

communications, including:
򐂰 Transmission Control Protocol/Internet Protocol (TCP/IP)

򐂰 System Network Architecture (SNA), developed by IBM
򐂰 DECNet (formerly Digital Network Architecture)
򐂰 Internetwork Packet Exchange/Sequenced Packet Exchange (IPX/SPX),
developed by Novell for its Netware products
Today, the de facto standard for client/server communications in the LAN, and
across the Internet, is TCP/IP. This is because it is an entirely open protocol, not
tied to any vendor. Millions of clients and servers, using TCP/IP protocols, are
interconnected into IP network infrastructures by way of routers and switches.
For this reason, we assume the TCP/IP protocol whenever we refer to LANs in
this book.
Transmission Control Protocol/Internet Protocol

TCP/IP was born of work done by the US Department of Defense in the 1970s
and 1980s, which was instrumental in developing inter-networking concepts.
TCP/IP was implemented around UNIX, and the code later spread rapidly among
universities and research centers. In time these US government-funded TCP/IP
networks came to be known as the Internet.
The Internet
Today the Internet is known to all since it is so pervasively used to interconnect
autonomous networks around the world. The Internet has acquired its own
administration body to oversee issues and to carry out ongoing research and
development. This board is called the Internet Activities Board (IAB). It has a
number of subsidiary groups, the best known of which is the Internet Engineering
Task Force (IETF), which deals with tactical implementation and engineering
problems of the Internet. For information on the IAB and IETF, see the following
Web sites:
http://www.iab.org/iab/
http://www.ietf.org
The IETF plays an important role in the development of industry standards,

especially with regard to inter-networking protocols. For this reason we give an
outline of some of the IETF work group topics in 7.9.2, “IETF work groups” on
page 259. These will certainly lead to future standards implementations, which
will influence storage networking solutions.

TCP and IP combined
TCP/IP is really made up of two protocols, which by convention are combined to
TCP/IP.
TCP: The protocol which manages the OSI Transport level of exchanges is
Transmission Control Protocol (TCP). (Note: the OSI network layers are
described in 2.1, “Open Systems Interconnection (OSI) model” on page 64). TCP
adds a destination port and other information about the outgoing data, and puts it
into what is known as a TCP segment.
IP: The standard peer-to-peer networking protocol used by Ethernet (and the
Internet) to route message exchanges between network nodes is the Internet
Protocol (IP). As a result, these networks are generically known as IP networks.
IP is operating in the OSI Network layer. It takes the TCP segment and adds
specific network routing information. The resulting packet is known as an IP
datagram. The datagram passes to the network driver software, which adds
further heading information. The datagram is now a packet, or frame, ready for
transmission across the network.
TCP/IP: The TCP/IP protocol is software based. It is geared towards unsolicited

packets. TCP is reliable because it guarantees that the packet is received by the
target destination. If the packet is not received the target notifies the initiator, and
TCP/IP resends the packet. This software structure implies processing overhead
both in the initiator and in the target nodes. This is a significant factor for data
intensive applications, such as those related to data storage.
TCP/IP also includes a number of other protocols, which are known as the
TCP/IP Suite or stack. This describes a suite of protocols designed to handle
program to program transactions, electronic mail, security, file transfers, remote
logon facilities, and network discovery mechanisms over local and wide area
networks. We describe the TCP/IP protocol stack, and how it interrelates with IP
networks in 2.2, “TCP/IP technical overview” on page 66.
1.5.3 Exploiting IP networks

Once you have established a client/server infrastructure, how can the network be
exploited to deliver business benefit? Sharing of information was one of the key
drivers for LAN implementation. Two major opportunities were exploited from the
outset. The first made it possible to send copies of files between users, and is
known as file transfer. The second enabled multiple users to share access to a
common file, which is stored on a system remote to the user. This is file sharing.

File transfer
An early requirement was to be able to copy files from one computer to another
across the network. Examples of file transfer protocols are remote copy (rcp),
rdist, gopher, tftp. For brevity, this discussion is limited to File Transfer Protocol
and Hypertext Transfer Protocol since these are the only file transfer protocols
supported by Network Attached Storage.
File Transfer Protocol (FTP)

File copying capability is provided by the well known client/server function known
as File Transfer Protocol (FTP). You probably have experience exchanging files
with colleagues within your organization via the LAN or WAN, or between
organizations or individuals over the Internet. It might be a spreadsheet, a
graphical presentation, or a working document for review. FTP specifies how
information that is organized as files should be transferred between
heterogeneous computers on the network.
The manner in which files are stored, accessed, and protected differs among
different types of platforms. Therefore, FTP works with some basic properties
which are common to files on most systems to enable users to manipulate files.
An FTP communication begins when the FTP client establishes a session with
the FTP server. The client can then initiate multiple file transfers to or from the
FTP server. An example of FTP file copying is illustrated in Figure 1-6. At
completion of the process, both systems have a copy of file “x”, and both can
work on it independently.
File Transfer Protocol (FTP) File Transfer Protocol (FTP)
IP network
x x
y
Computer A Computer B
Computer A sends a copy of file "x" to Computer B
Figure 1-6 File transfers use FTP

FTP is frequently used to download data from sites on the Internet. It is
particularly useful for exchanging and distributing software programs, and test or
sample code. However, FTP does not normally include encryption, and FTP data
does not benefit from caching in proxy servers.
Hypertext Transfer Protocol (HTTP)

HTTP is probably familiar to you. It is the most widely used transfer protocol
available on the Internet. It allows you to access Web sites, and to print and
download files from the World Wide Web. It has a number of advantages
compared to FTP, such as the ability to benefit from Web caching technology.
File sharing
Another early requirement was to share files. In other words, rather than ship
files between computers, why not allow multiple clients to access a single copy of
a file which is stored on a central server? Network file protocols and network
operating systems (NOS) were developed in the 1980s to enable users to do
this. These include Network File System (NFS), Common Internet File System
(CIFS), and Novell Netware.
Network File System (NFS)

NFS is a file-level protocol for accessing and sharing data across the network.
NFS originated in the UNIX world, having initially been developed for Sun
systems. NFS is device independent. That means that NFS has no knowledge of
the location of data on a storage device. It addresses data in files, for instance
“read the first 80 bytes from File ABC.” For more details about NFS, refer to 2.6.1,
“Network File System (NFS)” on page 93.
Common Internet File System (CIFS)

CIFS (commonly pronounced “siffs”) is a file level protocol developed by
Microsoft. It provides Windows operating environments with device independent
accessing and sharing of data across the network. CIFS, like NFS, reads and
writes data to and from files, with no knowledge of the location of the data on the
storage device. For more details about CIFS, refer to 2.6.2, “Common Internet
File System (CIFS)” on page 95.
NetWare
NetWare is a popular PC-based specialized network operating system (NOS)
rather than a protocol. Developed by Novell, the NetWare operating system is
optimized as a multi-platform network file server. It supports numerous client
platforms by means of its name space service. In addition to supporting CIFS for
Windows systems, UNIX clients can store data on NetWare servers using NFS,
and Apple Macintosh users can do so via the Apple file protocol.

1.5.4 Managing the IP network resources
The strategic importance of network management in today's computing
environments is critical. Businesses run on a collection of technological
resources including applications, communication tools, the Internet, extranets,
and PCs. A complex network of servers, hubs, switches, bridges, and routers
connects these resources. If one component fails, a crippling domino effect can
spread throughout your entire network of business-critical technology. A number
of software tools are available to manage TCP/IP networks, including:
򐂰 Tivoli NetView
򐂰 CA Unicenter TNG
򐂰 Microsoft System Management
򐂰 HP OpenView
To illustrate the importance and functions of network management, we have

included a brief description of Tivoli Netview in 2.13.1, “Tivoli NetView” on
page 112.
1.6 Network attached storage (NAS)

Storage systems which optimize the concept of file sharing across the network
have come to be known as NAS. NAS solutions utilize the mature Ethernet IP
network technology of the LAN. Data is sent to and from NAS devices over the
LAN using TCP/IP protocol.
By making storage systems LAN addressable, the storage is freed from its direct
attachment to a specific server, and any-to-any connectivity is facilitated using
the LAN fabric. In principle, any user running any operating system can access
files on the remote storage device. This is done by means of a common network
access protocol—for example, NFS for UNIX servers and CIFS for Windows
servers. In addition, a task such as backup to tape can be performed across the
LAN using software like Tivoli Storage Manager (TSM), enabling sharing of
expensive hardware resources (for example, automated tape libraries) between
multiple servers.
A storage device cannot just attach to a LAN. It needs intelligence to manage the
transfer and the organization of data on the device. The intelligence is provided
by a dedicated server to which the common storage is attached. It is important to
understand this concept. NAS comprises a server, an operating system, and
storage which is shared across the network by many other servers and clients.
So a NAS is a specialized server or appliance, rather than a network
infrastructure, and shared storage is attached to the NAS server.

Note: A specialized NAS appliance, like the IBM NAS 300G, attaches to
external storage via a Fibre Channel network connection. Refer to 3.4, “IBM
TotalStorage Network Attached Storage 300G” on page 145 for details on the
300G. We also outline the benefits of the 300G, and similar “NAS gateway”
products, in 1.8.5, “The IBM NAS 300G appliances” on page 43.
The NAS system “exports” its file system to clients, which access the NAS
storage resources over the LAN.
1.6.1 File servers

NAS solutions have evolved over time, beginning in the mid 1990s. Early NAS
implementations used a standard UNIX or NT server with NFS or CIFS software
to operate as a remote file server. Clients and other application servers access
the files stored on the remote file server as though the files are located on their
local disks. The location of the file is transparent to the user. Several hundred
users could work on information stored on the file server, each one unaware that
the data is located on another system.
The file server has to manage I/O requests accurately, queuing as necessary,
fulfilling the request, and returning the information to the correct client. The NAS
server handles all aspects of security and lock management. If one user has the
file open for updating, no one else can update the file until it is released. The file
server keeps track of connected clients by means of their network IDs,
addresses, and so on.
1.6.2 Network appliances

Later developments use application specific, specialized thin server
configurations with customized operating systems (OS). These OS usually
comprise a stripped down UNIX kernel, reduced Linux OS, or a specialized
Windows 2000 kernel, as with the IBM Network Attached Storage appliances
described in this book. In these reduced operating systems, many of the server
operating system functions are not supported. It is likely that many lines of
operating system code have been removed. The objective is to improve
performance and reduce costs by eliminating unnecessary functions normally
found in the standard hardware and software. Some NAS implementations also
employ specialized data mover engines and separate interface processors in
efforts to further boost performance.

These specialized file servers with reduced OS are typically known as
appliances, describing the concept of an application-specific system. The term
appliance borrows from household electrical devices the idea of a specialized
plug-and-play, application-specific tool, such as a coffee maker or a toaster.
Indeed, specialized NAS appliances, like the IBM TotalStorage NAS solutions,
come with pre-configured software and hardware, and with no monitor or
keyboard for user access. This is commonly termed a headless system. A
storage administrator can access the device and manage the disk resources
from a remote console.
One of the typical characteristics of a NAS appliance is its ability to be installed

rapidly, with minimal time and effort to configure the system, and to integrate it
into the network. This plug-and-play approach makes NAS appliances especially
attractive when lack of time and skills are elements in the decision process.
So, a NAS appliance is an easy-to-use device. It is designed for a specific

function, such as serving files to be shared among multiple servers, and it
performs this task very well. It is important to recognize this when selecting a
NAS solution since it is not a general purpose server, and should not be used
(indeed, due to its reduced OS, probably cannot be used) for general purpose
server tasks. But it does provide a good solution for appropriately selected
shared storage applications.
The IBM 3466 Network Storage Manager (NSM) is an integrated appliance

that provides backup, archive, storage management, and disaster recovery of
data stored in a network computing environment. The NSM integrates Tivoli
Storage Manager (TSM) server functions with a rack mounted RS/6000, SSA
disk storage, network communications, and links to automated tape libraries.
NSM manages clients’ data, providing easily installed, centrally administered
storage management services in a distributed network environment.
The IBM 3466 Network Storage Manager (NSM) is an example of a specialized,

plug-and-play IBM network-attached appliance; it requires limited administrator
skills to implement a comprehensive data backup and protection solution. Since
the focus of this book is on recently announced NAS disk storage, we do not
include further details about the 3466. For more information on this powerful
backup/restore product, see A Practical Guide to Network Storage Manager,
SG24-2242.

1.6.3 NAS appliances use File I/O
One of the key differences in a NAS appliance, compared to DAS or other
network storage solutions such as SAN or iSCSI, is that all client I/O operations
to the NAS use file level I/O protocols. File I/O is a high level type of request that,
in essence, specifies only the file to be accessed, but does not directly address
the storage device. This is done later by other operating system functions in the
remote NAS appliance.
A file I/O specifies the file. It also indicates an offset into the file. For instance, the
I/O may specify “Go to byte ‘1000’ in the file (as if the file were a set of
contiguous bytes), and read the next 256 bytes beginning at that position.” Unlike
block I/O, there is no awareness of a disk volume or disk sectors in a file I/O
request. Inside the NAS appliance, the operating system keeps track of where
files are located on disk. It is the NAS OS which issues a block I/O request to the
disks to fulfill the client file I/O read and write requests it receives.
In summary, network access methods like NFS and CIFS can only handle file I/O
requests to the remote file system. This is located in the operating system of the
NAS device. I/O requests are packaged by the initiator into TCP/IP protocols to
move across the IP network. The remote NAS file system converts the request to
block I/O and reads or writes the data to the NAS disk storage. To return data to
the requesting client application, the NAS appliance software repackages the
data in TCP/IP protocols to move it back across the network. This is illustrated in
Figure 1-7 on page 24.
By default, a database application that is accessing a remote file located on a

NAS device is configured to run with File System I/O. It cannot utilize raw I/O to
achieve improved performance. For more technical details about network file I/O,
refer to “File systems and database systems” on page 90 and to 2.8, “Tracing the
I/O path for network storage” on page 101.

IP network
Application
server
NAS Appliance
File I/O
IP protocol
Application server File system in NAS

directs file I/O request appliance initiates
over the LAN to block I/O to NAS
remote file system in integrated disk
the NAS appliance
Figure 1-7 NAS appliances use file I/O
1.6.4 IBM TotalStorage Network Attached Storage 200 and 300

IBM has recently introduced a series of network attached disk storage
servers comprising:
򐂰 IBM 5194 TotalStorage Network Attached Storage 200
– The IBM 5194-201 is a tower model, which scales from 108 GB to
440.4 GB (Internally).
– The IBM 5194-226 is rack mounted and scales from 108 GB to
3.52 TB.
The IBM NAS 200 appliances are well suited to support work group and
departmental environments.
The IBM 5195-325 is a dual node, fault tolerant, rack mounted model
which provides superior performance and data availability. It scales from
109.2 GB to 6.61 TB.
The IBM NAS 300 is ideal to support larger departmental and smaller
enterprise applications.

The IBM NAS 200 and 300 appliances support file-serving requirements
across NT and UNIX clients for e-business and similar applications. The IBM
NAS 200 is illustrated in Figure 1-8 on page 26 and the NAS 300 is shown in
1.6.5 NAS benefits

NAS offers a number of benefits, which address some of the limitations of directly
attached storage devices, and overcome some of the complexities associated
with SANs. NAS benefits include the following:
򐂰 Resource pooling: A NAS appliance enables disk storage capacity to be
consolidated and pooled on a shared network resource, at great distance
from the clients and servers which will share it. Thus a NAS appliance can be
configured as one or more file systems, each residing on specified disk
volumes. All users accessing the same file system are assigned space within
it on demand. This contrasts with individual DAS storage, when some users
may have too little storage, and others may have too much.
Consolidation of files onto a centralized NAS device can minimize the need to
have multiple copies of files spread among distributed clients. Thus overall
hardware costs can be reduced.
NAS pooling can reduce the need to physically reassign capacity among
users. The results can be lower overall costs through better utilization of the
storage, lower management costs, increased flexibility, and increased control.
򐂰 Exploits existing IP network infrastructure: Because NAS utilizes the
existing LAN infrastructure, there are minimal costs of implementation. Staff
with existing skills in IP networks can carry out the installation.
򐂰 Simple to implement: Because NAS devices attach to mature, standard
LAN infrastractures, and have standard LAN addresses, they are typically
extremely easy to install, operate, and administer. This plug-and-play
operation results in low risk, ease of use, and fewer operator errors, so it
contributes to a lower cost of ownership.
򐂰 Enhanced choice: The storage decision is separated from the server
decision, thus enabling buyers to exercise more choice in selecting
equipment to meet their business needs.
򐂰 Connectivity: LAN implementation allows any-to-any connectivity across the
network. NAS appliances may allow for concurrent attachment to multiple
networks, thus supporting many users.
򐂰 Scalability: NAS appliances can scale in capacity and performance within
the allowed configuration limits of the individual appliance. However, this may
be restricted by considerations such as LAN bandwidth constraints and the
need to avoid restricting other LAN traffic.

򐂰 Heterogeneous file sharing: A major benefit of NAS is support of multiple
client file systems. Most organizations support mixed platform environments,
such as UNIX and Windows. In a distributed server-based environment a
dedicated server is required for each file system protocol. If one department is
using Windows-based office applications while another is handling
UNIX-based computer-aided design, two independent servers with their own
directly attached storage are required, and must be supported by the IT
organization.
Remote file sharing is one of the basic functions of any NAS appliance. Most
NAS systems, like the IBM NAS 200, 300, and 300G range of appliances,
support multiple operating system environments. Multiple client systems can
have access to the same file. Access control is serialized by NFS or CIFS.
Heterogeneous file sharing is enabled by the provision of translation facilities
between NFS and CIFS, as with the IBM NAS 200, 300, and 300G
appliances.
򐂰 For users, this means flexibility and standardization of file services. It can also
mean cost savings in staffing, training, and deployment.
Figure 1-8 The IBM TotalStorage Network Attached Storage 200

Figure 1-9 The IBM TotalStorage Network Attached Storage 300
򐂰 Improved manageability: By providing consolidated storage, which

supports multiple application systems, storage management is centralized.
This enables a storage administrator to manage more capacity on a NAS
appliance than typically would be possible for distributed storage directly
attached to many independent servers.
򐂰 Enhanced backup: NAS appliance backup is a common feature of most
popular backup software packages. For instance, the IBM NAS 200, 300, and
300G appliances all provide TSM client software support. They also have an
integrated, automated backup facility to tape, enhanced by the availability of
advanced functions such as the appliance facility called True Image. This
enables multiple point-in-time copies of files to be created on disk, which can
be used to make backup copies to tape in the background—similar in concept
to features such as IBM’s Snapshot function on the IBM RAMAC Virtual Array
(RVA).
򐂰 Reduced total cost of ownership: Because of its use of existing LAN
network infrastructures, and of network administration skills already employed
in many organizations (such as Tivoli Netview management), NAS costs may
be substantially lower than for directly attached or SAN attached storage.

1.6.6 Other NAS considerations
On the converse side of the storage network decision, you need to take into
consideration the following factors regarding NAS solutions:
򐂰 Proliferation of NAS devices: Pooling of NAS resources can only occur
within the capacity of the individual NAS appliance. As a result, in order to
scale for capacity and performance there is a tendency to grow the number of
individual NAS appliances over time, which can increase hardware and
management costs.
򐂰 Software overhead impacts performance : TCP/IP is designed to bring
data integrity to Ethernet-based networks by guaranteeing data movement
from one place to another. The trade-off for reliability is a software-intensive
network design which requires significant processing overheads, which can
consume more than 50% of available processor cycles when handling
Ethernet connections. This is not normally an issue for applications such as
Web browsing, but it is a drawback for performance-intensive storage
applications.
򐂰 Consumption of LAN bandwidth: Ethernet LANs are tuned to favor short
burst transmissions for rapid response to messaging requests, rather than
large continuous data transmissions. Significant overhead can be imposed to
move large blocks of data over the LAN. This is due to the small packet size
used by messaging protocols. Because of the small packet size, network
congestion may lead to reduced or variable performance, so the LAN must
have plenty of spare capacity to support NAS implementations.
򐂰 Data integrity: The Ethernet protocols are designed for messaging
applications, so data integrity is not of the highest priority. Data packets may
be dropped without warning in a busy network, and have to be resent. It is up
to the receiver to detect that a data packet has not arrived, and to request that
it be resent, so this can cause additional network traffic.
򐂰 Impact of backup/restore applications: One of the potential downsides of
NAS is the consumption of substantial amounts of LAN bandwidth during
backup and restore operations, which may impact other user applications.
NAS devices may not suit applications which require very high bandwidth.
򐂰 Suitability for database: Given that their design is for file I/O
transactions, NAS appliances are not optimized for the I/O demands of
some database applications. They do not allow the database programmer to
exploit “raw” block I/O for high performance. As a result, typical databases,
such as Oracle or UDB (DB2), do not perform as well on NAS devices as they
would on DAS, or SAN, or iSCSI. However, some customers may choose to
use NAS for database applications with file I/O because of their other
advantages, including lower cost. It is important to note, however, that in
some cases the database vendor may prohibit use of NAS appliances with
their software. For instance, Microsoft does not support the use of NAS

devices with Microsoft Exchange. In such cases other storage solutions must
be found.
1.7 Storage Area Networks

A Storage Area Network (SAN) is a specialized, dedicated high speed network.
Servers and storage devices may attach to the SAN. It is sometimes called “the
network behind the servers.” Like a LAN, a SAN allows any-to-any connections
across the network, using interconnect elements such as routers, gateways,
hubs, and switches. Fibre Channel is the de facto SAN networking architecture,
although other network standards could be used. Throughout this book, when we
refer to SANs, we mean Fibre Channel SANs.
A decision to implement a SAN is a decision to develop a new storage network

infrastructure (see Figure 1-10 on page 31). Large numbers of customers
worldwide are implementing Fibre Channel SANs right now. As we saw in 1.2,
“Growth in networked storage” on page 4, industry analysts view these SANs as
the storage network infrastructure with the most momentum during the next two
or three years.
1.7.1 Overview of Fibre Channel storage networks

Fibre Channel is an open, technical standard for networking. It incorporates the
data delivery (OSI Transport layer) characteristics of an I/O bus with the flexible
connectivity and distance characteristics of a network. One of the fundamental
differences of SAN attached storage, compared to NAS, is that SAN storage
systems typically attach directly to the network by means of hardware called host
bus adapters (HBA). NAS, on the other hand, requires a “front-end” server as
part of the appliance, which attaches to the LAN by means of a Network Interface
Card (NIC).
A SAN eliminates the traditional dedicated connection between a server and

DAS. Individual servers no longer “own and manage” the storage devices.
Restrictions to the amount of data that a server can access is also minimized.
Instead, a SAN enables many heterogeneous servers to share a common
storage “utility”. This utility may comprise many storage devices, including disk,
tape, and optical storage, and may be located many kilometres from the servers
which use it. Thus SAN attached storage has the potential to be highly scalable
relative to a typical NAS device.
Because of its channel, or bus-like, qualities, hosts and applications see storage
devices attached to the SAN as if they are locally attached storage. With its
network characteristics, it can support multiple protocols and a broad range of
devices, and it can be managed as a network.

Fibre Channel is a multi-layered network, based on a series of American National
Standards Institute (ANSI) standards. These define characteristics and functions
for moving data across the network. Like other networks, information is sent in
structured packets or frames, and data is serialized before transmission. But,
unlike other networks, the Fibre Channel architecture includes a significant
amount of hardware processing. This is oriented to storage block I/O protocols,
such as serial SCSI (known as FCP). A SAN is therefore capable of delivering
very high performance relative to a NAS device, which is optimized for network
file I/O. The speed currently achieved is 100 MBps full duplex, with 200 MBps
soon to be delivered.
Measured effective data rates of Fibre Channel have been demonstrated in the
range of 60 to 80 MBps over the 1 Gbps implementation. This compares to less
than 30 MBps measured over Gigabit Ethernet. The packet size of Fibre Channel
is 2,112 bytes (rather larger than some other network protocols). For instance, an
IP packet is 1,518 bytes, although normally IP transfers are much smaller. But for
Fibre Channel a maximum transfer unit sequence of up to 64 frames can be
defined, allowing transfers of up to 128 MB without incurring additional overhead
due to processor interrupts. Thus, today Fiber Channel is unsurpassed for
efficiency and high performance in moving large amounts of data.
Transmission is defined in the Fibre Channel standards across three transport

topologies:
򐂰 Point-to-point: This is a bi-directional, dedicated interconnection between
two nodes. This delivers a topology similar to DAS, but with the added
benefits of longer distance.
򐂰 Arbitrated loop: This is a uni-directional ring topology, similar to a token
ring, supporting up to 126 interconnected nodes. Each node passes data to
the next node in the loop, until the data reaches the target node. All nodes
share the 100 MBps bandwidth. Devices must arbitrate for access to the loop.
FC-AL is suitable for small SAN configurations, or SANlets.
򐂰 Switched fabric: This describes an intelligent switching infrastructure which
delivers data from any source to any destination. Each node is able to utilize
the full 100 MBps bandwidth. Each logical connection receives dedicated
bandwidth, so the overall bandwidth is multiplied by the number of
connections. Complex fabrics must be managed by software which can
exploit SAN management functions which are built into the fabric.
A mix of these three topologies can be implemented to meet specific needs.

Local Area Network
Server-to-server
Server-to-storage
Storage-to-storage
Storage Area Network
Figure 1-10 SAN: The network behind the servers
SAN supports the following direct, high speed transfers:

򐂰 Server-to-storage: This is similar to a DAS connection to a server. The SAN
advantage, as with a NAS appliance, is that the same storage device may be
accessed serially or concurrently by multiple servers.
򐂰 Server-to-server: This is high-speed communications between servers.
򐂰 Storage-to-storage: Outboard data movement means data can be moved
with limited server intervention. Examples include a disk device moving data
directly to a tape device, or remote device mirroring across the SAN.
Fibre Channel combines the characteristic strengths of traditional I/O channels

with those of computer networks, in the following specifics:
򐂰 High performance for large data transfers by using storage transport
protocols and extensive hardware assists
򐂰 Serial data transmission
򐂰 A physical interface with a low error rate definition
򐂰 Reliable transmission of data with the ability to guarantee or confirm error free
delivery of the data
򐂰 Packaging data in packets (frames in Fibre Channel terminology)

򐂰 Flexibility in terms of the types of information which can be transported in
frames (such as data, video, and audio)
򐂰 Use of existing device-oriented command sets, such as SCSI
򐂰 A vast expansion in the number of devices which can be addressed when
compared to traditional I/O interfaces
It is this high degree of flexibility, availability, and scalability, over long distances,
and the broad acceptance of the Fibre Channel standards by vendors throughout
the IT industry, which make the Fibre Channel architecture attractive as the basis
for new enterprise storage infrastructures.
1.7.2 Fibre Channel SANs use block I/O

A SAN is similar to direct access storage to the extent that it is constructed from
hardware and software storage interfaces. Fibre Channel uses serial SCSI-3
lower level protocols which use block I/O access just like a SCSI bus.
Host-based file systems and/or database I/O management are used, as with
direct attached storage (see 1.4, “Direct attached storage” on page 7). All I/Os
across the SAN are block I/Os (see Figure 1-11). The conversion to blocks takes
place in the client or server platform, before transmission of the I/O request over
the network to the target storage device. For more details of block I/O, refer to
2.7, “Tracing the I/O path for local storage” on page 98.
I P n e tw o r k F ib r e C h a n n e l n e tw o r k
A p p li c a t io n
s e rv e r
B lo c k I/O
FCP SAN
A p p lic a t io n m a k e s file I /O
r e q u e s t to fi le s y s t e m in
s e r v e r , w h ic h in it ia te s b lo c k
I/O to S A N a tta c h e d d is k
OR
A p p lic a t io n in it ia te s r a w
b l o c k I/O to d is k
Figure 1-11 SAN uses block I/O

1.7.3 IBM SAN solutions
IBM has a wide range of hardware which is Fibre Channel (FC) enabled. This
includes all current models of strategic platforms, such a zSeries, pSeries,
xSeries and iSeries servers, plus a number of earlier S/390, RS/6000 and
Netfinity servers.
Current IBM disk storage systems, including the Enterprise Storage Server
(ESS), Modular Storage Server (MSS), FAStT200, FAStT500, and tape
subsystems like the IBM 3590, 3494, and LTO models, are also FC ready. In
addition, IBM offers a broad range of FC hubs, switches, directors and gateways
to build SANs which scale from small workgroups to enterprise-wide solutions.
Furthermore, IBM Global Services supports SAN implementation with
comprehensive design and consultancy services. It is not our intention in this
book to examine these IBM solutions. There are a number of other IBM
Redbooks which address SAN concepts and solutions in considerable detail; we
recommend the following for more information:
򐂰 Introduction to Storage Area Networks, SG24-5470
򐂰 Designing an IBM SAN , SG24-5788
򐂰 Planning and Implementing an IBM SAN , SG24-6116
򐂰 Using Tivoli Storage Manager in a SAN environment , SG24-6132
򐂰 Storage Area Networks; Tape Future in Fabrics , SG24-5474
򐂰 Storage consolidation in SAN environments, SG24-5987
򐂰 Implementing Fibre Channel Attachment on the ESS, SG24-6113
򐂰 SAN Survival Guide, SG24-6143
For details about IBM SAN solutions, visit the IBM storage Web site at:
http://www.storage.ibm.com/ibmsan
1.7.4 SAN benefits

Today’s business environment creates many challenges for the enterprise IT
planner. SANs can provide solutions to many of their operational problems.
Among the benefits of implementing SANs are the following:
򐂰 Storage consolidation: By enabling storage capacity to be connected to
servers at a greater distance, and by disconnecting storage resource
management from individual hosts, a SAN enables disk storage capacity to
be consolidated. The results can be lower overall costs through better
utilization of the storage, lower management costs, increased flexibility, and
increased control.

򐂰 Data sharing: The term data sharing is used somewhat loosely by users
and some vendors. It is sometimes interpreted to mean the replication of files
(FTP-like). This enables two or more users or applications, possibly running
on different host platforms, concurrently to use separate copies of the data. A
SAN can minimize the creation of such duplicate copies of data by enabling
storage consolidation. Data duplication is also eased by using advanced copy
services techniques found on enterprise class storage subsystems, such as
remote mirroring and Flash Copy on the IBM Enterprise Storage Server.
Data sharing can also be used to describe multiple users accessing a single
copy of a file. This is the role for which a NAS appliance is optimized. IBM
provides a NAS-like file sharing capability across the SAN, for selected
heterogeneous server environments, using the Tivoli SANergy File Sharing
solution. (See 1.8.1, “Tivoli SANergy” on page 38 for more details.)
By enabling high speed (100 MBps) data sharing, the SAN solution may
reduce traffic on the LAN, and cut the cost of extra hardware required to store
duplicate copies of data. It also enhances the ability to implement cross
enterprise applications, such as e-business, which may be inhibited when
multiple data copies are stored.
򐂰 Non-disruptive scalability for growth: A finite amount of disk storage can
be connected physically to an individual server. With a SAN, new capacity
can be added as required, without disrupting ongoing operations. SANs
enable disk storage to be scaled independently of servers.
򐂰 Improved backup and recovery: With data doubling every year, what
effect does this have on the backup window? Backup to tape and recovery
operations can increase LAN overheads.
– Tape pooling: SANs allow for greater connectivity of tape drives and tape
libraries, especially at greater distances. Tape pooling is the ability for
more than one server logically to share tape drives within an automated
library.
– LAN-free and server-free data movement: Backup using the LAN may
cause very high traffic volume, which may be disruptive to normal
application access to the network. SANs can minimize the movement of
backup and recovery data across the LAN. IBM’s Tivoli software solution
for LAN-free backup offers the capability for clients to move data directly to
tape using the SAN. A server-free data movement facility is also provided
by IBM Tivoli, allowing data to be read directly from disk to tape (and tape
to disk), saving server cycles used for housekeeping.
򐂰 High performance: Many applications benefit from the more efficient
transport mechanism of Fibre Channel. Most of the elements of FCP are
implemented in hardware to increase performance and efficiency. Currently,
Fibre Channel transfers data at up to 100 MBps full duplex (in practice
measured with effective data rates in the range of 60 MBps to 80 MBps). This

is several times faster than typical SCSI capabilities, and many times faster
than standard LAN data transfers, which operate at 10 Mbps or 100 Mbps. It
is also faster than Gigabit Ethernet, which nominally operates at 100 MBps,
but which in practice typically delivers around 30 MBps to 40 MBps when
moving storage-related data. This is because of the latter’s software
overhead for large data transfers. Removing storage data transfers from the
LAN to the SAN, may improve application performance on servers.
򐂰 High availability server clustering: Reliable and continuous access to
information is an essential prerequisite in any business. Server and software
vendors developed high availability solutions based on clusters of servers.
SCSI cabling tends to limit clusters to no more than two servers. A Fibre
Channel SAN allows clusters to scale to 4, 8, 16, and even to 100 or more
servers, as required, to provide very large shared data configurations.
򐂰 Data integrity: In Fibre Channel SANs, the class of service setting, such as
Class 2, guarantees delivery of frames. Sequence checking and
acknowledgement is handled in the hardware, thus incurring no additional
overhead. This compares to IP networks, where frames may be dropped in
the event of network congestion, causing problems for data-intensive
applications.
򐂰 Disaster tolerance: Sophisticated functions, like Peer-to-Peer Remote
Copy (PPRC) services, address the need for secure and rapid recovery of
data in the event of a disaster. A SAN implementation allows multiple open
servers to benefit from this type of disaster protection. The servers may be
located at campus and metropolitan distances (up to 10 - 20 kilometres) from
the disk array which holds the primary copy of the data. The secondary site,
holding the mirror image of the data, may be located up to a further 100 km
from the primary site.
򐂰 Allow selection of “best of breed” storage: A SAN enables storage
purchase decisions to be made independently of the server. Buyers are free
to choose the best of breed solution to meet their performance, function, and
cost needs. Large capacity external disk arrays may provide an extensive
selection of advanced functions.
Client/server backup solutions often include attachment of low capacity tape
drives to individual servers. This introduces a significant administrative
overhead since users often have to control the backup and recovery
processes manually. A SAN allows the alternative strategy of sharing fewer,
highly reliable, centralized tape solutions, (such as IBM’s Magstar and Linear
Tape Open solutions), between multiple users and departments.
򐂰 Ease of data migration: When using a SAN, data can be moved non-
disruptively from one storage subsystem to another, bypassing the server.
The elimination of the use of server cycles may greatly ease the migration of
data from old devices when introducing new technology.

򐂰 Reduced total costs of ownership: Consolidation of storage in a SAN can
reduce wasteful fragmentation of storage attached to multiple servers. A
single, consistent data and storage resource management solution can be
implemented. This can reduce costs of software and human resources for
storage management compared to distributed DAS systems.
򐂰 Storage resources match e-business enterprise needs: By eliminating
islands of information, and introducing an integrated storage infrastructure,
SAN solutions can be designed to match the strategic needs of today’s
e-business.
1.7.5 Other SAN considerations

There are pros and cons to most decisions. There are a number of issues to
consider when making a SAN investment, too.
򐂰 Costs: SAN entails installation of a new, dedicated Fibre Channel network
infrastructure. The cost of the fabric components, such as Fibre Channel
HBAs, hubs, and switches, is therefore an important consideration. Today
these costs are significantly higher than the equivalent Ethernet connections
and fabric components. An additional cost is the IT personnel, who may
demand higher salaries due to their specialized Fibre Channel knowledge.
򐂰 Interoperability: Unlike Ethernet LANs, which have been implemented for
more than fifteen years, Fibre Channel is still relatively early in its
development cycle. A number of important industry standards are in place,
while others have yet to be agreed upon. This has implications for ease of
interoperability between different vendors’ hardware and software, which may
cause added complexity to the implementation of multi-vendor,
heterogeneous SANs. However, this issue is gradually going away over time
owing to industry-wide efforts in interoperability testing, and cooperation on
development of standards (see 1.11, “Industry standards” on page 60).
򐂰 Storage Wide Area Networks (SWAN): Today Fibre Channel Protocol
SANs are mostly restricted in scope to the size of a LAN, due to the limited
distances (10 kilometres) supported by the Fibre Channel architecture. This
has implications when considering the interconnection of multiple SANs into a
SWAN. Such interconnections require protocol conversions to other transport
technologies, such as ATM or TCP/IP, and the costs are high. Future
implementations of FCP are expected to enable SANs to network across
wider domains than a LAN, but it is likely to be some years before this is
available.
򐂰 Skills: Due to Fibre Channel’s recent introduction and explosive growth,
(really beginning to take off only in 1998), people with the necessary skills are
still relatively scarce. Employment of new staff with appropriate experience
may be difficult or costly. It is often necessary, therefore, to invest in extensive

education of your own staff, or use external services (such as IBM’s Global
Services organization), which have developed the necessary skills and have
wide experience with SAN implementations.
򐂰 Lack of reach: To extend access to the SAN requires installation of a Fibre
Channel connect for each client. This configuration increases the total cost of
ownership. The Network Attached Storage 300G helps reduce this cost by
providing a direct connection to the SAN for IP clients.
1.7.6 Data and SAN management

It is evident that the emergence of open, heterogeneous SAN architectures
brings added complexity to the tasks of storage administrators. Comprehensive
management tools are required to enable them effectively to control and
coordinate all aspects of data and storage resource utilization. These must
enable appropriate data backup and recovery routines, as well as control data
access and security, such as zoning and LUN masking, and disaster protection.
They should also enable exploitation of the new capabilities of the SAN for
consolidation, centralized management, LAN-free and serverless data
movement, and so on.
IBM has introduced a family of data and SAN resource management tools,
namely the IBM StorWatch family of tools, Tivoli Storage Manager and Tivoli
Network Storage Manager. In addition, IBM has indicated its strategic direction to
develop storage network virtualization solutions, known as the Storage Tank
project, which will allow enterprise-wide, policy-driven, open systems
management of storage. Refer to 2.13, “Data and network management” on
page 111 for more details on:
򐂰 Tivoli Storage Manager (TSM)
򐂰 Tivoli Network Storage Manager (TSNM)
򐂰 Storage virtualization
1.8 Getting the best of both worlds: SAN with NAS

Most organizations have applications which require SAN performance, and
others which will benefit from the lower cost and file sharing of a NAS solution.
Recent IBM developments allow you to mix and match storage network solutions
to deliver the most cost effective answer to meet your business needs. IBM’s
Tivoli SANergy software and the IBM 300G appliance, either alone or combined,
deliver NAS file sharing functions, while exploiting Fibre Channel SAN scalability,
high performance, and availability.

1.8.1 Tivoli SANergy
Tivoli SANergy File Sharing introduces LAN file sharing technologies to SANs. In
this section we describe the SANergy architecture and its cooperation with the
Tivoli Storage Manager.
Tivoli SANergy File Sharing is unique SAN software that allows sharing of access
to application files and data between a variety of heterogeneous servers and
workstations connected to a SAN. In addition, Tivoli SANergy File Sharing
software uses only industry-standard file systems like NFS and CIFS, enabling
multiple computers simultaneous access to shared files through the SAN (shown
in Figure 1-12 on page 39). This allows users to leverage existing technical
resources instead of learning new tools or migrating data to a new file system
infrastructure. This software allows SAN-connected computers to have the
high-bandwidth disk connection of a SAN while keeping the security, maturity,
and inherent file sharing abilities of a LAN.
The SANergy software employs technology to combine the simplicity of

LAN-based file sharing with the very high data transfer speeds afforded by
today’s Fibre Channel, and SSA storage networks. This enables the use of
high-speed, heterogeneous data sharing without the performance-limiting
bottlenecks of file servers and traditional networking protocols. The Tivoli
SANergy product is unique in that it extends standard file systems and network
services provided by the operating systems that it supports (Windows NT,
MacOS, AIX, plus variations of UNIX and Linux).
As an OS extension built on standard systems interfaces, SANergy fully supports

the user interface, management, access control, and security features native to
the host platforms, providing all the file system management, access control, and
security required in a network. With SANergy, virtually any network-aware
application can access any file at any time, and multiple systems can
transparently share common data.

LAN
WinNT SUN AIX WinNT
SANergy SANergy SANergy SANergy

MDC client client client
SAN
Disk Storage
Sub-System
NTFS
Figure 1-12 A Tivoli SANergy file sharing configuration
In addition to the SAN, Tivoli SANergy also uses a standard LAN for all the
metadata associated with file transfers. Because Tivoli SANergy is NT File
System (NTFS) based, even if the SAN should fail, access to data via the LAN is
still possible. Since each system has direct access to the Tivoli SAN-based
storage, Tivoli SANergy can eliminate the file server as a single point of failure
for mission-critical enterprise applications. Tivoli SANergy can also easily
manage all data backup traffic over the storage network, while the users enjoy
unimpeded LAN access to the existing file servers.
Tivoli SANergy architecture and data flow

The basic problem in storage area networking at the file level is keeping the
separate operating systems up to date with each other's independent and
asynchronous use of the storage. Tivoli SANergy is a hybrid of conventional
networking and direct attached storage. Conventional networking is rich with
abilities for keeping many computers coherent. That is, if one computer has an
open view of a directory, and another changes that directory (adds/deletes a file),
the view on all computers will change. Conventional networking allows
administrators to establish centralized access control lists and other data
management facilities.
Data “about” data is referred to as metadata. Examples include file names, file
sizes, and access control lists. The Tivoli SANergy File Sharing architecture lets
metadata transactions take place over conventional LAN networking. The actual
content of files moves on the high-speed direct SAN connection, as illustrated in

Heterogeneous Clients
(Work stations or servers)
Existing IP network for Client / Server communications

NT Client UNIX Client Mac Client
SANergy
Metadata
Controller
1
2
Fibre Channel SAN

3
Client File I/O request to Shared

1 SANergy MDC Server Storage
NTFS Device
MDC Server returns file access,
2 locks and disk metadata
SANergy client redirects all I/O

3 over SAN as block I/O to disk
Figure 1-13 Tivoli SANergy file sharing data flow
SANergy works with Ethernet, ATM, or anything else that carries networking
protocols. The network operating system can also be CIFS protocol (Windows
NT), Appletalk, NFS (UNIX), or a combination. Similarly, SANergy supports any
available disk-attached storage fabric. This includes Fibre Channel, SSA, SCSI,
and any other disk-level connection. It is also possible for installations to use one
set of physical wiring to carry both the LAN and storage traffic.
When you use SANergy, one computer in the workgroup is designated as the
Meta Data Controller (MDC) for a particular volume. You can have a single
computer as the MDC for all volumes, or MDC function can be spread around so
that multiple computers each control certain volumes. The other computers are
SANergy clients. They use conventional networking to “mount” that volume, and
SANergy on those clients separates the metadata from the raw data
automatically.
1.8.2 SANergy uses a mix of file I/O and block I/O

SANergy is a an intelligent, hybrid solution. It combines aspects of the LAN
client, requesting file access to information stored on a remote server, with those
of a SAN attached client, accessing data directly on the device.

When the initial request to open the file is made, the SANergy client does not
own the device, and has no knowledge of the structure of the data on the disk. It
therefore follows the standard approach of a making an NFS, or CIFS, file call via
TCP/IP to the remote server. In this case, the server is the SANergy Metadata
Controller (MDC). Recognizing that the I/O request is from a SANergy client, the
SANergy MDC returns a number of important pieces of information to the client.
First, if the file is available for use, permission is granted to access the file (with
read or read/write capability). Second, the MDC provides file locking procedures,
which prevent another client from accessing and updating the file while it is in
use by the requestor. Finally, the MDC provides metadata about the file location
and format on the disk. With this information, the client now has the requisite
information with which to access the disk device directly over the SAN. All
subsequent I/O requests are redirected by the SANergy client as block I/Os, over
the Fibre Channel SAN, directly to the device, as illustrated in Figure 1-13, and
described in more detail in 2.9, “Tracing the I/O path for SANergy clients” on
page 104.
1.8.3 SANergy benefits

In summary, SANergy provides the following benefits:
򐂰 File sharing at SAN speeds: SANergy software provides NAS-like file
sharing, with data sent over the SAN rather than the LAN for higher
performance. Applications which would benefit from remote file sharing, but
which might previously have achieved poor performance over a LAN, can
now achieve the benefits of pooled SAN storage while delivering excellent
performance.
򐂰 True heterogeneous file sharing: SANergy file sharing is independent of
the network file protocol. Once access to a file has been given to any client,
subsequent disk I/O is done in serial SCSI block format. Multiple unlike
platforms, such as Windows, UNIX, AIX, and Mac, may therefore concurrently
share the file. This greatly increases user flexibility, and allows important
information to be made available to user departments which have been
equipped with a variety of host platforms.
򐂰 Storage hardware flexibility: SANergy has the attributes of NAS and SAN
with added flexibility. SANergy supports the NFS and CIFS file sharing
protocols, but allows for the selection of enterprise-class scalable disk
systems like the IBM ESS, or other SAN-attached disk storage required to
suit the business need.
򐂰 LAN-free and serverless data movement: SANergy automates the
capability to move large data transfers like backup and recovery across the
high speed SAN rather than over the LAN. These applications are among the
most seductive for enterprise data managers.

Using SANergy together with the Tivoli Storage Manager will give you the
possibility of transferring your data through the SAN. It supports both
LAN-free and serverless types of backup/restore. In both cases the data
transfer will be off-loaded to the SAN. These applications provide some of the
most attractive benefits of SAN implementation, because they eliminate so
much traffic which currently moves across the LAN. We describe two possible
scenarios of how TSM and SANergy can be used in concert to provide these
solutions in 6.4.1, “Using TSM with SANergy” on page 227.
򐂰 Reduced hardware costs: SANergy supports the protocols of a
conventional NAS appliance, but with significantly higher performance. At the
same time it does not require the dedicated NAS processor front end to the
disk storage. Instead, SANergy software sits in client hosts and in the
SANergy metadata controller. This may be a standard server or a NAS 300G
appliance.
1.8.4 SANergy considerations

A number of factors must be taken into consideration when implementing
SANergy. These include:
򐂰 File opening overheads: The remote file call across the LAN to the
SANergy MDC entails an overhead every time a file is opened. Applications
which open and close many small files for short periods of time, and issue a
small number of I/O requests while the file is open, will not perform well.
SANergy is optimized to give most benefit to applications which utilize
relatively large files, keep them open for long periods, and issue large
numbers of I/Os while the file is open.
򐂰 File fragmentation: Metadata regarding the files to be accessed is normally
very small. It takes little time to send this from the MDC to the SANergy client.
However, if a file is fragmented across many sectors and disk devices, the
volume of the metadata, and the time needed to send it to the client, may
impact SANergy’s performance. Storage administrators should ensure that
defragmentation is carried out regularly, in order to minimize the file opening
and file access overheads.
򐂰 Database applications: Although SANergy is using block I/O, it is not
using raw partition processing required by some database applications. For
this reason, SANergy is not suitable for database applications unless the
database I/O is processed via the client’s file system (as described in 2.7,
“Tracing the I/O path for local storage” on page 98). Some database vendors
do not support access via redirected I/O.

1.8.5 The IBM NAS 300G appliances
The IBM NAS 300G appliances, announced in February 2001, offer cost effective
and flexible alternatives to NAS servers. A 300G provides the function of a
conventional NAS appliance but without integrated disk storage. The disk storage
is attached externally to the NAS appliance. The disk may be either a standalone
offering for direct attachment (Fibre Channel point-to-point), or a switched fabric
SAN attachment.
Two different types of configurations are available for the NAS 300G; the
single-node G01 and the dual-node G26. The dual node Model G26 provides
clustering and failover protection for top performance and availability. The G01
and G26 models are illustrated in Figure 1-14.
Figure 1-14 The IBM TotalStorage Network Attached Storage 300G
The NAS 300G accepts a file I/O request (for example, using the NFS or CIFS
protocols) and translates that to a SCSI block I/O request to access the external
attached disk storage. The 300G interconnections are illustrated in Figure 1-15
on page 44 and Figure 1-16 on page 45.

Many industry analysts and press comments have referred to the 300G as a
NAS “gateway.” IBM believes this term mis-describes the 300G. It does not take
account of the comprehensive functions provided with the 300G. These include
advanced management facilities; the ability to exploit up to 250 True Image
point-in-time backups by means of the Persistent Storage Manager; and the
optional extension to support high performance SANergy file sharing functions to
SAN attached disk storage. The 300G is, in effect, a powerful network inter-link
server, and it enables new levels of data sharing and enterprise storage pooling.
IBM Network Attached Storage 300G

to Fibre Channel disk
IP network
Application
server
File I/O Block I/O

IP protocol FCP
NAS 300G

redirects file I/O 300G initiates
request over the LAN block I/O to disk
to remote file system
in NAS appliance
Figure 1-15 NAS 300G to FC point-to-point attached disk

IBM Network Attached Storage 300G
to Fibre Channel SAN
IP network Fibre Channel network

Application
server
Block I/O SAN

File I/O
IP protocol FCP
NAS 300G

redirects file I/O 300G initiates
request over the LAN block I/O to SAN
to remote file system attached disk
in NAS appliance
Figure 1-16 NAS 300G to FC SAN Fabric attached disk
1.8.6 IBM NAS 300G benefits

The Network Attached Storage 300G approach to file sharing offers all the
flexibility and ease of use benefits of a conventional NAS appliance, such as:
򐂰 Plug-and-play simplicity to implement
򐂰 Ease of use
򐂰 Connectivity
򐂰 Heterogeneous file sharing
򐂰 Improved manageability
򐂰 Enhanced backup
򐂰 Reduced total cost of ownership
The 300G also offers additional advantages of SAN scalability and performance
on the IP network:
򐂰 Increased choice of disk types: By separating the disk subsystem
selection from the NAS appliance selection, the buyer has greater flexibility to
choose the most cost-effective storage to meet business requirements. The
best of breed storage systems can be selected to attach to the SAN, and the
300G appliance can exploit the benefits of their superior performance,
availability, and advanced functions.

򐂰 Increased function and availability: Enterprise-class storage systems like
IBM’s ESS and FAStT700 can be selected, which provide advanced functions
such as large read/write cache, Flash Copy, or remote copy functions. Due to
their ‘n + 1’ fault-tolerant design, intended to deliver 24 hour/7 days per week
continuous operations, these devices also have much higher availability than
typical NAS subsystems.
򐂰 Increased scalability: An integrated NAS appliance may have limited
capacity beyond which it cannot scale. This tends to lead to proliferation of
NAS appliances when you need to grow performance or capacity. But
external disk systems attached to the IBM NAS 300G, like IBM’s ESS,
FAStT500, and FAStT700, can grow to many terabytes. The enterprise class
storage system can easily scale for both performance and capacity. New
capacity can be dynamically and non-disruptively allocated to users. The
scalability comparison is illustrated in Figure 1-17 on page 47.
򐂰 Ability to protect existing investments: You may already have made
investments in large capacity disk arrays, on which plenty of capacity is
available. You can preserve and enhance the value of these installed Fibre
Channel disk systems by adding NAS file sharing partitions, alongside
partitions allocated to traditional database and other block I/O applications.
򐂰 Increases the reach of FC SAN: Investments in Fibre Channel SANs can
now be extended to allow you to reach beyond the boundaries of the SAN,
and exploit Fibre Channel devices on the IP network. You can increase the
return on your investment in FC SAN, and increase the utilization of SAN
fabric and devices.
򐂰 Greater flexibility and lower costs: Disk capacity in the SAN can be
shared, and easily reassigned, between NAS-attached file sharing
applications and block I/O applications. So a 300G can be viewed as a
NAS/SAN hybrid, increasing flexibility and potentially lowering costs. Without
this capability, disk capacity might go under utilized if it is permanently
dedicated either to a traditional NAS appliance, or to a SAN.
򐂰 Integrated SANergy: The 300G optionally supports Tivoli SANergy software
internally. This allows applications to access data using protocols supported
by the gateway (CIFS and NFS), yet process block I/Os at SAN speeds (see
“Tivoli SANergy architecture and data flow” on page 39).

IBM NAS solutions
from GBs to TBs
300G w/ESS
(Shark)
300G
w/FAStT500
NAS 300G
w/FAStT200
NAS 300G
w/7133
NAS 300
Network Attached
NAS 200 Storage 300G
Integrated NAS NAS 300G + Fibre attached storage
Figure 1-17 Scalability of IBM NAS appliances
򐂰 Increased file sharing performance: The 300G allows you to implement

heterogeneous file sharing on SAN-attached storage. The ability to use
multiple 300G appliances, each utilizing SANergy to access the same files,
provides very high performance by scaling beyond the limits of a single NAS
appliance. This can lower costs compared to adding NAS appliances each
with their own integrated disk storage.
1.8.7 Other NAS 300G considerations

Although the 300G gives perhaps the broadest benefits for file I/O applications,
you should also consider:
򐂰 Multiple remote file redirection: NFS allows remote file access to be
directed through several layers of servers. For example, a file I/O request
may be made to a server, which itself redirects the I/O to a second tier remote
NAS file server.

This capability is not currently supported by Microsoft Windows applications.
This may restrict the use of NAS appliances such as the 300G in some
Windows environments. For instance, a Windows client to a 300G with
SANergy cannot, itself, be a Windows server for other clients. If a first tier
client makes a file I/O request to the first tier server, and this redirects the I/O
to a second tier 300G-attached file, the I/O request will fail. This may be
addressed by Microsoft in future releases of Windows
򐂰 Network congestion: This may be an issue for large NAS implementation.
This should be alleviated with high speed Gigabit Ethernet, and largely
overcome, or masked, in time by the delivery of higher speed network
transmission such as 10 GB/sec.
1.9 A new direction: SCSI over IP networks

We have seen that Fibre Channel SANs have the potential to deliver significant
business benefits. There are also some current restrictions, such as LAN-like
distances, costs, skills, and so on, which limit their application.
The question arises whether we can use TCP/IP, the networking technology of
Ethernet LANs and the Internet, for storage. This could enable the possibility of
having a single network for everything, and could include storage, data sharing,
Web access, device management using SNMP, e-mail, voice and video
transmission, and all other uses.
IP SANs could leverage the prevailing technology of the Internet to scale from
the limits of a LAN to wide area networks, thus enabling new classes of storage
applications. SCSI over IP could enable general purpose storage applications to
run over TCP/IP. Moreover, an IP SAN would also automatically benefit from new
networking developments on the Internet, such as Quality of Service (QoS) and
security. It is also widely anticipated that the total cost of ownership of IP SANs
would be lower. This is due to larger volumes of existing IP networks and the
wider skilled manpower base familiar with them.

The problem is that IP networking is based on design considerations different
from those of storage concepts. Is it possible to merge the two concepts and yet
provide the performance of a specialized storage protocol like SCSI, with block
I/O direct to devices? The challenges are many. The TCP/IP protocol is
software-based and geared towards unsolicited packets, whereas storage
protocols are hardware-based and use solicited packets. Also, a storage
networking protocol would have to leverage the TCP/IP stack without change
and still achieve high performance.
At the IBM research centers at Almaden and Haifa, efforts are under way to
resolve these issues. The goal is to make the promise of IP SANs a reality.
Efforts are concentrated along two different directions: the primary effort is to
bridge the difference in performance between Fibre Channel and IP SANs. In
parallel, there is an effort to define a standard mapping of SCSI over TCP/IP. The
result is Internet SCSI (iSCSI), sometimes called SCSI over IP.
1.9.1 Internet SCSI (iSCSI)

iSCSI is a proposed industry standard that allows SCSI block I/O protocols
(commands, sequences, and attributes) to be sent over a network using the
popular TCP/IP protocol. This is analogous to the way SCSI commands are
already mapped to Fibre Channel, parallel SCSI, and SSA media. (Do not
confuse this with the SCSI cabling transport mechanism, which was discussed in
the Direct Access Storage section. That can be ignored. We are talking about
protocols).
The iSCSI proposal was made to the Internet Engineering Task Force (IETF)
standards body jointly by IBM and Cisco. Details of some of the objectives and
considerations of the IETF standards proposals for iSCSI are described in 2.4,
“iSCSI basics” on page 79. In February 2001 IBM announced the IBM
TotalStorage IP Storage 200i, which became generally available in June 2001.
This was followed in April by the announcement by Cisco of the Cisco SN 5420
Storage Router, a gateway product linking iSCSI clients and servers to Fibre
Channel SAN-attached storage.
IBM has taken a leadership role in the development and implementation of open
standards for iSCSI. As it is a new technology, you can expect additional
developments as iSCSI matures. Since IBM’s iSCSI announcement in February
2001, a large number of other companies in the storage networking industry
have stated their intentions to participate in iSCSI developments, and to bring
products to market in due course. Things already are moving very rapidly. In July
2001 IBM participated in a SNIA sponsored iSCSI interoperability demonstration,

together with some 20 other companies now working in the development of new
products. This included a number of major companies in the industry. Response
from industry analysts and from customers alike has been extremely positive to
this new direction.
So iSCSI is moving rapidly, and we can anticipate extensions to other existing

technologies, such as faster Ethernet (10 GB/sec), HBAs with TCP/IP offload
engines, and so on. Some of the potential industry developments under current
consideration in the storage networking arena are outlined in Chapter 7, “Other
storage networking technologies” on page 237.
1.9.2 The IBM TotalStorage IP Storage 200i

In February 2001 IBM announced the IBM TotalStorage IP Storage 200i, which
became generally available in June 2001.
This is a network appliance that uses the new iSCSI technology. The IP Storage
200i appliance solution includes client initiators. These comprise client software
device drivers for Windows NT, Windows 2000, and Linux clients. These device
drivers coexist with existing SCSI devices without disruption. They initiate the
iSCSI I/O request over the IP network to the target IP Storage 200i. IBM plans to
add additional clients in response to customer feedback and market demands.
IBM is committed to support and deliver open industry standard implementations
of iSCSI as the IP storage standards in the industry are agreed upon.
Processors (initiators) supporting iSCSI can attach to IP Storage 200i over a

TCP/IP network, such as (but not necessarily limited to) an Ethernet LAN.
Attachment might be directly to the server or storage, or might be via a protocol
converter at either end. The initiators are device drivers that intercept the low
level SCSI commands and redirect them via TCP/IP over the IP network to the IP
Storage 200i. The IP Storage 200i then receives this TCP/IP encapsulated
command, and maps the SCSI command directly to its embedded storage. A
technical overview of the IP Storage 200i series can be found in 3.5, “IBM
TotalStorage IP Storage 200i Series” on page 158.
The IBM IP Storage 200i is a low cost, easy to use, native IP-based storage
appliance. It integrates existing SCSI storage protocols directly with the IP
protocol. This allows the storage and the networking to be merged in a seamless
manner. iSCSI-connected disk volumes are visible to IP network-attached
processors, and as such are directly addressable by database and other
performance oriented applications. The native IP-based 200i allows data to be
stored and accessed wherever the network reaches—LAN, MAN or WAN
distances.

Two options for attachment exist. You may choose to integrate the 200i directly
into your existing IP LAN, combining storage traffic with other network traffic.
This is a low cost solution for low activity storage applications. The alternative is
to create a discrete native IP SAN, similar to an FC SAN. Servers attach only to
storage devices on the dedicated IP SAN. It acts as an extra network behind the
servers, while the LAN in front of the servers remains dedicated to normal
messaging traffic.
IBM TotalStorage IP Storage 200i, comprises the 4125 Model 110 tower system,
and the 4125 Model 210 rack-mounted system. These are high-performance
storage products that deliver the advantages of pooled storage, which FC SANs
provide. At the same time, they take advantage of the familiar and less complex
IP network fabric.
These models are illustrated in Figure 1-18.
Figure 1-18 The IBM TotalStorage IP Storage 200i
The IBM TotalStorage IP Storage 200i is designed for workgroups, departments,

general/medium businesses, and solution providers that have storage area
network requirements across heterogeneous NT, Windows 2000, and Linux
clients (initiators). The IBM iSCSI technology offers customers who do not want
to bear the infrastructure cost dictated by Fibre Channel SAN environments the
opportunity to gain the benefits offered by SANs by deploying storage appliances
on the existing Ethernet LAN infrastructure. This further increases the potential

return on investment (ROI) which the IBM TotalStorage IP Storage 200i can offer.
IBM's industry-leading design offers capabilities and advantages such as large or
shared end-user storage and remote/centralized storage management via a
browser GUI.
The IBM TotalStorage IP Storage 200i products are “appliance-like.” All required
microcode comes pre-loaded, minimizing time required to set up, configure, and
make operational the IP Storage 200i. There are only two types of connections to
make: connecting the power cord(s) and attaching the Ethernet connection(s) to
the network.
Microcode for the 200i is Linux-based. Since the microcode is pre-loaded, the
initial installation time (after unpacking, physical location, and external cabling)
should take about 15 minutes. After the first IPL boot, succeeding IPL boots
should take about 5 minutes. The code for an iSCSI initiator should take less
than 5 minutes to install, as it is a seamless device driver addition.
1.9.3 iSCSI gateways and the Cisco SN 5420 Storage Router

The IBM IP Storage 200i is an appliance with embedded storage. An alternative
approach to iSCSI connectivity is by means of an interface, or gateway, between
an IP network and a Fibre Channel network (Figure 1-19). This allows an IP
network-attached client to access Fibre Channel SAN storage, via the gateway.
Tw o industry A pproaches:
iSCSI App liances (with Em bed ded Sto rage )
iSCSI G ate ways (IP/FC Bridges)
iSCSI iS CS I Appliance
C lient Softw are
3
IP N etwork 2
1
SCSI Protoc ol
S
iSC SI G atew ay A
N
1 iC lient (initiator) cod e reroutes SC SI com m ands over IP netw ork
2 iSC SI target code re ceives SC SI com m ands from IP network.
S CS I com m ands then e ither routed directly to em b edded sto rage

3 (iS CS I Appliance ) or routed to FC SAN (iS CS I Gatew ay)
Figure 1-19 iSCSI gateway connection to FC SAN

In April 2001 Cisco, IBM’s partner in presenting the iSCSI protocol to the IETF,
announced the Cisco SN 5420 Storage Router, which offers this type of gateway
solution. IBM International Global Services (IGS) has a re-marketing agreement
with Cisco for the sale and support of the SN 5420. Technical information
regarding the SN 5420 is given in Chapter 3.6, “The Cisco SN 5420 Storage
Router” on page 166.
iSCSI compared to DAS, SAN, and NAS

Because DAS, SAN, and NAS concepts and products preceded iSCSI, it is
natural to try to understand where iSCSI fits in the storage world by comparing it
to those concepts. iSCSI is a mapping of the SCSI I/O protocol to the TCP/IP
protocol, which in turn usually runs over Ethernet and the Internet. iSCSI is a
connection alternative to DAS and Fibre Channel SAN, while a NAS is an
appliance.
1.9.4 iSCSI uses block I/O

iSCSI uses the SCSI I/O protocol. Therefore, it is block I/O oriented, like a DAS
or SAN; rather than file I/O oriented, like a NAS appliance. This is illustrated in
Figure 1-20. Thus, iSCSI devices will not suffer from the limitation of file I/O
accesses. Support for general purpose storage applications over TCP/IP,
including database systems, is enabled.
iSCSI uses block I/O

IP network Application
server
iSCSI Appliance
Block I/O
IP protocol
Application I/O request iSCSI appliance

initiates block I/O which is "unwraps" the I/O from
encapsulated in TCP/IP and TCP and reads/writes
sent to disk block on disk
Figure 1-20 iSCSI uses block I/O

1.9.5 iSCSI benefits
The IBM IP Storage 200i appliance offers a number of benefits, and it can be
viewed as a complementary solution to the other storage networking
implementations already discussed.
The benefits of the IBM IP Storage 200i appliance include the following:
򐂰 Connectivity: iSCSI can be used for DAS or SAN connections.
iSCSI-capable devices could be placed on an existing LAN (shared with other
applications) in a similar way to NAS devices. Also, iSCSI-capable devices
could be attached to a LAN which is dedicated to storage I/O (in other words,
an IP SAN), or even to a LAN connected to only one processor (like a DAS).
These options are shown in Figure 1-21.
򐂰 Extended distance: IP networks offer the capability easily to extend beyond
the confines of a LAN, to include Metropolitan and Wide Area Networks
(MANs and WANs). This gives greater flexibility, and at far less cost and
complexity, compared to the interconnection of Fibre Channel SANs over
wide areas.
iSCSI storage
iSCSI
iSCSI
Client
Client Software
Software
IP LAN
IP LAN
IP SAN
SCSI Protocol SCSI Protocol SCSI Protocol
iSCSI Appliance iSCSI Appliance iSCSI Appliance
Pooled storage Pooled storage
Figure 1-21 IBM IP Storage 200i connectivity options
򐂰 Media and network attachments: iSCSI and NAS devices both attach to IP
networks. This is attractive compared to Fibre Channel because of the
widespread use of IP networks. IP networks are already in place in most

organizations and are supported by existing IT skills. TCP/IP-based networks
can potentially support much longer distances than can pure Fibre Channel
SANs, promising the possibility of scaling IP SANs to Storage Wide Area
Networks (SWANs).
Purchasing of attachment technologies is simplified, as they are the same as
for LAN attachments.
򐂰 Interoperability: The well-known early-life interoperability problems of
devices on Fibre Channel SANs disappear on networks using the familiar
TCP/IP protocol.
򐂰 SANergy file sharing: NAS supports file sharing, while SANs generally do
not. However, the SANergy product can add file sharing capabilities to iSCSI
SANs and Fibre Channel SANs alike, delivering the best of both SAN and
NAS implementations. This is primarily of relevance when iSCSI storage is
attached to a dedicated IP storage network (or IP SAN), below the servers.
The servers also would be attached to the messaging LAN. For details of
SANergy refer to 1.8.1, “Tivoli SANergy” on page 38.
򐂰 Backup: Backup of data on the IP Storage 200i series is the same as for any
direct-attach storage; that is, via any method that supports SCSI-attached
volumes. A backup application running on an external server, including the
one hosting the iSCSI initiator code, will control the backup of data that is
physically stored on the iSCSI appliance. In the future IBM plans to include
more embedded storage management routines in the IP Storage 200i,
including enterprise class solutions like Tivoli Storage Manager, and other
popular industry solutions. A NAS appliance, because it “hides” disk volumes
from its clients, and often includes specialized backup facilities, may be
easier to install and manage.
򐂰 Management: iSCSI is managed like any direct-attach SCSI device.
iSCSI-connected disk volumes are visible to attached processors. Compared
to Fibre Channel SANs, iSCSI benefits from using IP networks for which there
are established network management tools and people skills, such as Tivoli
NetView or HP Openview. Such tools enable network administrators to
coordinate provision of bandwidth among users and applications, traffic
management, and overall network operations. Training in new networking
skills is minimized.
Fibre Channel SANs currently have more storage-related management tools
than iSCSI, such as support for tape sharing for backup. This advantage is
likely to diminish as iSCSI matures and the market demands SAN-like
management for iSCSI devices.
򐂰 Low cost: Cost comparisons are difficult to generalize and will probably
depend on particular products. However, today an iSCSI SAN is likely to have
lower costs than a Fibre Channel SAN. For example, iSCSI network hardware
such as Ethernet host adapters are generally lower cost than Fibre Channel

host adapters. If iSCSI (or NAS) is attached to an existing LAN, no new host
adapter cards may be needed at all. An iSCSI SAN can be built more quickly
and with fewer new skills than a Fibre Channel SAN. An iSCSI disk device, all
else being equal, may be lower cost than a NAS appliance since the iSCSI
device does not need to support file systems, file sharing protocols, and other
facilities often integrated into NAS products.
1.9.6 iSCSI considerations

As with the other storage network solutions, there are a number of
considerations to take into account when selecting an iSCSI implementation,
including the following:
򐂰 Network congestion: We have seen in the case of NAS, and backup over IP
networks, that congestion may cause variable performance for other
applications. This also remains a problem for iSCSI implementations. This
should be alleviated with high speed Gigabit Ethernet, and largely overcome,
or masked, in time by the delivery of higher speed network transmission such
as 10 GB/sec.
򐂰 Performance: A performance comparison is difficult to generalize because
there are so many variables. That said, Fibre Channel at 1000 Mbps
(1 Gigabit/sec) is generally more efficient for I/O traffic than TCP/IP over
Ethernet at equivalent bandwidth. iSCSI performs better than NAS (when
both are on Ethernet) due to reduced protocol overhead. This is because it
handles SCSI directly, rather than translating between file-I/O protocols and
SCSI. This makes the IP Storage 200i appliance suitable for a variety of
applications which might not deliver optimum performance on a NAS
appliance. For instance, initial results indicate that database performance of
IBM’s IP Storage 200i is superior using raw I/O, compared to the equivalent
database function executing through a file system.
TCP/IP is a software-intensive network design which requires significant
processing overheads. These can consume a substantial proportion of
available processor cycles when handling Ethernet connections. This is a
drawback for performance-intensive storage applications.
A performance consideration, therefore, is the impact of the software protocol
stack on processor utilization. Fibre Channel SANs support SCSI commands
mapped directly to Fibre Channel media, and processor overhead for this
mapping is low. In iSCSI, handling of the TCP/IP protocol requires processor
cycles at both ends. Therefore, at this early time in the evolution of iSCSI, it is
best suited for situations of relatively low I/O activity. This point generally
applies to NAS as well. (“Low” in this case can be thousands of I/Os per
second, but will be less than the very high performance levels which a Fibre
Channel SAN could support.)

The IP Storage 200i is achieving its initial throughput targets. We recommend
that early installations of the 200i are best suited to applications with low to
moderate bandwidth requirements under 30 MBps throughput. Performance
testing and tuning is in progress. Overall throughput is expected to achieve
60 MB to 75 MB throughput per second in later releases of microcode.
򐂰 Data security: An argument in favor of Fibre Channel SANs is that data
moves over a separate, secure, and dedicated network. Many IT managers
would have serious reservations about running mission-critical corporate data
on an IP network which is also handling other traffic. iSCSI introduces the
possibility of an IP network SAN, which could be shared. If a separate IP
network is implemented to minimize this risk, the cost advantage of IP SAN
would be reduced.
򐂰 Early life of the technology: iSCSI is very early in its development cycle.
Industry standards are under discussion in the key industry organizations,
and may take some time to emerge. Some IT managers will want to “wait and
see” how things develop over the coming months. But, everything has a
beginning. The advantages seem to be significant, and industry analysts have
been extremely positive about the direction which IBM has taken with IP
network solutions. The IP Storage 200i series solutions offer low entry points
for IT departments to introduce open IP storage network solutions in order to
test and establish the principles, and prepare for the future.
1.9.7 Where does the IBM IP Storage 200i fit?

The following applications for iSCSI are contemplated:
򐂰 Local storage access, consolidation, clustering and pooling (as in the data
center)
򐂰 Client access to remote storage (for example, a storage service provider)
򐂰 Local and remote synchronous and asynchronous mirroring between storage
controllers
򐂰 Local and remote backup and recovery
With these applications in mind, the IBM TotalStorage IP Storage 200i will be well
suited for departments and workgroups within large enterprises, mid-size
companies, service providers (such as Internet service providers), and
e-business organizations.

1.10 Storage networking solution options from IBM
We have outlined the additional options now available from IBM, increasing your
choices for storage networking solutions. Inevitably, more choice may sometimes
seem to make your selection decision more complicated.
However, the good news is that the IBM offerings are truly complementary with
each other. They are designed to work together to deliver the broadest range of
cooperating storage network solutions. In making a decision for one solution
today, you are not ruling out the ability to select and benefit from another network
choice tomorrow.
In reality, most larger organizations are likely, in our view, to implement several of
the network options, in order to provide an optimal balance of performance,
flexibility, and cost for differing application and departmental needs. As we show
in Figure 1-22, all the IBM storage network systems can be interlinked.
IBM Storage Networking
Clients IP 200i
iSCSI Appliances
IP
Cisco
NAS 300g Gateway
"iSCSI"
FC Servers
NAS 200 & 300
Appliances
SAN Attached
Storage
Figure 1-22 IBM storage networking solutions can be interlinked

1.10.1 Which storage network?
You should keep in mind that each of the storage network solutions is optimized
for a differing, but sometimes overlapping environment, as follows:
򐂰 DAS is optimized for single, isolated processors, delivering good performance
at a low initial cost.
򐂰 SAN is a robust storage infrastructure, optimized for high performance and
enterprise-wide scalability.
򐂰 Integrated NAS appliances are discrete pooled disk storage sub-systems,
optimized for ease-of-management and file sharing, using lower-cost
IP-based networks.
򐂰 The NAS 300G is a SAN/NAS hybrid. It is optimized to provide NAS benefits
with more flexibility in selecting the disk storage than offered by a
conventional NAS device, and enabling sharing of pooled SAN-attached
storage. With SANergy, it delivers high performance SAN file sharing.
򐂰 SANergy is optimized for NAS-like file sharing at SAN speeds.
򐂰 iSCSI is optimized for exploitation of existing Ethernet-based networks. It
provides low initial cost of storage, which is well suited for general purpose
applications, including database.
Storage protocols Transaction processing

Database processing
LAN
IP Network
MDC
DAS FC SAN SAN
LAN SAN with SANergy

iSCSI
IP Network Fibre Channel SAN
TCP/IP SAN
LAN LAN
IP Network NAS 300G IP Network
iSCSI iSCSI
NAS 300G SAN

iSCSI SAN NAS NAS
NAS / SAN Hybrid NAS
Internet protocols File sharing
Figure 1-23 Positioning IBM storage networking solutions

The basic difference between these various network solutions is that DAS, Fibre
Channel SAN, and iSCSI use SCSI block storage protocols. They are more
focused on the storage part of storage networking. NAS appliances, like the IBM
NAS 200 and 300, use network file protocols, so they are more focused on the
network part of storage networking. SANergy and the 300G link storage and
network protocols, to bridge the two worlds. This is illustrated in Figure 1-23 on
page 59. Their various characteristics are summarized in Table 1-1
Table 1-1 Differentiating between storage networks
FC SAN iSCSI SAN NAS
Protocol FCP Serial SCSI NFS, CIFS
Network Fibre Channel Ethernet, TCP/IP Ethernet, TCP/IP
Source/Target Server/Device Server/Device Client/Server or
Server/Server
Transfer Blocks Blocks Files
Storage device Direct on network Direct on network I/O bus in the NAS
connection appliance
Embedded file No No Yes
system
1.11 Industry standards

It has been said that “Variety kills efficiency.” There is a clear customer need for
standardization within the storage networking industry, to allow users to select
equipment and solutions knowing that they are not tying themselves to a
proprietary or short-term investment. To this end, there are extensive efforts
among the major vendors in the storage networking industry to cooperate in the
early agreement, development, and adoption of standards. A number of industry
associations, standards bodies, and company groupings are involved in
developing and publishing storage networking standards. The most important of
these are the SNIA and the Internet Engineering Task Force (IETF).
In addition, IBM and other major vendors in the industry have invested heavily in
interoperability laboratories. The IBM laboratories in Gaithersburg, (Maryland,
USA), Mainz (Germany), and Tokyo, Japan, are actively testing equipment from
IBM and many other vendors, to facilitate the early confirmation of compatibility
between multiple vendors servers, storage and network hardware and software
components. Many IBM Business Partners have also created interoperability test
facilities to support their customers.

1.11.1 Storage Networking Industry Association (SNIA)
The Storage Networking Industry Association (SNIA) is an international
computer industry forum of developers, integrators, and IT professionals who
evolve and promote storage networking technology and solutions. SNIA was
formed to ensure that storage networks become efficient, complete, and trusted
solutions across the IT community.
The SNIA is accepted as the primary organization for the development of SAN
and NAS standards, with over 150 companies and individuals as its members,
including all the major server, storage, and fabric component vendors. The SNIA
is committed to delivering architectures, education, and services that will propel
storage networking solutions into a broader market. IBM is one of the founding
members of SNIA, and has senior representatives participating on the board and
in technical groups. For additional information on the various activities of SNIA,
see its Web site at:
http://www.snia.org
The SNIA mission is to promote the use of storage network systems across the
IT community. The SNIA has become the central point of contact for the industry.
It aims to accelerate the development and evolution of standards, to promote
their acceptance among vendors and IT professionals, and to deliver education
and information. This is achieved by means of SNIA technical work areas and
work groups. A number of work groups have been formed to focus on specific
areas of storage networking, and some of these are described in 7.9.1, “SNIA
work groups” on page 255.
1.11.2 Internet Engineering Task Force (IETF)

The Internet Engineering Task Force (IETF) is a large, open, international
community of network designers, operators, vendors, and researchers
concerned with the evolution of the Internet architecture and the smooth
operation of the Internet. It is open to any interested individual. The actual
technical work of the IETF is done in its working groups, which are organized by
topic into several areas (for example, routing, transport, and security).
One of the relevant work groups pertaining to topics in this book is the IP Storage
(ips) Work Group, which is addressing the significant interest in using IP-based
networks to transport block I/O storage traffic. The work of this group is outlined
in 7.9.2, “IETF work groups” on page 259.
For more information on the IETF and its work groups, refer to:
http://www.ietf.org

2
Chapter 2. IP storage networking

technical details
There is much to consider when you begin to plan IP storage networks. This
chapter explains some of the important concepts and components you should
understand before you begin this process.
Fibre Channel SAN concepts have been extensively covered in other IBM
Redbooks, so we do not address Fibre Channel here.
Since the focus of this book is on IP networks, it is useful to begin with a brief
description of the standard model for open systems networks. This is known as
the Open Systems Interconnection (OSI) model.
We then discuss in detail the specific products and technologies that make up
open systems networks.

2.1 Open Systems Interconnection (OSI) model
OSI is a useful point of reference when discussing the components of networks,
and the various protocols which are used. The OSI is the creation of the
International Standards Organization (ISO). The OSI describes standards for the
implementation of open systems networks. It is a layered approach, and the
layers help to differentiate the various parts and functions of a network. OSI is a
seven-layer model. We will look at each layer starting at the bottom of the stack,
as shown in Figure 2-1.
The OSI reference model

Seven layers
Application
Presentation
Session
Transport
Network
Data link
Physical
Figure 2-1 Open Systems Interconnection (OSI) seven-layer model
2.1.1 Physical layer

The physical layer is responsible for the electrical and mechanical aspects of
data transfers. It describes how bits are transmitted and how to handle the
various electrical timings, voltages, and so on. In other words, the physical layer
does NOT define physical cables or how they should be laid out (topologies).

2.1.2 Data link layer
The data link layer handles host-to-host communications, taking the packets of
data and placing them in frames for transmission across the network. The data
link protocol is responsible for ensuring that frames arrive safely. The data link
layer is frequently subdivided into two sub-layers: the logical link layer and media
access control (MAC). MAC is concerned with how the data flows over the
physical media, such as Ethernet or Token Ring cables, and it is therefore
sometimes called the logical topology. The MAC delivers information to the
logical link sub-layer, which brings the data to the appropriate communications
protocol, such as TCP/IP.
2.1.3 Network layer

The network layer finds the best route through the network to the target
destination. It has little to do in a single discrete LAN; but in a larger network with
subnets, or access to WANs, the network layer works with the various routers,
bridges, switches and gateways, plus software, to find the best route for data
packets.
2.1.4 Transport layer

The transport layer is responsible for ensuring delivery of the data to the target
destination, in the correct format in which it was sent. In the event of problems on
the network, the Transport layer finds alternative routes. It is also responsible for
correct arrival sequence of packets.
2.1.5 Session layer

The session layer establishes the initial logging on to a transmission between
two nodes, plus the security and final termination of the session.
2.1.6 Presentation layer

The presentation layer provides interpretation services to other layers to ensure
that the receiving machine can understand the information received. This may
involve translation between, for instance, ASCII to EBCDIC, or encryption,
compression, and formatting.
Chapter 2. IP storage networking technical details 65

2.1.7 Application layer
The name of this layer may cause confusion. The application layer is not
handling user application programs. Rather it is concerned with delivering
network services to end-user programs, such as file transfer, messaging,
network management, or terminal emulation. Many of the utilities, like FTP,
provided with TCP/IP packages, are working at this application layer.
2.2 TCP/IP technical overview

Transmitting data across computer networks is a complex process. As we see
with the OSI model, network functionality has been broken down into modules
called layers. This simplifies and separates the tasks associated with data
transmission. Each layer is code that performs a small, well-defined set of tasks.
A protocol suite (or protocol stack) is a set of several such layers. It is usually a
part of the operating system on machines connected to the network.
2.2.1 Protocol stacks

A protocol stack is organized so that the highest level of abstraction resides at
the top layer. For example, the highest layer may deal with streaming audio or
video frames, whereas the lowest layer deals with raw voltages or radio signals.
Every layer in a stack builds upon the services provided by the layer immediately
below it.
The terms protocol and service are often confused. A protocol defines the
exchange that takes place between identical layers of two hosts. For example, in
the TCP/IP stack, the transport layer of one host talks to the transport layer of
another host using the TCP protocol. A service, on the other hand, is the set of
functions that a layer delivers to the layer above it. For example, the TCP layer
provides a reliable byte-stream service to the application layer above it.
Each layer of the protocol stack adds a header containing layer-specific

information to the data packet. A header for the network layer might include
information such as source and destination addresses. The process of
appending headers to the data is called encapsulation. (Figure 2-3 on page 68
shows how data is encapsulated by various headers.) During the process of
de-encapsulation the reverse occurs; the layers of the receiving stack extract
layer-specific information and process the encapsulated data accordingly. The
process of encapsulation and de-encapsulation increases the overhead involved
in transmitting data.

2.2.2 The TCP/IP protocol stack
Figure 2-2 shows the TCP/IP stack used by all systems connected to the
Internet, and by the many corporate networks which deploy IP networks, such as
Ethernet. It compares it to the OSI reference model. You can see that it
comprises only four layers. The application layer and the subnet layer combine
several functions represented by OSI as separate layers.
Data link and physical layer

At the bottom of the OSI model are the data link and physical layers which
consist of a network interface card and a device driver. The physical layer deals
with voltages. The data link layer provides services like framing, error detection,
error correction, and flow control. Together they are responsible for getting raw
bits across a physical link. One important aspect of the Internet Protocol is that it
has no restrictions about the physical medium over which it runs. This
characteristic provides the TCP/IP protocol its adaptability and flexibility. For
instance, LAN technologies, such as Ethernet, Token Ring, and FDDI, operate at
the data link subnet layer. So do Wide Area Networks, such as ATM, X.25, and
Switched Multi-megabit Data Services (SMDS). Routers can interconnect all
these different media technologies, and the Internet Protocol can communicate
over all of these lower level subnetworks.
TCP/IP Stack OSI Reference Model

Application
Application Presentation
Session
Transport Transport
TCP or UDP
Network Network
(Internet Protocol)
Data Link
Data Link and
Physical
(Subnet) Physical
Ethernet TCP/IP stack v OSI reference model

Figure 2-2 Comparing the TCP/IP stack with the OSI reference model

Each of the subnetworks has its own internal addressing and framing formats. To
accommodate these the subnetworks encapsulate IP packets with headers and
trailer information according to the specific subnet protocol. This enables IP
packets to be transmitted over just about any type of network media today.
Layering and encapsulation are illustrated in Figure 2-3.
Network layer (Internet Protocol)

The network layer protocol below the transport layer is known as the Internet
Protocol (IP). It is the common thread running through the Internet and most LAN
technologies, including Ethernet. It is responsible for moving data from one host
to another, using various “routing” algorithms. Layers above the network layer
break a data stream into chunks of a predetermined size, known as packets or
datagrams. The datagrams are then sequentially passed to the IP network layer.
Layering and encapsulation
TCP/IP Stack
Application
Application datagram
TCP
TCP Header
Internet IP
Protocol Header
Subnet IP Packet
IP TCP Application
Header Header data
Subnet Subnet
Header IP Packet Trailer
Subnetwork Frame
Figure 2-3 Layering and encapsulation in protocol stacks
The job of the IP layer is to route these packets to the target destination. IP
packets consist of an IP header, together with the higher level TCP protocol and
the application datagram. IP knows nothing about the TCP and datagram
contents. Prior to transmitting data, the network layer might further subdivide it
into smaller packets for ease of transmission. When all the pieces reach the
destination, they are reassembled by the network layer into the original
datagram.

IP connectionless service
The IP is the standard that defines the manner in which the network layers of two
hosts interact. These hosts may be on the same network, or reside on physically
remote heterogeneous networks. IP was designed with inter-networking in mind.
It provides a connectionless, best-effort packet delivery service. Its service is
called connectionless because it is like the postal service rather than the
telephone system. IP packets, like telegrams or mail, are treated independently.
Each packet is stamped with the addresses of the receiver and the sender.
Routing decisions are made on a packet-by-packet basis. On the other hand,
connection-oriented, circuit switched telephone systems explicitly establish a
connection between two users before any conversation takes place. They also
maintain the connection for the entire duration of conversation.
A best-effort delivery service means that packets might be discarded during

transmission, but not without a good reason. Erratic packet delivery is normally
caused by the exhaustion of resources, or a failure at the data link or physical
layer. In a highly reliable physical system such as an Ethernet LAN, the
best-effort approach of IP is sufficient for transmission of large volumes of
information. However, in geographically distributed networks, especially the
Internet, IP delivery is insufficient. It needs to be augmented by the higher-level
TCP protocol to provide satisfactory service.
The IP Packet
All IP packets or datagrams consist of a header section and a data section
(payload). The payload may be traditional computer data or, as is common today,
it may be digitized voice or video traffic. Using the postal service analogy again,
the “header” of the IP packet can be compared with the envelope and the
“payload” with the letter inside it. Just as the envelope holds the address and
information necessary to direct the letter to the desired destination, the header
helps in the routing of IP packets.
The payload has a maximum size limit of 65,536 bytes per packet. It contains
error and/or control protocols, like the Internet Control Message Protocol (ICMP).
To illustrate control protocols, suppose that the postal service fails to find the
destination of your letter. It would be necessary to send you a message
indicating that the recipient's address was incorrect. This message would reach
you through the same postal system that tried to deliver your letter. ICMP works
the same way: it packs control and error messages inside IP packets.
IP addressing
An IP packet contains a source and a destination address. The source address
designates the originating node's interface to the network, and the destination
address specifies the interface for an intended recipient or multiple recipients (for
broadcasting).

Every host and router on the wider network has an address that uniquely
identifies it. It also denotes the subnetwork on which it resides. No two machines
can have the same IP address. To avoid addressing conflicts, the network
numbers are assigned by an independent body.
The network part of the address is common for all machines on a local network. It
is similar to a postal code, or zip code, that is used by a post office to route letters
to a general area. The rest of the address on the letter (i.e., the street and house
number) are relevant only within that area. It is only used by the local post office
to deliver the letter to its final destination.
The host part of the IP address performs a similar function. The host part of an IP
address can further be split into a subnetwork address and a host address.
IP network addressing is a large and intricate subject. It is not within the scope of
this book to describe it in any further detail.
Time to Live (TTL)

The IP packet header also includes a Time to Live (TTL) that is used to limit the
life of the packet on the network. Imagine a situation in which an IP packet gets
caught in the system and becomes undeliverable. It would then consume the
resources indefinitely. The entire network could be brought to a halt by a blizzard
of such reproducing but undeliverable packets. The TTL field maintains a counter
that is decremented each time the packet arrives at a routing step. If the counter
reaches zero, the packet is discarded.
Transport layer (TCP)

Two commonly used protocols operate in the transport layer. One is
Transmission Control Protocol (TCP) and the other is User Datagram Protocol
(UDP), which provides more basic services. For the purposes of this book we
assume the use of TCP.
The application data has no meaning to the transport layer. On the source node,
the transport layer receives data from the application layer and splits it into
chunks. The chunks are then passed to the network layer. At the destination
node, the transport layer receives these data packets and reassembles them
before passing them to the appropriate process or application. Further details
about how data travels through the protocol stack follow.
The transport layer is the first end-to-end layer of the TCP/IP stack. This
characteristic means that the transport layer of the source host can communicate
directly with its peer on the destination host, without concern about how data is
moved between them. These matters are handled by the network layer. The
layers below the transport layer understand and carry information required for
moving data across links and subnetworks.

In contrast, at the transport layer or above, one node can specify details that are
only relevant to its peer layer on another node. For example, it is the job of the
transport layer to identify the exact application to which data is to be handed over
at the remote end. This detail is irrelevant for any intermediate router. But it is
essential information for the transport layers at both ends.
Figure 2-4 shows how the client side and the server side TCP/IP stack
implementation adds increasing overhead to the transmission of data through the
network.
Data encapsulation
Data
Increasing overhead
Application Application
Increasing overhead
Add Transport header
Transport Transport
Add Network header\

(source + destination)
Network Network
Add Network header

Data Link and Data Link and
Physical Physical
The Network
Client stack Server stack
(source) (target)
Figure 2-4 TCP/IP stack encapsulation overheads
Application layer
The application layer is the layer with which end users normally interact. This
layer is responsible for formatting the data so that its peers can understand it.
Whereas the lower three layers are usually implemented as a part of the OS, the
application layer is a user process. Some application-level protocols that are
included in most TCP/IP implementations include the following:
򐂰 Telnet for remote login
򐂰 FTP for file transfer
򐂰 SMTP for mail transfer

2.3 Ethernet technical overview
Ethernet is the most common LAN technology in use world-wide today. It is
estimated that some 85% of all installed LANs utilize IP networks based on
Ethernet. The name Ethernet describes both a media and a media access
protocol.
2.3.1 The history of Ethernet

The first experimental system was designed in 1972 to interconnect Xerox Alto
systems and was called the Alto Aloha Network. It was designed to support what
was then termed “the office of the future”. Later the name was changed to
“Ethernet” to make it clear that the system could support any computer, not just
Altos. The name was based on the word “ether” as a way of describing an
essential feature of the system: the physical medium (that is, a cable) carries bits
to all stations as if traveling through the ether. So Ethernet was born as a vendor
neutral network technology. Most LANs must support a wide variety of computers
purchased from different vendors, which requires a high degree of network
interoperability of the sort that Ethernet provides.
A committee was formed by the Institute of Electrical and Electronic Engineers

(IEEE) in February 1980 to standardize network technologies. This was titled the
IEEE 802 working group, named after the month and year of its formation.
Subcommittees of the 802 group addressed different aspects of networking,
each subgroup being distinguished by a suffix number. The 802.3 subgroup
focused on standardization of Carrier Sense Multiple Access with Collision
Detection (CSMA/CD). This is the media access protocol used by Ethernet.
Formal IEEE standards were first published for Ethernet in 1985. Today Ethernet
is a name which generically refers to the IEEE 802.3 standard.
2.3.2 Ethernet design concepts

The original design supported devices (nodes) in close proximity to each other.
They communicate over a single cable (segment) shared by all the devices on
the network. Nodes communicate with each other over the segment by means of
short messages (called frames). Ethernet protocols define the structure and
addressing of the frames. Each node has a unique address.
Subsequent developments allowed Ethernet networks to span tens of kilometers

by interconnecting segments via signal repeaters (hubs), routers, bridges, and
switches. Today Ethernet interfaces operate over switched fabrics at either 10
Mbps, 100 Mbps (Fast Ethernet), and since 1999 at up to 1000 Mbps (Gigabit
Ethernet). Ethernet is also an open industry standard which defines protocols for
the addressing, formatting and sequencing of frames across the network.

Ethernet is a broadcast mode network. In other words, it is based on the original
concept that every attached node receives every transmission. (We will see later
that modern implementations can avoid this by means of dedicated paths
between initiating and receiving nodes). A frame may be addressed to a specific
target node. Each node checks the destination address of the frame, and
discards the frame if it is intended for another node. The target node receives the
frame and reads the contents. If the destination address of the frame is a
broadcast address, it is intended for all nodes on the network, and they will all
receive and read the frame.
2.3.3 The CSMA/CD protocol

Ethernet uses a media access protocol known as Carrier Sense Multiple Access
with Collision Detection (CSMA/CD).
The CSMA/CD protocol moves packets on the network. The term Multiple
Access describes the concept that every node “hears” every message. In effect,
every node “listens” to the network segment to see if the network is transmitting a
frame. A node which wishes to transmit a frame waits until the network is free
before transmitting its data. Carrier Sense refers to this technique.
Since the nodes are spread in different locations, it is possible for more than one
node to begin transmitting concurrently. This results in a collision of the frames
on the network. If a collision is detected the sending nodes transmit a signal to
prevent other nodes from sending more packets. All nodes then go into a wait
mode. On a random basis they go back to monitoring and transmitting.
We can liken this to a group of people sitting around a dinner table. I may wish to
say something, but someone else is already speaking. Rather than rudely
interrupting the speaker, I will wait politely until the other person has finished
speaking. When there is a pause, then I will say my piece. However, someone
else may also have been waiting to say something. At the pause in the
conversation we may both begin to speak, more or less at the same time. In
Ethernet terminology, a collision has occurred. We will both hear the other
person begin to speak, so we both politely stop, in order to allow the other one to
finish speaking. One of us will sense that it is OK to carry on, and will begin the
conversation again.
Packets which collided are re-sent. Since collisions are normal, and expected,
the only concern is to ensure a degree of fairness in achieving a timely
transmission. This is achieved by a simple random algorithm, which will enable a
node to “win” a collision battle after a number of attempts.

Obviously, more Ethernet nodes tends to mean more data packets transferred,
and therefore more collisions. The more collisions occur, the slower the network
runs. A count of deferrals due to collision may be kept. Excessive deferrals may
indicate the need to subdivide the collision zone into multiple subnetworks.
Increased bandwidth of the network also can reduce propensity to collide, as
data moves faster between nodes.
When Fast Ethernet, at 100 Mbps, was introduced in the mid 1990s, an
auto-negotiation procedure was also introduced. This dealt with the difference
between the original 10 Mbps CSMA/CD half duplex operation, and the new 100
Mbps full duplex implementation. With half duplex, only one end node on a
copper link (not fiber) may transmit at a time. With full duplex, both end nodes
may transmit concurrently, without generating a collision. With full duplex
operation, many of the CSMA/CD protocol functions become redundant, and the
propensity for frames to collide is almost eliminated.
2.3.4 Ethernet frames

Ethernet sends data in packets known as frames. The frame consists of a set of
bits organized into several fields. Frames are variable sized chunks of data (from
46 to 1500 bytes). The structure of the frame is defined by a protocol which
specifies rules relating to size of the data, error checking fields to make sure that
the frame has arrived intact, and required routing information. For instance, it
must include the unique addresses of the sender (initiator) and the recipient
(target). A frame may be addressed to one or a few destinations; or it may have a
broadcast address, which means that it is to be sent to all the nodes in the
network.
2.3.5 Ethernet physical topologies

When it comes to how signals flow over the set of media segments that make up
an Ethernet system, it helps to understand the topology of the system. The
physical topology of an Ethernet follows a number of possible implementations,
as described in the following sections.
Segments
As we have seen, early designs communicated with devices attached to a single
cable (segment) shared by all the devices on the network. A single segment is
also known as a collision domain because no two nodes on the segment can
transmit at the same time without causing a collision.

Spanning tree
Multiple Ethernet segments can be linked together to form a larger network. The
first popular medium for Ethernet was a copper coaxial cable (usually bright
yellow in color) known as thicknet . It had a maximum segment length of 500
meters, which limited the scalability of the LAN. Through the use of signal
repeaters, or hubs, an Ethernet system can grow with multiple branches. A hub,
or repeater, may attach to multiple nodes, or to other hubs, thus interconnecting
multiple Ethernet segments. The hub listens to each segment to which it is
attached. It repeats the signal from one and passes it to every other segment
connected to the repeater. This Ethernet logical topology is known as a spanning
tree (see Figure 2-5).
Segment 3
Bridge Collision domain B
R
(subnet)
Repeaters
Segment 1
Segment 2
R
Collision domain A
(subnet)
Figure 2-5 Ethernet spanning tree topology with subnet collision domains
In order to reduce collisions and congestion on the expanding LAN, Ethernet

implemented bridges to connect two or more segments and to regulate the
traffic. Bridges examine the destination address of the frame before deciding how
it is to be routed. If the destination address is on the same Ethernet segment as
that of the initiator, the bridge can filter out, or drop, the frame, and not forward it
to other segments. For instance, a frame sent just between two stations on
segment 3 would not be broadcast to the other segments, thus reducing
unnecessary traffic on the wider network. Now, by creating separate collision
domains, multiple conversations can take place concurrently with minimized
collisions.

Returning to the analogy of the dinner party once again, it is a relatively simple
matter to avoid “collisions” during speech between a few people seated at one
table. However, if the table is a very large one, with more people in the group,
having only one person able to speak at a time would be very tedious. Normally,
at such a party, several conversations would be in progress at the same time,
between smaller groups at different places around the table. One way to
overcome the degree of “collisions” between different groups would be to split
the group between several different, independent tables. Then it is easy for each
of the groups to have separate conversations politely, without disruption to other
groups.
An Ethernet broadcast message is intended for every node on the network. An

important characteristic of bridges is that they forward all such broadcasts to all
connected segments. However, this can create congestion problems when a
bridged network grows too large.
The congestion problem is overcome by routers, which logically divide a single

network into two, or more, separate networks (subnets). Ethernet broadcasts do
not cross through routers. The router is a logical boundary for the network. A
router operates with protocols which are independent of the specific LAN
technology, such as Ethernet or Token Ring. They can therefore act as
interconnections, or gateways, between differing networks, both local and wide
area. Today, routers are widely used to inter-connect local networks to the
world-wide Internet.
Switched fabric
Today, Ethernet has evolved to switched fabric topologies. It also normally uses
twisted pair wiring or fiber optic cable to connect nodes in a radial pattern. Early
implementations of Ethernet used half-duplex transmission (that is to say, data
transferred in one direction at a time).
Switched networks have largely replaced the shared medium of interconnected

segments found in earlier configurations. Now each node effectively has its own
dedicated segment, and transmission is in full duplex (data can flow in both
directions concurrently). The nodes connect to a switch, which acts like an
Ethernet bridge, but can attach many single-node segments. Some of today’s
switches can connect to hundreds of nodes.
Switches dynamically set up dedicated point-to-point paths between two

individual nodes on the network. The only two devices on the logical segment are
the initiator and the target. In this way, the frame always reaches its intended
destination, and many concurrent conversations can take place in a collision-free
environment. Ethernet LANs, linked in a switched fabric, may support many
hundreds or thousands of machines (see Figure 2-6 on page 77).

Backbone Ethernet switch
Router to
Ethernet hub
other networks
Ethernet switch
Ethernet hub
Ethernet hub
Figure 2-6 A switched Ethernet network
Once again using our dinner party theme, now we have a series of major
banquets in different rooms, even in different buildings and cities. But each
person wants to be able to talk to anyone else at any of the tables in any of the
locations. The organizers have thoughtfully provided each diner with a
telephone. Each diner can now call any of the other participants directly, have a
person-to-person conversation, and later speak to other people individually,
wherever they are seated. Everyone can speak at the same time, without
interrupting the other diners.
2.3.6 Ethernet media systems

Ethernet speeds have increased significantly during recent years. There are
several different types of Ethernet networks, based on the media implementation
of the network. The original Ethernet system operated at 10 Mbps. Fast Ethernet
at 100 Mbps was introduced in 1995, and in 1999 the 1000 Mbps Gigabit
Ethernet arrived. There are a number of baseband media segments, or cable
types, defined in the Ethernet standards. Each one exhibits different speed and
distance characteristics. They fall into four main categories: thick coaxial
(thicknet ), thin coaxial cable (thinnet ), unshielded twisted pair (UTP), and fiber
optic cable.

The terminology describing Ethernet media uses an IEEE shorthand to describe
each media type. The IEEE identifiers include three pieces of information. An
example is 100BASE-T. The first item, 100, stands for the media speed of 100
Mbps . The word BASE stands for baseband . Baseband signaling simply means
that Ethernet signals are the only signals carried over the media. The third part of
the identifier provides a rough indication of segment type or length of the cable.
In our example, T stands for unshielded twisted-pair (UTP), indicating the cable
type.
UTP cable has a distance limitation of 100 meters at 10BASE-T. Another

example includes F, standing for fiber optic cable (which enables significant
distance extensions to be achieved). The number 5 indicates the original thick
coaxial cable, describing the 500 meter maximum length allowed for individual
segments of this type of cable. A number 2 represents thin coaxial cable,
rounded up from the 185 meter maximum length for individual thin coaxial
segments.
There are a number of other definitions of cabling, with suffixes such as TX, FX,
CX and SX, describing different types of twisted pair cables, or multi-mode and
single-mode fiber optic cable. To keep things simple we have not described
these, but the media variations are summarized in Figure 2-7 on page 79.
Ethernet really took off commercially when it became possible to use UTP cable,
and when the use of hubs greatly simplified the logistics of installing the cabling.
A hub acted as a kind of concentrator for linking many machines to a central
wiring point. Today most sites use high quality twisted-pair cable or fiber optic
cables. These are much easier to install than coaxial cable because of their
flexibility. Short wave fiber optics can use multi-mode 62.5 micron or 50 micron
fiber optic cables; and single mode 9 micron cable is for long wave. These cables
can all carry either 10-Mbps, 100-Mbps or 1 Gigabit signals, thus allowing easy
infrastructure upgrades as required.

Ethernet Media Access Control
10Base5 10Base2 10BaseT 10BaseF 10 Mbps

Thick Coaxial Thin Coaxia l Twisted Pair Fiber Optic
100Base-T4 100Base-TX
Twisted pair Twisted pair 100Base-FX 100 Mbps
Fiber Optic
(voice grade) (data grade)
1000Base-T 1000Base-LX 1000Base-SX 1000Base-CX

Twisted pair Long wave laser Short wave laser Copper cable 1000 Mbps
Figure 2-7 Ethernet media varieties
2.3.7 Ethernet summary

Ethernet’s collision detection protocol is well suited to many messaging
applications, but it has some limitations when applied to normal storage traffic.
Ethernet’s major attractions are that it is low cost, it is pervasive in most
organizations of any size, and it is the de facto standard for LANs.
2.4 iSCSI basics

Work is in progress in the IETF IP Storage Work Group to define industry
standards for IP storage. The following sections summarize some concepts
derived from the latest Internet draft (version 5) of the iSCSI work group.
2.4.1 iSCSI requirements

The IETF work group has defined a number of key requirements for the iSCSI IP
Storage standard, including these:
򐂰 The iSCSI standard must specify how SCSI devices interact when attached to
IP networks.
򐂰 The iSCSI standard must use TCP as its transport.

򐂰 The iSCSI standard must not require modification to the current IP and
Ethernet infrastructure to support storage traffic.
򐂰 The iSCSI standard must allow implementations to equal or improve on the
current state of the art for SCSI interconnects. It:
– must provide low delay communications
– must provide high bandwidth and bandwidth aggregation
– must have low host CPU utilizations, equal to or better than current
technology
– must be possible to build I/O adapters handling the entire SCSI task
– must permit zero-copy memory architectures
– must not impose complex operations on host software
– must be cost competitive with alternative storage networking technologies
򐂰 iSCSI initiator should be able to send simultaneously to multiple interfaces on
the target through multiple paths through the network.
򐂰 iSCSI standard must operate over a single TCP connection.
򐂰 iSCSI standard should specify mechanisms to recover in a timely fashion
from failures on the initiator, target, or connecting infrastructure.
򐂰 iSCSI protocol document must be clear and unambiguous.
򐂰 iSCSI must use TCP connections conservatively, keeping in mind there may
be many other users of TCP on a given machine.
򐂰 iSCSI must not require changes to existing internet protocols. It:
– should support all current SCSI command sets
– must support all SCSI-3 command sets and device types
– must be possible to create bridges from iSCSI to other SCSI interconnects
(such as FCP)
A complete list of the iSCSI standards requirements can be found at:

http://www.ece.cmu.edu/~ips/Docs/docs.html
2.4.2 iSCSI concepts

The basic system model for iSCSI is that of an extended virtual cable, connecting
a SCSI initiator device to a SCSI target device. Both iSCSI initiator and iSCSI
target are identified completely by their IP addresses. At the highest level, SCSI
is a family of interfaces for requesting services from I/O devices, including hard
drives, tape drives, CD and DVD drives, printers, and scanners. In SCSI
parlance, an individual I/O device is called a logical unit (LU).

SCSI is client-server architecture. Clients of a SCSI interface are called
initiators. Initiators issue SCSI commands to request service from a logical unit.
The device server on the logical unit accepts SCSI commands and executes
them.
A SCSI transport maps the client-server SCSI protocol to a specific

interconnect. Initiators are one endpoint of a SCSI transport. The target is the
other endpoint. A target can have multiple Logical Units (LUs) behind it. Each
Logical Unit has an address within a target called a Logical Unit Number (LUN).
A SCSI task is a SCSI command or possibly a linked set of SCSI commands.
2.4.3 iSCSI functional overview

The iSCSI protocol is a mapping of the SCSI remote procedure invocation model
on top of the TCP protocol.
In keeping with similar protocols, the initiator and target divide their
communications into messages. The term iSCSI protocol data unit (iSCSI
PDU) describes these messages.
The iSCSI transfer direction is defined with regard to the initiator. Outbound or
outgoing transfers are transfers from initiator to target, while inbound or incoming
transfers are from target to initiator.
An iSCSI task is an iSCSI request for which a response is expected.
iSCSI operations
iSCSI is a connection-oriented command/response protocol. An iSCSI session
begins with an iSCSI initiator connecting to an iSCSI target (typically, using TCP)
and performing an iSCSI login. This login creates a persistent state between
initiator and target, which may include initiator and target authentication, session
security certificates, and session option parameters.
Once this login has been successfully completed, the iSCSI session continues in
full feature phase. The iSCSI initiator may issue SCSI commands encapsulated
by the iSCSI protocol over its TCP connection, which are executed by the iSCSI
target. The iSCSI target must return a status response for each command over
the same TCP connection, consisting of both the completion status of the actual
SCSI target device and its own iSCSI session status.
An iSCSI session is terminated when its TCP session is closed.
iSCSI data flow

The same TCP session used for command/status is also used to transfer data
and/or optional command parameters.

For SCSI commands that require data and/or parameter transfer, the (optional)
data and the status for a command must be sent over the same TCP connection
that was used to deliver the SCSI command.
Data transferred from the iSCSI initiator to iSCSI target can be either unsolicited
or solicited. Unsolicited data may be sent either as part of an iSCSI command
message, or as separate data messages (up to an agreed-upon limit negotiated
between initiator and target at login). Solicited data is sent only in response to a
target-initiated Ready to Transfer message.
Each iSCSI command, Data, and Ready to Transfer message carries a tag,
which is used to associate a SCSI operation with its associated data transfer
messages.
Layers and sessions

Communication between the initiator and target occurs over one or more TCP
connections. The TCP connections carry control messages, SCSI commands,
parameters and data within iSCSI Protocol Data Units (iSCSI PDUs). The group
of TCP connections that link an initiator with a target form a session. A session is
defined by a session ID that is composed of an initiator part and a target part.
Ordering and SCSI numbering

iSCSI uses Command and Status numbering schemes and a Data sequencing
scheme.
Command numbering is session-wide and is used for ordered command delivery

over multiple connections. It can also be used as a mechanism for command
flow control over a session.
Status numbering is per connection and is used to enable missing status

detection and recovery in the presence of transient or permanent communication
errors.
Command numbering and acknowledging

iSCSI supports ordered command delivery within a session. All commands
(initiator-to-target) are numbered.
The iSCSI target layer must deliver the commands to the SCSI target layer in the
specified order.
iSCSI login
The purpose of the iSCSI login is to enable a TCP connection for iSCSI use,
authenticate the parties, negotiate the session's parameters, open a security
association protocol, and mark the connection as belonging to an iSCSI session.

The targets listen on a well-known TCP port for incoming connections. The
initiator begins the login process by connecting to that well-known TCP port.
As part of the login process, the initiator and target may wish to authenticate
each other and set a security association protocol for the session. This can occur
in many different ways.
iSCSI full feature phase

Once the initiator is authorized to do so, the iSCSI session is in iSCSI full feature
phase. The initiator may send SCSI commands and data to the various LUs on
the target by wrapping them in iSCSI messages that go over the established
iSCSI session.
Naming and addressing

All iSCSI initiators and targets are named. Each target or initiator is known by a
World-Wide Unique Identifier (WWUI). The WWUI is independent of the location
of the initiator and target.
WWUIs are used in iSCSI to provide:

򐂰 A target identifier for configurations that present multiple targets behind a
single IP address and port
򐂰 A method to recognize multiple paths to the same device on different IP
addresses and ports
򐂰 A symbolic address for source and destination targets for use in third party
commands
򐂰 An identifier for initiators and targets to enable them to recognize each other
regardless of IP address and port mapping on intermediary firewalls
The initiator must present both its initiator WWUI and the target WWUI to which it
wishes to connect during the login phase.
In addition to names, iSCSI targets also have addresses. An iSCSI address

specifies a single path to an iSCSI target. The WWUI is part of the address. An
iSCSI address is presented in a URL-like form, such as:
<domain-name>[:<port>]/<wwui>
Message synchronization and steering

iSCSI presents a mapping of the SCSI protocol onto TCP. This encapsulation is
accomplished by sending iSCSI PDUs of varying length. Unfortunately, TCP
does not have a built-in mechanism for signaling message boundaries at the
TCP layer. iSCSI overcomes this obstacle by placing the message length in the
iSCSI message header. This serves to delineate the end of the current message
as well as the beginning of the next message.

In situations where IP packets are delivered in order from the network, iSCSI
message framing is not an issue; messages are processed one after the other. In
the presence of IP packet reordering (for example, frames being dropped),
legacy TCP implementations store the “out-of-order” TCP segments in temporary
buffers until the missing TCP segments arrive, upon which the data must be
copied to the application buffers. In iSCSI it is desirable to steer the SCSI data
within these out of order TCP segments into the pre-allocated SCSI buffers,
rather than store them in temporary buffers. This decreases the need for
dedicated reassembly buffers, as well as the latency and bandwidth related to
extra copies.
Synchronization and steering
iSCSI considers the information it delivers (headers and payloads) as a
contiguous stream of bytes, mapped to the positive integers from 0 to infinity.
However, iSCSI is not supposed to have to handle infinitely long streams. The
stream addressing scheme will wrap around at 2**32-1.
Login phase
The login phase establishes an iSCSI session between initiator and target. It sets
the iSCSI protocol parameters and security parameters, and authenticates the
initiator and target to each other. Operational parameters may be negotiated
within or outside (after) the login phase. Security must be completely negotiated
within the Login Phase or provided by external means.
In some environments, a target or an initiator is not interested in authenticating

its counterpart. It is possible to bypass authentication through the Login
Command and Response. The initiator and target MAY want to negotiate
authentication and data integrity parameters. Once this negotiation is completed,
the channel is considered secure.
The login phase is implemented via login and text commands and responses
only. The login command is sent from the initiator to the target in order to start the
login phase. The login response is sent from the target to the initiator to conclude
the login phase. Text messages are used to implement negotiation, establish
security, and set operational parameters. The whole login phase is considered as
a single task and has a single Initiator Task Tag (similar to the linked SCSI
commands).
The login phase starts with a login request via a login command from the initiator
to the target. A target may use the Initiator WWUI as part of its access control
mechanism; therefore, the Initiator WWUI must be sent before the target is
required to disclose its LUs.

iSCSI security and integrity negotiation
The security exchange sets the security mechanism and authenticates the user
and the target to each other. The exchange proceeds according to the algorithms
that were chosen in the negotiation phase.
Security considerations
Historically, native storage systems have not had to consider security because
their environments offered minimal security risks. That is, these environments
consisted of storage devices either directly attached to hosts, or connected via a
subnet distinctly separate from the communications network. The use of storage
protocols, such as SCSI, over IP networks requires that security concerns be
addressed. iSCSI implementations must provide means of protection against
active attacks (posing as another identity, message insertion, deletion, and
modification) and may provide means of protection against passive attacks
(eavesdropping, gaining advantage by analyzing the data sent over the line).
No security: This mode does not authenticate nor does it encrypt data. This
mode should only be used in environments where the security risk is minimal and
configuration errors are improbable.
Initiator-target authentication: In this mode, the target authenticates the

initiator and the initiator optionally authenticates the target. An attacker should
not gain any advantage by inspecting the authentication phase messages (that
is, sending “clear password” is out of the question). This mode protects against
an unauthorized access to storage resources by using a false identity (spoofing).
Once the authentication phase is completed, all messages are sent and received
in clear. This mode should only be used when there is minimal risk of
man-in-the-middle attacks, eavesdropping, message insertion, deletion, and
modification.
Data integrity and authentication: This mode provides origin authentication

and data integrity for every message that is sent after a security context is
established. It protects against man-in-the-middle attacks, message insertion,
deletion, and modification. It is possible to use different authentication
mechanisms for headers and data.
Every compliant iSCSI initiator and target must be able to provide initiator-target
authentication and data integrity and authentication. This quality of protection
may be achieved on every connection through properly configured IPSec
involving only administrative (indirect) interaction with iSCSI implementations.

Encryption: This mode provides data privacy in addition to data integrity and
authentication, and protects against eavesdropping, man-in-the-middle attacks,
message insertion, deletion, and modification. A connection or multiple
connections MAY be protected end-to-end or partial-path (gateway tunneling) by
using IPSec.
For full details of the latest iSCSI Internet Draft you may wish to refer to the IETF
Web site at:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-05.txt
2.5 Understanding the storage I/O path

One of the confusions which often arises regarding IP NAS storage when
compared to DAS or SAN-attached storage is the difference between a “block
I/O” and a “network file I/O”. To understand this, let’s examine the various
components of a storage I/O, and see how they change when moving from DAS
or SAN to IP network-attached storage.
2.5.1 Hardware components of the I/O channel

There are a number of physical hardware components behind the CPU which are
involved in a storage I/O:
򐂰 System memory bus
򐂰 Host I/O bus
򐂰 Host Bus Adapter (HBA)
򐂰 I/O bus and network connection
򐂰 Storage device and media
System memory bus

This high speed bus physically connects the CPU to its primary memory and
cache. Devices such as storage and network interface adapters are very
substantially slower than primary memory. Therefore, to prevent performance
impact to CPU processing, they are separately connected to the system memory
bus by means of the host I/O bus. The CPU passes data to the system memory
bus, which acts as a traffic cop, and directs I/O to the host I/O bus.
Host I/O bus

There are several host I/O bus designs, including the Peripheral Component
Interface (PCI) bus, S-bus (an I/O bus used in Sun Microsystems servers),
Microchannel (developed by IBM for PCs), VME bus developed by Digital
Equipment Corporation for its VAX range of systems, and several others. The

Peripheral Component Interface (PCI) bus is by far the most common host I/O
bus found in products today. It moves data in and out of the system memory bus
to the peripheral devices, such as printers, video cards, scanners, and storage
devices. Each has a specific attachment adapter.
Host Bus Adapter or Network Interface Card

The host I/O bus must be connected to the device by an HBA, or to the network
via a NIC.
Host Bus Adapter (HBA)

The HBA is a card which attaches to the host I/O bus. It shares the I/O bus with
other processor cards, such as video or network processor attachment cards.
Firmware is loaded on the HBA (that is software which is specifically optimized to
the circuitry of the particular HBA). The firmware controls functions such as
power on testing and error correction for the specific protocol of the connection to
the attached storage devices. These include:
򐂰 Parallel SCSI
򐂰 Serial SCSI used in Fibre Channel (Fibre Channel Protocol or FCP)
򐂰 Internet SCSI (iSCSI), which is also a serial SCSI implementation
encapsulated in TCP/IP
򐂰 Serial Storage Architecture (SSA) which also uses serial SCSI protocols
򐂰 ESCON, FICON and so on.
The device driver controls the operation of the attached storage device, and the
transfer of data to and from the device through the HBA. The device driver
software is part of the system operating system; it is described briefly in “Device
and network drivers” on page 92.
Network Interface Card (NIC)

The NIC is a physical card similar in function to the HBA, except that it is the
interface for devices attaching to the IP network cable. Network driver firmware
allows the NIC to communicate with the network communications protocol, such
as TCP/IP.
I/O bus and network connections

This refers to the physical cable used to attach the system to the storage device.
It may be an I/O bus for DAS, or a network connection for NAS, such as Ethernet
media, or fiber optic cable for Fibre Channel SAN.

SCSI I/O bus
The parallel SCSI I/O bus uses a linear topology. Multiple devices can be
attached in a daisy-chain topology to the linear bus; but in this case arbitration
must occur in order for a device to take control and send its transmission. There
are always two ends to the chain, and each end must be terminated. SCSI Data
signals are transmitted in parallel, by which we mean that each data “bit” signal
travels over 8 or 16 separate strands of wire within the bus cable. Because of the
tendency of the signals to skew over time and distance, plus the overhead
caused by arbitration, the length of the parallel SCSI bus has been limited to 25
meters or less.
Network connections
Storage networks solve, among other things, the distance limitations of the SCSI
bus. The storage I/O bus is replaced by a cable attachment into the network. The
attachment may utilize devices to facilitate ease of implementation, such as hubs
and switches. The physical topologies of these attachments may vary according
to the network size, costs, and performance requirements.
򐂰 SAN topologies
The following physical topologies for Fibre Channel SAN are supported:
– Loop: A Fibre Channel loop cable is a shared attachment resource.
Arbitration determines which device can send its transmission. Loops are
typically implemented in a star fashion. A hub provides a simple, low cost,
loop topology within its own hardware. Each loop node is connected via
cable to the hub. The bandwidth of the loop is shared by all attached loop
nodes.
– Switched fabric: Switched fabric topologies use centralized, high speed
switches to deliver multiple, dedicated, concurrent data transmission paths
across the network. There is no arbitration required. The bandwidth of the
network automatically scales as paths are added to the topology.
Intelligence in the fabric components, such as switches, can determine if a
path is broken or busy, and can select the best alternative route through
the network to the target node.
– Point-to-point: A point-to-point connection may be made, depending on
the storage device attached. This provides a connection similar to direct
attachment, although it uses HBAs and Fibre Channel protocols.
򐂰 LAN topologies
In the case of LAN topologies, Ethernet supports bus-like daisy chain
(segment), spanning tree, and switched fabric topologies. These are
described in 1.5.1, “Ethernet” on page 14. For the sake of brevity we will not
repeat the information here.

򐂰 Summary of network and storage connections
Table 2-1 summarizes the main data network and storage connection
topologies.
Table 2-1 Data network and storage connection topologies
Bus Topology Loop Topology Switched

Topology
Data Networks Thinnet Ethernet Token Ring 10BaseT
Thicknet Ethernet FDDI 100BaseT
Arcnet 1000BaseT
ATM
Storage FC Point-to-Point FC-Arbitrated Loop FC Switched

Connection Fabric
Parallel SCSI SSA ESCON
ATA and IDE P1314 Firewire FICON
Storage device and media

Storage devices include disk, tape, and optical drives and subsystems. A single
device usually has a single address on the I/O bus or network connection,
whereas a subsystem may have multiple addresses, or Logical Unit Numbers
(LUNs). The magnetic media on the surface of the disk or tape cartridge is the
final storage medium for recording the data.
2.5.2 Software components of the I/O channel

A storage I/O incorporates a number of software components, which handle the
logical view of the storage I/O path. They include the following:
򐂰 Application software
򐂰 Operating system
򐂰 File systems and database systems
򐂰 Volume managers
򐂰 Device or network drivers
Application software
Applications which need access to data generate an I/O. The I/O request may
come from an interactive user-driven application, a batch process, a database
operation, or a system management process. The application has no idea about
the physical structure and organization of the storage device where the data is
located.

Operating system (OS)
Sometimes it is the operating system (OS) which generates an I/O, using virtual
techniques to move data to and from storage as determined by an internal task
scheduler. This may include OS components such as system logs, configuration
files, and so on. The operating system manages resources and task scheduling.
File systems and database systems

Most I/O requests are handed by the OS to a file system. But many database
I/Os bypass the file system and use raw disk partitions.
File systems
A file system (FS) is the physical structure an operating system uses to store and
organize files on a storage device. At the basic I/O system (BIOS) level, a disk
partition contains sectors, each with a number (0,1,2 and so on). Each partition
could be viewed as one large dataset, but this would result in inefficient use of
disk space and would not meet application requirements effectively. To manage
how data is laid out on the disk, an operating system adds a hierarchical
directory structure. Each directory contains files, or further directories, known as
sub-directories. The directory structure and methods for organizing disk
partitions is called a file system.
File systems manage storage space for data created and used by the
applications. The primary purpose of an FS is to improve management of data by
allowing different types of information to be organized and managed separately.
The FS is implemented through a set of operating system commands that allow
creation, management, and deletion of files. A set of subroutines allows lower
level access, such as open, read, write, and close to files in the file system. The
FS defines file attributes (read only, system file, archive, and so on), and
allocates names to files according to a naming convention specific to the file
system. The FS also defines maximum size of a file and manages available free
space to create new files.
Many different file systems have been developed to operate with different
operating systems. They reflect different OS requirements and performance
assumptions. Some file systems work well on small computers; others are
designed to exploit large, powerful servers. An early PC file system is the File
Allocation Table (FAT) FS used by the MS-DOS operating system. Others file
systems include the High Performance FS (HPFS), initially developed for IBM
OS/2, Windows NT File System (NTFS), Journal File System (JFS) developed
for the IBM AIX OS, and General Parallel File System (GPFS), also developed
by IBM for AIX. There are many others.

A file system does not work directly with the disk device. A file system works with
abstract logical views of the disk storage, which are created by the volume
manager function. In other words, the disk may be virtual or real. From the file
system’s point of view it does not matter. The FS maintains a map of the data on
the disk drives, including virtual volumes. From this map the FS finds space
which is available to store the file. It then converts the original file I/O request to
storage protocols (some number of block I/O operations). Finally, the FS creates
metadata (data describing the file) which is used for systems and storage
management purposes, and determines access rights to the file.
A disk drive may have partitions with file systems belonging to several different
operating systems. Generally an operating system will ignore those partitions
whose ID represents an unknown file system.
The file system is usually tightly integrated with the OS. However, in storage
networks it may be separated from the OS and distributed to multiple remote
platforms. This is to allow a remote file system (or part of a file system) to be
accessed as if it were part of a local file system. Later we will see how this
happens with Network File System (NFS) and Common Internet File System
(CIFS).
Database systems
A database can access and store data by making I/O requests via a file system.
Alternatively, it can manage its own block I/O operations by reading and writing
directly to “raw partitions” on the disk device. In this case the database allocates
and manipulates the storage for its own table spaces without requesting services
from the file system. This may result in very much faster performance.
The roles of these components are described in more detail in 2.7, “Tracing the
I/O path for local storage” on page 98.
Volume manager
The volume manager may be an integral part of the OS, or it may be a separate
software module, such as Veritas Logical Volume Manager developed for Sun
Solaris OS. The volume manager is concerned with disk device operations,
creating and configuring disk drive partitions into logical drives. The File System
uses these logical views to place the data. For instance, the volume manager
can mirror I/O requests to duplicate partitions, to provide redundancy and
improve performance. In this case, it takes a single I/O request from the file
system and creates two I/O requests for two different disk devices. Also, it can
stripe data across multiple drives to achieve higher performance; and it may
implement RAID algorithms to create fault-tolerant arrays of disk volumes.

The volume manager may have the ability to merge several disk partitions to
create a single, virtual volume. This “disk concatenation” delivers a logical
volume with the combined capacity of the several partitions. The volume
manager may also use system memory to provide disk caching for increased I/O
performance.
Device and network drivers

The driver depends on connection to a channel (bus) or network.
Device driver
For DAS and SCSI block I/O on SAN and iSCSI networks, the device driver
software (or firmware) receives the I/O request from the volume manager
function. It formats the data and generates the appropriate signal for the targeted
storage device. It is the last software in the server to handle the data before it
leaves the hardware, and the first to handle it when it returns from the storage
device.
Network driver
In the case of network-attached devices, I/O must pass through the network
interface card (NIC) attachment to the network. The NIC contains a network
protocol driver in firmware. This describes the operations exchanged over the
underlying network protocol (such as TCP/IP). There are often several
protocol layers implemented here as a series of “device drivers.”
One of the layers is the file protocol driver software, which varies according to
the operating system environment. For instance, with Windows operating
systems the file protocol is CIFS; with UNIX it is NFS. Or it may be File Transfer
Protocol (FTP). These network file system protocol drivers interface to the
TCP/IP stack. CIFS and NFS are described in 2.6, “Network file system
protocols” on page 93.
2.5.3 I/O operations hardware/software combination

I/O operations combine functions from the hardware components and the
software components described previously. The combination of components
varies according to whether the I/O is to a locally attached device, or to a network
attached device. We describe the differences in the following sections.

2.6 Network file system protocols
Network file system protocols allow files and directories located on other systems
to be incorporated into a local file system and accessed as though they are part
of that file system. There are a number of such protocols available. The most
commonly used are NFS and CIFS.
2.6.1 Network File System (NFS)

The Network File System (NFS) is a network-based client/server protocol, which
enables machines to share file systems across a network using the TCP/IP
communication protocol.
It allows you to optimize efficiency in a distributed network while still capitalizing

on capacity, security, and integrity for data management. NFS allows authorized
network users to access shared files stored on computers of different types.
Users can manipulate shared files as if they were stored locally on the user’s
own hard disk. With NFS, computers connected to a network operate as clients
while accessing remote files. They operate as servers while providing remote
users with access to local shared files. The MOUNT protocol performs the
operating system-specific functions that allow clients to attach remote directory
trees to a point within the local file system. The mount process also allows the
server to grant remote access privileges to a restricted set of clients via export
control.
In the NFS environments, the Network Lock Manager (NLM) provides support for
file locking when used.
Key features
The NFS provides the following key features:
򐂰 Improved interoperability with other system platforms, increasing overall
network utilization and user productivity
򐂰 Easy access to files for the end-user of the NFS client system
򐂰 Uses industry standard TCP/IP protocols
NFS cross-platform specifications

NFS assumes a hierarchical file system (directories). Files are unstructured
streams of uninterpreted bytes. That is, each file is seen as a contiguous byte
stream, without any record-level structure. This is the kind of file system used by
UNIX and Windows, so these environments will easily integrate an NFS client
extension in their own local file system. File systems used in MVS lend
themselves less readily to this kind of extension.

Network File System was designed by Sun Microsystems. It is designed to be
machine-independent, operating system-independent, and transport
protocol-independent. This independence is achieved through Remote
Procedure Call (RPC) primitives. These allow a program on one machine to start
a procedure on another machine as if the procedure were local. RPC uses the
External Data Representation protocol (XDR), which resolves the differences in
data representation of different machines.
The RPC concept can be simplified as follows:

򐂰 The caller process sends a call message and waits for the reply.
򐂰 On the server side, a process is dormant, awaiting the arrival of call
messages. When one arrives, the server extracts the procedure parameters,
computes the results and sends them back in a reply message.
With NFS, all file operations are synchronous. This means that the file operation
call returns only when the server has completed all work for the operation. In the
case of a write request, the server will physically write the data to disk. If
necessary, it will update any directory structure before returning a response to
the client. This ensures file integrity.
NFS is a stateless service. That means it is not aware of the activities of its
clients. As a result, a server does not need to maintain any extra information
about any of its clients in order to function correctly. In the case of server failure,
clients only have to retry a request until the server responds, without having to
reiterate a mount operation.
File locking and access control synchronization services are provided by two
cooperating processes: the Network Lock Manager (NLM) and the Network
Status Monitor (NSM). The NLM and NSM are RPC-based servers, which
normally execute as autonomous daemon servers on NFS client and server
systems. They work together to provide file locking and access control capability
over NFS.
NFS specifications in open environments

The NFS protocol was designed to allow different operating systems to share
files. However, since it was designed in a UNIX environment, many operations
have semantics similar to the operations of the UNIX file system. NFS in UNIX
environments supports:
򐂰 Server and client functions to share data
򐂰 Network Information Services (NIS), which allows, for example, centralized
user authentication
򐂰 Network Lock Manager and Network Status Monitor

򐂰 Automounter support, which allows automatic NFS mounting, while the
accessing file system is unmounted from the client, to enhance the network
load
򐂰 User authentication, provided by means of RPC use of the data encryption
standard
򐂰 Support of access control lists between UNIX systems
򐂰 Remote mapped file support, which allows an RS/6000 NFS client to take
advantage of the enhanced virtual memory management function of AIX
Specific software is available from vendors to implement NFS functions in the

Windows NT environment. This allows users to access the Windows NT storage.
Windows NT can be both an NFS server and an NFS client. Depending on the
software, the NFS server can have the following features:
򐂰 NFS server provides NFS clients access to server, exported disks, printer,
and CD-ROMs.
򐂰 NFS server can be installed as a Windows NT service, with no logon needed.
򐂰 NFS allows seamless integration with NT security; use NT local or domain
accounts.
򐂰 NFS provides NT accounts and groups to UNIX UID and GID mapping.
򐂰 NFS supports FAT, NTFS, CDFS, and HPFS file systems.
򐂰 NFS supports network locking manager.
򐂰 NFS can be integrated with Windows Explorer to share an NFS directory from
Explorer or Network Neighborhood.
򐂰 NFS provides automatic recovery when Windows NT restarts.
2.6.2 Common Internet File System (CIFS)

The Common Internet File System (CIFS) is another protocol to share file
systems across the network. It is used in Microsoft Windows products. The CIFS
protocol supports rich, collaborative applications over the Internet.
CIFS defines a standard remote file system access protocol for use over the
Internet. This enables groups of users to work together and share documents
across the Internet, or within their corporate intranets. CIFS is an open,
cross-platform technology based on the native file-sharing protocols built into
Microsoft Windows and other popular PC operating systems. It is supported on
dozens of other platforms, including UNIX.
With CIFS, millions of computer users can open and share remote files on the
Internet without having to install new software or change the way they work.

CIFS in a nutshell
CIFS enables collaboration on the Internet by defining a remote file access
protocol. This protocol is compatible with how applications already share data on
local disks and network file servers. CIFS incorporates the same
high-performance, multi-user read and write operations, locking, and file-sharing
semantics that are the backbone of today's enterprise computer networks. CIFS
runs over TCP/IP and utilizes the Internet's global Domain Naming Service
(DNS) for scalability. It is specifically optimized to support slower speed dial-up
connections common on the Internet.
With CIFS, existing applications and applications for the World Wide Web can
easily share data over the Internet or intranet, regardless of computer or
operating system platform. CIFS is an enhanced version of Microsoft's open,
cross-platform Server Message Block (SMB) protocol. This is the native
file-sharing protocol in the Microsoft Windows 95, Windows NT, and OS/2
operating systems. It is the standard way that millions of PC users share files
across corporate intranets. CIFS is also widely available on UNIX, VMS™,
Macintosh, and other platforms.
CIFS technology is open, published, and widely available for all computer users.
Microsoft has submitted the CIFS 1.0 protocol specification to the Internet
Engineering Task Force (IETF) as an Internet-Draft document. Microsoft is also
working with interested parties for CIFS to be published as an Informational RFC.
CIFS (SMB) has been an Open Group (formerly X/Open) standard for PC and
UNIX interoperability since 1992 (X/Open CAE Specification C209).
CIFS is not intended to replace HTTP or other standards for the World Wide
Web. CIFS complements HTTP while providing more sophisticated file sharing
and file transfer than older protocols such as FTP. CIFS is designed to enable all
applications, not just Web browsers, to open and share files securely across the
Internet.
CIFS benefits
Following are some benefits of using CIFS:
򐂰 Integrity and concurrency - CIFS allows multiple clients to access and
update the same file, while preventing conflicts with sophisticated file-sharing
and locking semantics. These mechanisms also permit aggressive caching,
and read-ahead/write-behind, without loss of integrity.
򐂰 Fault tolerance - CIFS supports fault tolerance in the face of network and
server failures. CIFS clients can automatically restore connections, and
reopen files, that were open prior to interruption.

򐂰 Optimization for slow links - The CIFS protocol has been tuned to run well
over slow-speed dial-up lines. The effect is improved performance for the vast
numbers of users today who access the Internet using a modem.
򐂰 Security - CIFS servers support both anonymous transfers and secure,
authenticated access to named files. File and directory security policies are
easy to administer.
򐂰 Performance and scalability - The performance of CIFS servers is good.
CIFS servers are highly integrated with the operating system, tuned for
maximum system performance, and easy to administer.
򐂰 Unicode file names - File names can be in any human character set, not just
ones designed mainly for English or Western European languages.
򐂰 Global file names - Users do not have to mount remote file systems. They
can refer to them directly with globally significant names, instead of ones that
have only local significance.
2.6.3 Differences between NFS and CIFS

The main differences between NFS and CIFS are:
򐂰 NFS was designed by Sun Microsystems to be machine-independent,
operating system-independent, and transport protocol-independent.
CIFS was designed by Microsoft to work on Windows workstations.
򐂰 NFS servers make their file systems available to other systems in the network
by exporting directories and files over the network. An NFS client “mounts”
a remote file system from the exported directory location. NFS controls
access by giving client-system level user authorization. The assumption is
that a user who is authorized to the system must be trustworthy. Although this
type of security is adequate for some environments, it is open to abuse by
anyone who can access a UNIX system via the network.
On the other hand, CIFS systems create “file shares” which are accessible
by authorized users. CIFS authorizes users at the server level, and can use
Windows domain controllers for this purpose. So CIFS security is stronger
than NFS.
򐂰 NFS is a stateless service. In other words, it is not aware of the activities of its
clients. Any failure in the link will be transparent to both client and server.
When the session is re-established the two can immediately continue to work
together again.
CIFS is session- oriented and stateful . This means that both client and server
share a history of what is happening during a session, and they are aware of
the activities occurring. If there is a problem, and the session has to be
re-initiated, a new authentication process has to be completed.

򐂰 For directory and file level security, NFS uses UNIX concepts of “User”,
“Groups” (sets of users sharing a common ID), and “Other” (meaning no
associated ID). For every NFS request, these IDs are checked against the
UNIX file system’s security. However, even if the IDs do not match, a user
may still have access to the files.
CIFS, however, uses access control lists that are associated with the shares,
directories, and files, and authentication is required for access.
򐂰 The locking mechanism principle is very different. When a file is in use NFS
provides “advisory lock” information to subsequent access requests. These
inform subsequent applications that the file is in use by another application,
and for what it is being used. The later applications can decide if they want to
abide by the lock request or not. So any UNIX application can access any file
at any time. The system relies on “good neighbor” responsibility and clearly is
not foolproof.
CIFS, on the other hand, effectively locks the file in use. During a CIFS
session, the lock manager has historical information concerning which client
has opened the file, for what purpose, and in which sequence. The first
access must complete before a second application can access the file.
2.7 Tracing the I/O path for local storage

Usually the host application program has no knowledge of the actual physical
layout of data on the disk device; it knows about logical files of information
specific to the application. The disk drive on the other hand knows nothing about
applications or files. It only knows about blocks and sectors formatted on the
physical storage media.
2.7.1 File system I/O

The application program makes an I/O request to a File System, which is an
integral part of the operating system (OS) of the host server. The File System
defines the directory structure which subdivides disk partitions into smaller files,
assigns names to each file, and manages the free space available where new
files can be created. For instance, in the Windows NT world, the standalone
computer’s file system is known as NT File System (NTFS).
The OS manages the scheduling of system resources. It is responsible for

routing the I/O request from the application, through the appropriate processes,
and finally to the device driver, which controls the operation of the specific
storage device.

The File System controls the organization of the data on to the storage device. It
also manages a cache buffer for data in the server memory. On receiving the I/O
the File System decides whether the file has a valid name. Does a file with this
name already exist or must one be created? It determines if the file is read-only,
or if data may be written to it, for instance. It establishes if there is an appropriate
directory where the user is writing the file, and if there is enough space on the
disk for the file to be written; and other such checks. Then, if appropriate, it
decides where to place the file on the device.
Although the File System does not deal directly with the physical device, it does
have a map of where data is located on the disk drives. This map is used to
allocate space for the data, and to convert the file I/O request into storage I/O
protocols. The I/O must go to the device in a format which is understandable to
the device; in other words, in some number of “block-level” operations. The File
System therefore creates for the I/O some metadata (data describing the data),
and adds information to the I/O request which defines the location of the data on
the device.
The File System deals with a logical view of the physical disk drives. It maps data
on to logical devices as evenly as possible in an attempt to deliver consistent
performance. It passes the I/O request via a volume manager function, which
processes the request based on the configuration of the disk subsystem it is
managing. Then the volume manager passes the transformed I/O to the device
driver in the operating system.
The device driver reads or writes the data in blocks. It sizes them to the specific
data structure of the storage media on a physical device, such as a SCSI disk
drive. SCSI commands contain block information mapped to specific sectors on
the surface of the physical disk. This block information is used to read and write
data to and from the block table located on the disk device.
A File System is designed to provide generalized services for many applications

and different types of data. The whole process of directing the I/O via the OS File
System is known as “file system I/O” (commonly abbreviated to the term “file
I/O”). A file I/O is known as “cooked” in the UNIX world, because it provides
pre-programmed, ready-to-use services.

2.7.2 Raw I/O
Some database applications use the OS File System facilities, opening a file for
update, and leaving it open while it makes I/O requests periodically to update
blocks within the file.
However, database applications are generally not oriented to file structures, but
instead are “record” oriented, using a great deal of indexing to database tables.
Different databases may have very specific I/O requirements, depending on the
applications they support. For instance, a data mining database system may
have very long streaming I/Os, whereas a transaction oriented database is likely
to generate many short bursts of small I/Os.
High performance is frequently paramount for a database application, and use of

generalized file services may not deliver good results. For instance, each I/O
may involve many thousands of processor instructions. It is therefore common
that a database application bypasses the File System, and itself manages the
structure, caching and allocation of data storage.
In this case, the database application provides its own mechanism for creating
an I/O request. It reads and writes blocks of data directly to a raw partition, and
provides its own volume management functions. The database assumes control
over a range of blocks (or sectors) on the disk. This range of blocks is called the
“raw partition.” It then directly manages the system software component of the
I/O process itself. In effect the raw partition takes the role of the File System for
the database I/O operations.
The database provides its own complete method of handling the I/O requests.
This includes maintenance of a tailored table, or index, which knows the location
of records on the disk devices. When it recognizes that an I/O operation is
required it uses this table, and directs the record-level I/O through the raw
partition to the device driver, which reads or writes the data in blocks to the disk.
The database application also handles security locking at the record level, to
prevent multiple users updating the same record concurrently. Some other
applications, especially those which stream large amounts of data to and from
disk, also generate “raw I/O”.
Raw partitions can be totally optimized to the specific application or database

(Oracle, UDB — formerly DB2, Sybase and so on), and tuned for its unique
requirements to achieve optimal performance.

2.7.3 Local and SAN attached storage block I/O summary
A fundamental characteristic of DAS, and SAN implementations (unlike TCP/IP
network storage devices) is that, regardless of whether the application uses
“cooked” or “raw” I/O (that is, file system or block access) all I/O operations to
the device are translated to storage protocol I/Os. That means they are formatted
in the server by the database application, or by the operating system, into blocks
that reflect the address and structure of the data on the physical disk device. The
blocks are moved on the SCSI bus, or the Fibre Channel connection, to the disk
device. Here they are mapped to a block table in the storage device I/O bus, and
from there to the correct sector on the media. In mainframe parlance this is a
channel I/O. A file system and a “raw partition” I/O are illustrated in Figure 2-8.
Server
Database
Application
Application
1 2 File I/O requests:

Operating System 1. Raw I/O
2. File system I/O
Database System File System
Raw Partition
Volume Manager
Manager
Device Driver
Host Bus Adapter
Device specific requests to

local or SAN attached device
1 Storage I/O Bus Adapter 2
Data in blocks to sector location on disk Block I/O / data / storage location
Tracing a Local or FC SAN Block I/O

Figure 2-8 Tracing a local or Fibre Channel SAN block I/O’
2.8 Tracing the I/O path for network storage

There is a fundamental difference in the way that client/server I/O requests are
driven over a TCP/IP LAN compared to I/O requests to directly attached or SAN
attached storage.

2.8.1 Redirected I/O
Software on the client presents a logical view of the data files on the remote
appliance, as though they are stored on the client’s own local disk. This
“virtualization” of the storage provides transparent access to the remote storage.
However, we have seen that the application normally relies on file systems,
volume managers, and device drivers to convey all the required information
about the I/O operation to the storage device. (See 2.7, “Tracing the I/O path for
local storage” on page 98.)
With a network I/O, much of this device-specific information is lacking because

the storage and its file system is remote from the application that requests the
I/O. To overcome this, the client software must “redirect” the I/O over the
network. An I/O redirector is located in the client I/O path in front of the client’s
own file system. It presents a common view of the client’s own local file system,
and, transparently, the remote server’s file system. The I/O redirector has no
knowledge of the metadata relating to either of the files systems. But the local
and remote storage devices may, in reality, have quite different characteristics.
On receipt of the I/O request to a file that is located in the remote NAS appliance,
the following occurs:
򐂰 The I/O redirector performs what is called a “mapped drive” in the Windows
world, or a “remote mount” in UNIX.
򐂰 The I/O request is directed away from the local I/O path to an alternative path
over the network, which accesses the remote file server.
Since the client system has no awareness of the device characteristics on which
the data is stored on the remote server, all redirected I/Os must be done at the
file (byte range) level. This is termed a “file I/O.”
The client is attached to the LAN by a Network Interface Card (NIC). Since the
NIC uses a network protocol, such as the TCP/IP stack, the I/O operation must
be transferred using a network protocol. Now one of the network file protocols
(such as NFS or CIFS) comes into play as a kind of network device driver. In
effect, the network file protocol lies on top of the lower level communications
protocol stack, such as TCP/IP. It is the TCP/IP protocol that carries the
redirected I/O through the NIC onto the network. On a LAN the media access
control layer used is typically the Ethernet CSMA/CD protocol. (See 2.3.3, “The
CSMA/CD protocol” on page 73.)

When the remote server, or NAS appliance, receives the redirected I/O, the
requests are “unbundled” from their TCP/IP network protocols. This is done by
specialized device drivers in the receiving NIC. They are then sent to the
receiving network file handling protocol. This controls tracking information in
order to be able to exchange information with the client initiator. Now the request
is submitted to the NAS appliance’s operating system, which manages the
scheduling of the I/O, and security processes to the local disk. From then on the
I/O is handled more or less like a local I/O. It is routed via the appliance’s file
system, which establishes the file’s identity and directory, and converts the
request to a storage system protocol (that is, a block I/O operation). Finally, the
I/O request is handed to the volume manager for mapping to the device, and
then on to the device driver and the I/O bus (or Fibre Channel SAN HBA
attachment).
The receiving NAS device must keep track of the initiating client’s details so that
the response can be directed back to the correct network address. The route for
the returning I/O follows more or less the reverse path outlined above.
2.8.2 Network File I/O summary

One of the key differences of a NAS device, compared to DAS or other network
storage solutions such as SAN or iSCSI, is that all I/O operations use file-level
I/O protocols. In summary, the network access methods—NFS, CIFS, and Novell
Netware—can only handle File I/O requests to the remote file system located in
the operating system of the NAS device. This is because they have no
knowledge of the characteristics of the remote storage device. I/O requests are
packaged by the initiator into the TCP/IP protocol stack, in order to move across
the IP network. The remote NAS OS file system converts the request to block I/O
and reads or writes the data to the NAS disk storage. This is illustrated in
Figure 2-9 on page 104. (It is clear from this diagram that the network file I/O
process involves many more steps than storage protocol (block) I/O that was
illustrated in Figure 2-8.) This software stack overhead is a factor in comparing
performance of a NAS I/O to a DAS or SAN-attached I/O.
It is important to note that a database application accessing a remote file located

on a NAS device, by default, is configured to run with file system I/O. It cannot
use raw I/O to achieve improved performance.

Client platform NAS Appliance
Application File
(including Database) I/O NAS Appliance Operating
System
Remote File
I/O request
Network File File System

Client Operating System Protocol
handler Volume
I/O Redirector Manager
Appliance
Network File Protocol TCP/IP stack
Device Driver
(NFS / CIFS)
Network Host Bus
Interface Card Adapter
TCP/IP stack
Storage I/O Bus
Network Interface Card Block I/O to Block

NAS Appliance Storage I/O
File I/O request

to remote NAS
on network
Tracing a Network File I/O
Figure 2-9 Tracing a network file I/O’
2.9 Tracing the I/O path for SANergy clients

SANergy brings the best of both worlds to the client who is attached both to a
LAN and also to a SAN. The SANergy client can participate in NAS-like file
sharing applications, but it directly accesses the remote file which is stored on
SAN-attached disk devices. The result is the ability to deliver very high speed file
access, using block I/O.
The SANergy client software lies in a protocol layer beside the I/O Redirector. It
is first to see the I/O request from the application. It passes the initial file mount
(file I/O) request to the network, via the I/O Redirector and Network File Protocol
(NFS or CIFS). The I/O passes through the TCP/IP stack for encapsulation, and
out through the Network Interface Card.

On receipt of the I/O request, the SANergy Metadata Controller (MDC)
recognizes that the request is from a SANergy client. The MDC checks that the
client is authorized to access the file, and that the file is available for use. It
returns, over the network, authorization for the client to open the file, together
with metadata about the file. This includes file structure and location on the
SAN-attached disk device (such as which disk, track, and sector).
Now the client can access the file directly via the SAN. SANergy knows all the
required details of the file on the device, and, in effect, “sees” the device itself.
Since “ownership” of the file has temporarily been ceded by the MDC to the
SANergy client, it can proceed with all further I/Os as block I/Os to the disk.
The client application continues to issue file I/Os, as it appears to be working with
a remote file system. The SANergy client code effectively blocks this view, and
intercepts the I/Os. These are redirected via the client’s own file system and
volume manager to the device driver. I/Os are converted to serial SCSI block
I/Os for transmission through the Fibre Channel SAN to the disk device. This is
illustrated in Figure 2-10.
SANergy Client Platform

Application
1 3
Operating System
2
SANergy Client
I/O Redirector
File System
Network File Protocol
(NFS / CIFS)
Volume Manager
TCP/IP stack Device Driver
Network Interface Card Host Bus Adapter
Client File I/O request to MDC Server returns file access, SANergy client redirects all I/O
1 SANergy MDC Server
2 locks and disk metadata 3 over SAN as block I/O to disk
1 2 Storage I/O Bus Adapter

3
SANergy MDC Server
Data in blocks to sector location on disk
Tracing a SANergy Block I/O
Figure 2-10 Tracing a SANergy client I/O

2.10 Tracing the I/O path for Internet SCSI (iSCSI)
As you might expect, an iSCSI I/O path has some similarities to both the local
storage I/O and to a network file I/O. The initial steps are the same as for a local
or SAN-attached I/O request. The application program makes an I/O request,
either to the file system, or via a raw partition in the case of many database
applications. The I/O is passed to the SCSI device driver.
The iSCSI client (the initiator) has a special SCSI mini-port driver layer of
software associated with the SCSI device driver. We call it the iSCSI device
driver layer. This is used to interface to TCP/IP, and to encapsulate the SCSI
commands into the TCP/IP stack. TCP/IP accesses the network device driver
firmware of the Network Interface Card (NIC), and transmits the I/O in SCSI
blocks over the network to the iSCSI storage appliance.
On arrival at the NIC of the target iSCSI appliance, the I/O is passed through the
receiving network device driver to the TCP/IP stack in the target. The iSCSI
device driver layer de-encapsulates the I/O from TCP, and passes it to the SCSI
device driver. From there it is handed on to the storage system bus adapter (the
ServeRAID adapter on the 200i), and then to the device.
The return journey is the reverse of the outbound route. Like the network file I/O,
you can see that today there is a software stack processing overhead associated
with an iSCSI I/O request. This has performance implications, but in general they
are less than for a file I/O. See Figure 2-11 on page 107.

Server
iSCSI Appliance
Database Application Application
1 File I/O requests 2 iSCSI Appliance Storage
Operating System Storage I/O Bus
Database System File System

RAID Host Bus Adapter
Encapsulation
Raw Partition SCSI Device Driver
De-encapsulation
Volume Manager
Manager
SCSI Device Driver iSCSI Device Driver layer
iSCSI Device Driver layer

TCP/IP stack
TCP/IP stack
Network Interface Card
Network Interface Card
Device specific requests to TCP/IP network

Block I/O / data / storage location
Tracing an iSCSI Block I/O

Figure 2-11 Tracing an iSCSI block I/O
2.11 Storage block I/O and network file I/O summary

Figure 2-12 on page 108 summarizes and compares, in a simplified way, the
different I/O paths we have described. As you can see, the differences between
storage protocol (block I/O) and network protocol (file I/O) are all about the
location of the File System and where the translation to SCSI blocks takes place.

DAS SAN iSCSI NAS
Computer System Computer System Computer System Computer System Computer System
Application Application Application Application Application
OS File
System
Database
System
OS File
System
Database
System
OS File Database OS File OS File
System System
system system
Raw Raw Raw I/O Redirector I/O Redirector
LVM LVM LVM
Partition Partition Partition
NFS / CIFS NFS / CIFS
SCSI Device Driver SCSI Device Driver iSCSI layer TCP/IP Stack TCP/IP Stack
SCSI Bus Adapter FC Host Bus TCP/IP stack NIC NIC
Adapter
NIC
File I/O
Block I/O IP Network
SAN IP Network NAS Appliance NAS Appliance

NIC NIC
TCP/IP Stack TCP/IP Stack
iSCSI Appliance File System + LVM File System + LVM
NIC
Device Driver Device Driver
TCP/IP stack
FC Host Bus Block I/O
SCSI Bus Adapter iSCSI layer
Adapter
SAN
I/O Bus Adapter
Figure 2-12 Storage block I/O compared to network file I/O
2.12 Clustering concepts

High availability is an important consideration for applications running across
your network. Many organizations cannot accept any significant period of loss of
access to information. In the e-business environment, 24 hour a day by seven
day a week operations are increasingly considered to be absolute requirements.
This has given rise to the clustering of servers to deliver very high availability.
The collection of two or more server engines into a single unified cluster makes it
possible to share a computing load without users or administrators needing to
know that more than one server is involved. For example, if any resource in the
server cluster fails, the cluster as a whole can continue to offer service to users
using a resource on one of the other servers in the cluster, regardless of whether
the failed component is a hardware or software resource.

In other words, when a resource fails, users connected to the server cluster may
experience temporarily degraded performance, but do not completely lose
access to the service. Cluster resources include physical hardware devices such
as disk drives and network cards, and logical items such as Internet Protocol (IP)
addresses, applications, and application databases. Each node in the cluster will
have its own local resources. However, the cluster also has common resources,
such as a common data storage array and private cluster network. These
common resources are accessible by each node in the cluster.
There are three principal advantages in clustering technology:

򐂰 Improved availability by enabling services and applications in the server
cluster to continue providing service during hardware or software component
failure or during planned maintenance.
򐂰 Increased scalability by supporting servers that can be expanded with the
addition of multiple processors.
򐂰 Improved manageability by enabling administrators to manage devices
and resources within the entire cluster as if they were managing a single
resource.
The design of network storage systems, like IBM’s TotalStorage NAS devices,
offer high availability configurations to meet these demands. Today they use
concepts derived from clustered server implementations, such as Microsoft
Cluster Services.
There are three possible levels of clustering availability, which use the common
industry terminology of shared null, shared nothing, and shared everything.
2.12.1 Shared null

Shared null refers to a configuration in which there is no ability, for high
availability reasons, to share components not directly controlled by the individual
node. The node may have some degree of fault tolerance within itself. For
instance, there may be n +1 fault tolerant components such as fans, power
supplies, and so on. If one of these fails, another like component will continue to
deliver the required service. Shared null may even be used to imply a one engine
configuration.
However, if a component which is a single point of failure (SPoF) fails, then the
node has no fault tolerance. It cannot failover to an associated node to provide
continued access to data. This would apply, for instance, in the case of two
single-node Network Attached Storage 300G Model G01s. If the appliance or its
attached disk fails, the other node has no access to the data. This is illustrated in
the top section of Figure 2-13 on page 110.

NAS Appliance Clustering
Shared Null A A
No failover
LAN/WAN LAN/WAN
No load balancing
B B
Shared Nothing A A
Failover
No load balancing LAN/WAN LAN/WAN
B B
Shared Everything A
A
Failover
Load balancing LAN/WAN
LAN/WAN
B B
Figure 2-13 NAS appliance clustering
2.12.2 Shared Nothing

A Shared Nothing model of cluster architecture refers to how servers in a cluster
manage and use local and common cluster devices and resources. In the Shared
Nothing cluster, each server owns and manages its local devices. Devices
common to the cluster, such as a common disk array and connection media, are
selectively owned and managed by a single server at any given time.
Shared Nothing implies that storage and other resources are shared only after a
failure. Otherwise the two nodes do not share resources.
When a node in the cluster needs to access data owned by another cluster
member, it must ask the owner. The owner performs the request and passes the
result back to the requesting node. If a node fails, the data it owns is assigned, in
the case of a two node cluster, to the other node in the cluster (or to another
node, in the case of more than two nodes in the cluster).

The servers are interconnected, and are aware of each other by means of a
“heartbeat” function. In the event of any failure of the server, its partner will
recognize the failure, and will take over ownership of all resources attached to
the failed server, such as disk and network resources. This is known as a
failover. Access to data will continue, but overall performance will be degraded.
Repair actions can take place non-disruptively on the failed device, while
operations continue on the other server.
Shared Nothing is illustrated in the second layer of Figure 2-13. The Shared
Nothing model makes it easier to manage disk devices and standard
applications.
2.12.3 Shared Everything

Symmetric Multiprocessing (SMP) is the processing of programs by multiple
processors that share a common operating system and memory. In symmetric
(or “tightly coupled”) multiprocessing, the processors share memory and the I/O
bus or data path. A single copy of the operating system is in charge of all the
processors. Any node can access, concurrently with any other node, any device
connected to the cluster. To achieve this, data accesses must be synchronized
by the cluster software. An advantage of SMP for this purpose is the ability to
dynamically balance the workload among computers, and achieve higher
throughput as a result. Shared everything is illustrated in the lower section of
Figure 2-13.
At the time of writing, no NAS appliances on the market are implemented with a
“shared everything” architecture.
2.13 Data and network management

It is evident that the emergence of open, heterogeneous SAN architectures
brings added complexity to storage administrators. Comprehensive management
tools are required to enable them effectively to control and coordinate all aspects
of data and storage resource utilization. These must enable appropriate data
backup and recovery routines, as well as control data access and security (via
techniques such as zoning and LUN masking), and disaster protection. They
must also exploit the new capabilities of the SAN for consolidation, centralized
management, LAN-free and Server-less data movement, and so on.
Tivoli Netview is a well established management tool for networks. IBM has
introduced a family of data and SAN resource management tools, namely Tivoli
Storage Manager and Tivoli Storage Network Manager. These cooperate with
device-specific management tools, such as IBM’s StorWatch family of software.

In addition, IBM has indicated its strategic direction to develop storage network
virtualization solutions, known as the “Storage Tank” project, which will allow
enterprise-wide policy driven open systems management of storage. These
products and development directions are described in outline below.
򐂰 Tivoli Netview
򐂰 Tivoli Storage Manager (TSM)
򐂰 Tivoli Storage Network Manager (TSNM)
򐂰 Storage virtualization
2.13.1 Tivoli NetView

Tivoli NetView is designed to ensure the availability of critical business systems,
running on IP networks, and to provide rapid resolution of problems. Tivoli
NetView functions include:
򐂰 Discovery of TCP/IP networks
򐂰 Display network topologies
򐂰 Correlate and manage events and SNMP traps
򐂰 Monitor network health
򐂰 Gather performance data
Tivoli NetView provides the scalability and flexibility to manage large scale
mission-critical network environments.
Tivoli NetView enables you to:

򐂰 Measure availability and provide fault isolation for problem management
򐂰 Quickly identify the root cause of network failures
򐂰 Use a Web console to provide network management from any location
򐂰 Provide a scalable, distributed management solution
򐂰 Develop reports on network trends and analysis
򐂰 Maintain device inventory for asset management
Tivoli NetView SmartSets allow you to group network resources that should be
managed similarly, and apply policies to these groups. As a result, you can
manage a set of resources as though it were a single device. SmartSets let you
dynamically group resources by type, location, vendor, services offered, or other
common characteristics.
Tivoli NetView graphically constructs guidelines to implement business policies.

It quickly diagnoses root problems without reporting all symptomatic events.
Events can be managed locally, centrally, or by propagating them to other Tivoli
applications (such as Tivoli Enterprise Console) for advanced event correlation.
Additionally, Tivoli NetView can handle a variety of actions when responding to
events and exceeded thresholds, including paging, e-mail, and programmable
responses to specific failures.

In the event of a serious network problem, router fault isolation immediately
focuses on the failing device, and marks affected network regions as
unreachable. In response, Tivoli NetView automatically reduces networking
polling of the affected networks, thereby reducing overall event traffic.
With its highly scalable design, the Tivoli NetView Web console allows you to
observe network activity from anywhere. Using the Web console, you can view
events, node status, and SmartSets, as well as perform network diagnostics.
Today's TCP/IP networks are more complex than ever. Tivoli NetView accurately
manages and represents complex topologies and provides accurate status
information. Additionally, networks often comprise a wide variety of devices such
as hubs, routers, bridges, switches, workstations, PCs, laptops, and printers.
With Tivoli NetView, you can decide which of these devices to manage. You can
then focus on your most important devices, as well as the most important
information about those devices.
With Tivoli NetView you can distribute management functions to remote locations
that cannot support full-scale management. This minimizes administrative
overhead, and eliminates the need for dedicated management systems
throughout the network. Local management is enabled to handle most problems,
while staff members in the network operations center monitor critical systems.
Tivoli NetView can be used by itself to provide comprehensive network

management capabilities. It can also be integrated with other Tivoli Enterprise
products for extended capabilities. Integration with Tivoli Enterprise Console, for
example, enables you to consolidate and perform correlation against enterprise
events, including network events. With Tivoli Inventory, network device
information is added to the Tivoli Inventory database. Integration with Tivoli
Service Desk extends the network management capabilities of Tivoli NetView by
providing workflow management support.
Tivoli Decision Support Network Guides provide insight and the ability to perform
thoughtful data analysis. These guides enable you to proactively manage your
network by presenting trend data and quickly answering questions. The following
are three Tivoli Decision Support Guides for Tivoli NetView:
򐂰 Network Element Status: Provides a detailed view of the overall health and
behavior of your network's individual elements, such as routers, servers, end
systems, SNMP data, and MIB expressions collected from MIB II agents.
򐂰 Network Event Analysis: Provides an overall view of network and NetView
event flow and event traffic. It analyzes events over time, distinguishing
device class and event severity.
򐂰 Network Segment Performance: Provides a view of network segment
behavior primarily determined by using RMON characteristics on the network.

This network analysis focuses on the health of specific sections of the
network, rather than individual network elements.
By providing a means to gather key network information and identify and solve
problems, Tivoli NetView allows network administrators to centralize the
management of network hardware devices and servers. Tivoli NetView is a
smarter way to isolate, evaluate, and resolve network issues. It is an ideal
solution for identifying and resolving short- and long-term network problems.
Tivoli Netview is not bundled with any of the NAS products.
2.13.2 Tivoli Storage Manager

Tivoli Storage Manager (TSM) is a full-function storage software set that
manages the complete range of information needs. It provides business
continuity for planned and unplanned outages, and delivers “one-touch” control
over your entire “info-structure.” TSM information management and protection
extends from the palmtop through to the data center.
TSM supports eight different server platforms: Microsoft Windows NT, AIX, Sun
Solaris, HP-UX, VM, OS/390, OS/2, and OS/400. It also protects more than 35 of
the most popular platforms as clients, including Apple, Digital, HP, IBM,
Microsoft, NCR, SCO, Silicon Graphics, Sun Microsystems, and more. TSM
integrates fully with hundreds of storage devices, as well as LAN, WAN, and
emerging SAN infrastructures. It provides online backups of all major groupware,
ERP applications, and database products. The objective is to keep information
available and accessible to anyone, anywhere.
TSM's progressive backup methodology has earned high marks from users. An
initial full backup is routinely supplemented with incremental backups that require
minimal network bandwidth. An intelligent relational database tracks all backups.
It builds, offline, the complete up-to-date picture. TSM keeps track of where files
are located. Incremental backups are performed in the background, so you can
continue to perform business as usual.
For a mobile workforce, TSM features patented byte- and block-level technology
to help you more effectively manage the rising volume of information stored on
laptop computers. Since TSM typically transmits only changed data, backups
occur in a fraction of the time required to back up entire files.
For Storage Area Networks, TSM provides integrated tools which exploit SAN
functionality, such as LAN-free backup to reduce the traffic on your IP network.
Tape libraries can be dynamically shared between multiple TSM servers. All
backups are managed intelligently, so recovery is a single, fast process. And
TSM can be configured to rebuild revenue-generating applications and customer
touchpoints first.

Tivoli Storage Manager also has an automated and intelligent Disaster Recovery
Management capability that keeps track of critical information such as tape
location, volume dates, and administrators, so you always have a complete,
up-to-the-minute disaster recovery plan.
Other TSM features include:

򐂰 Tivoli Decision Support For Storage Management Analysis helps users
display, analyze, and report on the health and usage of events, performance,
and system capacities based on historical performance.
򐂰 Tivoli Space Manager helps free administrators from manual file system
pruning tasks by using hierarchical storage management (HSM) to
automatically and transparently migrate rarely-used files to Tivoli Storage
Manager.
򐂰 Instant Archive allows users to create LAN-free archive copies from file
backups already stored on the storage management server. Intelligent
Restore speeds up the time to information delivery because TSM knows
exactly which tapes have archived data and where to find them.
Tivoli provides tools that enable online backups and restores, and manages
database transaction logs. Support is provided for most of today's popular
systems, including Lotus Notes, Lotus Domino, Informix, SAP R/3, Oracle,
Microsoft SQL Server, and Microsoft Exchange Server.
TSM can be integrated with other Tivoli software such as the Tivoli Enterprise
solution. It delivers a complete view of operations and monitors and manages the
entire business process, including: networks, systems, storage information, and
business applications. Tivoli Storage Manager client comes already pre-installed
as part of the IBM TotalStorage NAS products.
2.13.3 Tivoli Storage Network Manager (TSNM)

Tivoli Storage Network Manager is a scalable solution architected to ANSI SAN
standards. TSNM discovers, monitors, and manages your SAN fabric
components and allocates and automates your attached disk storage resources.
The results are: reduced storage administration costs, reduced administrative
workloads, maintenance of high availability, and minimized downtime.
The TSNM Server is supported on Windows 2000 Advanced Server Edition. The
managed host platforms are supported on Windows NT, Windows 2000, IBM
AIX, and Sun Solaris.
Tivoli Storage Manager has an automatic SAN discovery capability. This

produces a virtual topology map of the physical and logical components and
storage resources across the storage network.

TSNM supports ANSI standards. Therefore, SAN devices which support these
industry standards will get the most functionality from TSNM. Tivoli Storage
Network Manager is compatible with FC-MI and non FC-MI compliant
components and utilizes in-band and out-band SAN industry-standard discovery
techniques including, but not limited to:
򐂰 Simple Network Management Protocol (SNMP) management MIBs
(Management Information Base)
򐂰 Extended link services such as Request Node Identification (RNID) and
Request Topology Information (RTIN)
򐂰 Name server queries
򐂰 Management server queries
򐂰 Selected vendor-specific interfaces
Discovery provides problem identification and can launch appropriate device

management applications for quick resolution.
Tivoli Storage Network Manager provides continuous monitoring of all of the

components within the discovered SAN topology. This enables proactive
management of the storage area network. Events and data are processed by
TSNM, and alerts and notification are sent to the administrator for problem
resolution. Capacity planning, service level planning, and performance tuning will
be driven by data captured by TSNM. You can launch SAN component element
management software, such as IBM StorWatch element managers, from TSNM
to assist in closure of problems. This feature provides you with an easy
navigational tool to launch the specific application needed to perform device and
configuration management functions on your SAN components.
TSNM integrates with Tivoli NetView. This allows you to monitor and control your
SAN infrastructure and devices from the same interface you use to manage your
LAN and WAN. These customer networks can now be viewed from a single
console.
Tivoli Storage Network Manager allows you securely to allocate the discovered
storage resources to the appropriate host systems. You can easily assign disk
storage resources or Logical Unit Numbers (LUNs) from the SAN storage
subsystems to any computers connected to the SAN. TSNM effectively allows
multiple computers to share the same SAN resources, and the same storage
subsystems, even though they may be using different file systems. TSNM
ensures that the right host is looking at the right source.

Tivoli Storage Network Manager allows you to set granular policies across an
entire SAN, host group, and their file systems to detect out-of-space conditions.
TSNM will continuously monitor the previously assigned resources as they
approach a policy-defined threshold. When the policy-defined threshold is
exceeded, TSNM will automatically extend the file system by identifying and
allocating an unassigned disk LUN to that file system. This unique automation
capability of allocating additional LUNs, and extending supported file systems,
can greatly reduce the administrative tasks of manually monitoring and extending
file systems to maintain business continuity.
Events and data from the SAN are continuously captured, providing information,
alerts, and notification to administrators for problem resolution. SAN-related
events are forwarded to SNMP (Simple Network Management Protocol)
management tools such as Tivoli Event Console (TEC).
TSNM is a key component of the overall Tivoli Storage Solution portfolio,

providing comprehensive SAN and storage management. Tivoli Storage Network
Manager can operate stand-alone or integrate with:
򐂰 Tivoli NetView: When used with Tivoli NetView, you can monitor and control
your SAN infrastructure and devices from the same console that you use to
manage your LAN and WAN.
򐂰 Tivoli Enterprise Console: SAN-related events are forwarded to Tivoli
Enterprise Console® and/or through Simple Network Management Protocol
to SNMP management software.
򐂰 Tivoli Decision Support for SAN Resource Management: Tivoli intends
to provide a Decision Support Guide for reporting and analysis of data from
the TSNM.
Decision Support Guides are a set of “best practices” guides that can be used
to analyze and display data about applications and products. The information
is presented in a variety of text and graphical formats, allowing a user to drill
down to get details of a particular aspect of an environment. Tivoli Decision
Support for SAN Resource Management will allow its users to make business
decisions based on the inventory, capacity, and usage of SAN resources, as
well as threshold monitoring. (Note: This statement of direction is based on
Tivoli's current development plans, and is subject to change without prior
notice.)
2.13.4 Storage virtualization

As we have seen, the development of NAS and SANs is being driven by the
continued and unprecedented growth in storage. The greatest challenge posed
by this growth, in the long term, is the cost-effective management of the storage
resources, and the data residing on them. This has given rise to the concept of

virtualization of storage. This refers to the abstraction of storage so that the
logical representation of the storage to the operating system and the host
applications is completely separated from the complexities of the physical
devices where the data is stored. The benefits of virtualization are primarily in the
ease of managing the resources, and the ability to set enterprise-wide
management policies related to different pools of storage hardware.
A methodology to translate between the logical view and the physical view is
required in order to implement storage virtualization. The question arises “Where
and how should this be done?”
Virtualization techniques have been applied in several key areas of computing,

including virtual memory in processors, and in individual disk and tape systems,
like IBM’s RAMAC Virtual Array (RVA) and IBM’s Virtual Tape Server (VTS). This
individualized virtualization delivers some benefits, but does not address the
overall enterprise-wide management requirements.
Other approaches involve the introduction of specialized devices or storage

manager servers through which all systems route I/O via the storage network.
The network storage manager would handle the logical mapping of the storage to
the physical attached devices, rather like a powerful disk controller. This is known
as a “symmetrical” design.
IBM has announced that it is developing an approach to storage network

virtualization based on a development project that has been referred to as a
“Storage Tank.” This will provide central control of virtualization, but still allow all
I/O to be done directly from servers to storage. A metadata server provides the
virtual mapping functions as well as storage management processes. This is
known as an “asymmetrical” design.
The Storage Tank ultimately will deliver the promise of heterogeneous storage
networking. It will provide a universal storage system capable of sharing data
across any storage hardware, platform, or operating system. Storage Tank is a
software management technology that unleashes the flow of information across
a storage area network, providing universal access to storage devices in a
seamless, transparent, and dynamic manner.
Policy-based management of data storage will be enabled by the Storage Tank,

providing:
򐂰 Heterogeneous, open system platforms with the ability to plug in to the
universal storage system and to share both data assets and data storage
resources, regardless of physical location.
򐂰 Management, placement, access and usage of data controlled by policies
determined by the administrator.

򐂰 Host systems that no longer need to configure storage subsystems as
individual devices. Instead, they oversee and acquire storage capacity
needed, with bytes allocated accordingly. This alleviates fragmentation and
inefficient usage of storage resources due to pre-allocation of storage devices
to specific host systems, logical volumes, or file systems.
򐂰 Virtualized data storage resources, called storage groups, can be created.
These enable new storage devices to be added and old ones removed
without affecting access to data by applications. This provides for transparent
scaling and ensures uptime by allowing new storage devices to be added
dynamically, without manually cross-mounting volumes to a specific server.
The illustration in Figure 2-14 shows that Storage Tank clients communicate with
Storage Tank servers over an enterprise's existing IP network using the Storage
Tank protocol. It also shows that Storage Tank clients, servers, and storage
devices are all connected to a Storage Area Network (SAN) on a high-speed,
Fibre Channel network.
Storage Tank concept

Existing
ExistingIP Network
IP Network for Client/Server
for Client/Server CommunicationsCommunications
Heterogeneous
Clients NT AIX Linux Solaris
(Workstations
or Servers)
Installable Installble Installable

Installable
File File File
File system
System System System
Metadata
Fibre
Channel
Network SAN
Device to
Shared device copy Metadata
for backup
Storage
and migration
servers
Devices
Matadata Servers for
Authentication
Active Data Backups and Access control
migrated data Locking
Data placement
File level outboard
services
Figure 2-14 The IBM Storage Tank concept

An installable file system (IFS) is installed on each of the heterogeneous clients
supported by Storage Tank. The IFS directs requests for metadata and locks to a
Storage Tank server, and requests for data to storage devices on the SAN.
Storage Tank clients can access data directly from any storage device attached
to the SAN.
An enterprise can use one Storage Tank server, a cluster of Storage Tank
servers, or multiple clusters of Storage Tank servers. Clustered servers provide
load balancing, fail-over processing, and increased scalability. A cluster of
Storage Tank servers are interconnected on their own high-speed network or on
the same IP network they use to communicate with Storage Tank clients. The
private server storage that contains the metadata managed by Storage Tank
servers can be attached to a private network connected only to the cluster of
servers, or it can be attached to the Storage Tank SAN.
Within a server cluster is a storage management server. This is a logical server

that issues commands to back up and migrate files directly over the Fibre
Channel network from one storage device to another. No client involvement is
required to perform these tasks.
The Storage Tank architecture makes it possible to bring the benefits of

system-managed storage (SMS) to a open distributed environment. Features
such as policy-based allocation, volume management, and file management
have long been available on mainframe systems via IBM’s DFSMS software.
However, the infrastructure for such centralized, automated management has
been lacking in workstation operating environments. The centralized storage
management architecture of the Storage Tank system makes it possible to
realize the advantages of open system-managed storage for all of the data the
enterprise stores and manages.
For more details on storage network virtualization, refer to the IBM Redbook
Storage Networking Virtualization - What’s it all about?, SG24-6211-00.

3
Chapter 3. IBM NAS and iSCSI storage

products
IBM first introduced integrated NAS disk appliances in October 2000 with the
IBM ~ xSeries 150 range of network attached storage. In February 2001
two significant advances were introduced by IBM. The announcement of the IBM
TotalStorage Network Attached Storage 300G appliance brought to market a
hybrid appliance capable of linking IP network clients to Fibre Channel attached
storage. The revolutionary IBM TotalStorage IP Storage 200i introduced the
world’s first iSCSI appliance, with the capability to handle standard SCSI block
I/O commands across IP networks. In June 2001 IBM re-branded and enhanced
the xSeries 150 NAS appliances to become the IBM TotalStorage Network
Attached Storage 200. At the same time, IBM announced the advanced, fault
tolerant IBM TotalStorage Network Attached Storage 300, which offers greatly
increased scalability and performance for enterprise class file sharing
applications.
NAS appliances like the IBM TotalStorage Network Attached Storage 200 and
300 are fully integrated and dedicated storage solutions that can be quickly and
easily attached to an IP network. Their storage will then become immediately and
transparently available as a network file serving resource to all clients. These
specialized appliances are also independent of their client platforms and
operating systems, so that they appear to the client application as just another
server.

The IBM NAS 300G is able to plug into an existing network without the need to
shut down or make any changes to existing file servers. It should not be affected
by any upgrades to any servers, operating systems, and applications. It acts as a
bridge between the storage area network (SAN) and the IP network. The 300G
converts file I/O protocols to block I/O protocols.
The IP Storage 200i is an integrated storage appliance designed to provide

general purpose pooled storage facilities for departments and smaller
enterprises.
In summary, the following Network Attached Storage and iSCSI storage

appliance solutions are available from IBM. We describe each model in more
detail in the rest of this chapter:
򐂰 IBM 5196 TotalStorage Network Attached Storage 300G
򐂰 IBM 4125 TotalStorage IP Storage 200i
Note: As of the time of writing, these are the available products IBM has to
offer. The latest information on IBM Storage Networking products is always
available at this website:
http://www.storage.ibm.com/snetwork/index.html
3.1 The IBM TotalStorage Network Attached Storage 200

With the IBM NAS 200 (Model 201 and Model 226) appliances your enterprise
will gain scalable, network-attached storage devices that deliver excellent value,
state-of-the-art systems management capabilities, and task-optimized operating
system technology. These NAS devices provide you with increased performance,
storage capacity, and functionality.
Two models have been developed for use in a variety of workgroup and
departmental environments. They support file serving requirements across NT
and UNIX clients, e-business, and similar applications. In addition, these devices
support Ethernet LAN environments with large or shared end user workspace
storage, remote running of executables, remote user data access, and personal
data migration.
Both models have been designed for installation in a minimum amount of time,
and feature an easy-to-use Web browser interface that simplifies setup and
ongoing system management. Hot-swappable hard disk drives mean that you do
not have to take the system offline to add or replace drives, and redundant
components add to overall system reliability and uptime.

With enhancements over the predecessor IBM ~ xSeries 150 NAS
appliances, the NAS 200 Models 201 and 226 support the creation of up to 250
persistent images, enabling ongoing backups for exceptional data protection.
Internal and external tape drives can be attached for backup via an optional
SCSI adapter.
To help ensure quick and easy installation, both NAS models have tightly
integrated preloaded software suites.
The NAS 200 models scale from 108 GB to over 3.52 TB total storage. Their
rapid, non-disruptive deployment capabilities mean you can easily add storage
on demand. Capitalizing on IBM experience with RAID technology, system
design and firmware, together with the Windows Powered operating system (a
derivative of Windows 2000 Advanced Server software) and multi-file system
support, the NAS 200 delivers high throughput to support rapid data delivery.
3.1.1 IBM NAS 200 highlights

In this section we describe some of the most important features which are
included in the NAS 200.
Dedicated
As a fully-integrated, optimized storage solution, the NAS 200 allows your
general-purpose servers to focus on other applications. Pre-configured and
tuned for storage-specific tasks, this solution is designed to reduce setup time
and improve performance and reliability.
Open
The open-system design enables easy integration into your existing network
and provides a smooth migration path as your storage needs grow.
Scalable
Scalability allows you to increase storage capacity, performance, or both, as
your needs grow. NAS 200 storage capacities ranging from 108 GB to
440.4 GB (Model 201), and from 218 GB to 3.52 TB (Model 226) are
provided, while NAS 300 can be scaled from 360 GB to 6.61 TB (Model 326).
Flexible
Multiple file protocol support (CIFS, NFS, HTTP, FTP, AppleTalk, and Novel
NetWare) means that clients and servers can easily share information from
different platforms.
Reliable
Hot-swappable disk drives, redundant components, and IBM Systems
Management are designed to keep these systems up and running.
Chapter 3. IBM NAS and iSCSI storage products 123

Easy backups
With 250 True Image point-in-time data views, the NAS 200 can create
on-disk instant virtual copies of data without interrupting user access or
taking the system off-line.
Pre-loaded software
The NAS 200 is preloaded with Windows Powered OS and other software
designed specifically to enable network clients to access large amounts of
data storage on the NAS server using multiple file protocols. Pre-loaded
software is described in 3.1.7, “IBM NAS 200 preloaded software” on
page 129.
3.1.2 IBM NAS 200 Model 201 tower hardware

Whether you need additional storage for a remote location or for smaller
LANs within your main building or campus, the workgroup model offers many
features found in larger systems, but at an entry-level price. This tower
configuration model is powered by a single 1.133-GHz1 Pentium III processor
with a Single-channel hardware ServeRAID 4LX controller and six internal
storage bays. With a basic storage capacity of 109 GB (3 x 36.4 GB disk
drives) or (3 x 73.4 GB disk drives optional) , this model can be expanded up
to 440.4 GB by replacing the 36.4 GB disk drives and adding up to three
additional 73.4 GB disk drives. And like other IBM NAS products, it has
built-in backup capabilities.
Figure 3-1 shows a diagram of the IBM tower model 201.
ServeRAID4LX RAID
Drives
IBM NAS 200 Model 201
Figure 3-1 IBM 5194-201 NAS 201 tower model diagram
This model consists of the following hardware components:

򐂰 5194-201 tower unit
򐂰 Pentium III 1.133 GHz
򐂰 512 KB L2 cache
򐂰 512 MB SDRAM
򐂰 ServeRAID-4LX Ultra 160 SCSI

򐂰 108 GB capacity (3 x 36.4 GB HDD Ultra 160 drives), Optional minimum
config is 3 x 73.4 GB. Max config with all 73.4 GB drives.
򐂰 2 x 10/100 MBit Ethernet adapters, optional Quad 10/100 MBit Ethernet
Figure 3-2 shows a picture of the IBM 5194-201 NAS 201 tower model.
Figure 3-2 The IBM 5194-201 tower model
3.1.3 IBM NAS 200 Model 226 rack hardware

The departmental model is a higher-capacity, rack-configured appliance for
larger client/server networks. With dual 1.133-GHz Pentium III processors, a
four-channel hardware RAID controller, and storage capacity from 218 GB to
3.52 TB, this model is designed to provide the performance and storage
capabilities for more demanding environments.

A diagram of the IBM 5194-226 rack model is shown in Figure 3-3.
ServeR A ID 4H
Engine
A ppliance
O ptions
EXP 300
EXP 300
EXP 300
IB M N AS 20 0 M o d el 2 26
Figure 3-3 IBM 5194-226 NAS 200 rack model diagram

򐂰 5194-226 rack mounted unit
򐂰 2 x Pentium III 1.133 GHz
򐂰 512KB L2 cache
򐂰 1GB SDRAM
򐂰 ServeRAID-4H Ultra 160 SCSI
򐂰 218 GB starting capacity (6 x 36.4GB HDD Ultra 160 drives), Minimum ADD
configuration of 6 X. 73.4 GB ADD is optional. This model is expandable to
over 3.52 TB of capacity using up to 3 RAID EXPO Storage units, each with
up to 14 X. 73.4 GB disk drives.
򐂰 4 X. 10/100 MBit Ethernet adapters, Quad 10/100 Mbit Ethernet Adapter
(optional)

Figure 3-4 shows a picture of the IBM NAS 200 Model 226 rack model.
Figure 3-4 The IBM 5194-226 NAS 200 rack model
3.1.4 IBM NAS 200 technical specifications summary

Table 3-1 lists the technical specifications of the IBM NAS 200 tower and rack
models:
Table 3-1 IBM NAS 200 technical specifications
Specifications 5194-201 Tower 5194-226 Rack
Form factor Tower 5U
Number of processors 1/2 2/2

(std./max)
L2 cache 512 KB 512 KB
Memory (std./max) 512 MB/1 GB 1 GB/2 GB
PCI Expansion slots 5 5

(total/hot-plug)
HDD Expansion bays 6/6 6/6

(total/hot-plug)

Capacity (std./max) 108 GB/440.4 GB 218 GB/3.52 TB

3 x 36.4 GB 6 x 36.4 GB
3 x 73.4 GB (optional) 3 x 73.4 GB (optional)
10K Ultra160 10K Ultra160
3 Internal Bays 6 Internal Bays
expandable with 3 x
EXP300 units
Network 2 x 10/100Mbits Ethernet 2 x 10/100Mbits Ethernet

1 GB Ethernet (optional), 1 GB Ethernet (optional)
Quad 10/100Mbits Quad 10/100 Mbits
Ethernet (optional) Ethernet (optional)
Advanced System Yes Yes

management
Power supply 250 W 250 W
Hot plug, redundant HDDs, power HDDs, power

components
Light path diagnostics Yes Yes
RAID Adapter ServeRAID-4LX ServeRAID-4H
3.1.5 IBM NAS 200 features and benefits

Table 3-2 summarizes the features and benefits common to the IBM NAS 200
models.
Table 3-2 IBM NAS 200 features and benefits
Features Benefits
Tower and rack configuration Small footprint

Storage when you need it
108 GB to 440.4 GB — Model 5194-201 Scalable — simple
216 GB to 3.52 TB — Model 5194-226 Scalable storage growth for investment

protection
One 1.133 GHz Pentium III Processor — Powerful processor for optimal
Model 5194-201 performance
Two 1.133 GHz Pentium III Processor — Increased processing power for more
Model 5194-226 storage-intensive environments
Redundant components High availability — Increased reliability

Features Benefits
Open standards Easy integration into existing networks

Smooth migration paths for business
growth
Multiple file protocol support Supports heterogeneous client/server

environments
Windows (CIFS), UNIX (NFS), Novell
Netware, FTP, HTTP, AppleTalk
WEB browser interface Simplifies appliance installation
Advanced Systems management by Pre-loaded systems management for

Netfinity Director ease of use
3.1.6 IBM NAS 200 optional features

The following are optional features of the IBM NAS 200:
򐂰 1.133 GHz Pentium III Processor (10K2338) - Model 201 only)
򐂰 128 MB 133 MHz SDRAM ECC RDIMM II (33L3123)
򐂰 1 GB 133 MHz SDRAM ECC RDIMM II (33L3129)
򐂰 EXP300 Storage Unit (3531-1RU)
򐂰 36.4 GB 10K-4 Ultra160 SCSI Hot-Swap SL HDD (37L7206)
򐂰 IBM 10/100 Mbits Ethernet Adapter 2 (34L1501)
򐂰 Gigabit Ethernet SX Adapter (34L0301)
򐂰 250 W Hot-Swap Redundant Power Supply (33L3760)
򐂰 5U x 24D Tower-to-Rack Kit (37L6858) (Model 201 only)
Note: Use of tower-to-rack conversion kit does not transform a Model 201 into
a Model 226. It is simply a means of converting a Model 201 from tower into a
rack configuration
3.1.7 IBM NAS 200 preloaded software

The IBM TotalStorage NAS 200 models are preloaded with Windows Powered
OS and other software, designed specifically to support network clients
accessing large amounts of data storage on the server using multiple file
protocols. Preloaded software includes:
򐂰 Windows Powered OS - optimized for supported NAS applications
򐂰 Netfinity Director 2.2 UM Server Extensions - providing system management
support based on industry standards

򐂰 ServeRAID Manager RAID Configuration and Monitoring - providing
configuration tools and RAID management of IBM NAS appliances using
ServeRAID-4 controllers
򐂰 Persistent Storage Manager - providing 250 persistent images of customer
data and enabling full backup of system with Microsoft or TSM backup
applications
򐂰 Advanced appliance configuration utility - manages all your appliances from a
single client with this Web-based application set
IBM Advanced Appliance Configuration Utility tool

The IBM Advanced Appliance Configuration Utility tool helps you set or
reconfigure the network configuration for one or many appliance servers. This
software consists of an agent on each appliance server and a Java application
residing on a Windows-based client workstation acting as a configuration station.
You can use this configuration station to do the following:
򐂰 Discover appliance servers
򐂰 Set up and manage server network configurations
򐂰 Launch the comprehensive Web-based server management console in a
separate browser window
Network administrators not currently running DHCP servers will find the
advanced appliance configuration utility particularly useful for automatically
configuring network settings for newly added IBM NAS 200 appliances. Even
administrators with networks using DHCP servers can benefit from the advanced
appliance configuration utility, by permanently assigning IP addresses and host
names automatically and launching Web-based management.
Tivoli Storage Manager

The IBM NAS 200 models 201 and 226 are provided with the Tivoli Storage
Manager client software. The customer may choose to use it in concert with the
Tivoli Storage Manager server. This backup client provides file level and sub-file
level backup and restore functionality, as well as a variety of other management
functions.
Tivoli Storage Manager is a full-function storage software solution that addresses

the challenges of complex storage management across a distributed
environment. It manages a broad range of data storage, recovery, and availability
functions across your entire computing infrastructure, regardless of platform,
vendor, or operating system.
An overview description of Tivoli Storage Manager can be found in 2.13.2, “Tivoli

Storage Manager” on page 114.

3.1.8 IBM NAS 200 high availability and serviceability
Reliability and serviceability is delivered via these features:
򐂰 Six hot-swappable HDD bays with SCA-2 connectors support SAF-TE
functions.
򐂰 Standard ServeRAID-4LX (model 201) or -4H (model 226) controllers
support:
– Active PCI failover
– RAID levels 0, 1, 1E, 5, 5E, 00, 10, 1E0, and 50
򐂰 ECC DIMMs, combined with an integrated ECC memory controller, correct
soft and hard single-bit memory errors, while minimizing disruption of service
to LAN clients.
򐂰 Memory hardware scrubbing corrects soft memory errors automatically
without software intervention.
򐂰 ECC L2 cache processors ensure data integrity while reducing downtime.
򐂰 Predictive Failure Analysis on HDD options, memory, processors, VRMs, and
fans alerts the system administrator of an imminent component failure.
򐂰 Three worldwide, voltage-sensing 250-watt power supplies feature auto
restart and redundancy.
򐂰 An integrated Advanced System Management Processor (ASMP) for
diagnostic, reset, Power On Self Test (POST), and auto recovery functions
from remote locations; monitoring of temperature, voltage, and fan speed,
with alerts generated when thresholds are exceeded. An optional ASM PCI
adapter also allows for SNMP alerts via network connection when the
administrator console is running either Tivoli NetView or Netfinity Director.
More detailed information about the ASMP processor is covered in Chapter 4,
“Management of IBM NAS and IP Storage solutions” on page 173.
򐂰 An information LED panel provides visual indications of system well-being.
򐂰 Light-Path Diagnostics and on-board diagnostics provide an LED map to a
failing component, designed to reduce downtime and service costs.
򐂰 Easy access is provided to system board, adapter cards, processor, and
memory.
򐂰 CPU failure recovery in Symmetric Multi Processor (SMP) configurations
does the following:
– Forces failed processor offline
– Automatically reboots server
– Generates alerts
– Continues operations with the working processor (if present)

3.1.9 IBM NAS 200 scalability and growth
The IBM NAS 200 is available in tower (Model 201) or rack (Model 226)
configurations. If a rack configuration is preferred, customers supply their own
rack. The rack model is engineered to fit in a a 5U rack drawer. A conversion kit
is available to convert a tower model for rack mounting. SVGA video,
dual-channel Ultra160 SCSI, full-duplex 10/100 Mbps Ethernet, and the
Advanced System Management Processor are integrated on the system board.
Features include:
򐂰 Standard 512 MB (tower) or 1 GB (rack) of system memory, expandable to
2 GB.
򐂰 Five full-length adapter card slots: three 64-bit and two 32-bit PCI slots (slots
available vary by model)
򐂰 ServeRAID-4LX Ultra160 SCSI Controller (Model 201): single channel,
supports internal RAID storage.
򐂰 ServeRAID-4H Ultra160 SCSI Controller (Model 226): four channels,
supports internal and three external channels.
򐂰 Ten drive bays:
– Six 3.5-inch slim-high, hot-swappable drive bays, three 5.25/3.5-inch
half-high device bays, and one 3.5-inch slim-high drive bay.
– Up to 440.4 GB of internal data storage, using six 73.4 GB 10,000 RPM
Ultra160 SCSI Hot-Swap SL HDDs.
– Up to 3.52 TB total storage with addition of three EXP300 Storage
expansion units (Model 226 only), with up to 14 x 73.4 GB HDDs in each
EXP300.
– A 40x-17x IDE CD-ROM and 1.44 MB diskette drive.
򐂰 An SVGA controller (S3 savage4 chip set) with 8 MB of video memory.
IBM NAS 200 Storage Unit (Model EXU/EXX)

This highly available external storage unit, supported with the Model 226,
includes fourteen slim-high 10K-4 Ultra160 SCSI 73.4 GB HDDs, providing a
total capacity of 1.027 TB per unit.
The IBM Storage Unit Models EXU and EXX contain two hot-swappable,
redundant power supply/fan assemblies. Potential failure-causing conditions
are reported to the controller via Predictive Failure Analysis (PFA).
Here are the key features of the IBM 5194 NAS Storage Unit Model EXU and
EXX:
򐂰 Supports data transfer speeds of up to 160 MB

򐂰 Has 3U form factor for minimum rack space usage
򐂰 Accommodates single or dual SCSI bus configurations
򐂰 Includes three hot-swappable, 250 W redundant power supplies with
integrated fan assemblies
򐂰 Offers Predictive Failure Analysis (PFA) for fans and HDDs
򐂰 Includes two line cords, 2 m SCSI cable, and publications
Note: A maximum of three IBM 5194 Storage Unit Model EXU and EXX can
be attached to the IBM NAS 200 Model 226.
3.1.10 IBM NAS 200 system management

IBM NAS 200 models 201 and 226 are preloaded with Universal
Manageability (UM) Server Extensions, a part of Netfinity Director (agent
only). Netfinity Director is a powerful, highly integrated, systems management
software solution built on industry standards and designed for ease-of-use.
You can use it to exploit your existing enterprise or workgroup management
environments and use the Internet to securely access and manage physically
dispersed IT assets more efficiently. Netfinity Director can help reduce costs
through:
򐂰 Reduced downtime
򐂰 Increased productivity of IT personnel and end users
򐂰 Reduced service and support costs
Netfinity Director lets IT administrators view the hardware configuration of

remote systems in detail. In this way they can monitor the usage and
performance of critical components such as processors, HDDs, and memory.
Netfinity Director includes UM Server Extensions, a portfolio of server tools

that integrate into the Netfinity Director interface. They work with the
Advanced System Management Processor or other system management
monitoring functions contained in NAS 200 appliances. Typical functions and
monitoring capabilities include:
򐂰 PFA-enabled critical hardware components
򐂰 Temperature
򐂰 Voltage
򐂰 Fan speed
򐂰 Light-Path Diagnostics
These features give the IT administrator comprehensive, virtual on-site

control of NAS 200 appliances through the ability to remotely:
򐂰 Access the server regardless of the status

򐂰 Take inventory and display detailed system and component information
򐂰 View server bootup during Power On Self Test (POST)
򐂰 Browse and delete logs of events and errors
򐂰 Reset or power cycle the server
򐂰 Run diagnostics, SCSI, RAID setup during POST
򐂰 Monitor and set threshold on server health, including:
– Operating system load
– POST time-out
– Voltage
– Temperature
򐂰 Set proactive alerts for critical server events, including PFA on:
– Processors
– VRMs
– Memory
– Fans
– Power supplies
– HDDs
򐂰 Define automated actions such as:
– Send an e-mail or page to an administrator
– Execute a command or program
– Pop-up an error message to the Netfinity Director console
򐂰 Flash BIOS
򐂰 Monitor and graph the utilization of server resources, such as:
– Memory
– Processor
– Disk drives
򐂰 Identify potential performance bottlenecks and react to prevent downtime
򐂰 Monitor, manage, and configure RAID subsystems without taking them
offline
Netfinity Director provides integration into leading workgroup and enterprise

systems management environments, via Upward Integration Modules. This
enables the advanced management capabilities built into IBM NAS
appliances to be accessed from:
򐂰 Tivoli Enterprise and Tivoli Netview
򐂰 Computer Associates CA Unicenter TNG Framework
򐂰 Microsoft SMS
򐂰 Intel LANDesk Management Suite

3.2 IBM TotalStorage Network Attached Storage 300
IBM's TotalStorage Network Attached Storage 300 (5195 Model 326) is an
integrated storage product that is system-tested and comes with all components
completely assembled into a 36U rack.
The NAS 300 appliance provides an affordable but robust solution for the storage
and file serving needs for a large department or a small enterprise. It provides
the same features and benefits as the IBM NAS 200 series products. In addition,
with its second engine, it provides an increase in reliability and availability
through the use of clustering software built into the appliance.
The NAS 300 also provide scalability, fault tolerance, and performance for
demanding and mission critical applications. The NAS 300 consists of a dual
engine chassis with failover features. It has dual fibre channel hubs and a fibre
channel RAID Controller. The 300 is preloaded with a task-optimized Windows
Powered Operating System. With its fault-tolerant, dual engine design, the 300
provides a significant performance boost over the 200 series.
If your business is faced with expanding Internet use, e-business operation,

enterprise resource planning and large data management tasks, the NAS 300
provides the solutions you need, including high reliability and availability, and
ease of managing remotely.
The NAS 300 system will scale easily from 364 GB to 6.55 TB, making future
expansion simple and cost-effective. It comes ready to install, and becomes a
part of a productive environment with minimal time and effort.
The NAS 300 base configuration features the following:

򐂰 One Rack 36U (with state-of-the-art Power Distribution Unit)
򐂰 Two Engines, each with:
– Dual Pentium 1.133 Ghz processors
– 1 GB memory
– Two Redundant and hot swap power supplies/fans
򐂰 Two Fibre Channel Hubs
򐂰 One RAID Storage Controller
򐂰 Ten 36.4 GB hot-swappable HDD, (base config of 73.4 GB HDD is optional)
Optionally, it supports the following:

򐂰 Additional RAID Storage Controller
򐂰 Maximum of 7 Storage Expansion units, each populated with ten 36.4 or
73.4 GB hot-swappable HDD

The system comes standard with dual engines for clustering and fail-over
protection. The dual Fibre Channel Hubs provide IT administrators with high
performance paths to the RAID storage controllers using fibre-to-fibre
technology.
The preloaded operating system and application code is tuned for the network
storage server function, and designed to provide 24 X 7 uptime. With multi-level
persistent image capability, file and volume recovery is quickly managed to
ensure highest availability and reliability.
The IBM TotalStorage NAS 300 connects to an Ethernet LAN. Customer

supplied Ethernet cabling must be used to connect to the LAN. This
rack-mounted system provides for power distribution, but sufficient power must
be provided to the rack.
The following summarizes the IBM NAS 300:

򐂰 Fully assembled and tested solution, ready to go
򐂰 Designed for 24X7 operation
– Advanced Systems Management with:
• Light-Path Diagnostics, which provides visual indications of system
well-being
• Predictive Failure Analysis to alert the system administrator of an
imminent component failure
• Remote alert via pager or optional networking messaging
– Dual engines for clustering and fail-over
– Dual Fibre channel hubs for high-speed data transfer and contention
– Dual RAID controllers in each RAID Control Unit
– Hot Swap power supplies for system redundancy
򐂰 Connectivity
– Supports Gigabit and 10/100 Mbit Ethernet LAN connectivity
– Fiber-to-fiber technology
򐂰 Functionality
– Preloaded operating system optimized for Windows and UNIX client
servers
– Supports multiple RAID levels 0, 1, 1E, 5, 5E, 00, 10, 1E0, and 50
򐂰 Scalability
– Easily scales from 364 GB to 6.55 TB for future growth
򐂰 Easy of use

– Web-based GUI: Universal Management Services, IBM Advanced
Appliance Configuration Utility Tool, and Windows Terminal Services
– Simple point-and-click restore using NTBackup utility
򐂰 Simple management
– Superior management software for continuous operation
򐂰 Preloaded backup and recovery software: Windows NT Backup, Netfinity
Director agent, TSM client
򐂰 Persistent Storage Manager, which provides up to 250 point-in-time images
for file protection
It is capable of supporting heterogeneous client/server environments, such as

Windows, UNIX, Netware and HTTP. This helps to reduce the total cost of
ownership by eliminating the need to purchase a separate server for each
protocol.
3.2.1 IBM NAS 300 hardware

A diagram of the IBM NAS 300 rack model is shown in Figure 3-5.
It consists of the following hardware components:

򐂰 Pentium III 1.133 GHz
򐂰 512 KB L2 cache
򐂰 1 GB SDRAM ECC
򐂰 364 GB disk starting capacity
򐂰 SAN Fibre Channel Managed Hub
򐂰 On-board 10/100 Ethernet port
򐂰 36U mounted IBM rack
򐂰 Fibre Channel adapter

F ib r e C h a n n e l
N ode
F ib r e C h a n n e l N ode
FC H ub
FC H ub
F C R A ID
JB O D
JB O D
JB O D
Figure 3-5 IBM NAS 300 Rack Diagram
Figure 3-6 shows a picture of the IBM TotalStorage NAS 300.
Figure 3-6 The IBM TotalStorage NAS 300

Figure 3-7 represents the IBM TotalStorage NAS 300 base configuration.
e th e rn e t
e n g in e 1
e th e rne t
en gine 2
fc h u b1 fc h ub 2
R A ID C o ntro lle r
S to ra ge U n it
S to rag e U n it
Figure 3-7 IBM NAS 300 base configuration
Figure 3-8 represents the IBM TotalStorage NAS 300 maximum configuration.
e th e rn e t
e n g in e 1
e th e rn e t
e n g in e 2
fc h u b 1 fc h u b 2
R A ID C o n tro lle r R A ID C o n tro lle r
S to ra g e U n it S to ra g e U n it
S to ra g e U n it S to ra g e U n it
Figure 3-8 IBM NAS 300 maximum configuration

3.2.2 IBM NAS 300 technical specifications
Table 3-3 shows the technical specifications of the IBM TotalStorage NAS 300.
Table 3-3 IBM NAS 300 technical specifications
Specifications 5195-326
Form Factor Rack
Number and type of processors (std./max) 1/2 Pentium III 1.133 GHz
L2 Cache 512KB
Memory (st./max) 1024/2048 MB
PCI Expansion Slots (total/hot-plug) 5
HDD Expansion Bays (total/hot-plug) 7
Supported Capacity (min./max) 364 GB to 6.55 TB
Type of HDD Fibre Channel
Network (std./max) 2 x 10/100Mbits Ethernet /

4 x 10/100 Mbits or 2 Gbit Ethernet
System Management Yes
Power Supply 270 W
Hot Plug Components HDDs
Light Path Diagnostics Yes
RAID Support Fiber Channel RAID Expansion Bay
3.2.3 IBM NAS 300 features and benefits

Table 3-4 summarizes the features and benefits of the NAS 300.
Table 3-4 IBM NAS 300 Features and Benefits
Features Benefits
364 GB to 6.55 TB Scalable storage growth for investment

protection
Dual engine configuration Clustered failover support for increased

availability and performance
Two 1.133 GHz Pentium III Processors Very high performance

per engine with Fibre Channel
Connections to the HDD

Features Benefits
Hardware-based RAID and Redundant High availability, increased reliability

components

and smooth migration paths for business
growth

environments
Windows (CIFS) - UNIX (NFS) - Netware,
HTTP, FTP, AppleTalk
Web browser interface Simplifies appliance installation
Systems management via Netfinity Comprehensive management facilities

Director , Tivoli Enterprise, CA preloaded for ease of use
Unicenter TNG Framework, Microsoft
Systems Management
Rack configuration - 36U Enterprise Modular expansion - easy to grow

Rack
3.2.4 IBM NAS 300 optional features

The following are optional features for the IBM TotalStorage NAS 300:
򐂰 Up to 4 10/100 Mbits Ethernet or 2 Gigabit Ethernet adapters
򐂰 1 GB additional RAM
򐂰 Fast host adapter
򐂰 Netfinity Gigabit Ethernet SX adapter
򐂰 Fibre Channel cable (5 or 25 meters)
򐂰 Netfinity fibre channel long wave GBIC
򐂰 Netfinity fibre channel short wave GBIC
3.2.5 IBM NAS 300 preloaded software

Each IBM TotalStorage NAS 300 is preloaded at the factory with the base
operating system and application code. The code is loaded to the system's hard
disk with a backup copy provided on CD-ROM. The operating system and NAS
application code has been specifically tuned to enable the Model 326 as a high
performance NAS server appliance.

In addition to the operating system and application code, the code load contains
configuration and administration tools which simplify remote configuration and
administrator tasks. Network management agents are included that provide
options by which the IBM TotalStorage NAS 300 can be managed.
The preloaded code contains the following functions:

򐂰 Windows Powered OS, optimized for IBM TotalStorage NAS 300
򐂰 File Systems supported: CIFS, NFS, FTP, HTTP, Novell Netware, AppleTalk
򐂰 Remote NAS System Administration
– Administrative tasks performed in the Web-based GUI
– IBM Advanced Appliance Configuration Utility
– Alternate administrative tasks performed using Windows Terminal Service
– Advanced management functions available via Windows Terminal Service
– Simple point-and-click for restores using NT Backup
򐂰 UNIX services
– Preconfigured NFS support
– Web-based GUI for performing administrative tasks
– Microsoft Services for UNIX V2.0
– Supports NFS V3.0 (IETF RFC 1830)
򐂰 Disaster recovery of operating system
– Scheduled backups of the system partition
– Fifteen minute original factory CD-ROM reload of operating system
– Prevention of accidental reloads via reload enablement diskette
򐂰 IBM Fibre Management utility
– IBM Fibre Stand Alone management utility
– MMC snap-in that launches the utility
– Advanced users will find it useful to monitor the Fibre adapter
configuration
򐂰 Advanced Aggregate Management
– Netfinity Director Agent
򐂰 Persistent Storage Manager for IBM NAS
Persistent Storage Manager (PSM) creates True Images (tm). These are
multiple point-in-time persistent images of any or all system and data
volumes. All persistent images survive system power loss or a planned or
unplanned re-boot. Each instance of PSM seamlessly handles 250
concurrent images of up to 255 independent volumes for a total of 63,750
independent data images. Any image can be easily managed through the
Microsoft Web user interface, and accessed the same as any other active
volume. For disaster recovery, in case of data corruption or loss, any

persistent image can be used to revert to a prior time. This can substantially
reduce the amount of system downtime. Refer to Chapter 4, “Management of
IBM NAS and IP Storage solutions” on page 173 for more details on how
PSM works. The features provided by PSM include the following:
– Create multiple point-in-time persistent images
– Support for up to 2 TB of storage space of user data, per volume
– Flexible, configurable image access and administration
• All under a single share point, each under their own directory
• Maximum of 250 concurrent images of up to 255 independent volumes
• Variable image cache file
– Simplifies configuration parameters
• Scheduling of the images
• Manual creation of images
• Deleting images
– Advanced configuration screen, allowing you to set:
• Size of the image screen
• Location of the image cache file
• Security setting of the image root directory
• Maximum number of images to keep
򐂰 Netfinity Director With Universal Manageability (UM) Services V2.2
The IBM TotalStorage NAS 300 contains a Netfinity Director agent. It can be
managed by this powerful, highly-integrated, systems management software
solution that is built upon industry standards and designed for ease-of-use.
With intuitive Java-based GUI, an administrator can centrally manage
individual or large groups of IBM and non-IBM PC-based servers. IT
administrators can view the hardware configuration of remote systems in
detail and monitor the usage and performance of crucial components, such
as processors, disks, and memory.
The following functions have been added in V2.2:
– Windows 2000 server, console, and agent
– SCO UnixWare agent
– Alert on LAN - (AoL) configuration enhancements
– Wired for Management - (WfM) - compliant CIM to DMI Mapper
– SNMP device listener for Netfinity hardware
Netfinity Director with UM Services V2.2 is the latest update to the IBM
world-class systems manageability solutions. V2.2 replaces all earlier
versions of NF Director and UM Services and adds support for Windows
2000, SCO UNIXWare, and new IBM hardware systems.

The IBM TotalStorage NAS 300 is provided with the Tivoli Storage Manager
backup client software. The customer may choose to use this to back up data to
a Tivoli Storage Manager server. This backup client provides file level and
sub-file level backup and restore functionality.

vendor, or operating system.
An overview of Tivoli Storage Manager is given in 2.13.2, “Tivoli Storage

Manager” on page 114.
3.3 IBM NAS 200 and 300 comparison

Table 3-5 compares the various features and functions between the IBM
TotalStorage NAS Models 201, 226 and 326.
Table 3-5 Comparison between the NAS 200 and NAS 300
5194-201 5194-226 5195-326
Type Tower Rack Rack
Scalability 108 GB - 440.4 GB 218 GB - 3.52 TB 364 GB - 6.55 TB
Engines/Nodes 1 / Shared Null 1 / Shared Null 2 / Shared Nothing
Processors 1 (Second optional) 2 2 per node
Protocol SCSI SCSI Fibre Channel

Attachment to HDD
Persistent Data 250 Persistent / 250 Persistent / 250 Persistent /

Image/Level File, Volume, Drive File, Volume, Drive File, Volume, Drive
SNAP Restore
Redundancy Hot Swap HDD Hot Swap HDD Hot Swap HDD
Hot Spare HDD Hot Spare HDD Hot Spare HDD
Hot Swap Hot Swap Hot Swap
Redundant Power Redundant Power Redundant Power
Supplies Supplies Supplies
Backup Option Internal/External Internal/External Internal/External

Tape Tape Tape

5194-201 5194-226 5195-326
RAID Levels 0,00,1,1E,1E0,5, 0,00,1,1E,1E0,5, 0,1,3,5

5E,50,10 5E,50,10
System Netfinity Director Netfinity Director Netfinity Director

Management TSM Agent TSM Agent TSM Agent
Performance 28 MBps (CIFS) 40 MBps (CIFS) 78 MBps (CIFS)

Target 3000 OPS (NFS) 3000 OPS (NFS) 3000 OPS (NFS)
3.4 IBM TotalStorage Network Attached Storage 300G

IBM's TotalStorage Network Attached Storage (NAS) 300G high-speed
appliance connects your Ethernet LAN to storage resources on your SAN. The
NAS 300 high-performance models are designed to link application servers,
transaction servers, file servers, and end-user clients to storage resources
located on the SAN, 24 hours a day, 7 days a week.
Two different types of configurations are available for this product: the
single-node G01 and the dual-node G26. The dual node Model G26 also
provides clustering and failover protection for top performance and availability.
To remain competitive, your information systems must be flexible enough to

accommodate evolving needs and must be available around the clock. The G01
and G26 have been designed to meet these challenges head on. Not only do
they feature a modular design for flexible growth in processing power and
connectivity (to provide a lower overall total cost of ownership), they also provide
high availability and reliability with hot-swappable and redundant power supplies.
The IBM TotalStorage NAS 300G, 5196 models are specialized NAS appliances
acting as a high-bandwidth conduit. They connect LAN-attached clients and
servers to the SAN through high-speed Fibre Channel paths.
Figure 3-9 shows a pictorial view of a typical NAS 300G implementation.

Enterprise
SAN
NAS 300G
IP
Network
Ethernet Fibre
Shared SAN
File IO Storage Block IO
Protocols Protocols
Figure 3-9 Typical 300G Implementation
The main characteristics of the IBM TotalStorage NAS 300G are the following:
򐂰 Easy to use and install
򐂰 No keyboards, mouse, or display required to configure and maintain
򐂰 Supports CIFS, NFS, Novell NetWare, FTP, AppleTalk, and HTTP
򐂰 Persistent image file server backup, a point-in-time backup accessible by
users without administrator intervention
򐂰 Web-based GUI administration tools
򐂰 Windows Terminal Services for remote administration and configuration
򐂰 Uses external storage
򐂰 Netfinity Director agent
򐂰 Tivoli Storage Manager client
򐂰 SANergy
3.4.1 IBM NAS 300G hardware

In this section we briefly summarize the hardware components of the NAS 300G
models.

Single node model G01
Figure 3-10 is a diagram of the IBM NAS 300G Single Node model.
LAN
E th e rn e t
NO DE 1 F /C
F ib r e C h a n n e l S w itc h
( C u s t o m e r P r o v id e s )
SAN
Figure 3-10 IBM NAS 300G, 5196 G01 single node diagram

򐂰 2 x 1.13 GHz Pentium III Processors
򐂰 512 KB L2 Cache
򐂰 1 GB SDRAM
򐂰 3 x 36.4 GB Hard Drive
򐂰 Up to 4 Ethernet Adapters (at most, 2 can be gigabit)
򐂰 Fibre Channel Adapter (qlogic)

Figure 3-11 shows a picture of the single node IBM TotalStorage NAS 300G.
Figure 3-11 The IBM TotalStorage NAS 300G G01 single node model
Dual node model G26

For higher availability and redundancy, there is a dual-node model G26 which
provides a dual redundant path to data access. This model has been configured
for clustering takeover should there be a failure on any one of its nodes.
Figure 3-13 shows a diagram of the IBM TotalStorage NAS 300G Dual Node
Model.
The IBM TotalStorage NAS 300G Dual Node Model is made up of 2 individual
rack-mounted Single Node units and includes the following hardware
components in each unit:
򐂰 2 x 1.13 GHz Pentium III Processors
򐂰 512KB L2 Cache
򐂰 1 GB SDRAM
򐂰 3 x 36.4 GB Hard Drive
򐂰 Up to 4 Ethernet Adapters (at most, 2 can be gigabit)
򐂰 Fibre Channel Adapter (qlogic) in each chassis

Figure 3-12 The IBM TotalStorage NAS 300G G26 dual node model
LAN
Ethernet Ethernet
NODE 1 NODE 2
Ethernet
Fibre Channel Switch Fibre Channel Switch

(Customer Provides) (Customer Provides)
SAN
Figure 3-13 IBM NAS 300G, 5196 G26 dual node diagram

IBM NAS 300G optional features
The following are optional features for the IBM NAS 300G:
򐂰 250 W Hot-swap Redundant Power Supply (33L3760)
򐂰 Fibre Channel Adapter (#0002) — Plant- and field-installable
The Fibre Channel Adapter is an intelligent, DMA bus master adapter that has
been optimized for high throughput. It contains a powerful RISC processor,
Fibre Channel protocol module with 1-Gb/sec. transceivers, and PCI local bus
interface. The Fibre Channel Adapter has duplex type SC connector for
attaching multi-mode fiber cable. The adapter supports 50um or 62.5um
multi-mode fiber cable lengths up to 500 meters. Key features include:
– 100 MBps data rate over Fibre Channel connections
– PCI bus operation at 33 MHz or 66 MHz
– 64-bit, 32-bit PCI bus interfaces
– Supports:
• FCP-SCSI protocols
• FC-AL public loop profile (FL-PORT FABRIC LOGIN)
• Point-to-point fabric connection (F-PORT FABRIC LOGIN)
• Fibre Channel service classes 2 and 3
򐂰 Advance System Management (ASM) Adapter For 5196-G01 (#0003)
And Interconnect Cable for 5196-G26 (#0004) — Plant- and field-installable
The Advanced System Management PCI Adapter, in conjunction with the
ASM processor that is integrated into the base planar board of the servers,
allows you to connect via LAN or modem from virtually anywhere for
extensive remote management of NAS 5196 Model G01. Remote
connectivity and flexibility with LAN capability is provided by Ethernet
connection. The ASM adapter enables more flexible management through a
Web browser interface, in addition to ANSI terminal, Telnet, and Netfinity
Director.
򐂰 )10/100 Ethernet Adapter (#0005) — Plant- and field-installable
The 10/100 Ethernet Adapter provides IEEE 802.3-compliant 100BASE-TX
and 10BASE-T Ethernet connectivity for servers over an unshielded twisted
pair link through a single RJ-45 connector. Its 32-bit PCI 2.1 bus mastering
architecture offers outstanding performance with low server CPU utilization. It
provides half-duplex and full-duplex operation at both 10 Mbps and
100 Mbps. Auto-negotiation of speed and duplex mode facilitates the use of
the adapter in shared or switched environments running at either speed.

򐂰 Gigabit Ethernet SX Adapter (#0006) — Plant- and field-installable
The Gigabit Ethernet SX Adapter provides 1000BASE-SX connectivity to a
Gigabit Ethernet network for servers over a 50 or 62.5 micron multimode fiber
optic link attached to its duplex SC connector. Its 1000 Mbps data rate and
32- or 64-bit PCI bus mastering architecture enable the highest Ethernet
bandwidth available in an adapter. It is compliant with IEEE 802.3z Ethernet
and PCI 2.1 standards, ensuring compatibility with existing Ethernet
installations. It also supports 802.1p packet prioritization and 802.1q VLAN
tagging.
3.4.2 IBM NAS 300G technical specifications

Table 3-6 shows the technical specifications of the IBM NAS 300G.
Table 3-6 IBM NAS 300G technical specifications
Specifications 5196-G01 single node 5196-G26 dual node
Form factor Rack mountable 3U Rack mountable 6U
Nodes 1 2
Number of processors per Dual 1.13 GHz Pentium III Dual 1.13 GHz Pentium III
engine
Clustering/failover No Yes
Memory (std./max) 1 GB / 4 GB 1 GB / 4 GB
Adapter Slots 4 4 per engine (8)
File protocol support CIFS, NFS, Novell CIFS, NFS, Novell

Netware, HTTP, FTP, Netware, HTTP, FTP,
AppleTalk AppleTalk
Ethernet connections 1 x 10/100Mbits Ethernet 1 x 10/100Mbits Ethernet
Optional adapters 1 x 10/100Mbps Ethernet 1 x 10/100Mbps Ethernet

1 x Gigabit Ethernet 1 x Gigabit Ethernet
ASM Adapter ASM Adapter
1 x Fibre Channel 1 x Fibre Channel
Integrated ASM processor Yes Yes
Power supply Dual/Redundant/Auto Dual/Redundant/Auto

Ranging Ranging
Fibre Channel 1-Port Fibre Channel 1-Port Fibre Channel per

engine

3.4.3 IBM NAS 300G features and benefits
Table 3-7 summarizes the features and benefits of the NAS 300G.
(Note: In the following table an asterisk (*) indicates a recommended maximum)

Table 3-7 IBM NAS 300G features and benefits
Features Benefits
Capacity supported per engine *Up to 11 TB of Fibre Channel storage
Clients supported per engine *Up to 500
Dual node configuration — G26 only Clustered Failover support for increased
availability and performance
Two 1.13 Ghz Pentium III Processors Very high performance

per node

Smooth migration paths for business growth

environments
Windows (CIFS), UNIX (NFS), Netware,
HTTP
Simplified data management Heterogeneous and consolidated file serving

management
WEB browser interface Simplifies appliance installation
Provide remote LAN users access to Access to pooled storage on SAN and does
SAN storage not require individual Fibre Channel
connections
Combines NAS and SAN Provides SAN scalability and performance on

the IP network
Preloaded SANergy software Enables SANergy clients to access same

volumes at the same time
Systems management Comprehensive management facilities

preloaded for ease of use. Can be done via
Netfinity Director, Tivoli SANergy, Tivoli
Storage Manager, Microsoft Systems
Management

3.4.4 IBM NAS 300G preloaded software
Each 5196 NAS Model G01 and G26 is preloaded at the factory with base
operating system and application code. The code is loaded to the system's hard
disk with a backup copy provided on CD-ROM. The operating system and NAS
application code has been specifically tuned to enable the Model G01 and G26
as high performance NAS server appliances.
In addition to the operating system and application code, the code load contains
configuration and administration tools which simplify remote configuration and
administrator tasks. Network management agents are included that provide
options by which the NAS Models G01 and G26 can be managed.
The software listed in Table 3-8 is included in the IBM NAS 300G.
Table 3-8 IBM NAS 300G software
Software 5196-G01 and 5196-G26
Operating system Windows Powered OS
Backup Persistent Storage Manager enables point-in-time

backup
250 persistent Images
Storage management Tivoli Storage Manager
Systems management Netfinity Director 2.2 agent
Performance management Tivoli SANergy Exec Agent
Remote administration Web-based GUI

Microsoft Terminal Services
Configuration tools IBM Advanced Appliance Configuration Utility
The preloaded code provides the following functionality:

򐂰 Windows Powered OS, optimized for IBM TotalStorage NAS 300G Models
G01 and G26
򐂰 File systems supported are CIFS, NFS, FTP, HTTP, Netware, AppleTalk
򐂰 Remote NAS system administration:
– Administrative tasks performed in the Web-based GUI
– IBM Advanced Appliance Configuration utility
– Alternate administrative tasks performed using Windows Terminal Service
– Advanced management functions available via Windows Terminal Service
– Simple point-and-click for restores using NT Backup

– NAS backup assistant MMC snap-in Web page
򐂰 UNIX Services
– Pre-configured NFS support
– Web-based GUI for performing administrative tasks
– Microsoft Services for UNIX V2.0
– NFS V3.0 (IETF RFC 1830)
򐂰 Automatic disaster recovery of operating system
– Persistent Storage Manager is used to create a snapshot of the specified
drive
– IBM Touch sets the archive bit to support incremental backups
– Fifteen minutes original factory CD-ROM reload of operating system
– Prevention of accidental reloads via reload enablement diskette
򐂰 IBM Fibre Management utility
– IBM Fibre Stand Alone management utility
– MMC snap-in that launches the utility
– Advanced users — terminal services into the machine can use this utility
to monitor fibre adapter configuration
򐂰 Advanced Aggregate Management
– Netfinity Director Agent
򐂰 Persistent Storage Manager for IBM NAS
Persistent Storage Manager (PSM) creates “True Images,” multiple
point-in-time persistent images, of any or all system and data volumes. All
persistent images survive system power loss or a planned or unplanned
re-boot.
Each instance of PSM handles, in a seamless manner, 250 concurrent
images of up to 255 independent volumes for a total of 63,750 independent
data images. Any image can be easily managed through the Microsoft Web
user interface, and accessed the same as any other active volume. For
disaster recovery, in case of data corruption or loss, any persistent image can
be used to revert to a prior time, which can substantially reduce the amount of
system down time. The features of PSM are summarized as follows:
– Creates multiple point-in-time persistent images
– Supports up to 2 TB of storage space of user data, per volume
– Flexible, configurable image access and administration
• All under a single share point, each under their own directory

• Maximum 250 concurrent images of up to 255 independent volumes
• Variable image cache file
– Simplifies configuration parameters, allowing:
• Scheduling of the images
• Manual creation of images
• Deleting images
– Advanced configuration screen to manage:
• Size of the image screen
• Location of the image cache file
• Security setting of the image root directory
• Maximum number of images to keep
򐂰 Netfinity Director With Universal Manageability Services (UMS) V2.2
The IBM TotalStorage NAS 300 contains a Netfinity Director agent. The 300G
can be managed by this powerful, highly-integrated, systems management
software solution. It is built upon industry standards and designed for
ease-of-use. With an intuitive Java-based GUI, an administrator can centrally
manage individual or large groups of IBM and non-IBM PC-based servers. IT
administrators can view the hardware configuration of remote systems in
detail and monitor the usage and performance of crucial components, such
as processors, disks, and memory.
The following functions have been added in V2.2:
– Windows 2000 server, console, and agent
– SCO UnixWare agent
– Alert on LAN (AoL) — configuration enhancements
– Wired for Management (WfM) — compliant CIM to DMI Mapper
– SNMP device listener for Netfinity hardware
Netfinity Director with UM Services V2.2 is the latest update to the IBM
world-class systems manageability solutions. V2.2 replaces all earlier
versions of NF Director and UM Services and adds support for Windows
2000, SCO UnixWare, and new IBM hardware systems.
For more details on PSM, refer to 5.1.2, “Persistent Storage Manager True
Image Copies” on page 197.

The IBM TotalStorage NAS 300G is provided with the Tivoli Storage Manager
backup client, which can be used to back up data to a Tivoli Storage Manager
server. This backup client provides file level and sub-file level backup and restore
functionality.

vendor, or operating system. From the massive enterprise file servers in your
data center to the storage peripherals throughout your organization, and to the
laptops in your remote offices, Tivoli Storage Manager ensures data availability
for all users, wherever they are.
Tivoli Storage Manager enables organizations with large amounts of

mission-critical data held on distributed heterogeneous platforms to manage that
data from up to seven different platforms, including Windows 2000, Sun Solaris,
HPUX, IBM AIX, OS/390, VM, and OS/400. For additional information on Tivoli
Storage Manager, refer to the following Web site:
http://www.tivoli.com/storage
TSM is described in 2.13.2, “Tivoli Storage Manager” on page 114.
Tivoli SANergy
Tivoli SANergy software is pre-installed and ready to license on the TotalStorage
Network Attached Storage 300G. It can provide all of the benefits of a NAS
device with the higher performance and scalability of a SAN.
Any computer connected to the 300G can increase its bandwidth access to SAN
storage. Bandwidth-hungry computers can now receive data from the 300G at up
to 100 MB per second using SANergy. Tivoli SANergy will dynamically route data
to either the LAN or SAN to provide optimum network utilization and
performance.
The use of SANergy will not only increase disk-to-computer bandwidth for
individual computers; it will also greatly reduce CPU utilization on those
computers while accessing SAN storage. It will also reduce data copy and
transfer times between any computer connected this same way and will greatly
reduce traffic over the LAN. For more information on Tivoli SANergy, contact
Tivoli, or refer to this Web site:
http://www.tivoli.com/sanergy/nas
More discussion of Tivoli SANergy can be found in 1.8.1, “Tivoli SANergy” on

page 38.

3.4.5 IBM NAS 300G connectivity
The NAS 300G can connect directly point-to-point to a Fibre Channel-enabled
disk subsystem. It can also connect to a switched fabric (via switches and
directors) to enable access to SAN attached disk systems.
The following sections show various connectivity configurations using the IBM
TotalStorage NAS 300G.
IBM NAS 300G with ESS/MSS/FAStT200/FAStT500

Figure 3-14 shows sample connectivity between the IBM TotalStorage NAS
300G with any of the following: ESS, MSS, FAStT200, or FAStT500.
client a server
ESS or MSS
or FAStT200 & 500
client b
ethernet
SAN
Fibre
Channel
NAS 300G
Figure 3-14 IBM NAS 300G with IBM ESS/MSS/FAStT200/FAStT500
IBM NAS 300G with IBM 7133 connectivity

Figure 3-15 on page 158 shows sample connectivity between the IBM
TotalStorage NAS 300G with the IBM 7133 and an IBM 7140 SAN Controller 160
for Fibre Channel attachment.

client a server
IBM 7140 SAN

client b Controller 160
ethernet
SAN
Fibre
Channel
NAS 300G
7133 Serial Disk

System
Figure 3-15 IBM NAS 300G with IBM 7133 SSA subsystem
3.5 IBM TotalStorage IP Storage 200i Series

This is a network appliance that uses the new iSCSI technology. The IP Storage
200i appliance solution includes client initiators. These comprise client software
device drivers for Windows NT, Windows 2000, and Linux clients. These device
drivers co-exist with existing SCSI devices without disruption. They initiate the
iSCSI I/O request over the IP network to the target IP Storage 200i. The IP
Storage 200i target appliance has both iSCSI target code and embedded storage
up to 3.52 TB. IBM plans to add additional clients in response to customer
feedback and market demands. IBM is committed to support and deliver open
industry standard implementations of iSCSI as the IP storage standards in the
industry are agreed upon.
The IBM IP Storage 200i is a low cost, easy to use, native IP-based storage
appliance. The 200i is designed for workgroups, departments, general/medium
businesses, and solution providers that have storage area network requirements
across heterogeneous clients. It integrates existing SCSI storage protocols
directly with the IP protocol. This allows the storage and the networking to be
merged in a seamless manner. iSCSI-connected disk volumes are visible to IP
network attached processors, and as such are directly addressable by database
and other performance-oriented applications. The native IP-based 200i allows
data to be stored and accessed wherever the network reaches, LAN, MAN, or
WAN distances.

Two options for attachment exist. You may choose to integrate the 200i directly
into you existing IP LAN, combining storage traffic with other network traffic. This
is a low cost solution for low activity storage applications. The alternative is SAN.
Servers attach only to storage devices on the dedicated IP SAN. It acts as an
extra network behind the servers, while the LAN in front of the servers remains
dedicated to normal messaging traffic.
IBM TotalStorage IP Storage 200i family consists of the 4125 Model 110 and
4125 Model 210 tower systems, and the 4125 Model EXP rack-mounted system.
All required microcode comes preloaded, minimizing time required to set up,
configure and make operational the IP Storage 200i. There are only two types of
connections to make: attaching the power cord(s) and establishing the Ethernet
connection(s) to the network. High speed, 133 MHz SDRAM is optimized for
133 MHz processor-to-memory subsystem performance. IBM IP Storage 4125
Model 110 and IP Storage 4125 Model 210 use the ServerWorks ServerSet III LE
(CNB3.OLE) chipset to maximize throughput from processors to memory, and to
the 64-bit and 32-bit Peripheral Component Storage (PCI) buses.
These are high-performance storage products: they deliver the advantages of

pooled storage, which PC SANs provide. At the same time, they take advantage
of the familiar and less complex IP network fabric.
After power on, the initial IP address configuration is a straightforward task which
would be completed by the system administrator. The IBM TotalStorage IP
Storage 200i provides a browser-based interface with which the system
administrator can configure the network easily. RAID provides enhanced disk
performance while minimizing storage failure. Adding disks and administering
operations can occur while the system is online, providing excellent operational
availability.
IBM provides iSCSI initiator drivers for Linux, Windows NT, and Windows 2000.
These drivers are available for download from the following website:
http://www.storage.ibm.com
IBM provides a user ID and password to authorized customers and users. The
download package extracts all files, including a README, which explains how to
build the initiator for particular hardware types and Linux versions. The Windows
NT and 2000 install packages run under Install Shield, which will install drivers
and update the registry. Information provided explains how to configure the IP
address of the iSCSI target. Once installed and configured (assuming the system
administrator assigns access to storage for the initiator machine), the iSCSI
initiator driver will open a connection to the iSCSI target on bootstrap and will
treat the assigned storage just like a locally attached disk. This is an important
concept and has implications which are discussed later in this chapter.

3.5.1 IBM TotalStorage IP Storage 200i Configurations
The workgroup model, IP Storage 200i, 4125 Model 110, is a compact tower
design. It consists of the following components:
– One 1.13 GHz Pentium III Processor
– 512 KB Level 2 cache
– 512 MB of ECC 133 MHz System Memory
– ServeRAID-4MX - two channel RAID adapter
– 3/109GB of HDD Storage, expandable up to 6/440 GB internal
– Three 250 W, hot-swappable power supplies
Figure 3-16 shows both the workgroup and departmental models.
Figure 3-16 IBM IP Storage 200i
The departmental model, IP Storage 200i, 4125 Model 210, is rack mounted and
consists of the following components:
– Dual 1.13 GHz Pentium III Processors
– 1 GB of ECC 133 MHz System Memory
– 512 KB Level 2 cache per processor
– ServeRAID-4H - high function, four-channel RAID adapter
– 3/109 GB of HDD Storage, expandable up to 6/440 GB internal

– 9/440 GB of HDD Storage, expandable up to 48/3.52 TB with 3 EXP units
attached externally
– Three 250 W, hot-swappable power supplies
The IBM TotalStorage IP Storage 200i 4125 Model EXP is a storage expansion
unit that provides additional storage capability for the rack-based 4125. It
provides up to 1.027 TB storage capacity per unit and up to three expansion
units can be attached to a single 4125 Model 210, providing a maximum of
3.52TB of storage.
3.5.2 IBM TotalStorage IP Storage 200i Technical Specifications

The Table 3-9 below shows the technical specifications of the 200i model 110
and 210 side by side.
Table 3-9 200i Specification
Form Factor Tower 5U Rack drawer
Number of Processors 1/2 1.13 GHz Pentium III 2/2 1.13 GHz Pentium III
(std./max)
Internal Disk 20 GB IDE 20 GB IDE
L2 Cache 512 KB 512 KB Level 2 cache per

processor
Memory (std./max) 512MB/1GB 1G/2GB
Expansion Slots 5 5
Capacity (std. max) 3/109 GB, 6/440 GB 3/109 GB, 6/440 GB

internal internal, 9/440 GB
48/3.52 TB with 3 EXP
units attached externally
Network 10/100/1000Mbps or Intel 10/100/1000 Mbps or

Gigabit Fibre Intel Gigabit Fibre
Integrated Advanced Yes Yes

System Management
Processor
Power Supply 3 x 250 W Hot-swap 3 x 250 W Hot-swap

redundant power supplies redundant power supplies
Hot Plug Components HDDs, Power supplies, HDDs, Power supplies,

fans fans

Light Path Diagnostics Yes Yes
Total PCI Slots/Available 5 (4x64-bit and 1x32-bit)/1 5 (4x64-bit and 1x32-bit)/1
RAID Support ServeRAID-4MX-2 ServeRAID-4H-4 channel

channel
3.5.3 IBM TotalStorage IP Storage 200i Microcode

The IBM Total Storage IP Storage 200i, 4125 Models 110 and 210, are preloaded
with IETF Standard (V1.2) compliant machine code. This code is specifically
designed to handle iSCSI initiators (clients) accessing varying amounts of data
storage on the 4125 Models 110 and 210 appliances using block I/O requests.
The preloaded machine code includes:

– Linux OS optimized for iSCSI operations and functions
– iSCSI Target, iSCSI Initiator functions and drivers
– Gigabit Ethernet SX (Fibre) or Ethernet 10/100/1000 (Gigabit Copper) or
NIC drivers
– ServeRAID-4 storage controller code
– ServeRAID Configuration and Monitoring to provide Web-based
configuration tools and RAID Management using ServeRAID-4 controllers
– Web-based Configuration Utility that manages the IBM Total Storage IP
Storage 200i, 4125 Models 110 and 210, from a single client workstation
3.5.4 IBM TotalStorage IP Storage 200i features and profiles

The IBM TotalStorage IP Storage 200i serves out local storage to iSCSI Initiator
(client) requests. The IBM TotalStorage IP Storage 200i acts as a single server to
the physical disk drives on behalf of the iSCSI clients. The physical disk drive
storage is partitioned into multiple virtual logical units of storage (LUNs) for
storage assignment to individual iSCSI clients.
The ability to access storage residing on the IBM TotalStorage IP Storage 200i is
coordinated by Access Control logic in the Web-based User Interface (UI). iSCSI
clients use an assigned client ID and password to access assigned LUNs.
Internal IP-Storage appliance system functions are integrated on top of a base

Linux core (kernel level 2.4). Core architecture addresses initial IP addressing,
product boot and recovery, and general box management.

Two methods are provided for “first boot” IP address assignment:
򐂰 The Ethernet NIC will default to address 192.9.200.1. By initially installing this
product in a “private network” where this address is reachable by a
workstation having the same sub-net address, the administrator can point the
workstation's browser to 192.9.200.1:1959 to access the user configuration.
򐂰 ARP (Address Resolution Protocol) find is a process listening on TCP port
3939. The function monitors the local network for “unanswered ARPs.” By
pointing the administrator's browser to 3939, the administrator will cause an
ARP into the network where the 4125-200i has been installed. After seeing
that no station is responding to the ARP, the 4125-200i will install the desired
IP address and will respond to the ARP. If within 20 seconds a packet is
received at port 3939, the 4125-200i will keep the address. At that point, the
administrator can point the browser at the same address port 1959 to access
user configuration panels.
The system disk is partitioned for multiple system images for upgrade and
recovery. The system is booted from the primary partition. If the boot fails, the
system is automatically booted from the Recovery CD-ROM, which invokes
failure recovery procedures. Through the service interface, the user can apply
new system images from a local management station.
Network management is supported via SNMP and standard MIBs. SNMP agents
and subagents support internal functions. A specific iSCSI MIB is not supported
in this initial product release.
An ethernet device driver supports Gigabit Ethernet SX (Fibre) or Ethernet

10/100/1000 (Gigabit Copper) connectivity.
The RAID levels supported are RAID 0, 1, 1E, 5, and 5E. Disk partitioning and
management, as well as RAID arrays, are supported. Hot-spare disks can be
defined for automatic failed disk replacement (with the exception of RAID 0).
3.5.5 IBM IP Storage high availability and serviceability

IBM TotalStorage IP Storage 200i delivers economical reliability and
serviceability via the following features:
򐂰 Six hot-swap HDD bays with SCA-2 connectors support SAF-TE functions
򐂰 Standard ServeRAID-4H or 4MX controllers support RAID levels 0, 1,5 1E, 5,
and 5E.
򐂰 ECC DIMMs, combined with an integrated ECC memory controller, corrects
soft and hard single-bit memory errors while minimizing disruption of service
to LAN clients

򐂰 Memory hardware scrubbing corrects soft memory errors automatically
without software intervention
򐂰 ECC L2 cache processors ensures data integrity while reducing downtime
򐂰 Three worldwide, voltage-sensing 250-watt power supplies provide auto
restart and redundancy
򐂰 Information LED panel gives visual indications of system well-being
򐂰 Easy access to system board, adapter cards, processor, and memory
3.5.6 IBM IP Storage expandability and growth

The IP Storage 4125 Model 110 and IP Storage 4125 Model 210 mechanical
packages are available in tower or rack models. The rack model is engineered to
meet the compactness of a 5 U rack drawer. A conversion kit (feature
#3601--5600 Tower-to-Rack Kit) is available to convert a tower mechanically for
rack mounting. It features the following:
򐂰 Standard 512 MB (Model 110) or 1 GB (Model 210) of system memory,
expandable to 2 GB
򐂰 Five full-length adapter card slots - 5 (4x64-bit and 1x32-bit)
򐂰 ServeRAID-4MX Ultra160 SCSI Controller (IP Storage 4125 Model 110), duel
channel supports internal RAID storage
򐂰 ServeRAID-4H Ultra160 SCSI Controller (IP Storage 4125 Model 210), four
channels support internal and three external channels
򐂰 Ten drive bays:
– Six 3.5-inch slim-high, hot-swap drive bays, three 5.25/3.5-inch half-high
device bays, and one 3.5-inch slim-high drive bay
– Up to 440 GB of internal data storage
– A 40x-17x IDE CD-ROM and 1.44 MB diskette drive
These servers have the flexibility to handle applications for today and expansion
capacity for future growth.
3.5.7 IBM IP Storage 200i 4125-EXP Expansion Unit

This highly available external storage expansion unit is supported with the IP
Storage 4125. It ships with three slim-high 10K-4 Ultra160 SCSI.

The 4125-EXP contains two hot-swap, redundant power supply/fan assemblies.
Key features of the Storage Expansion Unit include:
– Support for 14 slim-high HDDs, maximum capacity 1.17 TB (14 X 73.4
GB)
– Support for data transfer speeds of up to 160 MB
– 3U form factor for minimum rack space usage
– Accommodates single or dual SCSI bus configurations
– Dual hot-swap, 250 W redundant power supplies with integrated fan
assemblies
– Includes two line cords and publications
3.5.8 IBM IP Storage 200i Optional Features

This section describes the optional features for the IBM TotalStorage IP
Storage 200i.
򐂰 Gigabit Ethernet SX Adapter (#3302)
The Netfinity Gigabit Ethernet SX Adapter provides 1000BASE-SX
connectivity to a Gigabit Ethernet network for servers over a 50 or 62.5
micron multimode fiber optic link attached to its duplex SC connector. Its 1000
Mbps data rate and 32- or 64-bit PCI bus mastering architecture enable the
highest Ethernet bandwidth available in an adapter. It is compliant with IEEE
802.3z Ethernet and PCI 2.1 standards, ensuring compatibility with existing
Ethernet installations. It also supports 802.1p packet prioritization and 802.1q
VLAN tagging. Either this feature (#3302-Ethernet SX Adapter) or feature
#3303-10/100/1000 (Gigabit) Ethernet Copper Adapter must be selected for
the IP Storage 4125 Model 110 and IP Storage 4125 Model 210.
򐂰 10/100/1000 (Gigabit) Ethernet Copper Adapter (#3303)
This adapter delivers up to 1000 Mbps over existing Category 5 twisted pair
cables. Existing Fast Ethernet cabling infrastructure can be used for Gigabit
throughput. No re-cabling is necessary. Link speed auto sensing is supported,
so the Adapter can operate at 10, 100, or 1000 Mbps, depending on the
configuration of the switch or hub. This supports the migration to Gigabit
without having to replace or reconfigure the adapter. Either this feature
(#3303-10/100/1000 (Gigabit) Ethernet Copper Adapter) or feature
#3302-Gigabit Ethernet SX Adapter must be selected for the IP Storage 4125
Model 110 and IP Storage 4125 Model 210.
򐂰 5600 Tower-To-Rack Kit (#3601)
This may be used to rack-mount the IP Storage 4125 Model 110.

򐂰 512 MB 133 MHz ECC SDRAM RDIMM (#3403)
This is memory for the IP Storage 4125 Model 110 and IP Storage 4125
Model 210. It can be used to increase memory to further tune performance for
the intended environment. ECC SDRAM RDIMM is a special type of memory
module which is recommended for use on servers. (See the glossary for more
details.)
򐂰 1 GB 133 MHz ECC SDRAM RDIMM (#3404)
This is memory for the IP Storage 4125 Model 110 and IP Storage 4125
Model 210. The IBM TotalStorage IP Storage 200i supports up to 2 GB when
populated with two of this feature.
򐂰 5600 Tower-To-Rack Kit (#3601)
This may be used to rack-mount the IP Storage 4125 Model 110.
Note: For more information on the IBM TotalStorage IP Storage 200i refer to
the redbook: Planning and Implementing Solutions using iSCSI, SG24-6291
3.6 The Cisco SN 5420 Storage Router

In April 2001 Cisco, IBM’s partner in presenting the iSCSI protocol to the IETF,
announced the Cisco SN 5420 Storage Router, which offers a gateway between
an IP network and a Fibre Channel network. This allows an IP network-attached
client to access Fibre Channel SAN storage via the gateway. IBM International
Global Services (IGS) has a re-marketing agreement with Cisco for the sale and
support of the SN 5420.
The Cisco SN 5420 Storage Router provides access to SCSI storage over IP
networks. With the 5420 you can directly access storage anywhere on an IP
network just as easily as you can access storage locally. The SN5420 is shown
in Figure 3-17.
The SN 5420 provides servers with IP access to storage by means of SCSI

routing using the iSCSI protocol. With SCSI routing, servers use an IP network to
access Fibre Channel attached storage as if the servers were directly attached to
the storage devices.
Figure 3-17 Cisco SN 5420 Storage Router

The Cisco SN 5420 Storage Router is ideal when using both Fibre Channel and
TCP/IP protocols. It combines the high performance data transfer capabilities of
Fibre Channel with the interoperable, and widely understood, TCP/IP.
The SN 5420 uses the TCP/IP protocol suite for networking storage supporting
the level of interoperability inherent to IP networks. It leverages existing
management and configuration tools that are already well known and
understood. And it is based on industry standards, which maximizes your
investment by allowing you to leverage existing TCP/IP experience and
equipment.
The Cisco SN 5420 Storage Router is based on both IP and storage area
network (SAN) standards, providing interoperability with existing local area
network (LAN), wide-area network (WAN), optical and Storage Area Network
(SAN) equipment. The Cisco SN 5420 is a high performance router designed to
allow block-level access to storage regardless of your operating system or
location. The SN 5420 accomplishes this by enabling Small Computer Systems
Interface over IP (iSCSI). The SN 5420 connects to both the FC SAN network
and the IP network via Gigabit Ethernet. This allows the Cisco SN 5420 Storage
Router to perform gateway functions between environments and allows IP
routing intelligence to be leveraged with storage networking technologies.
Each server that requires IP access to storage via the Cisco SN 5420 Storage
Router needs to have the Cisco iSCSI driver installed. Cisco and Cisco partners
have developed, or are currently working on, iSCSI drivers that support the
following operating systems:
– Linux
– Sun Solaris
– Windows NT
– Windows 2000 (under development by Cisco)
– AIX (under development by IBM)
– HP UX (under development by HP)
– Netware (under development by Novell)
Using the iSCSI protocol, the iSCSI driver allows a server to transport SCSI
requests and responses over an IP network. From the perspective of a server
operating system, the iSCSI driver appears to be a SCSI or Fibre Channel driver
for a peripheral channel in the server. Figure 3-18 on page 168 shows a sample
storage router network. Servers with iSCSI drivers access the storage routers
through an IP network connected to the Gigabit Ethernet interface of each 5420
storage router. The storage routers access storage devices through a storage
network connected to the Fibre Channel interface of the management interface
of each storage router. For high availability operation the storage routers
communicate with each other over two networks: the HA network connected to
the HA interface of each storage router, and the management network connected

to the management interface of each storage router. Note that there are three IP
interfaces on the Cisco 5420 Storage Router; Gigabit ethernet, Management
ethernet, and HA ethernet. All three interfaces must have unique IP networks
(subnets) defined.
Figure 3-18 iSCSI Storage Router network
3.6.1 Cisco SN 5420 hardware

The router is a 1 RU rack-mountable chassis that has one Gigabit Ethernet port,
one Fibre Channel port, two management ports, and one high availability port.
򐂰 The FC port is used to connect to storage controllers on the FC network. The
FC port supports point-point, loop, or fabric topologies, and functions as
either a Fibre Channel N_Port or NL_Port.
򐂰 The gigabit ethernet port is a 1000Base-SX (short-wavelength) interface used
to connect to servers that require IP access to storage.
򐂰 An RS-232 -232 serial interface management port is used for local console
access.
򐂰 A 10/100 Ethernet port is for ethernet network management access. Through
a management network you can manage the storage router using telnet to
enter CLI commands, a Web-based graphical user interface or SNMP
commands. This port uses a modular RJ-45 connector.
򐂰 The HA port is a 10/100 Ethernet port which is used to join other Cisco SN
5420 Storage Routers providing fault operation.

3.6.2 Cisco SN 5420 technical specifications
Table 3-10 presents the technical specifications of the Cisco SN 5420 storage
router.
Table 3-10 Cisco SN 5420 Storage Router technical specifications
Specifications Description
Environmental
Ambient operating temperature 32 to 104 F (0 to 40 C)
Humidity (RH) 10 to 95 percent non-condensing
Altitude -500 to 10,000 ft. (-152.4 to 3,048 m)
Physical Characteristics
Dimensions (H x W X D) 1.75 x 17 x 15.5 in. (4.4 x 43.2 x 39.4 cm)

1 Rack Unit in height
Weight 7.5 lb (3.4 kg)
AC Power
Output 70W
Power Dissipation 35W
Current 1.0A maximum @ 100 to 120 VAC

0.5A maximum @ 200 to 240 VAC
Frequency 50 to 60 Hz
Airflow Right and left side in, rear out
Gigabit Ethernet Port
Connector Duplex SC
Type Short Wavelength
Wavelength 850 nanometers
Fiber Type Multimode
Core Size, Modal Bandwidth, Maximum 62.5, 160, 722 ft. (220 m)
Length 62.5, 200, 902 ft. (275 m)
50.0, 400, 1640 ft. (500 m)
50.0, 500, 1804 ft. (550 m)
Fibre Channel Port
Connector Duplex SC

Specifications Description
Type SN (Shortwave laser without Open Fiber

Control)
Wavelength 850 nanometers
Fiber Type Multimode
Core Size, Modal Bandwidth, Maximum 62.5, 160, 984 ft. (300 m)
Length 50.0, 400, 1640 ft. (500 m)
3.6.3 Cisco SN5420 clustering and high availability

Clustering is used in conjunction with high availability and allows storage routers
to back each other up in case of failure. A storage router cluster consists of two
storage routers configured with the same cluster name and connected in one of
the following ways:
򐂰 To the same servers
򐂰 To the same storage systems
򐂰 To each other through their management and HA interfaces
In a cluster, storage routers continually communicate HA and configuration

information between each other by balancing the exchange of information
through both the HA and Management ethernet ports. In the event of a hardware
or software failure in the primary 5420, the secondary will take over SCSI router
operations. The HA interface for each storage in a cluster should be on the same
IP network or subnet. All SN 5420s that participate in a cluster must have the
same cluster name.
3.6.4 Cisco SN5420 SCSI Routing Services

Access for SCSI routing is controlled in the servers and a storage router. In a
server, the IP address of each storage router in which the server is to transport
SCSI requests and responses is configured in the iSCSI driver. In the Cisco SN
5420 Storage Router, a SCSI router service is defined with an access list that
identifies which servers can access storage devices attached to it based on the
IP address.
Once access is configured in the servers and once the storage mapping is
configured in a storage router, the storage router will forward SCSI requests and
responses between servers and the mapped storage devices.

Note: Up to four SCSI routing services can be defined per storage router.
3.6.5 Cisco SN5420 features and benefits

Features listed in this section make the Cisco SN 5420 Storage Router a
cost-effective and reliable method of implementing IP-based access to SAN
storage devices.
Making disk subsystems IP-aware

Implementing iSCSI within the server allows seamless block-level access for all
applications. With the SN 5420, any application that can access storage using
the SCSI protocol becomes an IP application. This capability allows existing
application software to operate without modification. iSCSI drivers, which reside
on the host server, are a key component of the Cisco SN 5420. The iSCSI drivers
intercept SCSI commands, encapsulate them as IP commands, and redirect
them to the Cisco SN 5420. These drivers are supported on a variety of
operating systems.
Interoperability
The Cisco SN 5420 fits seamlessly into existing storage and data networks. The
Cisco SN 5420 uses the well-known TCP/IP protocol suite for network storage,
supporting the level of interoperability inherent to mature IP networking
protocols. The SN 5420 is based on current SAN standards, as well, and is
compatible with existing SAN deployments, point-point, switched, or arbitrated
loop.
Scalability and reliability

The Cisco SN 5420 provides optimal performance and reliability. Additional SN
5420 can be easily added to the network to match performance requirements.
Reliability is accomplished by using the Cisco SN 5420 Storage Router high
availability (HA) and clustering features. If one SN 5420 fails, another SN 5420
automatically takes over for the failed component.
Manageability
The Cisco SN 5420 Storage Router leverages existing management and
configuration tools that are already well known and understood. The SN 5420
provides full network management support through Simple Network
Management Protocol (SNMP), WEB-based GUI and command line interface
(CLI) access.

Security
The Cisco SN 5420 Storage Router uses access control lists to limit only specific
IP address access to SAN-based storage. This controls client or server access to
specific logical unit numbers. The SN 5420 is also password protected to further
control security.
Investment protection
Total cost of ownership (TCO) is a growing concern for most system
administrators and management. The Cisco SN 5420 Storage Router helps
reduce the costs by leveraging your existing TCP/IP networking infrastructure
while maintaining your current and near-term investments in storage systems
and Fibre Channel infrastructure. The SN 5420 simplifies the cost of
management, deployment and support issues, given the fact that technical skills
in TCP/IP support are more widely available that SAN experience.
Note: For more information on the Cisco SN 5420, refer to the redbook:
Using iSCSI Solutions’ Planning and Implementation, SG24-6291
For the latest information about Cisco SN 5420, refer to the product page at:
http://www.cisco.com/warp/public/cc/pd/rt/5420/index.shtml

4
Chapter 4. Management of IBM NAS

and IP Storage solutions
As with all comprehensive systems, the IBM TotalStorage Network Attached
Storage and the IP Storage 200i products come with programs that can be use to
configure, manage, and maintain them. These utilities enable the products to be
easily managed and administered without the need to invest a great deal of time
and money in the acquisition of new skills. In this chapter, we introduce the
management tools that are available in the IBM TotalStorage NAS and IP
Storage products.

4.1 IBM NAS and IP Storage management
The IBM TotalStorage NAS and IP Storage products come with a rich set of
management tools and utilities. They provide management solutions through a
variety of hardware instrumentation. These products have been architected and
designed to provide industry leading manageability during the entire life cycle,
from installation and operations to problem management. With the optional
Advanced System Management (ASM) PCI adapter and external power supply, it
can also provide control even if your NAS or IP Storage products are down or
powered off. Features such as the following are designed into these systems:
򐂰 Mechanicals to allow easy access to components with a limited set of tools
򐂰 LEDs and panels to provide you with at-a-glance problem identification (Light
Path Diagnostics)
򐂰 Components utilizing Predictive Failure Analysis (PFA) to alert you before
component failure
򐂰 Redundant components for greater reliability, availability, and serviceability
򐂰 Room for expansion on key components like disks and memory
򐂰 ROM-based diagnostics for remote access
򐂰 Instrumented Basic Input/Output System (BIOS) to allow for the maximum
amount of system information to be provided for inventory and problem
resolution
򐂰 IBM Advanced System Management (ASM) Processor
Moreover, these products use a balanced system design so that your system is
running at optimal performance levels for your environment. IBM also introduced
an innovative light-path service panel in conjunction with component-level LEDs
on certain failing components. This makes the identification and replacement of a
failing component extremely easy.
The light-path service panel directs you to the problem area, and the
component-level LEDs tell you which component is the problem. This helps you
minimize downtime and save spare parts for times you might need them.

4.1.1 NAS 300 and 300G base drive configuration
Table 4-1 shows a summary of the NAS 300 and 300G default drive partitions.
Notice that the logical drives and the sizes of C and D are not changeable. These
are automatically created during the initial installation of the NAS operating
system. This setup would also be true if you were to perform a restoration of the
NAS operating system. The logical drive E can be reconfigured into various
LUNs if the user chooses to do this.
Table 4-1 IBM NAS 300 and 300G base drive configuration
Logical drive Size RAID type Partition use
C:\ 3GB 1E System
D:\ 6GB 1E Maintenance
Figure 4-1 shows a logical representation of the NAS 300 and 300G base drive
configuration.
logical drive 1: RAID 1E, 3GB, Sytem Partition
logical drive 2: RAID 1E, 6GB, Maintenance Partition
Array A
Figure 4-1 Logical view of NAS 300 and 300G base drive configuration
4.1.2 Advanced System Management (ASM) Processor

The IBM Netfinity ASM Processor is integrated on the planar of the IBM
TotalStorage NAS and IP Storage products. The processor provides the system
administrator with extensive remote management of these products, even when
the system has been switched off, or when it has failed. The processor is an
integrated subsystem solution independent of the hardware and operating
system. It complements the server hardware instrumentation by monitoring,
logging events, reporting on many conditions, and providing full remote access
independent of server status.
Chapter 4. Management of IBM NAS and IP Storage solutions 175

The ASM Processor controls IBM's innovative Light Path Diagnostics. This is a
milestone in maintenance and repair. In conjunction with the light path and
Predictive Failure Analysis (PFA), the processor provides extensive alerting and
real-time diagnostics. These indicate when a component such as a hard drive,
power supply, or fan is failing. PFA can send notifications about the component,
anticipating problems to help keep your business up and running. The processor
logs and sends alerts for PFA events on the CPU, voltage regulating modules
(VRMs) and Error Correction Code (ECC) memory, as well as on power supplies
and fans.
Other functions provided by the ASM Processor include the following:

򐂰 Remote update of system and ASM Processor BIOS.
򐂰 Remote power cycling of server (power-on and -off).
򐂰 Remote Power On Self Test (POST).
򐂰 Remote access to RAID and SCSI configuration through a Remote POST
Console.
򐂰 Shared serial port allowing connection to the ASM Processor and/or the
operating system through a single modem. The operating system owns the
“shared” serial port while the system is up and running. However, during
POST, and during a critical event, the processor owns the port.
򐂰 Warning thresholds that alert the user to certain potential problems, thus
allowing any necessary corrective action before failure, a feature not included
in some other vendors’ solutions.
򐂰 Access to vital product data with serial numbers of key components through
Netfinity Director and Netfinity Manager.
With all these powerful remote management functions, security is essential. The
ASM Processor includes security features such as these:
򐂰 Password protection
򐂰 User profiles (up to 12 profiles with the ability to define the level of access
rights)
򐂰 A time stamp in the event log of last login
򐂰 Dial-back configuration to protect the server from unauthorized access
The ASM Processor is constantly monitoring and made available so long as

there is continuous input power to the NAS appliance. This continuous input
power refers to the main power supply, regardless of the position of the ON/OFF
switch of the NAS appliance. With this feature, full remote control of power is
always enabled, even when the server is turned off. This is another IBM
TotalStorage advantage that is not available in some other vendors’ solutions.

However, if the power cable to the NAS appliance is unplugged, or if power to the
receptacle is lost, then the appliance and the ASM planar processor will cease to
operate. Since there is no internal battery backup for the ASM planar processor,
it is highly recommended that the customer purchase an intelligent UPS which
could execute a system shutdown whenever there is a power failure which
exceeds a certain amount of time. This is configurable through the Windows
Terminal Service’s control panel and power options.
4.1.3 ASM PCI adapter option

An optional Advanced System Management PCI Adapter may be purchased with
the IBM TotalStorage NAS and IP Storage. This PCI adapter allows you to
connect via LAN or modem, from virtually anywhere, for extensive remote
management. Remote connectivity allows for flexibility with LAN capability and is
provided by a standard 10/100 Ethernet operation. Power backup is offered
through an optional external power source. This allows greater availability by
connecting to an optional, uninterruptible power supply.
In addition, the PCI adapter enables more flexible management through a Web
browser interface. It also allows you to download flash BIOS for the ASM
Processor, as well as for the server, over a LAN, modem or ASM Interconnect.
The adapter also supports the generation and forwarding of unique SNMP traps,
allowing it to be managed by Tivoli Netview or Netfinity Director.
Automated Server Restart and orderly operating system shutdown are supported
by the ASM processor. The ASM processor is hardware and software
independent for all other functions.
The ASM uses a DOS based configuration utility. This provides additional
configuration functionality for both the ASM Processor and ASM PCI Adapter. In
addition, it also allows you to set up and configure all relevant parameters for the
ASM Processor and ASM PCI Adapter, independent of the operating system and
status of your server. This is done through a bootable DOS diskette.
IBM NAS ASM Interconnect

The IBM NAS Advanced System Management Interconnect option extends
existing integrated remote management capabilities by providing the ability to
share modem or LAN resources, thus eliminating the need for individual
connections to every managed system. By bringing large systems management
capabilities to the NAS appliance, the ASM Interconnect increases control of
networked business systems to improve system availability and reliability.

The Advanced System Management Interconnect Cable Kit makes it possible to
interconnect up to 12 ASM Processors or ASM PCI Adapters, or both, with a
maximum distance between the first and last processor being 90 M (300 ft).
Connecting processors in this way creates a systems management network in
which any ASM Processor or ASM PCI Adapter can be managed as if it were
directly attached to the management console.
Some examples of situations where this feature is useful are as follows:

򐂰 Suppose that you have a rack of NAS appliances, but only one of them has a
modem attached to it. Using the ASM Interconnect bus, an alert from any of
the servers in the rack can be transferred to the server with the modem, which
can then forward the alert.
򐂰 Using a Web browser or Telnet session, you can connect to the ASM PCI
Adapter in one of the NAS appliances in your rack. Then, using the ASM
Interconnect bus, you can connect to any of the NAS appliances in your rack
in order to perform actions such as power cycling.
A logical view of interconnecting between ASM PCI adapters and ASM

processors to achieve the described benefits is shown in Figure 4-2.
NAS Appliance
NAS Appliance
NAS Appliance
Figure 4-2 View of interconnected NAS appliances using ASM PCI adapters

4.2 IBM NAS and IP Storage preloaded software
The IBM TotalStorage NAS and IP Storage products come preloaded with the
configuration programs described in this section, which you can use to customize
your server hardware.
4.2.1 Configuration/Setup Utility

The Configuration/Setup Utility program is part of the basic input/output system
(BIOS) code that comes with these products. You can use this program to
configure serial and parallel port assignments, change interrupt request (IRQ)
settings, change the drive startup sequence, set the date and time, and set
passwords.
4.2.2 SCSI Select Utility

With the built-in SCSI Select Utility program, you can configure the devices that
are attached to the integrated SCSI controller. Use this program to change
default values, resolve configuration conflicts, and perform a low-level format on
a SCSI hard disk drive.
4.2.3 ServeRAID programs

Your IBM ServeRAID adapter comes preconfigured for the hard disk drives that
are installed in your system. If you add additional drives you must use the
ServeRAID configuration program to define and configure your disk-array
subsystem before you can use the disks in your Network Operating System
(NOS). The ServeRAID configuration program is preloaded with your system.
Figure 4-3 on page 180 shows the ServeRAID program found in the NAS
products.

Figure 4-3 ServeRAID Manager screen
4.2.4 Terminal Services Client

Since the IBM TotalStorage NAS appliance server is installed in a “headless”
environment, which means that the appliance does not have a terminal, mouse,
or keyboard attached to it, you must perform systems management tasks on the
appliance from a remote systems management console. The Terminal Services
Client, when installed on a workstation that is attached to the same network as
the appliance server, enables remote administration of the appliance.
If you do not plan to use the IBM Advanced Appliance Configuration Utility, then
you must install and use Windows Terminal Services to configure the appliance
server. Refer to the User’s Reference in the IBM TotalStorage NAS Appliance
product documentation for detailed instructions for installing and using the
configuration programs.
An example of the Terminal Services Client program found in the IBM NAS
products is shown in Figure 4-4.

Figure 4-4 Windows Terminal Services screen
4.2.5 Universal Manageability Services (UM Services)

Universal Manageability Services (UM Services) is a suite of graphical user
interfaces (GUIs) that enhances the local or remote administration, monitoring,
and maintenance of IBM systems. UM Services is a lightweight client that resides
on each managed computer system. With UM Services, a client-system user or
remote systems administrator can use a supported Web browser, or the
Microsoft Management Console (MMC) and UM Services Web console support,
to inventory, monitor, and troubleshoot IBM systems on which UM Services is
installed.
This “point-to-point” systems management approach, in which a system

management administrator uses a Web browser to connect directly to a remote
client system, can be used to enhance support. It enables systems
administrators to effectively maintain IBM systems, without requiring them to
install additional systems management software on their administrator console.
In addition to point-to-point systems management support, UM Services also

includes support for UM Services Upward Integration Modules (Aims). Aims
enable systems management professionals who use any supported systems
management platform (including Tivoli Enterprise, CA Unicenter TNG
Framework, and Microsoft Systems Management Server (SMS)), to integrate
portions of UM Services into their systems management console. Because it was
designed to use industry-standard information gathering technologies and

messaging protocols, including Common Information Model (CIM), Desktop
Management Interface (DMI), and Simple Network Management Protocol
(SNMP), UM Services adds value to any of these supported workgroup or
enterprise system management platforms.
In summary, you can use UM Services to:

򐂰 Learn detailed inventory information about your computers, including
operating system, memory, network cards, and hardware
򐂰 Track your systems proactively with features such as power management,
event log and system monitor capabilities
򐂰 Upwardly integrate with Tivoli Enterprise, Tivoli NetView, Computer
Associates Unicenter, Microsoft SMS and Intel LANDesk Management Suite
4.2.6 IBM Advanced Appliance Configuration Utility (IAACU)

The IBM Advanced Appliance Configuration Utility aids in setting up and
reconfiguring the network configuration on the NAS appliance servers. The
IAACU agent, preinstalled on the IBM TotalStorage NAS appliance, works with
the IAACU console. This is a Java-based application that is installed on a
network-attached system that will be used as a systems management console. It
enables you automatically to detect the presence of NAS appliances on the
network.
Once the NAS appliance is detected by the IAACU console, you can use the
IAACU to set up and manage the appliance’s network configuration, including
assigning the IP address, default gateway, network mask, and DNS server to be
used by the appliance. You can also use the Advanced Appliance Configuration
Utility to start Universal Manageability Services on the appliance, enabling you to
perform more advanced systems management tasks.
Networks not currently running DHCP servers will find the IAACU particularly
useful for automatically configuring network settings for newly added appliance
servers. However, networks with DHCP servers will also benefit from using the
IAACU, as it enables the systems administrator to reserve and assign the
appliance IP address in an orderly, automated fashion. Even if the customer
decides to use DHCP and does not choose to reserve an IP address for the
appliance, the IAACU can still be used to discover appliances and to start UM
Services Web-based systems management.
Consider the following information when using the IBM Advanced Appliance
Configuration Utility:
1. The IAACU configures and reports the TCP/IP settings of the first adapter on
each appliance server only. The first adapter is typically the built-in Ethernet

adapter. Be sure to connect the built-in Ethernet connector to the same
physical network as your systems management console.
2. The IAACU must be running for newly installed appliance servers to be
configured automatically.
3. The system running the IAACU console automatically maintains a copy of its
database (ServerConfiguration.dat) in the Advanced Appliance Configuration
Station installation directory. To remove previous configuration data, close the
IAACU, delete this file, and then restart the utility. All previously configured
Families will be deleted. However, the IAACU will discover connected NAS
appliances and their network settings.
The Advanced Appliance Configuration Utility agent

Once the appliance is connected to the network, the Advanced Appliance
Configuration Utility agent automatically reports the appliance’s MAC address (of
the first NIC only), serial number, type of appliance, and whether DHCP is in use
by the appliance or not. Furthermore, it will report the hostname, primary IP
address, subnet mask, primary DNS address, and primary gateway address if
these are configured on the system.
The IAACU agent is preinstalled on the NAS appliance.
The IAACU agent periodically broadcasts the appliance server IP settings. To

prevent the service from broadcasting this data periodically, you can stop the
iaaconfig service in Windows.
The Advanced Appliance Configuration Utility Console

The IAACU Console is a Java application that you install on one system in your
network that will be used as a systems management console.
Only one system running the IAACU console in a physical subnetwork is allowed
and supported.
The IAACU Console enables you to do the following:

򐂰 Monitor your NAS appliances. When you start the Advanced Appliance
Configuration Utility Console, it automatically detects all NAS appliance
servers on the same physical subnet that are running the IAACU agent.
򐂰 Use a simple, GUI-based application to configure the appliance servers’
network settings. You can use the IAACU to assign IP addresses, DNS and
gateway server addresses, subnet masks, hostnames, and more.
򐂰 Automatically group discovered NAS appliances into function-specific
Families. Appliances are added to a Family based on the appliance type, so

that appliances running different operating systems, but which perform the
same functions, can appear in the same Family.
򐂰 Start UM Services Web-based systems management console. Launch UM
Services on your NAS appliance servers and perform advanced systems
management tasks on a selected appliance server with a single mouse click.
Figure 4-5 shows the IBM Advanced Appliance Configuration Utility Console.
Figure 4-5 IBM Advanced Appliance Configuration Utility Console
The Advanced Appliance Configuration Utility Console is divided into two panes:
򐂰 The tree view pane
򐂰 The information pane
The tree view pane

The tree view pane presents a list of all discovered NAS appliances and includes
any Families you have previously defined. This pane include groups for
appliances that are not part of a previously defined Family. They are also not
configured using the IAACU, or have IP addresses that conflict with other
devices on the network. When you click on any item in the tree view, information
about that item and any items which are nested under that item in the tree view,
will appear in the information pane.

The information pane
The information pane displays information about the item that is currently
selected in the tree view pane. The information that appears in the information
pane varies depending on the item that is selected. For example, if you select the
All Appliances item from the tree view pane, the information pane will display
configuration information (IP settings, hostname, serial number, and so forth)
about all of the NAS appliances that have been discovered by the IAACU
Console. However, if you select a Family, the information pane displays
information about the Family settings for the selected Family. The IAACU
Console also features the following menus:
򐂰 File: Use the selections available from the File menu to import or export the
IAACU Console configuration data, to rescan the network, or to exit the
program.
򐂰 Family: Use the selections available from the Family menu to add or delete
Families, or to move Families up or down in the tree view.
򐂰 Appliance: Use the selections available from the Appliance menu to remove a
previously discovered appliance from a Family or group, and to add an
appliance to the first matching Family in the tree view.
򐂰 Help: Use the Help menu to display product information.
Discovering NAS appliances

Any NAS appliance server that is running and is connected to the same subnet
as the system running the IAACU Console is automatically discovered when you
start the Advanced Appliance Configuration Utility Console. Discovered
appliances appear in the LaunchPad Console tree view (found in the left pane of
the IAACU Console window). Each appliance will appear in two locations in the
tree view:
1. Every discovered appliance is listed in the tree view under All Appliances.
2. Each discovered appliance will also appear in one of the following portions of
the tree view.
The portion of the tree view is determined in the following manner.
In a Family
If the discovered appliance fits the requirements of a Family, it will automatically
appear as part of a Family. If a discovered appliance fits the requirements of
more than one Family, it is automatically added to the first appropriate Family
that is listed in the tree view, starting from the top of the tree. (For information on
how to move appliances between families, refer to , “Using Families and Groups
in the tree view” on page 186.)

In the Orphaned Appliances group
If the discovered appliance does not fit a previously configured Family, it is
placed in the Orphaned Appliances group.
In the Orphaned Externally Configured Appliances group

Appliances that are running the IAACU agent, but that have a network
configuration that was not set by the IAACU agent or console, will appear in the
Orphaned Externally Configured Appliances group.
Using Families and Groups in the tree view

Families are important elements of the Advanced Appliance Configuration Utility.
They specify the parameters the IAACU uses automatically to categorize
discovered appliances and to configure them with the appropriate network
settings. Family rules are defined solely by appliance type or purpose. Each
Family can contain only one type of appliance. Appliance servers that match the
rules criteria for a Family group can be automatically configured to use
predefined network settings. A Family can be configured to allow appliances to
use DHCP to configure their IP settings, or can be defined automatically to
assign IP settings (such as primary gateway and DNS server addresses,
assigning an IP address from a specified IP address range, and specifying a
subnet mask). Host names for discovered appliances can also be defined so that
they are allocated using either a Prefix or Serial Number.
The Advanced Appliance Configuration Utility is not the only way to configure
network settings. For example, network settings can be configured using
Terminal Services for Windows, or by attaching a keyboard and mouse to the
appliance and using Windows Control Panel on the server. If the appliance
network settings have been configured by a method other than using the IAACU,
the appliance will be discovered by the Advanced Appliance Configuration Utility
and it will be added to an appropriate Family, if one exists. Appliances that have
been configured using a method other than the IAACU for which no appropriate
family exists will appear in the Orphaned Externally Configured Appliances
group.
The tree view panel contains the following items:

򐂰 All Appliances: Every discovered appliance is listed in the tree view under All
Appliances.
򐂰 Families: The Families group in the tree view pane shows all Families that
have been defined. Appliance servers that have already been assigned to
each Family are nested beneath the Family name in the tree view. Families
are defined by appliance purpose, so that all appliances that appear in a
given family are of the same type. If you select a Family from the tree view
pane, a description of the Family, and the rules that are used to define the
selected Family, are displayed in the information pane.

If you select an appliance server from a Family in the tree view pane, the
selected appliance network settings are displayed in the information pane.
If you are not using DHCP, the Advanced Appliance Configuration Utility
automatically assigns one IP address per appliance server, using available
addresses within the range defined in the Family rules. When a Family’s IP
address range has been exhausted, the Advanced Appliance Configuration
Utility automatically searches for other Families that have rules matching the
appliance server being configured. If a matching Family with an available
address is found, the server will automatically be assigned to the Family that has
available IP addresses. This enables you to define multiple Families, each of
which uses a range of non-contiguous IP address ranges.
When an appliance is discovered on the network, the Advanced Appliance

Configuration Utility automatically searches all previously defined Families. It
starts with the first Family listed in the Families tree view and moves downward.
Appliances are automatically added to the first defined Family that matches the
appliance purpose. Therefore, the order in which Families appear is important.
To adjust this search order, right-click on a Family and then select Move Up or
Move Down to adjust its position within the Families list.
Orphaned appliances
Any discovered appliance servers that were configured using the Advanced
Appliance Configuration Utility, but that do not meet the rules for any existing
Family, are automatically added to the Orphaned Appliances group.
Orphaned externally configured appliances

Any discovered appliance server that has been configured without using the
IAACU tool and that does not meet the rules for any existing Family is
automatically added to the Orphaned Externally Configured Appliances group.
Appliance servers configured without the IAACU that meet the rules for any
existing Family are automatically added to the matching Family. The Advanced
Appliance Configuration Utility will not change manually configured network
settings of discovered appliance servers. If the manually configured IP and
Subnet addresses fit an existing Family, the IAACU will place that appliance
server into that Family, but will not change any other settings (such as Host
Name or DNS or gateway addresses).
Conflicting network addresses

Any discovered appliance server that has the same IP address as a previously
discovered appliance server will be listed in the Conflicting Network Addresses
group.

The NAS administration menu
From the IAACU console, you can access the NAS administration menu by
clicking the Start Web Management button. This will bring you to the Universal
Management Services menu, which allows you to administer the server
appliance, and also lists out all the files systems that were shared.
Figure 4-6 shows the Universal Management Services menu.
Figure 4-6 Universal Management Services menu
From here, clicking Administer this server appliance will lead you to the NAS
administration menu, as shown in Figure 4-7.

Figure 4-7 NAS Appliance Administration menu
From this menu, you will be able to perform the following actions:
򐂰 Network setup: Manage essential network properties
򐂰 Services: Control essential services
򐂰 Folders and Shares: Manage local folders, and create or modify file shares
򐂰 Disks and Volumes: Configure disks, volumes, disk quotas, and persistent
images
򐂰 Users and Groups: Manage local users and groups
򐂰 Maintenance: Perform maintenance tasks
򐂰 Help: View online help
For more information on its individual configuration parameters, refer to the

on-line User’s Guide documentation that comes with the NAS appliance.

5
Chapter 5. Backup for IBM Network

Attached Storage
The IBM NAS products come with a rich set of utilities for data management.
One of the key advantages of using IBM’s NAS products is the ability to capture
point-in-time image copies without the need for a long downtime window. In this
chapter, we describe the use of the NAS cache, and explain how it can help to
increase productivity in backup and recovery of your mission-critical data.
We also describe in detail how to implement the Persistent Storage Manager

(PSM), with a few simple examples of how PSM works.

5.1 IBM NAS cache exploitation for backup
Caches are often implemented in computer systems to improve performance.
The IBM NAS products use large caches to optimize performance. When
enhanced RAID systems with battery backup are used, the RAID caches can be
run in write-back mode to dramatically improve file write operations. This section
describes each of the major cache mechanisms and their operations.
The IBM NAS products use two types of backup implementation: point-in-time
image copies and archival backup.
Point-in-time backup
Point-in-time images provide a near instant virtual copy of an entire storage
volume. These point-in-time copies are referred to as persistent images and are
managed by the Persistent Storage Manager (PSM) software.
These instant virtual copies have the following characteristics:

򐂰 Normal reads and writes to the disk continue as usual, as if the copy had not
been made.
򐂰 Virtual copies are created very quickly and with little performance impact
since the entire volume is not truly copied at that time.
򐂰 Virtual copies appear exactly the same as the original volume when the
virtual copy was made.
򐂰 Virtual copies typically take up only a fraction of the space of the original
volume.
These virtual copies are created very quickly and are relatively small in size. As a
result, functions that would otherwise have been too slow, or too costly, are now
made possible. Use of these persistent images may now allow individual users to
restore their own files without any system administrator’s intervention. With the
pre-loaded code, the NAS administrator can set up the Persistent Storage
Manager automatically to schedule an instant virtual copy. This could be done
every night, for example, and users could be given access to their specific virtual
copies. If users accidentally delete or corrupt a file, they can drag-and-drop from
the virtual copy to their storage without any administrator involvement.
Archival backup
Archival backup is used to make full, incremental, or differential backup copies,
which are typically stored to tape. The NAS Persistent Storage Manager can
resolve the well-known “open file” problem of making backup copies in a 24x7
operation.

5.1.1 IBM NAS cache mechanisms
Generally, a cache mechanism is transparent to the user (or application) except
for the performance increase it provides. This section discusses the various
cache mechanisms implemented in the IBM Network Attached Storage products
utilizing the Windows Powered OS.
Types of cache mechanisms

Some memory and storage technologies are faster than others. A good example
is that random access memory (RAM) is faster than disk. Likewise, some RAM
technologies are faster and more expensive than other RAM technologies. An
analysis of most data usage shows that some data locations are read or written
much more often than other locations. Read and write caches improve system
performance by having a copy of the often-used data in both fast and slow
memory. While the read cache and the write cache can be implemented as one
unified cache subsystem, the goals and processing for reads and writes are
somewhat different, as described in the following paragraphs.
Read cache
The basic goal of a read cache is to get data into the processor as quickly as
possible. The read cache algorithms attempt to accomplish this goal by having
the most-often-used data written into fast-technology memory such as in RAM,
rather than disk. Based on some algorithms, it will make a “best guess” of what
data will be needed next. These algorithms generally copy data into the faster-
technology read cache by pre-fetching the data from the slower memory, or
keeping a copy of the data from an earlier write. Read caches are used heavily
because they can provide dramatic performance gains at a modest cost.
However, implementing a read cache requires the computer designer to address

a subsequent problem, which is that when a write occurs, both the read cache
copy and the original copy must be changed. The read cache must be updated
so that any future read cache references will contain the latest changes.
Write-back cache
The basic goal of a write-back cache is to “get rid of” data stored in the processor
as quickly as possible. In a write-back cache, the read cache is updated
immediately, but the change to the “real” (not read) cache location might be
slightly delayed as it uses a slower storage technology. During this time period,
the data waits in the write-back cache queue. Performance for the write-back
cache approach is very fast. The write operation completes as soon as the
(faster) read cache is updated, taking it “on faith” that the real location will also be
updated soon.
Chapter 5. Backup for IBM Network Attached Storage 193

Obviously, with a write-back cache approach, it is very important that the design
can accept unpredictable problems. An example is an unexpected loss of power
that might prevent the write-back operation from completing as intended. For that
reason, some write-back caches have internal battery backup, while in other
cases, the write-through cache approach is used instead.
Write-through cache
In a write-through cache, a write operation is simultaneously updated in the
cache copy and in the “real” location, and a separate write-cache buffer is not
required. This approach is of course the simplest and “safest,” but unfortunately,
it is also slower than a write-back cache. A write-through cache is slower
because the write operation cannot complete until both copies are updated, as
the “real ” (not the cache) copy is stored in a much slower technology (slower
RAM, or even a disk). Assuming that there is no battery backup, a write-through
cache approach is “safer” because both copies are always exactly the same,
even if the cache copy gets destroyed (for example, RAM cache during loss of
power), because the real copy (for example, the disk copy) has already been
updated.
Microprocessor cache
Most current microprocessors, such as the Intel Pentium processors, use
multiple read and write cache mechanisms to improve performance. There are
instruction caches, data caches, and even register caches. Both write-through
and write-back schemes are often used. The most-widely known microprocessor
cache is the level 2 cache (L2). In earlier processors, such as the 486 processor,
the L2 cache was implemented with separate memory chips on the personal
computer motherboard. With today’s technology, this cache is included within the
microprocessor chip, module, or microprocessor carrier card.
On NAS products, the microprocessor caches, including the L2 cache, are a

fundamental part of the microprocessor design, and they are not expandable (by
the customer or IBM). The IBM NAS products currently provide 256 KB L2
cache, which cannot be increased or decreased.
Cache mechanisms in Windows Powered OS

There are many buffers and cache mechanisms in any operating system, such
as the Windows Powered OS used in the IBM Network Attached Storage
products. Here, we will focus on cache mechanisms for the data stored on the
disk.
A customer can purchase varying amounts of main-memory RAM for the engines
on IBM Network Attached Storage products. Some of this RAM is used for the
Powered by Windows OS, but the vast majority is used as a large read cache for
the user data stored on the disk. When a NAS user requests a file, the NAS

engine first reads the data from the disk and copies it into the engine’s RAM, and
then sends it to the network-attached user (client). If the user requests this same
data again, the NAS product does not have to retrieve this data from the disk
again, but can simply send the data that is already in RAM, which is much faster.
If a NAS user writes data, this update must be made to both the disk copy and
the RAM copy (if any). In the IBM Network Attached Storage products, the main
RAM has error correction code (ECC) technology to protect against loss of data
due to a partial memory failure. However, this memory is not powered by
batteries, so all data in RAM is lost upon power down. Likewise, if there is a
power failure on the box, or should the operating system abend, the main RAM
contents will be lost. Furthermore, if write-back mode was being used when this
problem occurred, the data in the main RAM would never get written back to the
disk. To avoid this potential problem, the Windows Powered OS caches are
configured for write-through mode.
Cache mechanisms in the RAID controller

The RAID controller also has cache memory for data being transferred to and
from the NAS operating system and the attached disks. The majority of this RAID
memory is used for read operations, but write updates must also be handled
efficiently and safely.
The ServeRAlD-4LX adapter used in the IBM NAS 200 Model 201 workgroup
machine has 16 MB of internal RAM memory, most of which is for a disk-read
cache. This adapter does not have a battery-backed write cache. If this RAID
adapter is used in write-back mode, a failure at the wrong moment will result in
permanent lost data, even if the data is written to a redundant RAID configuration
(such as RAID 1 or RAID 5). For some operations this might be acceptable, but
for most cases, it would not. Therefore, this adapter should generally be run in
write-through mode, so that data integrity is not dependent on the cache
contents.
The ServeRAlD-4H adapter used in the IBM NAS 200 Model 226 departmental
machine has 128 MB ECC battery-backed cache, 32 MB onboard processor
memory, and 1 MB L2 cache for complex RAID algorithms, which allows this
RAID controller to be safely configured for write-back operations. Should there be
a power failure or an abend in the NAS product before the write-to-disk
completes, the data to be written will still be contained in the battery-backed
RAM. When power is restored and the NAS product is rebooted, the RAID card
will be triggered to flush out all remaining information in the battery-backed RAM.
This data will be written to the disk, and all remaining write operations will be
completed automatically. For the best performance, this card can be safely run in

write-back mode, as all writes to the disks will eventually get written. Also note
that if the ServeRAlD adapter itself dies, the battery-plus-memory can be moved
from the failing adapter to a new ServeRAlD adapter and the data operation will
still complete successfully.
In the IBM NAS 300, the RAID subsystem is not contained in the engine
enclosure itself but instead is contained in the first storage unit enclosure. Within
this storage unit enclosure are dual RAID controllers and dual power supplies to
provide a completely redundant solution with no single point of failure. Large
system configurations have a second identical RAID subsystem, which also has
dual RAID controllers. Each of the dual RAID controllers has 128 MB of internal
battery backed-up write ECC RAM. The RAID subsystem can be safely
configured for write-back operations. Should power fail (or the NAS product
abend) before the write-to-disk completes, the data to be written is still contained
in the battery-backed RAM. When power is restored and the NAS device is
rebooted, the RAID will realize that there is information still in the battery-backed
RAM that must be written to the disk, and this write operation will complete
automatically. Additionally, as write-back data is stored in both of the dual RAID
controllers, this write-back will occur even if one of these RAID controllers fails.
For the best performance, this adapter can be run safely in write-back mode, as
all writes to the disks will eventually get written to the disk array.
The IBM NAS 300G does not have an integrated RAID controller, but instead
uses a disk subsystem that is SAN-attached. Based on the properties of that
SAN disk subsystem, the SAN administrator may choose to run that RAID
adapter in write-through or write-back mode, after considering performance and
potential data integrity tradeoffs, if any.
Cache mechanisms in the Persistent Storage Manager

IBM Network Attached Storage products provide point-in-time images of the file
volumes through the PSM function. As explained in the next section, this function
uses storage cache that is privately managed by the PSM code. This PSM cache
is quite different from the cache mechanisms mentioned before. These cache
mechanisms were specifically designed for performance purposes, while the
PSM caches are required for the PSM function to work.
Data backup and recovery in the NAS products

There are two methods of saving copies of the data on the IBM Network
Attached Storage products and each has its own value. We generically call these
archival backup and point-in-time images. In most cases, a NAS administrator
performing an archival backup will also use the point-in-time backup. Table 5-1
on page 197 gives an overview of the differences.

Table 5-1 Differences between archival backups and persistent images
Archival backup NAS Persistent image
Use Archival backup (most likely with NAS persistent image or

point-in-time file image) point-in-time file image (alone)
Typical storage location Tape, or perhaps disk Disk only
Copy retention across reboot? Yes (tape or disk) Yes (disk)
Number of copies limitation The number of tape cartridges or Total disk storage space
availability of disk capacity to available (250 maximum images
hold backup images per volume)
Additional backup software Windows NT Backup (ships with No additional software is

needed for this function NAS product), Tivoli Storage required
Manager, or ISV software (for
example, Veritas, Legato), is
required
Used for NAS Operation System or NAS Mainly for NAS user (client) data
user (client) data backup backup
Stores files as separate items in Yes (specifics dependent on Yes

copy or backup additional backup software)
Stores volumes as an entity No, but volumes are simply a No, but volumes are simply a
collection of files (dependent on collection of files
additional backup software)
Useful for disaster recovery Yes (if written to tape) No, as data is always stored on
where entire disk system is disk within the same NAS
destroyed (for example, fire) appliance. This approach is not
useful if disk is destroyed
Useful for recovery where data is Yes, administrator recovery only Yes, administrator or user
accidentally erased or modified recovery
Users can restore their own No Yes, if allowed by administrator

deleted/changed files
5.1.2 Persistent Storage Manager True Image Copies

The Persistent Storage Manager (PSM) provides a “point-in-time” True Image
copy of the file system. The True Image function is similar to the following
functions in other products:
򐂰 FlashCopy on the IBM Enterprise Storage Server
򐂰 Snapshot on Network Appliance products

򐂰 SnapShot on StorageTek or IBM RAMAC products
On the IBM NAS products, all of the following terms refer to the same
functionality:
򐂰 Persistent image
򐂰 True Image on Columbia Data Products
򐂰 Point-in-time image
򐂰 Instant virtual copy
򐂰 Snapshot on NetApp or StorageTek
PSM provides several key benefits. It:

򐂰 Provides a solution for the “open file” backup problem
򐂰 Allows a very quick copy of a volume
򐂰 Eliminates backup windows, which allows continued system access during
the backup
򐂰 Provides easy end-user restorations of individual files
Usually, after a backup is made, the users will continue to update those files on
the disk. These backups will “turn stale” with time (that is, they will be outdated
after a while). However, it is very important that the data on the backup stays
exactly as it was when the backup was made.
Unfortunately, making a backup copy while the data is still changing is rather
difficult. Commonly encountered problems include:
򐂰 While data is changing, multiple sectors are being written to disk
򐂰 Write-back caches might not have completed writing to disk
򐂰 An application that is changing two or more files “at the same time” will not
truly update both at the exact same instant
Therefore, for a good backup, these changes must not occur while the backup is
being made, so that all data written is consistent in all changed files.
Historically, this problem has been solved by disabling all users while the backup
occurs. However, this may take several hours. In today’s 24x7 environment,
having such a large backup window is simply not acceptable. In these NAS
systems, this problem is solved by making a very quick “instant virtual copy” of a
volume, a True Image copy.

Overview of how PSM works
PSM software connects into the file system of the NAS product, and monitors all
file reads and writes with minimal performance impact. When a persistent image
copy is requested, the following activities occur:
1. The moment a persistent image is requested, PSM begins monitoring the file
system, looking for a 5-second period of inactivity. This monitoring is required
to make sure that ongoing write operations were committed before the
“instant virtual copy” is made. The requirement for 5 seconds of inactivity is
necessary so that PSM can be sure that any data in a write-back buffer has a
chance to get “flushed” to the disk before the “instant copy” is made. This
5-second period can be configured by the NAS administrator. The NAS
administrator can also configure how long PSM should search for this
inactivity window. If inactivity is not found within that time, the virtual instant
copy will not be made.
2. An instant virtual copy is then made. At this point in time, PSM sets up control
blocks and pointers. This virtual copy is created very quickly.
3. The PSM code continues to monitor the file system for write-sector requests.
When a write-sector request occurs, the PSM code intercepts this request by
first reading the data that is to be overwritten, then saves the “original data” in
a PSM-specific cache file (which is also stored on disk). After a copy of the
original data is saved, the write-sector request is allowed to be completed.
4. As additional write-sector requests are made, PSM again saves a private
copy of the original data in the PSM-specific cache. This process, called a
copy-on-write operation, continues from then on until that “virtual copy” is
deleted from the system. Note that through time, the PSM-specific cache will
grow larger. However, only the original sector contents are saved and not
each individual change.
5. When an application wants to read the virtual copy instead of the actively
changing (normal) data, PSM substitutes the original sectors for the changed
sectors. Of course, read-sector requests of the normal (actively changing)
data pass through unmodified.
By design, processes (such as backup or restoration) having data access

through a persistent image have a lower process priority than the normal read
and write operations. Therefore, should a tape backup program be run at the
same time the NAS is experiencing heavy client utilization, the tape-backup
access to the PSM image is limited while the normal production performance is
favored. This helps to minimize normal user impact.

Note: While creating the PSM images happens very quickly, it might take a
few minutes before that image is available and visible to the users. In
particular, the very first image will generally take much longer to be made
available than subsequent images. By design, PSM will run at a lower priority
than regular traffic, so if the system is heavily utilized, this delay can be longer
than normal.
Figure 5-1, Figure 5-2, and Figure 5-3 show the copy-on-write, normal read, and
reading of data from a persistent image, during the execution of the PSM.
Copy-on-write operation
NAS file
system
1. Write request to
update disk
PSM software PSM cache
3. Write completes 2. Copy-on-write saves the original

to disk (unmodified) contents of sector
in the PSM cache
Note: Actually, PSM cache is also

on disk, but is shown here
separately for simplicity
Disk
Figure 5-1 Persistent Storage Manager: Copy-on-write operation

N o r m a l r e a d o p e r a tio n
N A S f ile
s y s te m
1 . R e a d n o rm a l
( n o t p e r s is te n t 4 . D a ta p a s s e d - -
im a g e ) c o p y no change
P S M s o ftw a re PS M cache
2 . P a s s e d to d i s k - 3 . D a ta p a s s e d --
no change no change
D is k
Figure 5-2 Persistent Storage Manager: Normal read operation
R e a d d a ta fr o m P e r s is te n t Im a g e
N A S file
sy s te m
1 . R e a d f ro m t h e
p e rs is t e n t 3 . F o r c h a n g e d s e c t o rs , P S M s u b s t it u te s t h e
im a g e c o p y o rig in a l fr o m its c a c h e w h e n it s e n d s th e
d a t a t o t h e N A S file s y s t e m .
P S M s o ftw a r e P S M cache
2 a . S e c t o rs th a t
2 b . F o r s e c to r s t h a t h a v e c h a n g e d , t h e
ha ve not
p r e v io u s ly -s a v e d o rig in a l s e c t o r d a t a is
c h a n g e d a re
re trie v e d fr o m th e P S M c a c h e
re a d f ro m t h e
re g u la r lo c a t io n .
D is k
Figure 5-3 Persistent Storage Manager — reading data from persistent image

How PSM works: PSM cache contents
The following examples illustrate how data in the sectors are updated by the
PSM software during a copy-on-write operation.
In these examples, we assume that the disk originally contained only the
following phrase:
“Now is the time for all good men to come to the aid of their country.”
In the following examples, the expression (FS) represents those sector(s)

containing the file system metadata. This, of course, is updated on every write
operation. Empty (free space) sectors are indicated as #0001, #0002, and so on.
The disk/cache picture examples A through D are not cumulative; that is, in each
case we are comparing against example A.
A. Immediately after a persistent image (“instant virtual copy”) is made
Table 5-2 shows the layout of how the disk would appear immediately after the
True Image copy is made. Note that nothing has really changed (while pointers
and control blocks have changed, for simplicity those details are not shown here).
Table 5-2 Layout of disk after “instant virtual copy” is made
Now i s the time for a ll go od me n to come to th
e aid of t heir count ry. #0015 #0016 #0017 #0018
#0019 #0020 #0021 #0022 #0023 #0024 #0025 #0026 #0027
#0028 #0029 #0030 #0031 #0032 #0033 #0034 #0035 (FS)
Table 5-3 shows the layout of the PSM cache after “instant virtual copy” is made.
Notice that it contains empty cells.
Table 5-3 Layout of PSM cache after “instant virtual copy” is made
B. Immediately after a file is deleted
Table 5-4 shows the layout of how the disk would appear immediately after the
original file was erased. Note that a copy of the original file system (metadata,
and so on) is all that is saved.

Table 5-4 Layout of disk immediately after file is deleted
#0001 #0002 #0003 #0004 #0005 #0006 #0007 #0008 #0009
#0010 #0011 #0012 #0013 #0014 #0015 #0016 #0017 #0018
#0019 #0020 #0021 #0022 #0023 #0024 #0025 #0026 #0027
#0028 #0029 #0030 #0031 #0032 #0033 #0034 #0035 (FS)
Table 5-5 shows the layout of the PSM cache immediately after file is deleted.
Notice that the PSM cache contains a copy of the original file system data.
Table 5-5 Layout of PSM cache immediately after file is deleted:
(FS)
C. Immediately after an “update in place” changing “time” to “date”
Table 5-6 shows the layout of how the disk would appear if the word “time” was
changed to “date”. For this example to be truly correct, we would further assume
the application program only wrote back the changed sectors (as explained later,
this is not typical). This picture illustrates how the sectors might appear.
Table 5-6 Layout of disk after changing “time” to “date”
Now i s the date for a ll go od me n to come to th
e aid of t heir count ry. #0015 #0016 #0017 #0018
#0019 #0020 #0021 #0022 #0023 #0024 #0025 #0026 #0027
#0028 #0029 #0030 #0031 #0032 #0033 #0034 #0035 (FS)
Table 5-7 shows the layout in which the PSM cache would contain the original
sector contents for the word “time” and the file system’s metadata:
Table 5-7 Layout of PSM cache after changing “time” to “date”
time (FS)

D. Immediately after an “update in place” changing “men” to “women”
Table 5-8 shows the layout of how the disk would appear if the change requires
more spaces. Since more spaces are required, obviously the data following the
word “women” would also change as well. The original contents of all changed
sectors would have to be saved in the PSM cache. Note that this example is not
cumulative with examples B or C.
Table 5-8 Layout of disk after changing “men” to “women”.
Now i s the time for a ll go od wo men t o com e to
the a id of their r cou ntry. #0015 #0016 #0017 #0018
#0019 #0020 #0021 #0022 #0023 #0024 #0025 #0026 #0027
#0028 #0029 #0030 #0031 #0032 #0033 #0034 #0035 (FS)
Table 5-9 shows the layout in which the PSM cache would contain all the
changed sectors, starting with the sector containing “men” and including the data
that slid to the right, together with the original file system’s metadata.
Table 5-9 Layout of PSM cache after changing “men” to “women”:
od me n to come to th e aid of t heir count ry.
(FS)
E. Appearance for most file updates
In the preceding examples, we assumed that the change was an “update in

place,” where the changes were written back to the very same sectors containing
the original data. Most databases do an update in place. However, most desktop
applications, such as Freelance, WordPro, Notepad, and so on, will perform a
“write and erase original” update. When these desktop applications write a
change to the file system, they actually write a new copy to the disk. After that
write is completed, they erase the original copy.
Individual sectors on a disk always have some ones and zeros stored in every
byte. Sectors are either “allocated” (in use) or “free space” (not in use or empty,
and the specific data bit pattern is considered as garbage). The disk file system
keeps track of which data is in what sector, and also which sectors are free
space.

For the NAS code that shipped on 9 March 2001, PSM is unaware of free space
in the file system. Therefore, if something is written to the disk, even if it is written
to unallocated disk storage, the underlying sectors are copied to the PSM cache.
The following example illustrates this.
Table 5-10 shows the layout of how the disk would appear following a “save”
operation after changing the word “time” to “date.” This assumes no free space
detection and no “update in place.” Note again that this example is not
cumulative with examples A through D.
Table 5-10 Layout of disk after changes without free space detection”
#0001 #0002 #0003 #0004 #0005 #0006 #0007 #0008 #0009
#0010 #0011 #0012 #0013 #0014 Now i s the date for a
ll go od me n to come to th e aid of t heir count
ry. #0029 #0030 #0031 #0032 #0033 #0034 #0035 (FS)
After this “save” is complete, the new, saved information is written into free space
sectors #0015-#0028, and the original location sectors then turn into free space,
as indicated by #0001-#0014 in the preceding example.
Since the PSM cache works at the sector level and since this version of PSM
code is unaware of free space, PSM would copy the previous free-space sectors
to its cache as shown in Table 5-11.
Table 5-11 Layout of PSM cache after changes without free space detection
#0015 #0016 #0017 #0018 #0019 #0020 #0021 #0022 #0023
#0024 #0025 #0026 #0027 #0028 (FS)

F. Appearance for most file updates, with free space detection
For the NAS code that shipped on 28 April 2001, PSM is enhanced and can
detect free space in the file system. Therefore, if data is written to the disk’s
free-space sectors, those free space sectors will not be copied to the PSM
cache.
Table 5-12 shows the layout of the disk in the event of a “save” operation after
changing the word “time” to “date,” with free space detection but not “update in
place.” Again, this example is not cumulative with previous examples.
Table 5-12 Layout of disk after changes with free space detection
#0001 #0002 #0003 #0004 #0005 #0006 #0007 #0008 #0009
#0010 #0011 #0012 #0013 #0014 Now i s the date for a
ll go od me n to come to th e aid of t heir count
ry. #0029 #0030 #0031 #0032 #0033 #0034 #0035 (FS)
Table 5-13 shows the layout of the PSM cache after saving the “time” to “date”
change. Here, since the PSM cache is aware that the new phrase is being stored
in free space, it does not copy the original free space contents into the cache,
and instead only updates the file system information containing pointers to the
data, and so on.
Table 5-13 Layout of PSM cache after changes with free space detection
(FS)
Finally, note that in this situation, as the recycle bin is active on the NAS, these
save operations tend to “walk through disk storage” and write in free-space
sectors. Therefore, with free space detection (28 April 2001 code) the recycle bin
should be set to a higher number to minimize cache writes and minimize cache
size. For the 9 March 2001 code, the recycle bin should be set to a low number or
turned off, to minimize cache size.
Eventually, a save operation will need to use sectors that were not free space
when the original persistent image was made. Then the original contents are
copied into the PSM cache.

Some considerations on cache location and size
As the previous examples indicate, depending on whether the updates are in
place or not and depending on free-space detection, it is possible that small
changes may have a much larger impact on what data must be stored in the PSM
cache. On the other hand, if sector number 15,123 is changed twenty times after
the True Image copy was made, only one sector, the original sector, is saved in
the PSM cache. Also, if file operations such as a database reorganization or sort
occurs, this may cause a lot of sectors to be changed.
Once a persistent image is created, the PSM cache must keep a copy of any and
all changes to the original file. Therefore, the cache for a specific True Image
copy could eventually grow to be as big as the original volume. The maximum
cache storage size is configurable by the administrator. If insufficient storage is
allocated, then not all the changes can be stored. The PSM cache would then be
made invalid as it would have some good and some missing information. For this
reason, if the PSM cache size is exceeded, the cache will be deleted, starting
with the oldest cache first. It is highly recommended that the NAS administrator
configure a warning threshold that will signal if the cache exceeds the warning
level. The administrator should choose the cache size wisely, as changing the
maximum size might require a NAS system to be rebooted.
PSM caches can neither be backed up or restored from tape. Therefore, the
tape-archive backup program should not be configured to back up the PSM
caches.
How a True Image copy is accessed

True Image copies appear to the user as a mounted drive; that is, as special
(“virtual”) subdirectories. If enabled by the NAS administrator, each NAS user
has access to copies of his or her files, as saved in the persistent images.
The following examples will illustrate how files might appear. The name of the
special PSM folder is administrator-customizable, but in the following example,
the NAS administrator chose the name PSMCOPY.
First, let’s see how the directory looks without any persistent images. Say a user
has a D:\drive located as a “share” on network-attached storage, and that this
drive appears as follows:

D:\
0 MY DOCUMENTS folder
0 PROGRAM FILES folder
0 MULTIMEDIA FILES folder
0 TEMP folder
0 ZZZZ folder
Here is the same directory, except that it is partially expanded:
D:\
/ January Sales.doc
/ February Sales.doc
/ Sales Plan.doc
/ Orders.123
1 Lotus Applications
0 Notes
0 123
0 Freelance Graphics
0 TEMP folder
0 ZZZZ folder
The following example shows how True Image copy within the PSMCOPY folder
would appear to the user. The PSMCOPY folder has been opened, and
persistent images had been created at 10:00 a.m. on Monday, Tuesday, and
Wednesday.
D:\

0 PSMCOPY folder
0 Mon_Mar_05_2001_10.00.00 folder
0 Tue_Mar_06_2001_10.00.00 folder
0 Wed_Mar_07_2001_10.00.00 folder
0 TEMP folder
0 ZZZZ folder
Opening Tuesday’s image would show all files as they originally appeared on
drive D:\ as of Tuesday at 10:00 a.m. These files appear as any other file and can
then be copied (or dragged and dropped) like any other file.
D:\
0 PSMCOPY folder
0 Mon_Mar_05_2001_10.00.00 folder
0 Tue_Mar_06_2001_10.00.00 folder
/ January Sales.doc
/ February Sales.doc
/ Sales Plan.doc
/ Orders.123
1 Lotus Applications folder
0 Notes
0 123
0 Freelance Graphics
0 TEMP folder
0 ZZZZ folder
0 Wed_Mar_07_2001_10.00.00 folder
0 TEMP folder
0 ZZZZ folder

5.1.3 PSM True Image copies can either be read-only or read-write
A persistent image is read-only by default, so no modifications can be made to it.
However, the persistent image can be set to read-write, which allows it to be
modified. When a persistent image is changed, the modifications made are also
persistent (they survive a reboot of the system). Changing a persistent image
from read-write to read-only resets the persistent image to its state at the time
that the persistent image was taken, as does selecting Undo Writes for a
read-write persistent image from the Persistent Images panel.
The ability to create a read-write copy is particularly valuable for test

environments when bringing up a new test system. Specifically, using PSM, a
True Image copy can be made of a live database, and this True Image copy could
be configured as read/write. Then, a separate non-production test system could
use the True Image copy for test purposes. During debug of the non-production
system, the tester could select Undo Writes to reset the test system database to
its original True Image copy. All of this testing would be kept completely separate
from the ongoing active system, and a full copy would not be required. By design,
processes (such as the test system in this example) having data access through
a True Image copy have a lower process priority than the normal read and write
operations, thus minimizing the performance impact to the production database
use.
5.1.4 Differences between PSM and other similar implementations

Table 5-14 provides an overview of the differences between PSM and other
similar implementations with regard to the “virtual instant copy” functions.
Table 5-14 Overview of “virtual instant copy” functions on various products
IBM NAS IBM ESS RAMAC Virtual Network

Array Appliance
Product name Persistent Image Flashcopy Snapshot Snapshot
File system NTFS n/a n/a WAFL
Copy Method Copy on write Two options: Copy on write Proprietary

1. NOCOPY (copy
on write)
2. Background copy
of entire volume
Storage location disk disk disk disk
Data retained yes yes yes yes

across boot

IBM NAS IBM ESS RAMAC Virtual Network
Array Appliance
Maximum number 250 copies of each Up to 255 copies of Limited only by 31

of copies volume each logical volume disk space
Stores files as yes No, has no No, has no yes

separate items in “knowledge” of file “knowledge” of
backup system file system
Stores volumes as No, but volumes Only can backup Only can backup No, but volumes
an entity are simply a volumes, not files volumes, not files are simply a
collection of files collection of files
Useful for disaster No No No No

recovery (ex. entire
disk system is
destroyed by fire)
Useful for recovery Yes, by Yes, by Yes, by Yes,

where data is administrator or administrator only administrator administrator or
accidentally user recovery only user recovery
erased or modified
Users can restore Yes, if allowed by No No Yes, if allowed

their own deleted administrator by administrator
or changed files
directly from the
backup image
Space usage Changes only Target volume Changes only Changes only
size=source volume
size
5.1.5 Archival, backup, and restoration of IBM NAS appliances

Systems administrators should ensure that data stored in the NAS appliance has
adequate protection against data losses due to accidental erasure, replacement,
disk crashes and even disaster scenarios. This section discusses the options
and rationale of each.
Note that these NAS products do not support making an archival copy of the
PSM cache itself. Therefore, when using the following recovery approaches, all
PSM True Image copies and PSM caches should be deleted.

Archival copy of the NAS operating system on CD-ROM
IBM NAS products ship with a Recovery CD-ROM that allows the NAS
administrator to restore the system to the same configuration as it was shipped
from the factory. Therefore, no matter what happens to the operating system or
maintenance partition, the NAS administrator can restore the operating system
software from this Recovery CD-ROM. However, if the administrator has applied
any fixes to the NAS product, these must be reapplied after the Recovery
CD-ROM is used.
Archival backup of NAS OS maintenance partition

IBM Network Attached Storage products are preconfigured with a 3 GB operating
system partition and a 6 GB maintenance partition. Using the preloaded
NTBackup software, the administrator can make a backup of the operating
system to the maintenance partition, and the NAS product has a wizard assistant
to make this simple. Because the NAS operating system might be in use when
performing the backup, it is important to use a True Image copy to resolve the
“open file” problem when making a backup. The included NAS Backup Assistant
will create a True Image copy for its use before the NTBackup is started to ensure
that the backup is complete and valid.
Archival backup of NAS OS to tape using NTBackup

The operating system can be backed up to tape, using the included NTBackup
program and the NAS Backup Assistant. Again, to resolve the “open file”
problem, the backup should employ a True Image copy, and the included NAS
Backup Assistant wizard can and should be used.
Archival backup of NAS OS to tape by other backup program

While the NAS operating system can be backed up using the included NTBackup
software, a customer may decide to back up the operating system using Tivoli
Storage Manager (TSM) or a separately purchased ISV backup program. Either
TSM or a purchased backup software package provides additional backup
functionality to that of NTBackup. This enhanced backup software might then be
used for backing up both the operating system and the user data.
Tivoli Storage Manager is discussed in “TSM configuration and backup” on

page 217, while ISV backup software is discussed in “Independent Software
Vendor (ISV) solutions” on page 219.
Archival backup of NAS user (client) data

System administrators need to make archival copies of their critical data.
Typically, these copies are made to tape, and then these tape cartridges can be
taken off-site to protect against site disaster incidents.

In many cases, the administrator will use the PSM to create a True Image copy,
which then will be used in the archival backup. While True Image copies are
retained across NAS reboots, they are not a replacement for tape backup, as
they do not create a separate copy of the data that can be transported off-site.
Therefore, NAS administrators should not use True Image copies as their sole
disaster-recovery approach. However, True Image copies can be used by clients
to recover from problems such as an accidental file deletion.
How archival backup accesses a PSM persistent image

Persistent Storage Manager is accessed via the NAS Administration console.
This is where the PSM images are created. They are either executed
immediately or scheduled for single or periodic execution.
Figure 5-4 shows the PSM schedule menu. This menu contains a list of
schedules for the PSM images to be captured.
Figure 5-4 Persistent Storage Manager: Scheduling menu screen

Full, incremental, and differential backups
Over time, backups can take up substantial storage. Generally, only a small
amount of data changes each day or over a period of time. Therefore, backup
administrators often take a backup of the changes that occur, rather than backing
up a complete copy of all data. Backup software such as the preloaded
NTBackup generally has the ability to make full, incremental, or differential
backups. This section gives you an overview of backup processes, but the
individual backup program manuals should be consulted for in-depth descriptions
and additional details.
First, most backup programs allow the administrator to select all files or a specific
subset of the files to be backed up. For these selected files, a full backup,
differential backup, or incremental backup can generally be requested. The
distinctions between the three types of backup are as follows:
򐂰 When a full backup is taken, all selected files are backed up without any
exception.
򐂰 When a differential backup is taken, all files changed since the previous full
backup are now backed up. Thus, no matter how many differential backups
are made, only one differential backup plus the original full backup are
needed for any restore operation. However, the administrator should
understand the particular backup software thoroughly because some backup
software will back up changed files—but not new files—during a differential
backup. When restoring from a differential backup, both the full backup and
the latest differential backup must be used.
򐂰 An incremental backup is similar to a differential backup. When an
incremental backup is taken, all files changed since that previous incremental
backup are now backed up. When restoring from an incremental backup, the
full backup will be needed as well as all of the incremental backups.
The NAS administrator can decide to perform a backup using all of the files from
a specific True Image copy, or only some files from it. However, while the
administrator can take incremental or differential backups of the drive
represented by a virtual image, the administrator cannot back up the PSM
persistent image cache files themselves. Therefore, should you have a situation
where you have to restore user data from tape, the persistent images will be lost.
The following example illustrates how True Image copies can be used in the
backup and restoration process:
򐂰 On Monday, a True Image copy is taken of drive E:\. A full tape backup of that
image is made. After this backup is completed, the True Image can be kept or
deleted. If the copy is kept, it can be subsequently used to restore Monday’s
files.

򐂰 On Tuesday, another True Image copy is again taken of drive E:\. An
incremental tape backup (#1) of Tuesday’s True Image is made. This image
will contain the changes to the disk system since Monday. After this backup is
completed, this True Image copy can be kept or deleted. If this copy is kept, it
can be subsequently use to restore Tuesday’s files.
򐂰 On Wednesday, another True Image copy is taken of drive E:\. An incremental
tape backup (#2) of Wednesday’s True Image copy is made. This image
contains the changes to the disk system since Tuesday. After this backup is
completed, this True Image copy can be kept or deleted. If this copy is kept, it
can be subsequently used to restore Wednesday’s files.
򐂰 On Thursday, an earthquake occurs, and the entire NAS system is completely
destroyed. A new replacement NAS system is installed. The NAS
administrator must restore Monday’s full backup plus Tuesday’s incremental
backup #1 plus Wednesday’s incremental backup #2.
However, the Administrator cannot restore the specific PSM cache files that
otherwise would have been available if the earthquake had never occurred.
To assist the NAS administrator in making backups using either TSM or ISV
software with PSM persistent image technology, IBM has provided the IBMSNAP
utility. Using this utility requires knowledge of Windows batch files and a
command line backup utility. IBMSNAP.EXE is a command line utility that creates
a PSM persistent image virtual drive, launches backup batch files, and then sets
the archive bits accordingly on the drive being backed up. It can be used in
conjunction with any third-party backup utilities as long as the utility supports
command line backups. The IBMSNAP.EXE utility can be found in the
c:\nas\ibm\nasbackup directory of the NAS operating system. See the online
NAS help for further details.
Restoration of previous persistent images

As mentioned earlier, the NAS administrator or user can restore their files from a
previous True Image copy. They can perform this file restoration through a
graphical drag-and-drop action or other standard file-copy method. A single file
or an entire volume of files can be copied and restored in this manner. In addition
to the file restoration just described, the clustered version of the IBM NAS 300G
as well as the NAS 300 will support “instantly” restoring a volume from a previous
image. This function is very similar to the SnapRestore function that Network
Appliance offers. An overview of the process is as follows:
1. The NAS administrator selects the desired previous persistent image to be
restored.
2. A reboot of the NAS product is performed.
3. The operating system reverts the data to the prior persistent image view.

4. The boot is completed.
The NAS administrator can perform additional database-specific actions or other

recovery actions, or both. For example, the administrator can rerun any database
transaction logs to bring the database up to the latest level before the restoration.
NT backup
The IBM Network Attached Storage products are pre-loaded with Windows
NTBackup and the NAS Backup Assistant. This approach can be used to back
up operating system data or user data, either to disk or tape. The pre-loaded
Persistent Storage Manager function is the recommended method of resolving
the “open file” problem.
There are two ways to back up the files in the NAS appliance when you use the
NT backup method. You can either access it through the NAS administration
console or the Windows Terminal Services. The NAS administration console is
accessed via the Maintenance -> System Backup and Restore -> Backup
option. For this approach you should first create a Persistent Image before the
NT Backup is started. Use this method if you want to back up a selected folder
from one of the persistent images, or the system partition.
The other method is to use the NAS Backup Assistant tool. The NAS Backup
Assistant automatically creates a Persistent Image and starts the NT Backup
program. Use this method to back up the data in a volume or file level basis.
These are the steps to be executed:
1. Use Windows Terminal Services from any NAS client to access the NAS
appliance.
2. Select Start -> IBM NAS Admin.msc -> Backup and Restore.
3. This leads you to the IBM NAS Admin display.
4. Select Backup and Restore -> IBM NAS Backup Assistant from the left
pane.
5. In the right pane, the following options appear:
– Backup Operations: Select drive, schedules, backup types, backup
methods, destination type, file path or tape name.
– Schedule Jobs: List jobs scheduled for backups. You can also delete jobs
that have been scheduled but not yet executed.
– Backup Logs: Shows logs of all backups. You can view or delete logs here.
– Display Logs: Allows you to display the logs.

Restoring
To restore, just follow the preceding steps, but select Restore Using NT
Backup in step 4 instead of IBM NAS Backup Assistant .
Note: You must ensure that a check mark appears on the directory or
individual files during the selection process. Otherwise, nothing will be backed
up or restored.
TSM configuration and backup

The IBM Network Attached Storage products have been pre-installed with the
Tivoli Storage Manager Agent. This enables you to back up the data in the NAS
appliance. Since this is only a TSM client code, you will require a TSM server
(which is another server in the network) to perform the actual backup. Based on
the TSM server’s configuration, the final destination of the NAS appliance’s
backup may either be located in the TSM server’s disk storage or an attached
tape subsystem. The latter is the preferred target location.
As in the NTBackup method, you will have to ensure that the persistent images
are created before activating this backup function. Automated scheduling to back
up these PSM images can then be configured in the TSM server.
The TSM client uses an option file to store its configuration. Once the setup is
completed, it creates an option file on the IBM NAS appliance in the following
directory and file name: C:\Program Files\Tivoli\TSM\baclient\dsm.opt
Here are the steps needed to configure the TSM client:

1. Access the NAS appliance from the IBM Advanced Appliance Configuration
Utility console.
2. Open the Windows Terminal Services.
3. Select Start -> Programs -> Tivoli Storage Manager -> Backup Client
GUI
4. This produces the Tivoli Storage Manager window.
5. Select Utilities -> Setup Wizard.
6. You will get the TSM Client Configuration Wizard. Select the following:
– Help me configure the TSM Backup Archive Client
7. Click Next , then check the following:
– Create a new option file.
Select this option for a new setup. Select this option if you are setting up
the first time.

– Import an existing option file for use.
Select this option only if the dsm.opt file was previously created by the
system administrator on some other machine.
– Update my options file.
Select this option if you want to update a previously configured dsm.opt on
the same machine.
8. Select Next , and you are asked to enter the TSM Node Name to use. This
should be the name of the TSM Client; that is, the NAS appliance. An
example to enter is IBM_NAS_TSM_CLIENT.
9. Select Next , and the TSM Client/Server Communications screen is
displayed. Select TCP/IP.
10.Select Next , and you will be asked for the TCP/IP Parameters.
11.Enter the Server Address. This is the TSM Server’s IP address, for example,
192.1.1.5.
12.Enter the Port Address as: 1500. This is the default value.
13.Select Next , and check the following:
– Domain List
Click the Edit button to select the directory to be backed up.
– Include/exclude List
Click the Edit button to either include or exclude some files from the list.
14.Select Next , and Finish to complete the TSM client configuration.
Figure 5-5 is a sample of the DSM.OPT file in the NAS appliance:
NODENAME IBM_NAS_TSM_CLIENT
PASSWORDACCESS GENERATE
DOMAIN “(\\ibm-23ttn07\share_e)"
DOMAIN "(\\ibm-23ttn07\share_g)"
DOMAIN ALL-LOCAL
TCPSERVERADDRESS 192.1.1.5
Figure 5-5 Sample output of TSM client’s dsm.opt file in the NAS Appliance
For the backup to work, the TSM Server must have its client’s nodename
registered in its configuration files. In this case, it will be the NAS Appliance’s
nodename.
To back up the files from the TSM Client, follow these steps:
1. Use Windows Terminal Services from any NAS client to access the NAS
appliance.

2. Select Start -> Programs -> Tivoli Storage Manager -> Backup Client
GUI .
3. This leads you to the Tivoli Storage Manager GUI.
4. Select Backup.
5. In the left pane, select the directory to back up, or use the right pane to select
individual files for the backups.
To restore, just follow the preceding steps, but select Restore in step 4 instead of
Backup.
Note: You must ensure that a check mark appears on the directory or
individual files during the selection process. Otherwise, nothing will be backed
up or restored.
Independent Software Vendor (ISV) solutions

The IBM appliances are sold as fixed-function boxes, and are in general not
intended to be modified or changed by the customer. IBM and its vendors have
cooperated to tune the performance and testing of these products in NAS
environments. Additionally, the license agreements between IBM and its software
vendors, and between IBM and its customers, prohibit the use of these
appliances as general-purpose servers. Therefore, addition or modification of
this software in the NAS system may void any support by IBM.
However, a limited number of add-on applications have been tested with these
NAS products, and customers may add those specific software applications to
the system. Should a customer have problems with non-IBM software that they
have added to this appliance, the customer should contact the vendor directly, as
IBM does not provide on-site or remote telephone support for those non-IBM
products.
IBM will continue to support hardware and software that is shipped with the NAS
appliance. However, in certain circumstances, any non-IBM software may have to
be uninstalled for IBM service to provide problem determination on the IBM
hardware and software.
IBM has tested, and will continue to test, a variety of vendor software products.
Customers can go to the IBM Support Web site at
http://www.ibm.com/storage/nas to see the status and additional details of this
testing.

AntiVirus scan
These NAS products do not come preloaded with antivirus software. They are
considered as a closed system configuration and are less susceptible to viral
infection. However, an antivirus scan of the storage can be performed from
clients that have the appropriate access permissions. Also, Norton AntiVirus
Version 7.1 or later can be installed using normal Windows 2000 software
installation procedures. Additionally, some ISV backup software has options to
scan for viruses during backup processing.
Depending on configuration options, antivirus scanning can use substantial

processor, disk, or network resources. Therefore, scanning options and scan
schedules should be carefully selected to minimize the impact to system
resources. A good recommendation is to schedule it during off-peak hours or
during archival backup.
Note: For more information, read the IBM whitepaper by Jay Knott entitled
“NAS Cache Systems, Persistent Storage Manager and Backup“ available at:
http://www.storage.ibm.com/snetwork/nas/whitepaper_nas_cache_systems.html

6
Chapter 6. Application examples for

IBM NAS and iSCSI
solutions
Throughout this book we have pointed out the benefits of IP storage in solving
some of the limitations of direct attached and SAN attached storage
implementations. In this chapter we summarize a number of ways in which you
can use IBM NAS and iSCSI solutions.
We want to emphasize that these are generalized examples, and our objective is
to bring together some of the ways in which you can benefit from IBM’s new
products. Inevitably we will not cover every possible use. Customers are often
very inventive, and think of new things they can do, which further enhance the
portfolio of solutions! However, the examples we include here are typical of the
way we believe users will begin to exploit the functions and capacity offered by
NAS and iSCSI storage.

6.1 NAS Storage consolidation
In Chapter 1, “Introduction to storage networking” on page 1, we described the
limitations of a general purpose file server in delivering capacity and
performance. Many organizations today have many such file servers, each
supporting groups of users. Management of such an installation becomes
increasingly complex; and storage capacity is often wastefully used. Overall
storage costs of ownership, and administration complexity, can be reduced by
pooling storage on a NAS appliance, such as the NAS 200 and 300. In the
situation where there is a need to provide substantial scalability, or if existing
enterprise class disk capacity is already freely available, a specialized appliance
like to 300G can deliver added benefits.
These implementations are illustrated in Figure 6-1 and Figure 6-2.
Storage consolidation with NAS 200 & 300

Current E nvironm ent Proposed Solution
Clients Clients Clients Clients
W indow s
NT File W indows
Servers NT File
W indow s Servers
W indow s®
NT File NT File LAN
Servers LAN Servers
W indows
NT File NAS 200 ( up to 1.74 TB )
Servers NAS 300 (up to 3.24 TB )
integtrated disk
Needs: Benefits:
To sim plify data mana gement of file servers Consolidates file-server managem ent
To sim plify adding storage to file servers Easier storage manageme nt
Simplifies adding additional storage
Figure 6-1 Implementation of storage consolidation with the NAS 200 and 300

S torage consolidation w ith N A S 300G
Current Environm ent Proposed So lution
C lie nts
Clients
C lients
Clients
W indow s Sun File

NT File Se rve rs
Se rve rs
W indow s ® Sun File
NT File Se rve rs LAN
L AN
S erve rs
N AS 30 0G
SAN
SAN (Fibre Channel)
(Fibre Channe l)
Grow disk capacity o n the SAN
Needs: Benefits:
S im p lify data and storage resource C onsolid ates file-server m anagem ent
m anag em ent o f file servers E asier storage m anagem ent
S im p lify adding storage to file servers S im plifies adding additional sto rag e
Figure 6-2 Implementation of storage consolidation with the NAS 300G
The attractions of the NAS appliances are the ease of management, along with
the availability of advanced functions, such as RAID, instantaneous copy of files
for easier backup processes, using Persistent Storage Manager, and so on.
Figure 6-3 shows how you can still make use of your 7133 with 300G.
Chapter 6. Application examples for IBM NAS and iSCSI solutions 223
S cena rio: 7133 d atab ase m an agem ent
C u rren t E n viro n m en t S o lu tio n
pS e rie s pS e rie s
se rv er s s e rve rs
(R S /6 00 0 ® ) (R S /6 0 0 0 )
SSA S SA
SSA SSA S SA S SA
71 3 3 7 13 3 71 3 3
7 133 71 33
7 13 3
L AN
LA N
N AS 3 00 G
SAN
(F ib re C h a nn e l) S AN
SL IC (F ib re C h a nn e l) S LIC
Ad a pte r A da pte r
71 3 3
71 3 3
N eeds: B en efits:
To sim plify da ta m an ag em en t o f file C on s olid ates d a ta m an ag em en t
s ervers H elps p ro te cts you r 71 33
To sim plify ad d in g s to rag e to file ser ve rs in ve stm en t
Figure 6-3 Implementation of storage consolidation with 300G and 7133
6.2 NAS LAN file server consolidation

As the number of file servers grows, the costs and complexities of managing a
heterogeneous platform environment also increase. Different platforms often
handle different applications, such as e-mail, departmental UNIX file sharing, NT
file sharing, and so on. By providing heterogeneous file sharing facilities, the IBM
NAS appliances offer a low cost, highly scalable way to resolve the issues, and
minimize the number of independent file servers. We illustrate this with an NAS
200/300 example in Figure 6-4, and a NAS 300G example in Figure 6-5.
Other benefits also accrue. Storage scalability is enhanced and growth can take
place with minimum disruption. Storage space can be re-allocated as required,
based on changing user needs. Backup processes can be automated using
Persistent Storage Manager (see Chapter 5, “Backup for IBM Network Attached
Storage” on page 191).

File-server consolidation with NAS 200 & 300
Current Environment Proposed Solution
Clients Clients
Clients Clients
Windows
Windows® NT File LAN
NT File Servers
Servers LAN
NAS 200 ( up to 1.74 TB)

NAS 300 (up to 3.24 TB)
integtrated disk
Needs: Benefits:
Simplify management of file servers Reduce number of file servers
Simplify adding storage to file servers Heterogeneous file-sharing on the SAN
Simplifies storage management
Figure 6-4 NAS LAN file server consolidation
File server consolidation with NAS 300G
Current Environment Proposed Solution
Clients
Clients
Clients
Clients
Servers
Windows® Sun File

NT File Servers LAN
Servers LAN
NAS 300G
SAN File
SAN sharing
(Fibre Channel)
SAN
(Fibre Channel)
Grow disk capacity on the SAN
Needs: Benefits:
Simplify management of file servers Reduce number of file servers
Simplify adding storage to file servers Heterogeneous file-sharing on the SAN
Simplifies storage management
Figure 6-5 File server consolidation with the NAS 300G
6.3 SANergy high speed file sharing

As we saw in 1.8.5, “The IBM NAS 300G appliances” on page 43, the addition of
SANergy Metadata Controller, provided as an option on the 300G, provides many
significant benefits. High speed file access across the LAN is given to SANergy
clients. Another benefit is greater robustness to network connection. This effect is
produced because the SANergy client, once given SAN access, continues to be
able to get to data over the SAN, in spite of any LAN connection failure. In
addition, depending on the disk systems attached to the SAN, more advanced
functions may be utilized, such as Peer-to-Peer Remote Copy, for disaster
tolerance. This is illustrated in Figure 6-6.

File sharing with SANergy™
Solution
Current Environment SANergy
Clients Client
Clients Client
Clients Clients
LAN
LAN
Fibre
NAS 300G Channel
NAS 300G MDC
SAN SAN
(Fibre Channel) (Fibre Channel)
Needs: Benefits:
Provide end-users with access to SAN Provides heterogeneous file sharing
storage Reduces traffic over LAN
Provide heterogeneous file sharing File access at Fibre Channel speed
Reduce LAN traffic
Figure 6-6 File sharing with NAS 300G and SANergy
6.4 SANergy with Tivoli Storage Manager (TSM)

Adding TSM capability to a NAS appliance (such as the 300G), together with
SANergy, resolves one of the problems which IT managers want most to
address. This is to reduce or eliminate the movement of data, due to backup and
restoration, over the IP messaging network.
6.4.1 Using TSM with SANergy

Using SANergy together with the Tivoli Storage Manager will give you the ability
to transfer your data through the SAN. It supports both LAN-free and serverless
types of backup/restore. In both cases, the data transfer will be off-loaded to the
SAN. These applications provide some of the most attractive benefits of SAN
implementation because they eliminate so much traffic which currently moves
across the LAN.
In the following sections, we illustrate two possible configurations.
6.4.2 TSM backup/restore using SANergy: Scenario 1

In this example, the application hosts are running SANergy client code. They
mount the disks they need and share data among themselves. The SANergy
MDC machine has the Tivoli Storage Manager server running. This MDC
machine owns all of the disk volumes; these volumes can therefore be accessed
locally. We are able to back up the data on these volumes via the SAN with the
Tivoli Storage Manager backup/archive client. Restores are also performed
locally. Figure 6-7 illustrates our SANergy/Tivoli Storage Manager scenario.
LAN
Application TSM server

server TSM client
SANergy client MDC
1 Read Data 2 Write Data
SAN
Disk
Vol 1 Vol 2 Vol 3 Vol 4
Figure 6-7 TSM using SANergy: Scenario 1
Note that in this case, no data is transferred through the LAN, not even metadata.
That is because there is no backup/restore action on your application server.
6.4.3 TSM backup/restore using SANergy: Scenario 2

In the second scenario, the Tivoli Storage Manager server is installed on the
application system, which is a SANergy client. This client mounts its volumes
from the MDC machine. Figure 6-8 shows this scenario.

File Access Metadata
LAN
TSM server
MDC TSM client
SANergy client
1 Read Data 2 Write Data
SAN
Disk
Vol 1 Vol 2 Vol 3 Vol 4
Figure 6-8 TSM using SANergy: Scenario 2
When the Tivoli Storage Manager client begins to back up the data, it will need to
get the metadata from the MDC machine. For this purpose, the TCP/IP transport
over the LAN will be used. But the raw data still will be transmitted through the
SAN.
Note that this scenario requires a homogeneous environment to do successful

restores with all file metadata. This means that the platform type of the system
running the Tivoli Storage Manager client has to be compatible with the MDC
systems.
6.5 NAS Web hosting

Many companies, whether they label themselves “dot.com” or not, have a
growing need to handle increasing traffic on their Web sites. As the volume of
traffic grows, the tendency is to add more servers in order to respond to user
requests, and to deal with the rapid increase in Web site “hits.” Web activity may
also be running on servers which handle other applications in addition, such as
e-mail, database applications, and so on. Load on the processor platforms may
become unpredictable, and lead to inconsistent performance and response
times. In the world of e-business this is an unacceptable situation, because Web
users are potentially your customers. Poor service levels will drive them into the
arms of your competitors.
Each server has its own storage, but the data related to Web pages is exactly the
same. It is costly to continue to grow in this manner, multiplying the number of
data copies with the addition of each new Web server. The ideal solution is to
have consolidated storage which all Web servers can access concurrently. One
possible solution is to move to, or increase investment in a Fibre Channel SAN.
However, the cost of building this new, high speed storage infrastructure may be
high, especially for low cost NT servers, (which typically are the Web servers).
Also, the time required to implement a SAN solution is long. An alternative that is
much lower in cost, and easier to implement rapidly, is to install a NAS appliance
to handle Web serving.
In Figure 6-9 we show how a NAS 200 or 300 would provide an excellent
Web-serving, consolidated storage solution, at low cost, and with minimum time
to install. New investment in servers is minimized, and Web services can easily
be isolated from other mission-critical applications.
Web Hosting
Current Environment Solution
Internal
Internal
Users
Users
Business Surfers,
Business Surfers, Shoppers
Shoppers Clients
Clients
LANs,
LANs, WANs,
WANs,
Web,
Database, NAS
Transaction, Database, 200
Mission Transaction, Web
Critical, Mission Server
Servers Critical, SAN
SAN
Servers (Fiber Channel)
(Fiber Channel)
Needs: Benefits:
Increase storage due to business growth High performance, dedicated Web Server
Provide high speed, web streaming to clients Minimize investment in additional Servers
To share storage among multiple web servers Provide storage pooling
Keep costs low Provide heterogeneous Web File Serving
Reduce CPU load on "Mission Critical" Servers Use existing infrastructure / tools / processes
Isolates Web clients
Figure 6-9 Web hosting with NAS 200 and 300

Storage Service Providers (SSPs) want cheap storage to sell (rent) to clients.
Their contracts typically do not include high levels of availability, so failover isn’t
needed. Hence NAS 200 and 300 suit the purpose.
Internet Data Centers (IDCs) also want plenty of cheap storage to offer their
clients. An IDC offers a physical location for storage, to support anything the
customer wants to put on the box. Corporate IT centers use IDCs, as do SSPs
and Web-hosting ISPs. The benefit of an IDC is that it is located adjacent to an
optic fiber line, so it eliminates the customer need to run fiber optic to their
business, saving them thousands of dollars per month. Most IDC customers do
not require high availability.
Integrated NAS solutions are also used frequently for video streaming storage
service on the Web, and as a vehicle for providing a place to do backups for an
office workgroup or department.
Video streaming frequently runs with CIFS protocol, for which NAS 200 and 300
are well suited. Video streaming is also typically not an application where failover
is required. NAS 200 and 300 will also allow several users to view the file
simultaneously, whereas the IP 200i is good for a direct feed to a single client.
6.6 IP Storage 200i solutions

We have seen that NAS and iSCSI differ fundamentally, because NAS uses
network file I/O, whereas iSCSI uses traditional SCSI storage “block I/O”
encapsulated in TCP/IP. This enables iSCSI storage appliances to address
applications which are not suited to NAS appliances. While NAS is optimized to
applications such as e-mail and Internet-based Web serving, and file sharing
between heterogeneous platforms; iSCSI is optimized for applications such as
database, transaction processing, and video streaming.
Like NAS, an iSCSI appliance can provide an excellent solution to centralized

pooling of storage. It is attractive to users who would otherwise have to bear the
costs, skills, and time to implement a Fibre Channel SAN.
6.6.1 Database solutions
The example illustrated in Figure 6-10 shows the use of the IP Storage 200i to
enable a small- to medium-sized data center to exploit their existing IP network to
support a number of database or low volume transaction-oriented applications.
The 200i is an ideal, flexible solution for an organization that needs to keep
implementation simple and low cost, and to avoid the need to develop new skills,
as would be necessary with a Fibre Channel SAN.
Database Solution...IP Storage 200i

Current Environment Solution
High
Performance
Database and
Transaction
Servers
DataCenter IP DataCenter IP
Infrastructure Infrastructure
High
Performance
IP
Database and
Transaction Storage
Servers 200i
Block I/O Pooled
Environment Storage
Needs: Benefits:
Additional storage for database + Pooled / Centralized Storage
transaction servers Non-disruptive growth
Limited IT skills Centralized storage management
Pooled storage for availability, flexibility, Utilized existing network/IP skills
+ scalability
Low/Moderate Transaction Volume
Figure 6-10 iSCSI pooled storage for database applications
Customers may choose to run databases on NAS appliances. Some database

programs can be configured to use a mapped network drive and run in file I/O
mode. However, databases will run somewhat faster on iSCSI because it uses
block I/O. For more information, consult your database vendor.

6.6.2 Transaction-oriented applications
Figure 6-11 shows an example of how the IBM IP Storage 200i appliance can
support growth of transaction-oriented applications in a company which, until
now, has been focused entirely on file sharing and Web serving. The 200i is an
ideal partner to NAS pooled storage, like the IBM NAS appliances. No change
needs to be made to the IP network infrastructure; no new skills need to be
acquired. Costs are kept low, and new database applications can be isolated for
security reasons from the existing systems. The 200i is good news for this type of
environment, which is common among rapidly growing Internet Service Providers
(ISPs) and Application Service Providers (ASPs).
xSP growth with IP Storage 200i

Current environment Solution
Internal
Internal
Users
Users
Business Surfers,
Business Surfers,
Clients Shoppers
Clients Shoppers
LANs, LANs,
WANs, WANs,
NAS pooled
storage
NAS pooled Database,

Applications
iSCSI pooled
storage Block I/O storage
Needs: Benefits:
Add database applications to web serving Pooled storage for database applications
Share storage among database servers Complements / coexists with NAS solution
Reduce SAN implementation costs Use IP infrastructure / tools / processes
Isolates Web clients away from database
applications
Figure 6-11 Supporting Web server growth with iSCSI storage
6.7 Positioning storage networking solutions
Table 6-1 provides a brief summary of all the Storage Networking solutions we
have described thus far.
Table 6-1 Summary of storage networking solutions
SAN NAS iSCSI SANergy
Topology Device Protocol File Sharing Software
Better with Block I/O Better with File I/O Block I/O File IO plus Block IO
(database) applications
applications IP Based NAS File sharing with
IP Based SAN performance
FC Storage Sharing
File Sharing
Storage Sharing
Slower database
performance than SAN or
iSCSI
Larger Environment Enterprise SPs Same customer profile

as SAN
Requirement for Midmarket Dept/Workgroup/
Highest Performance & Branch offices
Scalability xSPs
Minimal SAN needs
Ease of Management

6.8 Typical applications for NAS and for iSCSI?
As we have seen from previous chapters, the NAS devices support file I/O and
the iSCSI devices support block I/O applications. Table 6-2 shows typical
applications for both file I/O and block I/O applications.
Table 6-2 Typical applications for file I/O and block I/O
Typical applications using file I/O Typical applications using block I/O
Groupware and collaborative tools High performance databases

Lotus Notes UDB (DB2)
Lotus Domino - Server Oracle
Lotus Approach Microsoft Exchange
Lotus Freelance graphics Informix
MS Power Point Video Streaming
MS Word ERP applications
MS Excel
Word Pro
Publications
Software development
Rich media management
Web design
Computer Aided Design (CAD)
Two important points should be mentioned here:

1. Many block I/O applications can be configured so that they can also be run in
file I/O mode. However, the main reason for running them in block I/O mode is
performance. When an application is written using the operating system’s file
I/O, which is a higher layer protocol, the overhead is likely to be higher than if
raw partitions are used. Application designers have better control of how the
data is written or organized on disk by bypassing the operating system’s file
I/O. As a result, they can frequently achieve significant performance benefits.
An analogy can be drawn with the use of low-level and high-level
programming languages. For instance, a programmer writing in Assembler
language knows exactly what is happening when an Assembler instruction is
executed. On the other hand, a programmer writing in a high-level language
such as PL/I or Visual Basic, has no such detailed control. The language
semantics generate lower level instructions for the programmer. High-level
programming languages are much easier to use than Assembler, but have a
higher processing overhead. However, Assembler was still used in
environments where efficient storage and CPU utilization were at a premium,
even though easier languages like PL/I were available.
Today, under normal circumstances, Assembler is no longer used to write
application programs since the cost of writing in Assembler is much higher
than the cost of the “wasted” storage and CPU power incurred with PL/1. This
is because the cost of hardware has fallen substantially over time. A similar
approach can be expected in the storage arena. It is likely that application
developers will leave the lower layer functionality to the operating systems,
especially as new storage technologies emerge.
2. All file I/Os result at the lower layers into block I/O commands. In other words,
iSCSI devices, like other storage systems which support storage protocols,
also support file I/O applications. In this case, it should be noted that the
“visibility” of the files is lost. The iSCSI device, like DAS and SAN attached
storage, knows nothing about the “files,” but only about “raw I/O” or blocks. It
is for this reason that NAS devices should be considered only for file I/O
applications, whereas iSCSI appliances are well suited to general purpose
storage applications, including file I/O applications.

7
Chapter 7. Other storage networking

technologies
Many companies, both large and small, offer numerous technology initiatives in
the storage and networking industries. These initiatives aim to enhance existing
architectures, or to bring to market new technologies that will enable greater
connectivity, scalability, reliability, availability, or performance of servers and
storage systems.
The preceding chapters in this book covered IBM’s announced solutions using
storage over IP networks. In this chapter, we describe some of the other
technologies which are emerging, or are in the process of being introduced into
the storage market. In general, the developments come from groups of
co-operating companies, and they address varying connectivity and data
transmission issues arising from today’s diverse customer networking
environments. Many of these developments are complementary, and combine to
enhance your choices, and benefit the solutions you plan to implement today.
IBM is an active participant in many of these industry initiatives.
Naturally, such developments present a rapidly moving target, with so many

changes occurring in a short time frame. In the following sections we have
included some information, current at the time of writing, on the following
developments:
򐂰 Network performance: What is beyond 1 Gigabit Ethernet?

򐂰 Storage over IP (SoIP): A close relation to iSCSI.
򐂰 Internet Fibre Channel Protocol (iFCP): A gateway protocol designed to
allow attachment of Fibre Channel storage products to IP networks.
򐂰 Fibre Channel over TCP/IP (FCIP): A protocol designed to interconnect
Fibre Channel SAN islands across the IP network.
򐂰 InfiniBand (IB): A switch fabric architecture, designed to replace today’s I/O
bus architectures.
򐂰 Virtual Interface (VI): Defining a standard specification for communication
within clusters of servers.
򐂰 Direct Access File System (DAFS): A file access protocol under
development to take advantage of new interconnect technologies, such as
Infiniband and Virtual Interface.
򐂰 Network Data Management Protocol (NDMP): A proposed protocol aimed
at standardizing mechanisms for transfer of data between primary and
secondary storage, such as backup and recovery.
7.1 Network performance

There is a good deal of speculation in the industry about future network speeds,
and which network—IP or Fibre Channel—is or will be faster. In practice, many of
the same vendors have a foot in both camps, so to speak. Therefore, it is likely
that IP and Fibre Channel network speeds will develop more or less in step in the
future. Future implementations of Fibre Channel at 200 and 400 MBps have
been defined. Indeed, prototypes of storage components which meet the 2 Gbps
transport specification are already in existence, and will be in production in 2001.
Vendors and industry analysts alike are projecting the availability of 10 Gbps
networks within the next several years.
7.2 Storage over IP (SoIP)

Storage over IP (SoIP) is a concept that combines the features of SANs, which
provide high availability and performance, with the features of IP networks. This
technology provides product compatibility, with familiar technology and network
scalability similar to iSCSI, except that it focuses on the use of UDP protocols
versus TCP/IP.
SoIP technology enables traditional storage system interfaces, such as SCSI,

Fibre Channel, Fibre Channel Arbitrated Loop, and InfiniBand interfaces, to
connect to a standard IP infrastructure. It uses existing networking standards,
such as OSPF, MPLS, and SNMP, along with IP and Gigabit Ethernet.

SoIP is a framework for deploying native IP storage solutions. It is designed to
support transparent interoperability of storage devices based on Fibre Channel,
SCSI and Gigabit Ethernet storage devices. The objective is to enable any
existing Fibre Channel or SCSI devices, such as servers with host bus adapters
(HBA) or storage subsystems, to be included in an SoIP storage network without
modification. It could also be connected to native IP Gigabit Ethernet storage
devices by means of devices such as SoIP adapters.
SoIP would be implemented through a family of networking products that link

existing Fibre Channel and SCSI end devices with Gigabit Ethernet backbone
networks. The end devices can be servers or storage devices with either Fibre
Channel or SCSI interfaces. SoIP can also be extended across a Metropolitan
Area Network (MAN).
7.3 Internet Fibre Channel Protocol (iFCP)

iFCP is a gateway-to-gateway protocol that is used for the implementation of a
Fibre Channel fabric in which TCP/IP switching and routing elements replace
Fibre Channel components. The protocol enables the attachment of existing
Fibre Channel storage products to an IP network by supporting the subset of
fabric services required by such devices. A key and unique capability of iFCP is
to permit each Fibre Channel “session” to be encapsulated in a TCP/IP protocol
in such a way that it can be delivered from one Edge Connect Switching Router
to an IP-based network. Each session can be individually routed to another SAN
network which has similar Edge Connect Switching Routers.
iFCP uses TCP to provide congestion control, error detection, and recovery.
iFCP's primary objective is to allow interconnection and networking of existing
Fibre Channel devices at wire speeds over an IP network. The protocols and
method of frame translation of this protocol enables the transparent attachment
of Fibre Channel storage devices to an IP-based fabric by means of lightweight
gateways. The protocol achieves this transparency through an address
translation process. This allows normal frame traffic to pass through the gateway
directly, with provisions for intercepting and emulating the fabric services
required by an FCP device.
In its simplest form of iFCP implementation, the Fibre Channel devices are
directly connected to the iFCP fabric through F_PORTs, which are implemented
as part of the edge switch or gateway. At the N_PORT interface on the Fibre
Channel side of the gateway, the network appears as a Fibre Channel fabric.
Here, the gateway presents remote N_PORTs as directly attached devices.
Conversely, on the IP side, the gateway presents each locally connected
N_PORT as a logical iFCP device on the IP network.
Chapter 7. Other storage networking technologies 239

An important property of this gateway architecture is that the fabric configuration
and topology on the FC side are hidden from the IP network. Consequently,
support for FC fabric topologies, such as switches or loops, becomes a gateway
implementation option. In such cases, the gateway incorporates whatever
functionality is required to distill and present locally attached N_PORTs (or
NL_PORTs) as logical iFCP devices. N_PORT to N_PORT communications that
traverse a TCP/IP network require the intervention of the iFCP layer.
For more information on this topic, visit the following Web site:
http://www.ietf.org
7.4 Fibre Channel over TCP/IP (FCIP)

Fibre Channel (FC) over TCP/IP relies on IP-based network services to provide
connectivity between SAN islands over LANs, MANs, or WANs. FC over TCP/IP
relies upon TCP for congestion control and management and upon both TCP
and FC for data error and data loss recovery. FC over TCP/IP treats all classes of
FC frames the same, that is, as datagrams.
FCIP is also referred to as tunneling. Because it provides dedicated

point-to-point links between two SAN islands, it can be likened to dedicated dark
fiber links. Since dedicated dark fiber connections are expensive, FCIP may offer
a lower cost solution to enable use of existing MANs and WANs to link distributed
SANs.
FCIP Protocol
The FCIP Protocol consists of the following:
򐂰 FCIP Device: This term generally refers to any device that encapsulates FC
frames into TCP segments and reassembles TCP segments to regenerate
FC frames. It may be a stand-alone box, or integrated with an FC device such
as an FC backbone switch. It could also be integrated with any TCP/IP device,
such as an IP switch or an IP router. The FCIP device is a transparent
translation point. The IP network is not aware of the FC payload that it is
carrying. Similarly, the FC fabric and FC end nodes are not aware of the
IP-based transport.
򐂰 Protocol: The FCIP protocol specifies the TCP/IP encapsulation, mapping
and routing of FC frames. It applies these mechanisms to an FC network
utilizing IP for its backbone (or more generally, between any two FC devices).
򐂰 FCIP Header Format: This header consists of its version number, header
length, frame length, and its reserved bits.

The use of the FCIP length with either the End-of-File (EOF) byte-code
immediately preceding the FCIP header, or the Start-of-File (SOF) byte-code
immediately following the FCIP header, or both, provides enough verification that
the FCIP devices communicating over a particular TCP connection are
synchronized with each other.
The FCIP device always delivers entire FC frames to the FC ports to which it is
connected. The FC ports must remain unaware of the existence of the IP
network that provides, through the FCIP devices, the connection for these FC
ports. The FCIP device also treats all classes of FC frames the same, that is, as
datagrams.
http://www.ietf.org
7.5 InfiniBand (IB)

A consortium of computing industry leaders, including IBM, Compaq, Dell,
Hewlett-Packard, Intel, Microsoft, and Sun Microsystems, have joined together to
address important issues of server I/O bus performance and flexibility. They have
formed an independent industry body called the InfiniBand SM Trade Association.
The consortium is dedicated to developing a new common I/O specification. The

objective is to deliver a channel-based, switched-fabric technology that the entire
industry can adopt. The objective is to replace today’s server I/O bus, such as
the Peripheral Component Interface (PCI), with a new approach to I/O
technology.
7.5.1 InfiniBand objectives

The main objective of this association is to develop a new interconnect standard
with the following in mind:
򐂰 To develop a specification that will meet the emerging needs of customers. A
channel-based, switched-fabric architecture is expected to deliver:
– Scalable performance to meet the growing demands of data centers
– Flexibility to provide connectivity that scales with business demands,
independent of the microprocessor or OS complex
– Flexibility to inter-operate from the entry level to the enterprise level
򐂰 To draw on existing proven technology. Switched-fabric, point-to-point
interconnects are not new to the industry. InfiniBand Architecture will utilize
the collective knowledge of switched-fabric implementations to deliver the

best and most cost-effective I/O solutions. These will eventually ensure a
transition from the legacy I/O bus like PCI and PCI-X.
7.5.2 InfiniBand architecture specification

The InfiniBand architecture will de-couple the I/O subsystem from memory.
Rather than use a traditional shared bus, load and store configuration, such as
PCI, the InfiniBand architecture is designed to utilize channel-based
point-to-point connections The newly designed interconnect utilizes a 2.5 Gbps
wire speed connection with one, four, or twelve wire link widths. This offers
scalable performance through multi-link connections, as well as a host of
interoperable link speeds. The specification will support both copper and fibre
implementations.
InfiniBand replaces the bus-based PCI with a high-bandwidth (multiple gigabytes

per second) switched network topology. It also shifts I/O control responsibility
from processors to intelligent I/O engines, commonly known as channels. These
approaches have long been used in the design of the world’s largest servers.
Now they will be brought down to a scale that can address virtually every server.
It is anticipated that the new architecture will provide an unprecedented range of

performance for entry-level servers through high-end data-center class solutions.
These will use interoperable links with aggregate bandwidths of 500 MBps, 2
GBps, and 6 GBps with a 2.5 Gbps wire signaling rate. However, there is far
more to aggregate system performance than wire speed. A more sophisticated
understanding of performance will evolve as the specification advances. It will
support both copper and fiber-optic cabling.
7.5.3 The benefits of InfiniBand

InfiniBand architecture delivers a unique solution that benefits a wide range of
industry participants, from components vendors and systems suppliers to
storage and networking communications firms. Ultimately, end users receive the
greatest benefit through improved price/performance, greater flexibility and
scalability, and more reliable and manageable data centers.
Initially, InfiniBand technology will be used to connect servers with remote

storage and networking devices, and other servers. It will also be used inside
servers for inter-processor communication (IPC) in parallel clusters. Customers
requiring dense server deployments, such as ISPs, will also benefit from the
small form factors being proposed.

Other benefits include greater performance, lower latency, easier and faster
sharing of data, built-in security and quality of service, improved usability (the
new form factor will be far easier to add/remove/upgrade than today's
shared-bus I/O cards).
Additionally, InfiniBand architecture is expected to reduce total cost of ownership

by focusing on data center reliability and scalability. The technology addresses
reliability by creating multiple redundant paths between nodes (reducing
hardware that needs to be purchased). It also moves from the “load and store
based” communications methods used by shared local bus I/O to a more reliable
message passing approach.
Scalability needs are addressed in two ways. First, the I/O fabric itself is
designed to scale without encountering the latencies that some shared bus I/O
architectures experience as workload increases. Second, the physical modularity
of InfiniBand Technology will avoid the need for customers to buy excess
capacity up-front in anticipation of future growth. Instead, they will be able to buy
what they need at the outset and “pay as they grow,” to add capacity without
impacting operations or installed systems.
http://www.infinibandta.org
7.6 Virtual Interface (VI) architecture

In traditional network architecture, the operating system virtualizes the network
hardware into a set of logical communication endpoints available to network
users. To simplify the interface between the network hardware and the operating
system, the operating system multiplexes access to the hardware among these
endpoints, and implements communication protocols to ensure reliability. The
disadvantage of this implementation is that all network communications require a
call or trap into the operating system kernel, which is expensive. De-multiplexing
the process and reliability protocols also increases computational cost.
These system processing overheads of the traditional network architecture can

be eliminated by using the VI Architecture. This is achieved by providing each
network user process with a protected, directly accessible interface to the
network hardware. This interface is known as a Virtual Interface (VI). Each VI
represents a communication endpoint. VI endpoints can be logically connected
to support bi-directional, point-to-point data transfer.

A process may own multiple VIs exported by one or more network adapters. A
network adapter performs the endpoint virtualization directly, and subsumes the
tasks of multiplexing, de-multiplexing, and data transfer scheduling, which are
normally performed by an operating system kernel and device driver. The
adapter may be tasked to ensure the reliability of the communication between
various connected VIs. Alternatively, at the discretion of the hardware vendors,
this task may be shared with transport protocol software loaded into the
application process.
7.6.1 The objectives of Virtual Interface architecture

The goal of Virtual Interface (VI) architecture is to improve the performance of
distributed applications by reducing the latency associated with critical
message-passing operations. This is achieved by reducing the system software
processing required to exchange messages as compared to traditional network
interface architectures.
VI architecture defines an industry-standard specification for communication

within clusters of workstations and servers. These clusters utilize
standards-based servers as building blocks to achieve enterprise-class
performance and scalability.
VI architecture standardizes the interface for high-performance network

technologies known as System Area Networks. Utilizing System Area Networks,
VI architecture transforms a collection of independent standards-based servers
into a highly scalable cluster that can meet the performance and capacity
requirements of the largest and most demanding enterprise applications. Its fast
server-to-server communications can enhance an application’s scalability and
performance in a variety of ways, from allowing a single application to run
efficiently across dozens of clustered nodes, to speeding up the exchange of
data between distributed application modules running on different application
servers.
The VI architecture specification, jointly promoted by Compaq, Intel, and

Microsoft, is the result of contributions from over 100 industry organizations.
7.6.2 Virtual architecture components

The VI architecture comprises four basic components: virtual interfaces,
completion queues, VI providers and VI consumers. The VI provider consists of a
physical network adapter and a software kernel agent. The VI consumer is
generally composed of an application program and an operating system
communication facility. The structure and organization of these components is
illustrated in Figure 7-1 on page 245.

Virtual interfaces
Virtual interfaces are mechanisms which allow a VI consumer to directly access
a VI provider to perform data transfer operations. A logical flow of a virtual
interface is shown in Figure 7-2 on page 246.
A VI consists of a pair of work queues, consisting of a send queue and a receive

queue. VI consumers post requests in the form of descriptors on the work
queues as a send or receive data. A descriptor is a memory structure that
contains all of the information that the VI provider needs to process the request,
such as pointers to data buffers.
VI providers process the posted descriptors asynchronously, and mark them with
a status value when completed. VI consumers will remove these completed
descriptors from the work queues and reuse them for subsequent requests. Each
work queue has an associated doorbell that is used to notify the VI network
adapter whenever a new descriptor has been posted to a work queue. There is
no operating system intervention to operate the doorbell since this is
implemented directly by the adapter.
Figure 7-1 Structure and organization of the VI architecture model

A completion queue allows a VI consumer to combine notification of descriptor
completions from the work queues of multiple VIs in a single location. Completion
queues are discussed in more detail in section “Completion queues” on
page 247.
VI provider
The VI provider is the set of hardware and software components responsible for
initiating a virtual interface. The VI provider consists of a network interface
controller (NIC) and a kernel agent. The VI NIC implements the virtual interfaces
and completion queues and directly performs data transfer functions.
The kernel agent is a privileged part of the operating system. This is usually a
driver supplied by the VI NIC vendor; it provides setup and resource
management functions which are needed to maintain a virtual interface between
VI Consumers and VI NICs. These functions include the creation and destruction
of VIs, VI connection setup and teardown, interrupt management and/or
processing, management of system memory used by the VI NIC, and error
handling. Standard operating system mechanisms, such as system calls, are
used by the VI consumers to access the kernel agent. Kernel agents interact with
VI NICs through standard operating system device management mechanisms.
Figure 7-2 A virtual interface

VI consumer
The VI consumer represents the user of a virtual interface. While an application
program is the ultimate consumer of communicating services, applications
access these services through standard operating system programming
interfaces such as sockets or MPI. The operating system facility is generally
implemented as a library that is loaded into the application process.
The operating system makes the system calls to the kernel agent to create a VI
on the local system and connect it to a VI on a remote system. Once a
connection is established, the operating system facility posts the application’s
send and receive requests directly to the local VI.
The operating system communication facility often loads a library that abstracts
the details of the underlying communication provider, in this case the VI and
kernel agent. This component is shown as the VI user agent in Figure 7-1. It is
supplied by the VI hardware vendor, and conforms to an interface defined by the
operating system communication facility.
Completion queues
Completed requests can be notified directly to a completion queue on a per-VI
work queue basis. This association is established when a VI is created. Once a
VI work queue is associated with a completion queue, all completion
synchronization must take place on that completion queue.
As with VI work queues, notification status can be placed into the completion
queue by the VI NIC without an interrupt, and a VI consumer can synchronize on
a completion without a kernel transition.
Figure 7-3 on page 248 shows the VI architecture completion queue model.

Figure 7-3 VI architecture completion queue model
For more information on the virtual architecture, visit the following Web site:
http://www.viarch.org
7.7 Direct Access File System (DAFS)

The Direct Access File System (DAFS) is a file access protocol based on
Network File System (NFS) Version 4. This is being designed to take advantage
of new, standard memory-to-memory interconnect technologies, such as VI and
InfiniBand, in high-performance data center environments (see Figure 7-4, “Data
transfer overhead” on page 249).
DAFS is a lightweight protocol that enables applications to access transport

resources directly. As a result, a DAFS-enabled application can transfer data
from its application buffers to the network transport, bypassing the operating
system while still preserving files semantics. This is expected to improve CPU
utilization by enabling high-performance file I/O. It is also expected to reduce
system overhead due to fewer data copies, context switches, interrupts, and less
network protocol processing. DAFS is also designed specifically for 24x7

machine room environments, where clusters of application servers need low
latency access to shared pools of file storage. DAFS provides data integrity and
availability features such as consistent high speed locking, graceful fail-over of
clients and servers, fencing, and enhanced data recovery.
Figure 7-4 Data transfer overhead
7.7.1 DAFS compared to traditional file access methods

As mentioned in the previous section, DAFS reduces the overhead normally
associated with file access methods. Figure 7-4 compares three file access
methods: local file system access, network file system access, and DAFS
access.
For the local or network file system, data is copied into a buffer cache. It is then
copied into the application’s private buffer. File access over network file systems
incurs additional data copies in the networking stack. Some operating systems
can bypass the buffer cache copy in certain cases, but all reads over a traditional
network file system require at least one data copy.

DAFS has an advantage over other file access methods when reading data. By
using the remote memory addressing capability of transports like VI architecture
and InfiniBand, an application using the DAFS API can read a file without
requiring any copies on the client side. Using the direct DAFS operations, a
client’s read or write request causes the DAFS server to issue remote DMA
requests back to the client.
In this way, data can be transferred to and from a client application’s buffers
without any CPU overhead on the client side. To avoid extra data copies on write
requests, a traditional local or remote file system must lock down the
application’s I/O buffers before each request. A DAFS client allows an application
to register its buffers with the NIC once, which avoids the per-operation
registration overhead currently incurred.
7.7.2 Benefits of DAFS-enabled storage

The key difference between DAFS and present day network file systems, or SAN
protocols, is that it takes advantage of VI architecture directly. The protocol
leverages VI's direct memory mapping to support high-speed storage access
and retrieval. The objective of DAFS-enabled storage is to be as responsive as
dedicated block-level storage arrays in high performance, transaction-processing
applications.
Other features of DAFS-enabled storage include the following:

򐂰 DAFS is designed to support VI's distributed bus architecture. DAFS-enabled
storage appliances communicate directly with the dedicated I/O processors in
VI application servers, without requiring additional central processor unit
cycles. Thus, to the application server, the presence of DAFS-enabled
storage on the network is handled in much the same way as a locally attached
disk drive.
򐂰 The DAFS protocol will utilize the same IP networking technology as other
network file system protocols. There is no new technology service to learn or
provision at the network or system level in order to make DAFS storage work.
򐂰 Like all other server appliances, DAFS appliances can be configured for
redundancy and fail-over in order to achieve fault tolerance and
zero-downtime objectives.
򐂰 DAFS, as an IP-based, network file system protocol, is compatible with a
broad range of physical and network transports. Whether storage
accessibility is required within a corporate Ethernet or across a metropolitan
or wide area network, or over a wireless or satellite-based net, if the network
supports IP, it also supports DAFS.
򐂰 DAFS is “wire agnostic.” That means that storage devices in a DAFS-enabled
storage appliance can be connected via SCSI, Fibre Channel, InfiniBand or

virtually any other interconnect selected by the product developer. Potentially,
therefore, DAFS users will have a wider choice of storage vendors.
򐂰 DAFS enables a more flexible storage infrastructure strategy. DAFS-enabled
storage appliances can be deployed to serve the local file sharing
requirements of a workgroup, or they can be clustered together to form large
storage pools for use in server farms or server clustering environments. With
the DAFS protocol, there are no “isolated islands of storage.” Instead, files
can be shared immediately, based on requirements and policy, across
campus, across town, across the country, or around the world.
Implementation of DAFS is, of course, dependent on delivery of architectures

such as InfiniBand or Virtual Interface. For more information on Direct Access
File System architecture, visit the following Web site:
http://www.dafscollaborative.org
7.8 Network Data Management Protocol (NDMP)

The Network Data Management Protocol (NDMP) defines a mechanism and
protocol for controlling backup, recovery, and other transfers of data between
primary and secondary storage.
The NDMP architecture separates the network-attached Data Management

Application (DMA), Data Servers, and Tape Servers participating in archival or
recovery operations. NDMP also provides low-level control of tape devices and
SCSI media changers.
NDMP uses the External Data Representation (XDR) and TCP/IP protocols as
foundations. The key goals of NDMP include interoperability, contemporary
functionality, and extensibility.
7.8.1 NDMP terminology

It is useful to understand the following terms that apply to NDMP.
NDMP
Network Data Management Protocol. An open protocol for enterprise-wide
network-based backup.
NDMP client
The application that controls the NDMP server.

NDMP host
The host that executes the NDMP server application. Data is backed up from the
NDMP host to either a local tape drive or to a backup device on a remote NDMP
host.
NDMP server
The virtual state machine on the NDMP host that is controlled using the NDMP
protocol. There is one of these for each connection to the NDMP host. This term
is used independent of implementation.
7.8.2 NDMP architecture model

NDMP architecture is based on the client/server model. Backup management
software is considered as a client to the NDMP server. For every connection
between the client on the backup management software host and the NDMP
host, there is a virtual state machine on the NDMP host. That is controlled using
NDMP. This virtual state machine is referred to as the NDMP server. Each state
machine controls at most one device used to run backups. The protocol is a set
of XDR-encoded messages that are exchanged over a bi-directional TCP/IP
connection and are used to control and monitor the state of the NDMP server
and to collect detailed information about the data that is backed up.
In the simplest configuration, an NDMP client will back up the data from the
NDMP host to a backup device connected to the NDMP host.
Figure 7-5 shows a logical view of a simple NDMP configuration.
Figure 7-5 Simple NDMP configuration

It is also possible to use NDMP simultaneously to back up to multiple devices
physically attached to the NDMP host. In this configuration, there are two
instances of the NDMP server on the NDMP host. This is shown in Figure 7-6.
Figure 7-6 Two drive NDMP configuration
NDMP can be used to back up data to a backup device in a tape library that is
physically attached to the NDMP host. In this configuration, there is a separate
instance of the NDMP server to control the robotics within the tape library. This is
shown in Figure 7-7.

Figure 7-7 NDMP tape library configuration
This architecture can also back up a host that supports NDMP, but which does
not have a locally attached backup device. This is achieved by sending the data
through a raw TCP/IP connection to another NDMP host. A logical view of this
configuration is shown in Figure 7-8.
In addition to the backup/retrieval function, NDMP supports tape-to-tape or

data-to-data copy from one NDMP server to another NDMP server.
Tape-to-tape copy function could be used to duplicate the backup tape for off-site
storage, while data-to-data copy is used to restore the entire data from one disk
to another disk.
For more information on NDMP, visit the following Web site:

http://www.ndmp.org

Figure 7-8 Backing up NDMP host through the network to another NDMP host
7.9 Industry standards bodies

The two most significant bodies focused on establishment of standards for the
various storage networking architectures are the Storage Networking Industry
Association (SNIA), and the Internet Engineering Task Force (IETF). Some of the
Work Groups of these two organizations are outlined in this section.
7.9.1 SNIA work groups

The Storage Networking Industry Association (SNIA) is described as the most
important body in the agreement and development of standards in the industry.
To show the scope of SNIA’s involvement, the following section outlines some of
the key focus areas of various work groups within SNIA. More details can be
found at the SNIA Web site:
http://www.snia.org
NAS Work Group

The NAS Work Group provides a forum in which users and vendors can build
understanding of NAS technologies and their operational and management
issues, and define and promote NAS standards. These are its key objectives:

򐂰 To define a common terminology for NAS. Today, entities from single disks to
terabyte-range devices are all called NAS, without much differentiation or
stratification to help customers in evaluating products.
򐂰 To identify NAS management strategies. The objective is to standardize on a
NAS management and administrative infrastructure for interacting with
existing storage management tools and providing a universal and consistent
base on which NAS vendors can build their management utilities. Considering
the popularity of Web-based management tools in appliance solutions and
SNMP-based management tools, it is imperative that the proposed
management infrastructure be able to plug into environments employing
these solutions.
򐂰 To develop benchmarks that accurately represent NAS. The goal is to define,
develop, analyze, and/or endorse benchmarking metrics, tools, and
processes that allow customers to accurately and objectively measure the
value of NAS solutions in their environments.
򐂰 To document and standardize the CIFS (Common Internet File System)
protocol. This protocol is typically encountered, as it provides the means for
Internet file access from Windows and NT systems.
򐂰 To establish NAS support in multi-operating system environments. A
methodology is needed to facilitate transparent interoperability of NAS
services across heterogeneous operating systems. Of particular interest is
standardizing the processing of access lists and accounts, the use of
directory services, and file locking and naming.
File Systems Work Group

This group promotes standards for rapid adoption of vendor-neutral,
heterogeneous shared storage for storage networks. It is based upon two
fundamental strategies:
򐂰 In order to standardize the algorithms for shared access to disks, existing and
commonly accepted NAS standards are used and extended.
򐂰 Since on-disk format incompatibility is really an issue of metadata
incompatibility, not data incompatibility, a common server is used to interpret
the metadata for all clients, while allowing each client to access the data
directly from a SAN-attached disk.
Extending NAS protocols to share data over SANs effectively eliminates the
distinction between NAS and SANs, allowing them to be managed and
administered as one logical network that simply has varying means of physical
connectivity. In both cases, storage is attached to, and heterogeneously shared
via, some kind of network: typically, Ethernet for LAN-attached storage and Fibre
Channel for SAN-attached storage.

Discovery Work Group
The Discovery Work Group will provide SAN storage management and data
management software vendors, SAN hardware vendors, and imbedded SAN
services vendors, a forum for defining requirements, definitions, developing
prototypes, and creating standards definitions for discovery services. For storage
environments, the Discovery Work Group will address how entities are
discovered and accessed for the purpose of management. It will define a
discovery process to identify and publish information such as entity identity,
characteristics, relationships, and communication methods for access.
White papers from this Work Group will provide customers an understanding of
discovery within the SAN, how it fits into the overall management scheme of the
SAN, and how SAN storage management and data management software will
use it.
Backup Work Group

The Backup Work Group addresses technical issues related to data backup and
storage networking, including both SAN and NAS environments. It promotes
draft specifications to standards bodies whenever possible.
The Backup Work Group maintains a prioritized list of topics and problems that
are viewed as current or important to the community of backup providers, backup
consumers and to SAN/NAS element providers with a stake in backup
technology. The Backup Work Group has an objective of promoting all draft
specifications to standards bodies whenever possible. Currently, this Work
Group is addressing a number of issues, including these Subcommittees:
Snapshot/Checkpoint/Quiesce Subcommittee
Currently a large number of application, database, or supporting software
companies produce or are planning to produce a snapshot capability. A large
number of software companies produce software that must either invoke a
snapshot (such as backup software) or use a snapshot (such as recovery
software). The large increase in connectivity afforded by storage network
technology amplifies the need for uniform interfaces for snapshot, quiesce and
checkpoint. The market values a general solution. Providing a general solution
requires that each snapshot-using software product handle all the different
snapshot types. The Snapshot/Checkpoint Subcommittee is defining a standard
API for creating snapshots and checkpoints. A standard API will reduce
complexity and encourage interoperability.

Extended Copy Session Management Subcommittee
This Session Management Subcommittee is examining the ways in which
Extended Copy might be employed by systems and solutions. The outcome of
this effort is to identify and further define services and capabilities that may be
needed as part of a system or solution, above and beyond the basic SCSI
Extended Copy protocol specification.
Network Data Management Protocol (NDMP) Subcommittee

The NDMP Subcommittee was formed July 2000, specifically tasked with NDMP
development. This new group is responsible for the continued development of
protocol standards, interoperability testing, and educational programs. NDMP
provides a protocol framework for data management in a multi-vendor data
server environment. By defining a network protocol that abstracts data
management application functionality from data service providers, NDMP
minimizes the administrative management logic that resides on data servers and
allows interoperability with a variety of data management solutions.
Security Work Group

The Security Work Group provides architectures and frameworks for the
establishment of information security capabilities within the Storage Networking
industry, including that of stored information in heterogeneous environments. The
focus of the Security Work Group is directed toward long term security solutions,
taking into account any security inherent in underlying transports or technologies.
Fibre Channel Work Group

This deals with Fibre Channel SAN management architecture and related
specifications and guidelines. It develops demonstrations, reference
implementations, and test suites for Fibre Channel Storage Network architecture
and standards.
Object-Based Storage Device (OSD) Work Group

This enables the creation of self-managed, heterogeneous, shared storage for
storage networks. The work is focused on moving low-level storage functions
into the storage device itself, accessing the device through a standard object
interface. The group plans to standardize and extend the output from the
National Storage Industry Consortium's Network-Attached Storage Devices
(NASD) Project and work closely with the Object-Based Storage Device group
(OSD) efforts of the Distributed Management Task Force (DMTF).

Policy Work Group
The Policy Work Group will enable interoperable storage policies, covering all of
the important aspects of storage network management. The Group provides
requirements for, analysis of and extensions to the DMTF and IETF work on
policies. Planned deliverables include definitions and reference implementations
and/or test suites.
SNIA IP Storage Forum

The SNIA IP Storage Forum was also recently formed (in February 2001) to
promote and market standards based block storage networking solutions using
Internet Protocol networks. The forum will also work with the various standards
bodies to help ensure that viable and customer valued standards based solutions
for IP storage will be available in the marketplace. IBM is a founder member and
co-chairman of the SNIA IP Storage Forum.
7.9.2 IETF work groups

Internet Engineering Task Force (IETF) work groups have been established to
cover a very broad range of interests and development areas for Internet
enhancements and future directions. These include groups focused on various
levels of the OSI networking model described in 2.1, “Open Systems
Interconnection (OSI) model” on page 64, and include these areas:
򐂰 Applications
򐂰 General
򐂰 Internet
򐂰 Operations and Management
򐂰 Routing
򐂰 Security
򐂰 Transport
򐂰 User Services
Details of these work groups can be found at the IETF Web site:
http://www.ietf.org
Each of these primary work areas has a number of subgroups assigned to

specific areas of interest, with a view to the proposal and development of
industry standards. An important work group established in 2000 was in the
Transport area. It is focused on the IP storage.

IETF IP Storage Work Group
This group will pursue the pragmatic approach of encapsulating existing
protocols, such as SCSI and Fibre Channel, in an IP-based transport or
transports. The group will focus on the transport or transports and related issues
(for example, security, naming, discovery, and configuration), as opposed to
modifying existing protocols. Standards for the protocols to be encapsulated are
controlled by other standards organizations (for example, T10 [SCSI] and T11
[Fibre Channel]).
The Work Group cannot assume that any changes it desires will be made in
these standards, and hence will pursue approaches that do not depend on such
changes unless they are unavoidable. In that case, the Work Group will create a
document to be forwarded to the standards group responsible for the technology,
explaining the issue and requesting the desired changes be considered. The
Work Group will endeavor to ensure high quality communications with these
standards organizations. It will consider whether a layered architecture providing
common transport, security, and/or other functionality for its encapsulations is the
best technical approach.
The protocols to be encapsulated expect a reliable transport, in that failure to

deliver data is considered to be a rare event for which time-consuming recovery
at higher levels is acceptable. This has implications for both the choice of
transport protocols and design of the encapsulation(s). The Work Group's
encapsulations may require quality of service assurances (for example, bounded
latency) to operate successfully; the Work Group will consider what assurances
are appropriate and how to provide them in shared traffic environments (for
example, the Internet) based on existing IETF QoS mechanisms such as
Differentiated Services.
Use of IP-based transports raises issues that do not occur in the existing
transports for the protocols to be encapsulated. The Work Group will address at
least the following:
򐂰 Congestion control suitable for shared traffic network environments such as
the Internet.
򐂰 Security measures, including authentication and privacy, sufficient to defend
against threats up to and including those that can be expected on a public
network.
򐂰 Naming and discovery mechanisms for the encapsulated protocols on
IP-based networks, including both discovery of resources (for example,
storage) for access by the discovering entity, and discovery for management.
򐂰 Management, including appropriate MIB definition(s).

The Work Group will address security and congestion control as an integral part
of bits protocol encapsulation(s); naming, discovery, and management are
important related issues, but may be addressed in companion documents.
The Work Group specifications will provide support for bridges and gateways that
connect to existing implementations of the encapsulated protocols. The Work
Group will preserve the approaches to discovery, multi-pathing, booting, and
similar issues taken by the protocols it encapsulates to the extent feasible.
It may be necessary for traffic utilizing the Work Group's encapsulations to pass
through Network Address Translators (NATs) and/or firewalls in some
circumstances; the Work Group will endeavor to design NAT- and firewall-friendly
protocols that do not dynamically select target ports or require Application Level
Gateways.
Effective implementations of some IP transports for the encapsulated protocols

are likely to require hardware acceleration; the Work Group will consider issues
concerning the effective implementation of its protocols in hardware.
The standard Internet checksum is weaker than the checksums use by other
implementations of the protocols to be encapsulated. The Work Group will
consider what levels of data integrity assurance are required and how they
should be achieved.
The Work Group will produce a framework document that provides an overview
of the environments in which its encapsulated protocols and related protocols are
expected to operate. The Work Group will produce requirements and
specification documents for each protocol encapsulation, and may produce
applicability statements. The requirements and specification documents will
consider both disk and tape devices, taking note of the variation in scale from
single drives to large disk arrays and tape libraries, although the requirements
and specifications need not encompass all such devices.
However, the Work Group will not work on:

򐂰 Extensions to existing protocols such as SCSI and Fibre Channel beyond
those strictly necessary for the use of IP-based transports.
򐂰 Modifications to Internet transport protocols or approaches requiring transport
protocol options that are not widely supported, although the Work Group may
recommend use of such options for block storage traffic.
򐂰 Support for environments in which significant data loss or data corruption is
acceptable.
򐂰 File system protocols.

7.10 The bottom line
You can see that there are many exciting possibilities for the future. Some of
these developments will arrive sooner, others later. Still others, not described
here, will begin to come into focus. Ever since the invention of the computer,
there has always been something new around the corner. The key to success is
standardization, and standards are sometimes slower to emerge than everyone
would wish.
The wise IT professional will not get too carried away with promises for
tomorrow, because “tomorrow never comes.” It is smart to be aware of possible
advances, but you cannot make a sensible investment decision until the promise
is delivered. After all, many “wonder technologies” never really make it in the
market, and others arrive later than expected. Yet again, something previously
ignored may cause a much greater than anticipated impression in the market,
when its potential is fully understood.
Our recommendation is this: Focus on the solutions we can deliver now, or in the
near future. We hope we have shown you that these solutions offer cost
effectiveness and great flexibility; and IBM is committed to open standards, now
and for the future.

A
Appendix A. RAID concepts

Redundant Arrays of Independent Disks (RAID) is an important technology that
most people implementing business-critical IT systems probably know and use.
However, we recognize that some of our readers are not familiar with the
terminology of RAID, so this appendix has been included for their convenience.
RAID is mentioned in a number of places in this book; a basic appreciation of its
features and benefits will help you to understand why.
In the following sections, we describe the RAID implementations supported by

the IBM TotalStorage NAS and IP Storage appliances via the new ServeRAID 4x
Ultra160 SCSI controllers, including new RAID-1E and RAID-5E implementations
and spanned array implementations RAID-00, RAID-10, RAID-E0, and RAID-50.
Comments regarding applicability of these RAID levels are specific to the
ServeRAID controllers.
What is RAID
RAID is an architecture designed to improve data availability by using arrays of
disks in conjunction with data striping methodologies. The idea of an array—a
collection of disks the system sees as a single device—has been around for a
long time. In fact, IBM was doing initial development of disk arrays as early as the
1970s. In 1978, IBM was issued the patent for a disk array subsystem. At that
time, however, the cost of technology precluded the use of RAID in products.

In 1987, IBM co-sponsored a study by three researchers at the University of
California at Berkeley on the potential use of arrays. This study resulted in a
paper entitled “A Case for Redundant Arrays of Inexpensive Disks (RAID).” (The
name was subsequently modified by changing Inexpensive to Independent).
The original Berkeley paper emphasized performance and cost. The authors
were trying to improve performance while lowering costs at the same time. In
their efforts to improve reliability, they designed the fault tolerance and logical
data redundancy which was the origin of RAID. The paper defined five RAID
architectures, RAID Levels 0 through 5. Each of these architectures has its own
strengths and weaknesses, and the levels do not necessarily indicate a ranking
of performance, cost, or availability. Other RAID levels and combinations have
been defined in subsequent years.
Although very commonly implemented using SCSI disks, RAID is independent of

the specific disk technology being used. A RAID disk subsystem may have any
number of disks in the array, but between two and eight physical disks are typical.
These are accessed by the processor via a specialized RAID controller adapter.
The controller makes the array appear as a single large virtual disk to the
processor.
In the case of a six-drive array, the “logical” disk has six completely independent
head mechanisms for accessing data, so the potential for improved performance
is immediately apparent. In the optimal situation all six heads could be providing
data to the system without the need for the time-consuming head-seeks to
different areas of the disk that would be necessary were a single physical disk
being used. RAID can be implemented using a specialized hardware or via
software most common in the operating system.
However, the primary intent of a RAID implementation is to prevent the system

served by the array from being affected by critical hard disk failures. The three
most common implementations are levels 0, 1, and 5. The ServeRAID Ultra160
SCSI controllers introduce a new enhanced RAID-5 described in “RAID-5
Enhanced” on page 271. In order to know RAID strengths and limitations, a clear
understanding of the different RAID architectures is required.
RAID-0
RAID-0, sometimes referred to as disk striping, is not really a RAID solution since
there is no redundancy in the array at all. The disk controller merely stripes the
data across the array so that a performance gain is achieved. This is illustrated in
Figure A-1 on page 265.
It is common for a striped disk array to map data in blocks with a stripe size that is
an integer multiple of real drive track capacity. For example, the IBM ServeRAID
controllers allow stripe sizes of 8 KB, 16 KB, 32 KB or 64 KB, selectable during
initialization of the array.

Here are the advantages and disadvantages of RAID-0.
򐂰 Advantages:
– Performance improvement, in many cases
– All disk space available for data
򐂰 Disadvantage:
– No redundancy
Physical disks
0 1 2 3
4 5 6 7
8 9 10 11
0
1
2
3
4
5
6
Logical disk
Figure A-1 RAID-0
RAID-1 and RAID-1E

RAID-1, or disk mirroring, offers true redundancy. Each stripe is duplicated, or
mirrored, on another disk in the array. In its simplest form, there are two disks
where the second is a simple copy of the first. If the first disk fails, then the
second can be used without any loss of data. Some performance enhancement
is achieved by reading data from both drives.
Certain operating systems, including Windows NT, provide direct support for disk
mirroring. There is a performance overhead, however, as the processor has to
issue duplicate write commands. Hardware solutions where the controller
handles the duplicate writes are preferred.
Appendix A. RAID concepts 265

When more than two disks are available, the duplication scheme can be a little
more complex to allow striping with disk mirroring, also known as Enhanced
RAID-1 (RAID-1E). An example is shown in Figure A-2.
Physical disks
0 0 1
1 2 2
3 3 4
0
1
2
3
4
5
6
Logical disk
Figure A-2 RAID-1E implementation
As you can see, any one disk can be removed from the array without loss of
information, because each data stripe exists on two physical disks. The controller
detects a failed disk and redirects requests for data from the failed drive to the
drive containing the copy of the data. When a drive has failed, the replacement
drive can be rebuilt using the data from the remaining drives in the array.
When a disk fails, only one copy of the data that was on the failed disk is
available to the system. The system has lost its redundancy, and if another disk
fails, data loss is the result. When a failed disk is replaced, the controller rebuilds
the data that was on the failed disk from the remaining drives and writes it to the
new disk, restoring the redundancy.
To avoid having to manually replace a failed disk, the IBM ServeRAID controller
implements hot spare disks that are held idle until a failure occurs, at which point
the controller immediately starts to rebuild the lost data onto the hot spare,
minimizing the time when redundancy is lost. The controller provides data to the
system while the rebuild takes place. When you replace the failed drive, its
replacement becomes the array’s new hot spare.

Here are the advantages and disadvantages of RAID-1 and RAID-1-E.
򐂰 Advantages:
– Performance improvement, in many cases.
– Redundancy. A drive can fail without loss of data.
򐂰 Disadvantage:
– Cost. The logical disk has only half the capacity of the physical disks.
RAID-3
RAID-3 stripes data sequentially across several disks. The data is written or
retrieved in one parallel movement of all of the access arms. RAID-3 uses a
single dedicated disk to store parity information, as shown in Figure A-3.
Because of the single parallel movement of all access arms, only one I/O can be
active in the array at any one time.
Because data is striped sequentially across the disks, the parallel arm movement
yields excellent transfer rates for large blocks of sequential data, but renders
RAID-3 impractical for transaction processing or other high throughput
applications needing random access to data. When random processing does
take place, the parity becomes a bottleneck for write operations.
RAID-3 can withstand a single disk failure without losing data or access to data.
It is well-suited for imaging applications.
D is k C o n tro lle r
P h y s ic a l d is k s
0 1 2 3 P a r ity
4 5 6 7 P a r ity
8 9 10 11 P a r ity
D is k 1 D is k 2 D is k 3 D is k 4 D is k 5
Figure A-3 RAID-3 Implementation

򐂰 Advantages:
– Good data availability
– High performance for transfer rate intensive applications
– Cost effective, since only one extra disk is required for parity
򐂰 Disadvantages:
– Can satisfy only one I/O request at a time
– Poor small, random I/O performance
– Complicated
RAID-5
RAID-5 is one of the most capable and efficient ways of building redundancy into
the disk subsystem. The principles behind RAID-5 are very simple and are
closely related to the parity methods sometimes used for computer memory
subsystems. In memory, the parity bit is formed by evaluating the number of 1
bits in a single byte. For RAID-5, if we take the example of a four-drive array,
three stripes of data are written to three of the drives and the bit-by-bit parity of
the three stripes is written to the fourth drive.
As an example, we can look at the first byte of each stripe and see what this
means for the parity stripe. Let us assume that the first byte of stripes 1, 2, and 3
are the letters A, B, and G respectively. The binary code for these characters is
01000001, 01000010 and 01000111 respectively.
Parity is defined as redundant information about user data which allows it to be

regenerated in the event of a disk failure. In the following descriptions, data can
mean a byte or block, not necessarily an entire file.
We can now calculate the first byte of the parity block. Using the convention that
an odd number of 1s in the data generates a 1 in the parity, the first parity byte is
01000100 (see Table A-1). This is called Even Parity because there is always an
even number of 1s if we look at the data and the parity together. Odd Parity could
have been chosen; the choice is of no importance as long as it is consistent.
Table A-1 Generation of parity data for RAID 5
Disk 1 Disk 2 Disk 3 Disk 4

“A” “B” “G” Parity
0 0 0 0
1 1 1 1
0 0 0 0
0 0 0 0
0 0 0 0

Disk 1 Disk 2 Disk 3 Disk 4
“A” “B” “G” Parity
0 0 1 1
0 1 1 0
1 0 1 0
Calculating the parity for the second byte is performed using the same method,
and so on. In this way, the entire parity stripe for the first three data stripes can be
calculated and stored on the fourth disk.
The presence of parity information allows any disk to fail without loss of data. In
the above example, if drive 2 fails (with B as its first byte) there is enough
information in the parity byte and the data on the remaining drives to reconstruct
the missing data. The controller has to look at the data on the remaining drives
and calculate what drive 2’s data must have been to maintain even parity.
Because of this, a RAID-5 array with a failed drive can continue to provide the
system with all the data from the failed drive.
Performance will suffer, however, because the controller has to look at the data
from all drives when a request is made to the failed one. A RAID-5 array with a
failed drive is said to be critical, since the loss of another drive will cause lost
data. For this reason, the use of hot spare drives in a RAID-5 array is as
important as in RAID-1.
The simplest implementation would always store the parity on disk 4 (in fact, this
is the case in RAID-4, which is hardly ever implemented for the reason about to
be explained). Disk reads are then serviced in much the same way as a level 0
array with three disks. However, writing to a RAID-5 array would then suffer from
a performance bottleneck. Each write requires that both real data and parity data
are updated. Therefore, the single parity disk would have to be written to every
time any of the other disks were modified. To avoid this, the parity data is also
striped, as shown in Figure A-4 on page 270, spreading the load across the
entire array.
The consequence of having to update the parity information means that for every
stripe written to the virtual disk, the controller has to read the old data from the
stripe being updated and the associated parity stripe. Then the necessary
changes to the parity stripe have to be calculated based on the old and the new
data. All of this complexity is hidden from the processor, but the effect on the
system is that writes are much slower than reads. This can be offset to a great
extent by the use of a cache on the RAID controller. The IBM ServeRAID
controllers have cache as standard, which is used to hold the new data while the
calculations are being performed.

Physical disks
0 1 2 Parity 0-2
3 4 Parity 3-5 5
6 Parity 6-8 7 8
0
1
2
3
4
5
6
Logical disk
Figure A-4 RAID 5 implementation
Meanwhile, the processor can continue as though the write has taken place.
Battery backup options for the cache, available for some controllers, mean that
data loss is kept to a minimum even if the controller fails with data still in the
cache.

򐂰 Advantages:
– Performance improvement, in many cases.
– Storage overhead is equal to the size of only one drive.
򐂰 Disadvantage:
– Overhead associated with writes can be detrimental to performance in
applications where the write/read ratio is high. A controller cache can
alleviate this.

RAID-5 Enhanced
RAID-5 Enhanced (RAID-5E) puts hot spare drives to work to improve reliability
and performance. A hot spare is normally inactive during array operation and is
not used until a drive fails. By utilizing unallocated space on the drives in the
array, a virtual distributed hot spare (DHS) can be created to improve reliability
and performance. Figure A-5 shows normal operation of a RAID-5E array. The
data areas of the individual disks shown contain the application data and stripe
parity data as for a normal RAID-5 array:
Logical Drive RAID-5E Logical

drive Status: OKY
Data Data Data Data

S pare Spare S pare S pare
S pace Space S pace Space
Figure A-5 RAID-5E array; normal operation
In the event of a physical drive failing, its status will change to Defunct Disk Drive
(DDD) and the ServeRAID controller will start rearranging the data the disk
contained into the spare space on the other drives in the array, provided there is
enough space, of course.
During the migration of data, the logical drive will be in a critical, non-redundant
state. As soon as all the data is rearranged, the logical drive will be marked OKY
(Okay) and have full redundancy again. This is illustrated in Figure A-6 on
page 272.

RAID-5E Logical
Logical Drive Drive Status
critical while
migration to
Data DDD Data Data spare space

in progress
REARRANGING DATA TO SPARE SPACE IN PROGRESS
Figure A-6 RAID-5E array: single physical disk drive failure
A second physical disk failure, occurring before the previously failed disk has
been replaced, is illustrated in Figure A-7.
RAID-5E Logical
Logical Drive Drive Status: OKY
Data distributed
Data DHS Data Data throughout

previous
Data Data Data spare space
Figure A-7 RAID-5E array: data distributed throughout previous spare space
In the event of such a second physical disk failure before the previously failed
disk has been replaced, normal RAID-5 procedures will be taken to provide
service to the system through the checksum calculations described in
Figure A-8.

RAID-5E Logical
Logical Drive Drive Status critical.
Redundancy lost
but array operational.
Data DHS DDD Data

Data Data
Figure A-8 RAID 5E array: second physical disk array
Here are the advantages, disadvantages, and design characteristics of RAID-5E.

򐂰 Advantages (as compared to RAID-5):
– 15 to 20% performance improvement for smaller arrays with typical data
transfer size.
– Protects data, even in the event of a two-drive failure.
򐂰 Disadvantage:
– Migration time.
򐂰 Design characteristics:
– One RAID-5E logical drive per array.
– Minimum of four physical drives in array configured for RAID-5E logical
drive.
Spanned arrays—RAID levels - x0

With the introduction of ServeRAID-4 controllers, significant enhancements to
the available RAID levels have been made available. Four new RAID levels have
been developed, the main benefits of which are larger logical drives, increased
performance, and increased reliability. With the ServeRAID controllers no more
than 16 physical drives can be used in an array with RAID levels 0, 1, 1E, 5 and
5E. However, RAID levels 00, 10, 1E0 and 50 may include more physical drives
by managing an array of arrays, or a spanned array.

Note: RAID levels xO are only available on the IBM serveRAID-4x Ultra SCSI
controller.
Each of the new levels utilizes disk drive organizations referred to as spanned
arrays. Data is striped across a number of lower level arrays rather than
individual disks using RAID 0 techniques. These lower level arrays are
themselves RAID arrays.
In this section we explain the principles behind each of these spanned RAID
levels.
RAID-00
RAID-00 comprises RAID-0 striping across lower level RAID-0 arrays, as shown
in Figure A-9:
This RAID level does not provide any fault tolerance. However, as with a standard
RAID-0 array, you achieve improved performance, and also the opportunity to
group more disks into a single array, providing larger maximum logical disk size.
A B I J
C D K L
E F M N
G H O P
R A ID 0 R A ID 0
R A ID 0
S p a n n e d R A ID 0 0
Figure A-9 Spanned array: RAID-00
RAID-10
As we have seen, RAID-1 offers the potential for performance improvement as
well as redundancy. RAID-10 is a variant of RAID-1 that effectively creates a
striped volume of a RAID-1 array. The disks are first mirrored together and then
striped together as one volume.

in Figure A-10.
A A E E
B B F F
C C G G
D D H H
R A ID 1 R A ID 1
R A ID 0
This RAID level provides fault tolerance. Up to one disk of each sub-array may
fail without causing loss of data.

򐂰 Advantages:
– Performance improvement in many cases.
– Provides fault tolerance for disk enclosures.
򐂰 Disadvantages:
– Cost. The logical disk has only half the capacity of the physical disks.
– Slightly less flexible than RAID-1E (requires an even number of disks).
RAID-1E0
RAID-1E0 comprises RAID-0 striping across lower level RAID-1E arrays, as
shown in Figure A-11.
This RAID level gives you the performance of the RAID-1E and RAID-0 mixed in
one single array, and will give you high availability for your data. Up to one disk in
each sub-array may fail without causing data loss.

A B C G H I
C A B I J H
D E F J K L
F D E L J K
R A ID 1 E R ARIDA1 ID
E 1
R A ID 0
S p a n n e d R A ID 1 E 0
Figure A-11 Spanned array: RAID-1E0
RAID-50
in Figure A-12 on page 277.
Once again, the benefits of RAID-5 are gained, while the spanned RAID-0 allows
you to incorporate many more disks into a single logical drive. Up to one drive in
each sub-array may fail without loss of data.

A B A+B G H G + H
C+D C D I + J I J
E E +F F K
K+L L
R A ID 5 R AID 5
R A ID 0
RAID summary
RAID is an excellent and proven technology for protecting your data against the
possibility of hard disk failure. IBM’s ServeRAID range of RAID controllers bring
the benefits of RAID technology to IBM TotalStorage NAS solutions to your
critical business information.
Here is a brief summary of the different RAID levels we covered in this appendix:
RAID-0: Block interleave data striping without parity

򐂰 Best performance of all RAID levels
򐂰 Drive seek times and latencies effectively reduced by parallel operation
򐂰 Significantly outperforms a single large disk
RAID-1: Disk mirroring

򐂰 Fast and reliable but requires 100% disk space overhead
򐂰 Two copies of data maintained
򐂰 No performance degradation with a single disk failure
򐂰 Writes are slower than a single disk, reads are quicker
RAID-1E: Data stripe mirroring

򐂰 All the benefits of RAID-1
򐂰 Provides mirroring with an odd number of drives

RAID-5: Block interleave data striping with distributed parity
򐂰 Best for random transactions
򐂰 Poor for large sequential reads if request is larger than block size
򐂰 Block size is key to performance; must be larger than requested size
򐂰 Performance degrades in recovery mode, that is, when single drive fails
RAID-5E: RAID-5 with distributed hot spare

򐂰 All the benefits of RAID-5
򐂰 15 - 20% performance improvement for smaller arrays
򐂰 Protects data, even in the event of a two-drive failure
Spanned arrays: RAID 00; 10; 1E0 and 50

򐂰 Enable more disks to be grouped in a single array
򐂰 Enable larger logical volume size
򐂰 Improve performance
Typical RAID level applications

The functions and applications of typical RAID levels are shown in Table A-2.
Table A-2 Typical applications of RAID levels
RAID Level Function Application
Level 0 Splits data across drives to No data protection is

(Striping) increase data throughput needed, but high speed
storage is required
Level 1 Duplicates all data from one Where only two drives are
(Mirroring) drive to a second drive available and data protection
is needed
Level 3 Data striped at byte level Where high-speed reads are

(Striped data, across a number of drives, needed or data protection at
dedicated parity drive) with parity stored on one of low cost
the drives
Level 5 Data striped at block level Where high performance

(Distributed data and across drives, and parity with random I/O requests
parity) distributed to all drives occurs, such as databases
Level 10 Data is striped across For high performance

(Striped, mirrored multiple drives, each having applications with higher data
array) a mirrored twin availability required

Table A-3 summarizes the features offered by each RAID level to help you decide
which RAID implementation will best support your applications.
Table A-3 RAID level features
RAID level Read Write Data Capacity

performance performance redundancy utilization
RAID 0 Excellent Excellent No 100%
RAID 1 Very Good Very Good Yes 50%
RAID 1E Very Good Very Good Yes 50%
RAID 5 Excellent Good Yes 67% to 94%
RAID 5E Excellent Good Yes 50% to 88%
RAID 00 Excellent Excellent No 100%
RAID 10 Very Good Very Good Yes 50%
RAID 1E0 Very Good Very Good Yes 50%
RAID 50 Very Good Good Yes 67% to 94%
RAID 0 and RAID 00 would typically be used only when data on the array is not
subject to change and is easily replaced in the case of a failed disk.
Table A-4 illustrates the different advantages and disadvantage of each RAID
level.
Table A-4 RAID types: Descriptions, pros and cons

RAID type Users Description Pros Cons
0 Various User data distributed High performance. No redundancy.

across the disks in the Low cost. Increased chance of
array. failure.
1 Various Mirroring. Each disk in a Simplicity, reliability, and High inherent cost.
mirrored array holds an availability.
identical image of data.
0+1 Various RAID 0/1 is a dual-level Performance. Expensive.

array that utilizes multiple Data protection.
RAID1 (mirrored) sets in Minimal impact to
a single array. Data is performance during data
striped across all recovery.
mirrored sets.

RAID type Users Description Pros Cons
1E IBM Combines mirroring with Shares the same Requires a minimum of 3

Enhanced data striping. Stripes data characteristics of RAID-1. drives. 50% storage
and copies the data Allows more than two efficiency.
across all drives in the drives, including an odd
array. number of drives.
4 NetApp User data is striped High performance for Requires extra cache for
across multiple disks. reads. writes. Single parity disk
Parity check data is can be a bottleneck.
stored on a single disk.
5 Various User data is striped High performance for Still requires caching or
across multiple disks. reads. parallel multiprocessors
Parity check data is Multiple drives can fail in for writes.
stored across multiple a single array, but data is
disks. still protected.
5E IBM Complements RAID 5. Performance May need additional

Enhanced The hot spare drive is improvement for reads caching to improve
incorporated as an active and writes over standard performance.
drive. RAID-5.
S EMC Similar to RAID-5 without Provides more usable Proprietary. Only

Symmetrix Proprietary striping. Combination of storage than mirroring, implemented by EMC.
software and hardware to while still providing data
improve performance. protection.
The RAID Advisory Board, of which IBM is an active member, exists to

standardize terminology and provide information about RAID technology. To
learn more about RAID, see the following Web site:
http://www.raid-advisory.com/

Related publications
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this redbook.
IBM Redbooks
For information on ordering these publications, see “How to get IBM Redbooks”
on page 283.
򐂰 Introduction to Storage Area Network, SAN, SG24-5470
򐂰 Designing an IBM Storage Area Network, SG24-5758
򐂰 Implementing an Open IBM SAN, SG24-6116
򐂰 Using Tivoli Storage Manager in a SAN Environment, SG24-6132
򐂰 IBM Tape Solutions for Storage Area Networks and FICON, SG24-5474
򐂰 Storage Area Networks; Tape Future in Fabrics, SG24-5474
򐂰 Storage Consolidation in SAN Environments, SG24-5987
򐂰 Implementing Fibre Channel Attachment on the ESS, SG24-6113
򐂰 IBM SAN Survival Guide, SG24-6143
򐂰 Storage Networking Virtualization: What’s it all about?, SG24-6210
򐂰 A Practical Guide to Network Storage Manager, SG24-2242
򐂰 Using iSCSI Solutions’ Planning and Implementation, SG24-6291
Other resources
These publications are also relevant as further information sources:
򐂰 Building Storage Networks, ISBN 0072130725, Farley, Marc, McGraw-Hill
Professional, 2001
򐂰 IP Fundamentals, What Everyone Needs to Know About Addressing &
Routing, ISBN 0139754830, Maufer, Thomas, Prentice All, 1999
Referenced Web sites

򐂰 IBM Enterprise SAN
© Copyright IBM Corp. 2002 281

http://www.storage.ibm.com/ibmsan/index.htm
򐂰 Internet Engineering Task Force
http://www.ietf.org
򐂰 Internet SCSI Draft
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-05.txt
򐂰 Internet SCSI Documentation
http://www.ece.cmu.edu/~ips/Docs/docs.html
򐂰 Tivoli Sanergy on Network Attached Storage
http://www.tivoli.com/sanergy/nas
򐂰 Tivoli Storage Managerment Solutions
http://www.tivoli.com/storage
򐂰 IBM Network Attached Storage
http://www.storage.ibm.com/snetwork/nas/index.html
򐂰 Infiniband Trade Association
http://www.infinibandta.org
򐂰 Storage Networking Industry Association
http://www.snia.org
򐂰 Virtual Interface Architecture
http://www.viarch.org
򐂰 Direct Access File System
http://www.dafscollaborative.org
򐂰 Network Data Management Protocol
http://www.ndmp.org
򐂰 IBM Storage Networking Website
http://www.storage.ibm.com/snetwork/index.html
򐂰 IBM whitepaper by David Sacks entitled “Demystifying Storage Networking”
http://www.storage.ibm.com/snetwork/nas/sto_net.html
򐂰 IBM whitepaper by Jay Knott entitled “Cache Systems, Persistent Storage
Manager and Backup”
http://www.storage.ibm.com/snetwork/nas/whitepaper_nas_cache_systems.html
򐂰 Cisco SN 5420 Storage Router product page
http://www.cisco.com/warp/public/cc/pd/rt/5420/index.shtml
282 IP Storage Networking: IBM NAS and iSCSI Solutions - Update

How to get IBM Redbooks
You can order hardcopy Redbooks, as well as view, download, or search for
Redbooks at the following Web site:
ibm.com/redbooks
You can also download additional materials (code samples or diskette/CD-ROM

images) from that site.
IBM Redbooks collections

Redbooks are also available on CD-ROMs. Click the CD-ROMs button on the
Redbooks Web site for information about all the CD-ROMs offered, as well as
updates and formats.
Related publications 283

284 IP Storage Networking: IBM NAS and iSCSI Solutions - Update
Special notices
References in this publication to IBM products, programs or services do not imply

that IBM intends to make these available in all countries in which IBM operates.
Any reference to an IBM product, program, or service is not intended to state or
imply that only IBM's product, program, or service may be used. Any functionally
equivalent program that does not infringe any of IBM's intellectual property rights
may be used instead of the IBM product, program or service.
Information in this book was developed in conjunction with use of the equipment
specified, and is limited in application to those specific hardware and software
products and levels.
IBM may have patents or pending patent applications covering subject matter in
this document. The furnishing of this document does not give you any license to
these patents. You can send license inquiries, in writing, to the IBM Director of
Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact IBM Corporation, Dept.
600A, Mail Drop 1329, Somers, NY 10589 USA.
Such information may be available, subject to appropriate terms and conditions,

including in some cases, payment of a fee.
The information contained in this document has not been submitted to any formal
IBM test and is distributed AS IS. The use of this information or the
implementation of any of these techniques is a customer responsibility and
depends on the customer's ability to evaluate and integrate them into the
customer's operational environment. While each item may have been reviewed
by IBM for accuracy in a specific situation, there is no guarantee that the same or
similar results will be obtained elsewhere. Customers attempting to adapt these
techniques to their own environments do so at their own risk.
Any pointers in this publication to external Web sites are provided for
convenience only and do not in any manner serve as an endorsement of these
Web sites.

The following terms are trademarks of other companies:
C-bus is a trademark of Corollary, Inc. in the United States and/or other

countries.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Sun Microsystems, Inc. in the United States and/or other
countries.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of
Microsoft Corporation in the United States and/or other countries.
PC Direct is a trademark of Ziff Communications Company in the United States

and/or other countries and is used by IBM Corporation under license.
ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel

Corporation in the United States and/or other countries.
UNIX is a registered trademark in the United States and other countries licensed
exclusively through The Open Group.
SET, SET Secure Electronic Transaction, and the SET Logo are trademarks
owned by SET Secure Electronic Transaction LLC.
Other company, product, and service names may be trademarks or service

marks of others

Glossary
ANSI American National Standards Institute - The Bridge/Router A device that can provide the
primary organization for fostering the development functions of a bridge, router or both concurrently. A
of technology standards in the United States. The bridge/router can route one or more protocols, such
ANSI family of Fibre Channel documents provide the as TCP/IP, and bridge all other traffic. See also:
standards basis for the Fibre Channel architecture Bridge, Router.
and technology. See FC-PH.
Cache A small fast memory holding recently
Arbitrated Loop A Fibre Channel interconnection accessed data, designed to speed up subsequent
technology that allows up to 126 participating node access to the same data. Most often applied to
ports and one participating fabric port to processor-memory access but also used for a local
communicate. copy of data accessible over a network, and so on.
ASM Advanced System Management CIFS Common Internet File System.
Backup (1) A copy of computer data that is used to Client A software program used to contact and
recreate data that has been lost, mislaid, corrupted, obtain data from a server software program on
or erased. (2) The act of creating a copy of computer another computer—often across a great distance.
data that can be used to recreate data that has been Each client program is designed to work
lost, mislaid, corrupted or erased. specifically with one or more kinds of server
programs and each server requires a specific kind
Bandwidth Measure of the information capacity of of client program.
a transmission channel.
Client/Server The relationship between machines
BI Business Intelligence. in a communications network. The client is the
requesting machine, the server the supplying
BIOS Basic Input/Output System - set of routines machine. Also used to describe the information
stored in read-only memory that enable a computer management relationship between software
to start the operating system and to communicate components in a processing system
with the various devices in the system, such as disk
drives, keyboard, monitor, printer, and Cluster A type of parallel or distributed system that
communications ports. consists of a collection of interconnected whole
computers and is used as a single, unified
Bridge (1) A component used to attach more than computing resource.
one I/O unit to a port. (2) A data communications
device that connects two or more networks and Coaxial Cable A transmission media (cable)
forwards packets between them. The bridge may used for high speed transmission. It is called
use similar or dissimilar media and signaling coaxial because it includes one physical channel
systems. It operates at the data link level of the OSI that carries the signal surrounded (after a layer of
model. Bridges read and filter data packets and insulation) by another concentric physical
frames. channel, both of which run along the same axis.
The inner channel carries the signal, and the outer
channel serves as a ground.

CRM Customer Relationship Management. Fiber Optic Refers to the medium and the
technology associated with the transmission of
CSMA/CD Carrier Sense Multiple Access/ Collision information along a glass or plastic wire or fiber.
Detect - The low level network arbitration protocol
used on Ethernet. Nodes wait for quiet on the net Fibre Channel A technology for transmitting data
before starting to transmit, and listen while they are between computer devices at a data rate of up to 4
transmitting. If two nodes transmit at once, the data Gb/s. It is especially suited for connecting computer
gets corrupted. The nodes detect this and continue servers to shared storage devices and for
to transmit for a certain length of time to ensure that interconnecting storage controllers and drives.
all nodes detect the collision. The transmitting nodes
then wait for a random time before attempting to FICON Fibre Connection - A next-generation I/O
transmit again, thus minimizing the chance of solution for IBM S/390 parallel enterprise server.
another collision.
Frame A linear set of transmitted bits that define the
DAFS Direct Access File System. basic transport unit. The frame is the most basic
element of a message in Fibre Channel
Disk Mirroring A fault-tolerant technique that communications, consisting of a 24-byte header and
writes data simultaneously to two hard disks using zero to 2112 bytes of data. See also: Sequence.
the same hard disk controller.
FTP File Transfer Protocol - A communications
Disk Pooling A SAN solution in which disk storage protocol governing the transfer of files from one
resources are pooled across multiple hosts rather computer to another over a network.
than be dedicated to a specific host.
Full-Duplex A mode of communications allowing
DMA Data Management Application. simultaneous transmission and reception of frames.
DMI Desktop Management Interface - A Gateway A node on a network that interconnects

specification from the Desktop Management Task two otherwise incompatible networks.
Force (DMTF) that establishes a standard
framework for managing networked computers. DMI Gigabit One billion bits, or one thousand megabits.
covers hardware and software, desktop systems
and servers, and defines a model for filtering events Gopher A protocol for the storage and retrieval of
and describing interfaces. text on a computer network using a TCP/IP protocol.
Enterprise Network A geographically dispersed Half-Duplex A mode of communications allowing

network under the auspices of one organization. either transmission or reception of frames at any
point in time, but not both (other than link control
EOF End of File. frames which are always permitted).
ERP Enterprise Resource Planning. Hardware The mechanical, magnetic and

electronic components of a system, e.g., computers,
ESCON Enterprise System Connection. telephone switches, terminals and the like.
FCIP Fibre Channel over Internet Protocol. HBA Host Bus Adapter.
FCP Fibre Channel Protocol - the mapping of Heterogeneous Network Often used in the context
SCSI-3 operations to Fibre Channel. of distributed systems that may be running different
operating systems or network protocols (a
heterogeneous network).

HTTP Hypertext Transmission Protocol - A protocol Media Plural of medium. The physical environment
used to request and transmit files, especially Web through which transmission signals pass.
pages and Web page components, over the Internet
or other computer network. NAS Network Attached Storage - a term used to
describe a technology where an integrated storage
IAB Internet Activities Board. system is attached to a messaging network that
uses common communications protocols, such as
ICMP Internet Control Message Protocol. TCP/IP.
IDC Internet Data Center. NDMP Network Data Management Protocol.
iFCP Internet Fiber Channel Protocol. Network An aggregation of interconnected nodes,

workstations, file servers, and/or peripherals, with its
I/O Input/output own protocol that supports interaction.
IPC Inter-Process Communication - Exchange of Network Topology Physical arrangement of nodes

data between one process and another, either within and interconnecting communications links in
the same computer or over a network. It implies a networks based on application requirements and
protocol that guarantees a response to a request. geographical distribution of users.
iSCSI Internet Small Computer System Interface. NFS Network File System - A distributed file system
in UNIX developed by Sun Microsystems which
ISDN Integrated System Digital Network. allows a set of computers to cooperatively access
each other's files in a transparent manner.
JBOD Just a bunch of disks.
OSI Open Systems Interconnect - A model of
LAN Local Area Network - A network covering a network architecture and a suite of protocols (a
relatively small geographic area (usually not larger protocol stack) to implement it, developed by ISO in
than a floor or small building). Transmissions within 1978 as a framework for international standards in
a Local Area Network are mostly digital, carrying heterogeneous computer network architecture.
data among stations at rates usually above one
megabit/s. Packet A short block of data transmitted in a packet
switching network.
Latency A measurement of the time it takes to send
a frame between two locations. PFA Predictive Failure Analysis.
LUN Logical Unit Number - A 3-bit identifier used on POST Power-on self-test.
a SCSI bus to distinguish between up to eight
devices (logical units) with the same SCSI ID Protocol A data transmission convention
encompassing timing, control, formatting and data
MAN Metropolitan Area Network - A data network representation.
intended to serve an area the size of a large city.
QoS Quality of Service - A set of communications
MAC Media Access Control The lower sublayer of characteristics required by an application. Each QoS
the OSI data link layer. The interface between a defines a specific transmission priority, level of route
node's Logical Link Control and the network's reliability, and security level.
physical layer. The MAC differs for various physical
media.
Glossary 289
RAID Redundant Array of Inexpensive or SCSI Small Computer System Interface - A set of
Independent Disks. A method of configuring multiple evolving ANSI standard electronic interfaces that
disk drives in a storage subsystem for high allow personal computers to communicate with
availability and high performance. peripheral hardware such as disk drives, tape drives,
CD ROM drives, printers and scanners faster and
Raid-0 Level 0 RAID support - Striping, no more flexibly than previous interfaces.
redundancy.
SCSI-3 SCSI-3 consists of a set of primary
Raid-1 Level 1 RAID support - mirroring, complete commands and additional specialized command
redundancy. sets to meet the needs of specific device types. The
SCSI-3 command sets are used not only for the
Raid-5 Level 5 RAID support, Striping with parity. SCSI-3 parallel interface, but also for additional
parallel and serial protocols, including Fibre
RDist A utility included in UNIX that is used to Channel, Serial Bus Protocol (used with IEEE 1394
maintain identical copies of files over multiple hosts. Firewire physical protocol) and the Serial Storage
It preserves the owner, group, mode, and timestamp Protocol (SSP).
of files if possible, and can update programs that are
executing. SCSI-FCP The term used to refer to the ANSI Fibre
Channel Protocol for SCSI document (X3.269-199x)
Redirector An operating system driver that sends that describes the FC-4 protocol mappings and the
data to and receives data from a remote device. A definition of how the SCSI protocol and command
network redirector often provides mechanisms to set are transported using a Fibre Channel interface.
locate, open, read, write, and delete files and submit
print jobs. SCSI initiator A device that begins a SCSI
transaction by issuing a command to another device
RFC Request for Comment - One of a series, begun (the SCSI target), giving it a task to perform.
in 1969, of numbered Internet informational Typically a SCSI host adapter is the initiator, but
documents and standards widely followed by targets may also become initiators.
commercial software and freeware in the Internet
and UNIX communities. Few RFCs are standards Server A computer which is dedicated to one task.
but all Internet standards are recorded in RFCs.
SNIA Storage Networking Industry Association. A
Router (1) A device that can decide which of non-profit organization comprised of more than 77
several paths network traffic will follow based on companies and individuals in the storage industry.
some optimal metric. Routers forward packets from
one network to another based on network-layer SNMP Simple Network Management Protocol - The
information. (2) A dedicated computer hardware Internet network management protocol which
and/or software package which manages the provides a means to monitor and set network
connection between two or more networks. See configuration and run-time parameters.
also: Bridge, Bridge/Router
SSA Serial Storage Architecture - A high speed
SAN A Storage Area Network (SAN) is a dedicated, serial loop-based interface developed as a high
centrally managed, secure information speed point-to-point connection for peripherals,
infrastructure, which enables any-to-any particularly high speed storage arrays, RAID and
interconnection of servers and storage systems. CD-ROM storage by IBM.

Storage Media The physical device itself, onto TCP Transmission Control Protocol - a reliable, full
which data is recorded. Magnetic tape, optical disks, duplex, connection-oriented end-to-end transport
floppy disks are all storage media. protocol running on top of IP.
StorWatch Expert These are StorWatch TCP/IP Transmission Control Protocol/ Internet
applications that employ a 3-tiered architecture that Protocol - a set of communications protocols that
includes a management interface, a StorWatch support peer-to-peer connectivity functions for both
manager and agents that run on the storage local and wide area networks.
resource(s) being managed. Expert products
employ a StorWatch data base that can be used for Topology An interconnection scheme that allows
saving key management data (e.g. capacity or multiple Fibre Channel ports to communicate. For
performance metrics). Expert products use the example, point-to-point, Arbitrated Loop, and
agents, as well as analysis of storage data saved in switched fabric are all Fibre Channel topologies.
the database, to perform higher value functions,
including: reporting of capacity, performance, etc. Trivial File Transfer Protocol (TFTP) A simple file
over time (trends), configuration of multiple devices transfer protocol used for downloading boot code to
based on policies, monitoring of capacity and diskless workstations. TFTP is defined in RFC 1350.
performance, automated responses to events or
conditions, and storage related data mining. Twisted Pair A transmission media (cable)
consisting of two insulated copper wires twisted
StorWatch Specialist A StorWatch interface for around each other to reduce the induction (thus
managing an individual fibre channel device or a interference) from one wire to another. The twists, or
limited number of like devices (that can be viewed as lays, are varied in length to reduce the potential for
a single group). StorWatch specialists typically signal interference between pairs. Several sets of
provide simple, point-in-time management functions twisted pair wires may be enclosed in a single cable.
such as configuration, reporting on asset and status This is the most common type of transmission
information, simple device and event monitoring, media.
and perhaps some service utilities.
UMS Universal Manageability Services.
Striping A method for achieving higher bandwidth
using multiple N_Ports in parallel to transmit a single UTP Unshielded Twisted Pair.
information unit across multiple levels.
VI Virtual Interface.
Switch A component with multiple entry/exit points
(ports) that provides dynamic connection between VTS Virtual Tape Server.
any two of these points.
WAN Wide area network - A network which
Switch Topology An interconnection structure in encompasses inter-connectivity between devices
which any entry point can be dynamically connected over a wide geographic area. A wide area network
to any exit point. In a switch topology, the available may be privately owned or rented, but the term
bandwidth is scalable. usually connotes the inclusion of public (shared)
networks.
Tape Backup Making magnetic tape copies of hard
disk and optical disc files for disaster recovery. WfM Wired for Management (Intel).
Tape Pooling A SAN solution in which tape XDR eXternal Data Representation - A standard for
resources are pooled and shared across multiple machine-independent data structures developed by
hosts rather than being dedicated to a specific host. Sun Microsystems for use in remote procedure call
systems. It is defined in RFC 1014.
Glossary 291
Index
Symbols C
cache 160
'routing' algorithms 68
Carrier Sense 73
‘headless’ environment 180
Carrier Sense Multiple Access with Collision Detec-
tion (CSMA/CD) 72, 73
Numerics channel I/O 11
200i 158 circuit switched telephone 69
client/server 13
Clustered Failover 140
A
access scheme 13 Clustering 108
Advanced System Management 150, 174, 175 coaxial 75
Advanced System Management PCI Adapter 177 collision domain 74
Advanced System Management Processor (ASMP) collisions 76
131 Common Information Model (CIM) 182
Alert on LAN 143, 155 Common Internet File System (CIFS) 19, 95, 256
Alto Aloha Network 72 connection 53
American National Standards Institute (ANSI) 30 cooked I/O 10
AntiVirus 220 copy-on-write 202
any-to-any 29 CSMA/CD 14
AppleTalk 40, 123, 141, 146 Customer Relationship Management (CRM) 3
appliance 20, 53
appliance-like 52 D
appliances 22 DAS 11, 59
Application layer 66, 71 Data Link layer 65
Arbitrated loop 30 Data Management Application (DMA) 251
Archival backup 192 data sharing 34
ARCnet 13 database I/O 32
ARP (Address Resolution Protocol) 163 datagram 17, 68
ASM planar processor 177 DECNet 16
Asynchronous Transfer Mode (ATM) 14 Desktop Management Interface (DMI) 182
ATM 67 DHCP servers 130
Automated Server Restart 177 Direct Access File System (DAFS) 238, 248
Direct Attach Storage (DAS) 1, 4
disaster recovery 142
B
Basic Input/Output System (BIOS) 174, 179 discrete LAN 65
battery-backed RAM 196 DNS server 182, 186
block I/O 9, 10, 30, 32, 49, 53, 105, 122, 231 Domain Naming Service (DNS) 96
block I/O applications 235 drag-and-drop 192
blocks 11
bridges 65, 75 E
Business Intelligence (BI) 3 e-commerce 2
End -of-File (EOF) 241
Enterprise Resource Planning (ERP) 3

Error Correction Code (ECC) 176, 195 file sharing 41, 224
Ethernet 13, 14, 67, 68, 72, 73, 165 networks 69
EXP 159 platforms 231
External Data Representation (XDR) 94, 251 SAN 37
hierarchical file system 93
Host Bus Adapters (HBA) 9, 29, 86, 87, 239
F hubs 72
Family rules 186
Hypertext Transfer Protocol (HTTP) 19
Fast Ethernet 72
FAStT200 33
FAStT500 33 I
FC SAN 46 I/O
FC-AL public loop 150 bus 9, 29
FCIP Device 240 channel 7
FCP 10, 30 redirector 102, 104
FCP-SCSI 150 request 102
Fiber Distributed Data Interface (FDDI) 13 iaaconfig 183
fiber optic 15 IAACU Console 183
Fibre Channel 9, 30, 33 IBM 3466 Network Storage Manager (NSM) 22
Adapter 150 IBM Advanced Appliance Configuration Utility 130,
enabled 157 182
infrastructure 172 IBM Enterprise Storage Server (ESS) 7, 12, 33, 34
over TCP/IP (FCIP) 238, 240 IBM eServer xSeries150 24
RAID Controller 135 IBM Network Attach Storage 300G 43
SAN 48, 55, 56, 60, 105, 231 IBM storage networking 59
Fibre Channel Protocol 10 IBM StorWatch 37, 116
Fibre Management Utility 154 IBM TotalStorage IP Storage 200i 50
file I/O 23, 30, 53, 102, 122 IBMSNAP 215
file I/O applications 236 IBMSNAP.EXE 215
file level I/O 23 IDC estimates 2
file locking 41 Independent Software Vendors (ISV) 219
file system services 10 InfiniBand 238, 241, 250
File Transfer Protocol (FTP) 18, 92 initiator/s 50, 74
FlashCopy 197 input/output (I/O) 7
frames 17, 72, 74 instantaneous copy 223
framing 68 Institute of Electrical and Electronic Engineers
(IEEE) 72
Integrated Advanced System Management Proces-
G sor 161
gateways 65
Integrated NAS 59
Gigabit 165
Integrated System Digital Network (ISDN) 14
Gigabit Copper 162
internal addressing 68
Gigabit Ethernet 30, 56, 72, 151, 162, 239
International Standards Organization (ISO) 13, 64
Internet Activities Board (IAB) 16
H Internet Control Message Protocol (ICMP) 69
headless 22 Internet Data Centers (IDCs) 231
Heterogeneous 118 Internet Engineering Task Force (IETF) 16, 49, 60,
client/server 152 61, 255
data sharing 38 Internet Fibre Channel Protocol (iFCP) 238, 239

Internet Protocol (IP) 17 Metadata Controller (MDC) 41
Internet SCSI (iSCSI) 49 Metropolitan Area Network (MAN) 239, 240
Internet service providers (ISPs) 57 microprocessor caches 194
Internetwork Packet Exchange / Sequenced Packet Microsoft Management Console (MMC) 181
Exchange (IPX/SPX) 16 Microsoft Systems Management 152
inter-processor communication (IPC) 242 Modular Storage Server (MSS) 33
interrupt request (IRQ) 179 MPLS 238
IP network fabric 51, 159
IP Networks 9, 17
IP Packet 69
N
NAS appliance 6, 53
IP SAN 49, 51, 54
NDMP 251
IP Storage 200i 231
Netfinity 165
iSCSI 59, 81
Netfinity Director 129, 131, 133, 137, 176
benefits 54
NetWare 3, 19, 123
gateways 52
Network 4, 161
I/O 50, 106
Network appliances 21
Protocol Data Units (iSCSI PDU) 82
Network Attached Storage (NAS) 1, 3
SAN 55
Network Data Management Protocol (NDMP) 238,
251, 258
J Network File System (NFS) 19, 93, 248
Java-based application 182 Network Information Services (NIS) 94
Java-based GUI 143, 155 network infrastructure 20
Just Bunch of Disk (JBOD) 7 Network Interface Card (NIC) 29, 92, 102, 106
network interface controller 246
Network layer 65
K Network layer (Internet Protocol) 68
Kernel Agent 246
Network Lock Manager (NLM) 93, 94
Network management 163
L agents 153
L2 cache 194 Network Operating System (NOS) 19, 179
LAN connection failure 226 Network Status Monitor (NSM) 94
LAN topologies 88 Network-Attached Storage Devices (NASD) 258
LAN-free backup 227 network-attached user 195
Level 2 cache 160 networking infrastructure 172
Light Path Diagnostics 162, 176 nodes 72
Light-Path 133 non-contiguous IP address 187
Local Area Network (LAN) 6 NT File System (NTFS) 39, 98
Local Area Networks (LAN) 3
Logical Unit Number (LUN) 81
long wave GBIC 141 O
Object-Based Storage Device group (OSD) 258
LUN masking 37, 111
off-loaded 227
open systems 3
M Open Systems Interconnection (OSI) 13, 29, 63,
Media Access Control (MAC) 13, 65 64, 66, 67
media segments 74 OSPF 238
Memory 161
Meta Data Controller (MDC) 40
metadata 39, 40, 42
Index 295
P routers 65
packet 17, 69 Routing 69
Peer-to-Peer Remote Copy (PPRC) 35, 226
Peripheral Component Interface (PCI) 86, 241
Peripheral Component Storage 159
S
sample connectivity 157
persistent images 192 SAN 29
Persistent Storage Manager (PSM) 44, 130, 154, attached disk 157
191, 197, 213 benefits 33
Physical layer 64 fabric 6
physical medium 72 over IP 4
plug-and-play 22 SANergy 40, 59
point-in-time 192 SANergy benefits 41
images 137 SANergy Metadata Controller 105, 226
persistent images 154 SBA 9
Point-to-point 30 Scalable storage 140
fabric 150 SCSI 8, 80
pooled SAN storage 41 SCSI bus adapter (SBA) 9
pooled storage 51 SCSI Select Utility 179
Power-on self-test (POST) 131, 134, 176 SCSI-3 32
Predictive Failure Analysis (PFA) 132, 174, 176 SDRAM 159
Presentation layer 65 segment 15, 72, 74
primary gateway 186 Serial Storage Architecture 10
Processor 160 Server Message Block (SMB) 96
Processors 161 Server to server 31
protocol 6 Server to storage 31
protocol stack 66 ServeRAID 160, 162, 163
ServeRAID Manager 130
Q ServerConfiguration.dat 183
Quality of Service (QoS) 48 server-less backup 227
ServerWorks ServerSet 159
Session layer 65
R Shared Everything 111
RAID 7, 123, 160
Shared Nothing 110
RAID-3 267
Shared null 109
RAMAC Virtual Array (RVA) 118
Shared serial port 176
random access memory (RAM) 193
short wave GBIC 141
raw data 40
Simple Network Management Protocol (SNMP)
Raw I/O 10, 100
182
read/write 41, 210
SmartSets 112
Redbooks Web site 283
SNIA 61
Contact us xv
SNMP 117, 238
Remote connectivity 177
SNMP device listener 143
remote copy (rcp) 18
spanning tree 15
remote file call 42
specialized server 20
Remote power cycling 176
SSA 8, 9, 10, 12
Remote Procedure Call (RPC) 94
stack 17, 64
Remote update 176
storage 4
requestor 41
Storage Area Network (SAN) 1, 3, 29, 59, 119
return on investment (ROI) 52

storage network 59 twisted-pair 15
Storage Networking Industry Association (SNIA)
61, 255
Storage over IP (SoIP) 238
U
Undo Writes 210
Storage Router 166
Universal Manageability (UM) 133, 143
Storage Service Providers (SSPs) 231
Universal Manageability Services (UMS) 155, 181
Storage Tank 37, 118
UNIX 3
Storage to storage 31
UNIX Services 154
Storage virtualization 37
unshielded twisted pair (UTP) 15
Storage Wide Area Networks (SWAN) 36, 55
subnet 68, 76
Subnet addresses 187 V
switched 15 video streaming 231
Switched fabric 30, 76 virtual image 214
Switched Multi-megabit Data Services (SMDS) 67 Virtual Interface (VI) 238, 243, 250
switches 65 Virtual Tape Server (VTS) 118
Symmetric Multi Processor (SMP) 131 voltage regulating modules (VRMs) 176
System Network Architecture (SNA) 16
system-managed storage 120
Systems Management Server (SMS) 181
W
WANs (wide area networks) 240
Web Hosting (ISPs) 231
T Web-based GUI 137
Tape Pooling 34 wide area networks 54, 67
target 74 Windows 3
Telnet 71 Windows Powered 123
Terminal Services 180 Windows Terminal Service 142, 146
thick coaxial (thicknet) 15 Wired for Management (WfM) 143, 155
thicknet 75 World Wide Web 19
thin coaxial cable (thinnet) 15 World-Wide Unique Identifier (WWUI) 83
thin server 21 write-back 194, 195
thread 68 write-back cache 193
Time to Live (TTL) 70 write-through 194, 195
Tivoli Event Console (TEC) 117 write-through cache 194
Tivoli NetView 20, 55, 112, 131
Tivoli Network Storage Manager (TSNM) 37
Tivoli SANergy 34, 37, 46, 104, 156
X
X.25 67
Tivoli SANergyFS 39 X.25 packet switching 14
Tivoli Storage Manager (TSM) 20, 37, 111, 114,
130, 155, 212, 227
Token Ring 13, 67
TotalStorage 158
Transmission Control Protocol (TCP) 17, 70
Transmission Control Protocol/Internet Protocol
(TCP/IP) 15, 17, 70
Transport layer 65
True Images 142
TSM with SANergy 227
tunneling 240
Index 297
IP Storage Networking: IBM NAS and iSCSI Solutions
(0.5” spine)
0.475”<->0.875”
250 <-> 459 pages
Back cover ®
IBM NAS and iSCSI
Solutions
All about the latest IP Storage Networking utilizes existing Ethernet infrastructure as a
IBM Storage Network backbone for connecting storage devices. By using this network,
INTERNATIONAL
Products the infrastructure investment may be leveraged to provide an even TECHNICAL
greater ROI. Where creation of a dedicated storage network is SUPPORT
Selection criteria for desirable, the use of familiar IP "fabric" means that existing support ORGANIZATION
skills and resources can be leveraged, providing lower cost of
Storage Networking
ownership. IP Storage Networking devices simplify installation and
needs
management by providing a complete suite of pre-loaded software.
They are readily capable of filling the need caused by the BUILDING TECHNICAL
Application elimination of general purpose servers with direct attached storage. INFORMATION BASED ON
scenarios PRACTICAL EXPERIENCE
This IBM Redbook is intended for IBMers, Business Partners, and
customers who are tasked to help choose a storage network. This IBM Redbooks are developed by
book will help you understand the different storage networking the IBM International Technical
technologies available in the market. It discusses the circumstances Support Organization. Experts
from IBM, Customers and
under which you might want to use SAN, NAS, or iSCSI, showing
Partners from around the world
where all of these technologies complement each other. create timely technical
information based on realistic
We introduce the different storage networking technologies, scenarios. Specific
discuss in detail how Network Attached Storage and iSCSI work, recommendations are provided
to help you implement IT
and show how they differ from SAN. Various NAS and iSCSI products
solutions more effectively in
from IBM are covered, with their management tools, including your environment.
on-disk data protection and data archiving. We also suggest some
sample NAS and iSCSI applications.
For more information:

ibm.com/redbooks
SG24-6240-01 ISBN 0738424226

IBMredbooks IPstorageNetworking SG246240

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

IBMredbooks IPstorageNetworking SG246240

Diunggah oleh

Hak Cipta:

Format Tersedia

Front cover

All about the latest IBM Storage

Selection criteria for Storage

Second Edition (February 2002)

Comments may be addressed to:

Second Edition, February 2002

© Copyright IBM Corp. 2001, 2002 iii

Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Chapter 1. Introduction to storage networking . . . . . . . . . . . . . . . . . . . . . . 1

© Copyright IBM Corp. 2001, 2002 v

Chapter 2. IP storage networking technical details . . . . . . . . . . . . . . . . . . 63

vi IP Storage Networking: IBM NAS and iSCSI Solutions

Chapter 3. IBM NAS and iSCSI storage products. . . . . . . . . . . . . . . . . . . 121

Chapter 4. Management of IBM NAS and IP Storage solutions . . . . . . . 173

Chapter 5. Backup for IBM Network Attached Storage . . . . . . . . . . . . . . 191

viii IP Storage Networking: IBM NAS and iSCSI Solutions

Chapter 7. Other storage networking technologies . . . . . . . . . . . . . . . . . 237

Appendix A. RAID concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Related publications . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... . 281

Special notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

x IP Storage Networking: IBM NAS and iSCSI Solutions

This IBM Redbook is the result of residencies conducted at the International

This redbook will help you:

The most important feature of these appliances is ease of use, detailed in

These storage products will be storing important data—hence the importance of

© Copyright IBM Corp. 2001, 2002 xi

The team that wrote this redbook

IBM Redbook Development team

xii IP Storage Networking: IBM NAS and iSCSI Solutions

Cher Kion Chai is a Storage Networking Solutions Consultant in the Storage

Keith Carmichael is an advisory IT Availability Professional

IBM Advanced Technical Support Center

IBM San Jose

IBM Almaden Research

xiv IP Storage Networking: IBM NAS and iSCSI Solutions

We want our IBM Redbooks to be as helpful as possible. Send us your

Chapter 1. Introduction to storage

© Copyright IBM Corp. 2001, 2002 1

The problem is aggravated by the fact that information technology professionals

2 IP Storage Networking: IBM NAS and iSCSI Solutions

Figure 1-1 Storage networks facilitate consolidation and data sharing

1.1.1 The storage networking evolution

Chapter 1. Introduction to storage networking 3

4 IP Storage Networking: IBM NAS and iSCSI Solutions

Figure 1-2 NAS versus SAN spending

DAS NAS SAN

Figure 1-3 SAN and NAS adoption rate projections

Chapter 1. Introduction to storage networking 5

1.3 Storage architectures

1.3.1 The role of storage and network protocols

6 IP Storage Networking: IBM NAS and iSCSI Solutions

The simplest configuration is a single disk or single tape drive attached to a

Some disk systems allow the aggregate capacity of the subsystem to be

Chapter 1. Introduction to storage networking 7

Partitioned Disk Array

Figure 1-4 DAS implementations

1.4.1 DAS media and protocols

Small Computer Systems Interface (SCSI)

8 IP Storage Networking: IBM NAS and iSCSI Solutions

Storage devices may be directly attached to Fibre Channel enabled servers by

Chapter 1. Introduction to storage networking 9

Devices attached in this Fibre Channel point-to-point topology are, in effect,