Anda di halaman 1dari 31

High-Availability Campus Design –

Best Practices

Presented by
Dr. Peter J. Welcher, Chesapeake
Netcraftsmen

Slide 1

About the Speaker


• Dr. Pete Welcher
– Cisco CCIE #1773, CCSI #94014, CCIP
– Network design & management consulting
• Stock quotation firm, 3000 routers, TCP/IP
• Another stock quotation firm, 2000 routers, UDP
broadcasts
• Hotel chain, 1000 routers, SNA
• Government agency, 1500 routers
– Taught many of the Cisco courses
• CiscoWorld / Enterprise Networking
Magazine articles
– http://www.netcraftsmen.net/welcher/papers

Slide 2

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-1


Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 3

What Are We Trying To Do?


• Cost-effective building / campus
switched network design
• Modern L2/L3 switches
• Redundancy, high availability
• Reasonably high level of security
• AVVID support
– Future IP Telephony
– IP Video Conferencing (IPVC)
– Other Video services
• Support for QoS

Slide 4

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-2


Assumption
• Cisco’s Networkers 2002 RST-271
presentation preceded this, see the URL
http://www.cisco.com/networkers/nw02/po
st/presentations/docs/RST-271.pdf

Slide 5

Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 6

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-3


Connect Distribution Layer?
• When do the 2 Distribution Layer switches
need a trunk or link between them?

Access Distribution

Slide 7

Connect Distribution Layer?


• When do the 2 Distribution
Layer switches need a trunk
or L3 link between them?
– When there is more than one port
on the switch in a VLAN, need a
trunk
– When each distribution switch has
only one connection to the WAN or
Core router or switch
– Best to avoid sending inter-switch
traffic through IDF switches
• Consider having 2 such links!

Slide 8

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-4


Use of L3 Switching
• Use L3 switching rather than L2
– “Within reason”
• L3 in the IDF? L2 or L3
– Do you really want each IDF switch switching?
doing routing?
– What does this buy you if every port
on the switch is on one or two
VLANs?
– Making uplinks L3 may lead to slower
failover!
– Can use L3 capabilities in IDF for
QoS but not routing

Slide 9

Control L3 adjacencies
• Not needed if IDF’s are doing L3 routing
• The point: do not need the two
distribution layer MSFC’s becoming
neighbors on each IDF VLAN
• Select 2-4 preferred VLAN’s
• Make the rest passive interfaces
• Preferable: use inter-switch links for
this, with perhaps 1 or 2 “preferred
VLAN’s” in case both links fail

Slide 10

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-5


HSRP
• Use HSRP, making the MSFC’s the HSRP
routers
– Set HSRP primary to match preferred STP root switch
– Tune VLAN interface costs (or delay with EIGRP) for
symmetric return traffic Root for odd
• If have single links to Core VLAN’s
Blocking
or WAN, can use tracking state
of that link with pre-empt
• Consider load distribution
– Two VLAN’s per IDF
– Odds / evens for root switch

Slide 11

Don’t Forget the Servers


• Having redundant switching
doesn’t do much good if a single
switch failure takes out key
servers (DNS, DHCP, e-
commerce, etc.)
• Use dual-attached servers
– One IP address or two?
– Depends on NIC support, application
needs
– Test failover!
– Re-test failover after making changes

Slide 12

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-6


Servers: 3 or 4 NIC’s?
• Can consider additional NIC for
servers
– If server NIC performance, queues and drops
of concern
– Possible issue here: massive backup, network
storage appliance, or SAN data flows across
the network
– 3rd NIC allows separate address, separate
subnet for this
– Routing, and route hiding can direct large
flows onto dedicated links or NIC’s if desired
– Can isolate client-server traffic and NIC from
the large flows (or management traffic)
– Alternative: QoS for network, shared NIC use
for server

Slide 13

Cat 6500 and RPR


• Choices for Supervisor redundancy
– RPR: stateless failover, takes time to resume
switching
– RPR+: stateful failover of Sup and L2
– CatOS may be much faster than Native IOS
• Choices for MSFC redundancy
– Dual router mode: more routers, complexity
– Single router mode (SRM redundancy): simpler
– Single router mode preferred with FlexWAN module
• Inactive MSFC doesn’t “see” the FlexWAN interface
• Easy to lose configuration of FlexWAN interface

Slide 14

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-7


Cat 4507 Versus Cat 6500
• Availability of NAM, IDS, firewall blades for
chassis
• FlexWAN support
• Degree and speed of failover (RPR vs.
RPR+)
• Port density
• Gig/10 Gig performance
• Possibly feature differences, would need
to check details…

Slide 15

Jumbo Frames
• K. Tolly has been agitating for non-standard
Jumbo frames for higher speed media
– They do provide better performance on boxes with
CPU or NIC processing limitations, but smarter NIC’s
and drivers alleviate the need
• Design / support issues
– Need to implement and configure jumbos consistently
– Need equipment that supports them
• Conclusions
– Jumbos may be limiting on hardware purchases
– A real factor in MTTR relating to SAN or high-speed
outages – not a good place to add complexity

Slide 16

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-8


Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 17

Auto-Negotiate
• Auto-negotiate or lock down settings?
– Certain NIC cards do not interoperate with auto-
negotiation in Catalysts ?
• This may affect other ports
– Can hard-code settings on key FastEthernet ports
• Servers, routers
• Do set both ends
– Generally, let client PC’s negotiate
• Use high port error counts to track down
situations where one end hard-coded, other
trying to negotiate
– Symptom: failed negotiations (one end hard-coded) end up
as 100 half-duplex

Slide 18

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-9


Overall L2 Wisdom
• Control L2 and Spanning Tree Protocol (STP)
– Known root switch, known preferred link, known blocking
port
• Don’t mess with the STP timers
– Can cause instability, our recommended design uses
uplink fast for very fast failover anyway
• Do keep STP scale small: “spanning tree
weirdness”
• Don’t disable STP: cabling and other accidents
WILL happen
• Don’t overdo L2 redundancy
– It can hurt stability and convergence time

Slide 19

Trunking Scheme
• Use 802.1q for trunking
– Interoperability, standard, more future support
– Set Cisco switches to “desirable”, at both ends
– This allows possible connectivity if one end won’t trunk for some reason
– Can make trunk native VLAN the management VLAN for same reason
• Trunking provides CoS bits for QoS to use
– Useful with lower-end switches, not needed for L3-capable switches
• Strongly Consider breaking up VLAN 1
– Manually prune VLAN’s on trunks where not needed
– This controls impact of rogue switch, prevents large Spanning Tree
instability
– Use this to break up VLAN 1 size
– Use some other VLAN for management of switches

Slide 20

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-10


VTP
• VTP is useful for consistency, but can
magnify the consequences of errors
– Recommend using VTP transparent mode after
initial deployment and creation of VLAN’s
• Can still use CiscoWorks to manage
VLAN’s
– Use building.campus.corp.com as VTP domain?
• Putting all ports in switch in one VLAN (or
first half in one, rest in another) is much
simpler to maintain
– In general, doing anything port-by-port is painful!

Slide 21

EtherChannel
• EtherChannel is a great way to scale up
bandwidth
– We have seen odd low frequency failures, say 1 or 2 out of
100 channels, once every 3 months or so
– Possible alternative, slower failover: L3 routing
• Use PAgP desirable mode wherever possible
– Allows surviving ports to operate in non-channel modes in
worst case
– Avoids having STP loop or black hole during outage
– May need to force channeling on for servers or routers

Slide 22

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-11


EtherChannel – 2
• Steps to mitigate unwanted topology changes
on the Catalyst 6000:
– If a single port is used per module to form a channel, use
three or more modules (three ports or more total), so that
surviving ports stay channeled
– If the channel spans two modules, use two ports on each
module (four ports total)
– If a two-port channel is needed across two cards, use only
Supervisor ports
– Beware oversubscribing module backplane limits
• Use CatOS 6.3 or newer
– Handles module removal without STP recalculation for
channels split across modules.
– CatOS 6.4 is going to become GD, seems stable

Slide 23

Flow Control
• 802.3Z: Gigabit and 10 Gigabit Ethernet
flow control
– Not useful between switches, may cause drops
– Can use with servers, they should have enough
buffers

Slide 24

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-12


New L2 Features
• VRRP – Virtual Router Redundancy
Protocol, 12.2(13) T
– Standard HSRP replacement, should be
interoperable
– No interface tracking capability
• GLBP – Global Load Balancing
Protocol, 12.2(15) T
– Like HSRP or VRRP but all the routers can
forward packets!
– Also does allow interface tracking

Slide 25

New Spanning Tree Features


• 802.1s = Multiple Spanning Tree (MST)
– MST can reduce number of STP instances,
load on switch CPU
– No longer much of an issue
• 802.1w = Rapid Spanning Tree (RST)
– Works with MST to speed STP convergence
– Uplink fast proven technology, simpler
– Larger STP domains are still a bad design
idea!
• Need for these versus risks?

Slide 26

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-13


Prevent Spanning Tree Surprises
• One-way failures are possible, can cause
spanning tree loops
• Far End Fault Indication (FEFI) protects
fiber and Gigabit ports in newer blades
– Catalyst 5000: WS-X5201R, WS-X5305, WS-
X5236, WS-X5237, WS-U5538, and WS-U5539
– Catalyst 6000 and 4000: All 100BaseFX modules
and GE modules

Slide 27

Uni-Directional Link Detection (UDLD)


• For maximum protection, Cisco
recommends enabling aggressive
UDLD
– On point-to-point FE/GE links between
Cisco switches
– Default message interval is 15 seconds
– This assumes default STP timers
– May also wish to enable loop guard?

Slide 28

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-14


User Ports
• Portfast has timing interaction with PAgP,
DTP
– Set port host disables PAgP, DTP
– This speeds up port becoming active
– Use it for fast user, server, router port activation
• BPDU guard
– Disables port when BPDU received
– Human intervention required to re-enable port unless
errdisable-timeout enabled
– Can be a bit drastic
– Does discourage moving cabling around or introducing
rogue switches

Slide 29

New: Root and Loop Guard


• Root guard
– Forces a port to become designated
– Switch on the other end then cannot become STP
root
– Can use on distribution layer switches
• Loop guard
– Shuts down root port if BPDU’s not received
– Prevents STP loops in some uni-directional link
situations
– May have bad interactions with channels, losing
track of BPDU state
– My take on this: not worth the risk

Slide 30

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-15


L2 and Security
• L2 and security vulnerabilities
– Make trunk native VLAN different than access VLAN’s
– Force non-trunk (user) ports to OFF as far as trunking
– Don’t use VLAN 1 for anything (neither user ports or
device management)
• Port security
– Can prevent MAC flooding attacks and snooping
– ARP attacks: still have to watch for ARP spoofing,
6500 VACL’s can help
• 802.1x
– Consider user logins, can work with WEP, WPA
– Driver support issues?!

Slide 31

Security
• All-in-one boxes (firewall, router, switch) are attractive
for reducing device count
• But: crack that one box and you may be all the way
into the network!
– Separate OS & login with Cisco, but can reset module and try to
break in

Traditional WAN 6500 with Firewall


Firewall Design Blade

Slide 32

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-16


QoS and the Campus
• Throwing bandwidth at the problem is
relatively easy in a building or campus
• This may not eliminate drops
• QoS is crucial to support AVVID
applications
• Qos is crucial where there are potential
large bandwidth applications on campus
– Video on demand
– Network backup
– SAN
• This is a separate seminar talk!

Slide 33

IP Multicast Support
• Make sure unicast routing is consistent
• Use sparse mode
– You don’t want to see Dense mode floods
• Use IGMP snooping
• Use AutoRP
– Can provide redundancy
– Good way to manage RP selection
• Know your application traffic
– We’ve been seeing some interesting behavior when there are
many multicast sources
– Consider bidirectional PIM and Source Specific Multicast (SSM)
where appropriate

Slide 34

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-17


Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 35

Wireless Campus Design Issues


• Isolate WLAN’s outside a firewall as
untrusted networks
– Security!
• Subnetting and wireless mobility scheme
• Authentication (WLAN, network)
– Security!
• Data confidentiality across WLAN
– Security!

Slide 36

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-18


Isolating WLAN’s Outside a Firewall
• WLAN: anybody (in principle) can
access it
– Must therefore be untrusted, “outside”
network(s)
• Choices:
– Separate infrastructure cabling WAP’s to
firewall
• Costly!
– Separate isolation VLAN (building-sized?) with
all WAP’s in it, force traffic from VLAN through
firewall
• # users, scale of Spanning Tree?
– Multiple isolation VLAN’s
• Per-floor? 3-D signal propagation?
• Mobility support by vendor?

Slide 37

Subnetting and Mobility


• Mobility depends on WAP vendor
• No mobility support Å all WAP’s
must be in same VLAN & subnet
– One VLAN spanning the building or
campus??
– Number of users on that one VLAN??
• Cisco proxy Mobile IP client in
WAP
– Standard approach to mobility
– Scales well
• Some WLAN “switches” provide
mobility for clients on the WAP’s
they control
– They act as crypto and mobility gateways

Slide 38

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-19


WLAN and Network Authentication
• Securing access to WAP
– WEP isn’t even close to secure, don’t rely on it
– PEAP and LEAP versions of 802.1x contending for standard status,
WPA alternatives, appear adequate for controlling WAP access
– DoS preventing association may still be possible with WPA
– Availability and interoperability issues possible in short term with
multi-vendor approaches
• Alternative/workaround: IPSec IKE as authentication
– Using IPSec/IKE as sole authentication means hackers can use
WAP, although not to access switches or network

Slide 39

WLAN Confidentiality
• Current choices:
– Eventually 802.11i will provide a standard using AES
• But don’t hold your breath! Crypto chip support needed
– WEP with frequent rekeying (Cisco TKIP) does work
– WPA is current Wi-Fi scheme, uses TKIP plus other measures to
render adequate confidentiality
– IPSec
• Advantage to IPSec
– IT team may already support it for mobile, remote workers
– One method of access for user, one type of client, no matter where
– Simplifies firewall rules connecting WLAN’s to rest of network
– But does leave WAP access somewhat open
– Hence need to “harden” switches on path from WAP to IPSec
concentrator

Slide 40

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-20


Other WLAN Security
• Each PC on WLAN is in principle exposed
as if connected directly to Internet
– 802.11 does allow for PC-PC communications
without a WAP, unless you disable this
• Need personal firewall software on PC!!!
– Need safeguards to ensure personal firewall in
use when user on WLAN
– Virus scanning?

Slide 41

Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 42

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-21


Design for Manageability
• Keep VLAN’s small, known root switch
• Tie VLAN’s and addressing to location
• Use templates and cookie-cutter approach
for absolute configuration consistency
• Do configuration QA using another means
– show commands versus text comparison
• Do QA testing
– This often runs into deployment deadlines

Slide 43

Configuring for Manageability


• See
http://www.netcraftsmen.net/welcher/papers
/snmptemplate.html
• See
http://www.netcraftsmen.net/welcher/papers
/index.htm for other articles about
manageability

Slide 44

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-22


Pay Attention to…
• Boot up testing
– Can turn on full testing
– Affects reboot and possibly failover time
• Syslog messages to CiscoWorks server
– But not to console
• Send SNMP traps to HP Openview or CW
DFM
• NTP
• CDP
• Manage the important ports!

Slide 45

AUX VLAN for IP Phones


• I see this as a management issue
• Separate VLAN and addressing for phones
can make some tasks easier
– Passing selected DHCP options
– Implementing QoS
– Troubleshooting

Slide 46

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-23


Network Management
• Cisco recommends:
– Do something simple, and do it well.
– Reduce staff overload due to excessive data polling,
collection, tools, and manual analysis.
• Find out what is working well on the network
and leave it alone.
• Concentrate on what is not working.

Slide 47

Network Management Toolset


• HP Openview as an NMS
– Primarily for Fault Management
– Real-time graphs
• CiscoWorks
– Resource Manager Essentials for Configuration, syslog,
inventory, and software manager
– Campus Manager for VLAN and user management
– Device Fault Manager for first-line performance management
• Low-Cost Performance Management
– Cricket (freeware: Linux/UNIX or NT*)
– RTG (Linux/UNIX freeware), plus web forms & CGI scripting
– SolarWinds Orion
– What’s Up Gold

Slide 48

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-24


Configuration Essentials
• Passwords (telnet, enable)
• Permit filter for telnet or SSH access
– Use SSH to prevent exposure of passwords?
• Use banner messages that do not give
info away unnecessarily
• Consider TACACS+ or Radius
– Who accessed the device, when
– Audit trail
– Privileges
– Can use Cisco ACS on Windows

Slide 49

Secure Management Traffic


• SNMP: how to secure it?
– SNMPv3??
– Separate VLAN for management
• One alternative: out-of-band management
– How cost-effective is this?
– Separate LAN to devices for SNMP and telnet
access: costly
– Modems on remote AUX ports and/or terminal server
(+ modem?) connected to consoles: good back door
if lose network connectivity

Slide 50

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-25


Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 51

Keys to High Availability (HA)


• Redundancy
– Use first where needed most
– One bulletproof chassis versus two chassis?
• HA Costs
– Redundancy
– Spare parts onsite
– Tools & test gear
– Skills
– Training
– Staff head-count

Slide 52

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-26


Achieving High Availability
• MTBF & MTTR are • Lower MTTR
both factors – Tools
• Increasing MTBF – Techniques
– Redundancy – Staff skills, training
– Fast failover & – Solid simple design
protocols
– Good current maps!
• But test them
– Emergency Power – Port descriptions!
Off button guard – Spare parts
cover – Building, room
– Good security access (key versus
practices! guard travel time)

Slide 53

Human Procedures
• Avoid being the cause of an outage
– Configuration change control
– Image change control and procedures
– Testing prior to changes
– Operations control (tight but not too inflexible)
• Noticing failures which do not cause outages
• Understanding what’s going on
– Training
– Time to observe traffic flows and protocols
• Application profiles
– Knowing what matters
– Knowing who to contact

Slide 54

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-27


Diversity
• There are many aspects to diversity:
– Facilities
• Cabling
• Power
• Heating/cooling
• Water (pipes, pumps, drainage…)
• Rack location (fire sprinkler heads?)
– Geographic diversity
– WAN
• Dual links
• Dual carriers (but is it really diverse?)
• Diverse technologies

Slide 55

Managed HA
• Improving Operations • Net management
impact on HA isn’t as – Training and procedures
easy as it looks – Event management
– High costs procedures, rulesets
– May require culture and – Performance and
job description changes Capacity management
• Manage outage • Service Level
information management
– Outage frequency, – Target SLA’s
severity, MTTR – Continual improvement
– Outage cause reports – The right incentives

Slide 56

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-28


Agenda
• Introduction and Motivation
• Campus Design Principles
• LAN Protocols – Further Comments
• Adding Wireless
• Managing the Campus Network
• High Availability
• Wrap-Up

Slide 57

See Also
• Gigabit Campus Network Design— Principles and
Architecture
http://www.cisco.com/en/US/customer/netsol/ns110/ns146
/ns147/ns17/networking_solutions_white_paper09186a00
800a3e16.shtml
• Gigabit Campus Design Configuration and Recovery
Analysis
http://www.cisco.com/en/US/customer/netsol/ns110/ns146
/ns147/ns17/networking_solutions_white_paper09186a00
800a3e0b.shtml
• Best Practices for Catalyst 4000, 5000, and 6000
Series Switch Configuration and Management
http://www.cisco.com/en/US/customer/products/hw/switch
es/ps663/products_tech_note09186a0080094713.shtml

Slide 58

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-29


Summary
• Keep Campus Designs simple and
redundant
• Use a cookie-cutter approach and
configuration templates
• Use L3 to keep STP domains very small and
limit the scope of any outages
• Management practices and procedures are
important part of achieving High Availability

Disclaimer: this presentation touches on most of the high-level issues, but it


definitely does not cover all the details of campus design and IPT planning.

Slide 59

THANK YOU !

Slide 60

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-30


A Word From Us …

• We can provide
– Network design review: how to make what you have work better
– Periodic strategic advice: what’s the next step for your network or staff
– Network management tools & procedures advice: what’s right for you
– Implementation guidance (your staff does the details) or full
implementation
• We do
– Small- and Large-Scale Routing and Switching (design, health check,
etc.)
– IPsec VPN and V3PN (design and implementation)
– QoS (strategy, design and implementation)
– IP Telephony (preparedness survey, design, and implementation)
– Call Manager deployment
– Security
– Network Management (design, installation, tuning, tech transfer, etc.)

Slide 61

Cisco Certifications

Chesapeake Netcraftsmen
is certified by Cisco in:
• IP Telephony
• Network Management
• Wireless
• Security
• (Routing and Switching)

Slide 62

Copyright © 2003, Chesapeake Netcraftsmen Handout Page-31