Anda di halaman 1dari 89

Towards Massively Scalable Ethernet:

Technologies and Standards


BRKSPG-2206

Samer Salam
Principal Engineer, Cisco

2
“Flat != Easy”

Norman Finn
Cisco Fellow and IEEE 802.1 Veteran
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Scaling Layer-2 Networks to Millions of End-Points?

Addressing
Scalability, Mobility, Lookup

Optimal Forwarding
Routing vs. Bridging, Full
Efficient Interconnection of
Network Bandwidth use Ethernet Networks

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Addressing Aspects
“How to Avoid Keeping a Host-Route for Every Host on Every Network
Element While Maintaining Mobility and Ease of Use?”
Addressing Aspects

Requirement Approaches
Efficient Addressing • Hierarchical addressing schemes
• Location-dependent addressing
• Control Plane Learning (where feasible)

Host Mobility • Location-independent addressing


Reduce/Avoid Unicast Flooding • Control Plane Learning (where feasible)

Reduce/Avoid Broadcasts • Broadcast “offload” using proxies/servers

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Addresses – Location and Identity
• “Identity” addresses (“who”)
– MAC- addresses typically represent an Identity
– The manufacturer’s MAC address issued to a physical Ethernet interface is a “who” address; it identifies a
station regardless of what network, or where in that network, the station is attached. Virtual Machines (VM)
typically re-use the server-assigned MAC addresses

• “Location” addresses (“where”)


– IP-addresses typically represent a Location (but also carry Identity)
– Several approaches (e.g. Cisco’s FabricPath, Portland, Moose) assign to a device (or even VM) a MAC
address (or IP-Address) that carries some geographical information.
• Typically “locally administered MAC addresses”*
• 46 bits can carry a network ID, a subnetwork ID, a switch ID, a port ID, and/or a host ID on that port.
– Switches can use mask-and-match lookups instead of 48-bit “host-route” lookups to forward L2 packets.

* Locally administered MAC addresses: low-order two bits of the first byte are “10”; globally administered manufacturers’ addresses: those bits are “00”.
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
MAC addresses: “Identity” vs. “Location”
• “Identity addresses”
– Switches need to learn destination hosts MAC addresses
– If a host moves only the switches (rather than hosts) need to update their forwarding
tables
• “Location addresses”
– Reduce the size of the (L2) forwarding table
– Hosts change addresses when they move: Requires notification of every host.
• Approaches to combine the two worlds (i.e. namespaces)
– “Map ‘n Encap”: [FabricPath], [TRILL], [OTV], [LISP], [8021Qbp]
– Translate: [PortLand], [MOOSE]

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Identity and Location Addresses:
“Map ‘n Encap” Approaches
• Identity Addresses: Kept to the Edge
– Endpoint Identifier (EIR) [LISP]
– MAC-Address [OTV], [TRILL], [PBB-EVPN]

• Location Addresses: Topologically aggregatable; Can


change while Identity stays fixed Location
– Routing Locator (RLOC) [LISP]
Addresses
– Overlay Interface Address [OTV]
– Rbridge-ID [TRILL]
– B-MAC [PBB-EVPN], [8021Qbp] Location Address to
Identity Identity Address
• Mapping Service Addresses Mapping Service
– Map Identity to Location Addresses (distributed e.g. routing protocol
based or centralized/server-based - think “DNS”)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
802.1Q Data Planes Recap…
Service Instance Scale
Address Hierarchy
Service Instance
Hierarchy Provider Backbone Bridges
Service 802.1ah

Instances Payload
“Flat”
Connectivity Ethertype
C-VID
Provider Bridges
802.1ad C-TAG
Ethernet S-VID
VLAN Payload S-TAG
SA
Ethernet Payload Ethertype DA
C-VID I-SID
Payload Ethertype C-TAG I-TAG
C-VID S-VID B-VID
Ethertype Q-TAG S-TAG B-TAG
SA SA SA B-SA
DA DA DA B-DA
1998 2005 2008
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved.
Standard11Approved
Cisco Public
Combining “location” and “identity”
Example: Cisco FabricPath Frame Format 16-Byte MAC-in-MAC Header

• Switch ID – Unique number


identifying each FabricPath switch Classical Ethernet Frame

• Sub-Switch ID – Identifies DMAC SMAC 802.1Q Etype Payload CRC


devices/hosts connected via vPC+
Original CE Frame
• Port ID – Identifies the destination or FabricPathFrame
source interface Outer Outer FP
CRC
DA SA Tag DMAC SMAC 802.1Q Etype Payload
• Ftag (Forwarding tag) – Unique (48) (48) (32)
(new)
number identifying topology and/or
multi-destination distribution tree 6 bits 1 1 2 bits 1 1 12 bits 8 bits 16 bits 16 bits 10 bits 6 bits

• TTL – Decremented at each switch

OOO/DL
RSVD
Endnode ID Endnode ID Sub

U/L
I/G
Switch ID Port ID Etype Ftag TTL
hop to prevent frames looping (5:0) (7:6) Switch ID

infinitely

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Combining “location” and “identity”
Example: FabricPath
Network-based, L2-in-L2, C-MAC Data Plane Learning, Switch-ID Control Plane distribution
(ISIS) S100 S200

MAC1 MAC2

Payload

Ingress Egress
MAC1 MAC2 Edge Edge
Device Device MAC1 MAC2
Payload
S100 S200 Payload

Mapping Service Mapping Service


Host Location Host Location
Host: MAC1 MAC1 Int Eth 1 MAC1 S100 Host: MAC2

MAC2 S200 MAC2 Int Eth 2

1. Layer 2 lookup on the destination MAC. MAC 2 is reachable through S200. 4. The Edge Device receives and de-capsulates the packet.
2. The Edge Device encapsulates the frame. 5. Layer 2 lookup on the original frame. MAC 2 is a local MAC.
3. The transport delivers the packet to the Edge Device on the other site. 6. The frame is delivered to the destination.

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Combining “location” and “identity”
Example: PBB-EVPN
Network-based, L2-in-L2-over-MPLS, B-MAC Control Plane Distribution (BGP)
MPLS labels

B-MAC1 B-MAC2

MAC1 MAC2
PE performs
PE performs
Payload • EVPN
• EVPN
• MAC-in-MAC
• MAC-in-MAC

MAC1 MAC2
MAC1 MAC2
Payload PE1 PE2
B-MAC1 B-MAC2 Payload

Mapping Service Mapping Service


Host Location Host Location
Host: MAC1 B-MAC1 IP-PE1 B-MAC1 IP-PE1 Host: MAC2

B-MAC2 IP-PE2 B-MAC2 IP-PE2

Learned table for attachment circuit MAC1 Int Eth 1 BGP MAC1 B-MAC1 Learned table for attachment circuit
MAC2 B-MAC2 MAC2 Int Eth 2

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Combining “location” and “identity”
Example: OTV
Network-based, L2-in-L3, Distributed (ISIS based) Control Plane Learning
11.0.0.1 12.0.0.2
MAC1 MAC2

Payload

Ingress Egress
MAC1 MAC2 Edge Edge
Device Device MAC1 MAC2
Payload
11.0.0.1 12.0.0.2 Payload

Mapping Service Mapping Service


Host Location Host Location
Host: MAC1 MAC1 Int Eth 1 MAC1 Host: MAC2
11.0.0.1
MAC2 12.0.0.2 MAC2 Int Eth 2

ISIS

1. Layer 2 lookup on the destination MAC. MAC 2 is reachable through IP 12.0.0.2. 4. The Edge Device receives and de-capsulates the packet.
2. The Edge Device encapsulates the frame. 5. Layer 2 lookup on the original frame. MAC 2 is a local MAC.
3. The transport delivers the packet to the Edge Device on the other site. 6. The frame is delivered to the destination.

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Combining “location” and “identity”
Example: LISP
Network-based, L3-in-L3, Distributed Control Plane Learning (MR/MS/ALT)
11.0.0.1 12.0.0.2
2.0.0.2 3.0.0.3

Payload

Ingress Egress
Tunnel Tunnel
2.0.0.2 3.0.0.3
Router (ITR) Router (ETR) 2.0.0.2 3.0.0.3
Payload
Map
Cache 11.0.0.1 12.0.0.2 Payload

3.0.0.1
2.0.0.1

End-System ID End-System ID
(EID: 2.0.0.2) (EID): 3.0.0.3
Alternate
Map Resolver Map Server
Topology (ALT)
(MR) (MS)

Mapping Service

EID-Prefix Locator(s)
2.0.0.0/24 11.0.0.1

3.0.0.0/24 12.0.0.2

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Example: LISP/ OTV/ VXLAN
Observation
• LISP applies to Layer-2 and Layer-3
– For L2, End-System Identifier (EID)
= MAC-Address
– Layer 2 (L2) LISP Encapsulation Format
Source-site Overlay Interface Address

• Layer-2 LISP/ OTV/ VXLAN Destination-site Overlay Interface (or Multicast) Address

share the same encapsulation format


Not Used/Reserved

Instance ID | Not Used


| 0 0 0 0 |I|0 0 0|

• See also
– http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan
– http://tools.ietf.org/html/draft-smith-lisp-layer2
For VXLAN:
“VXLAN Network Identifier (VNI)”

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Combining “Location” and “Identity”
In case you don’t like “Map ‘n Encap”… How about Translation?

• Approach: Assume well defined  Perform source MAC address


topology and dynamically allocate rewriting (“MAC-NAT”) on ingress
hierarchical MAC addresses – Structured MAC-Addresses can be aggregated
representing the location (“MAC-subnet” masks)
– Example: [switch ID].[port ID].[host ID] – Hosts will think everyone but themselves has a
structured MAC address
– Route frames according to hierarchy
• Topology defined routing [PortLand] or run a – Typical NAT considerations apply (e.g. how to
routing protocol [MOOSE] deal with MAC-addresses in payload etc.)

“02:22:22:00:00:01”
“02:22:22/24”
“02:22:22:00:00:02”

“02:22:22:00:00:03” “02:33:33:00:00:01”
“02:33:33/24”
“02:33:33:00:00:02”

“02:33:33:00:00:03”
00:1E:13:9B:8E:10 “02:22:22:00:00:4”

Legend: MAC-NAT on ingress

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
A Note on MAC-NAT and Mobility
Location dependent addresses .. require notification of every switch when a host moves

• Concept
– If a host moves, it is allocated a new 2
Gratuitous ARP sent
MAC address by its new switch by new home switch

– Other hosts may have the old address


in ARP caches 1

• Forward frames, IP Mobility style (new switch 1


discovers host’s old location by querying other
switches for its real MAC address) Data forwarded
by care-of-switch
• Gratuitous ARP, XenVM migration style

• Issue: Not all hosts accept Host relocated to


Gratuitous ARP… new switch

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
A Note on MAC learning
If you believe you need MAC learning, consider learning only if you have to

• Many of the new approaches (e.g. [FabricPath], [TRILL],[802.1Qbp],..)


don’t do C-MAC learning in the core
• Several deployments still perform C-MAC learning at the edge switch
• Conversational Learning
– Each forwarding engine distinguishes between two types of MAC entry:
• Local MAC – MAC of host directly connected to forwarding engine
• Remote MAC – MAC of host connected to another forwarding engine or switch
– Forwarding engine learns remote MAC only if bidirectional conversation occurring
between local and remote MAC
– MAC learning not triggered by flood frames

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Conversational Learning
Learning only the MAC addresses required
MAC IF
xxx # of MACs xxx # of MACs

MAC IF
L2 Fabric
B 2/1

S11
S12 B

MAC IF MAC IF
STP Domain
A 2/1 C 3/1

C S12 A S11

xxx # of MACs xxx # of MACs A C

• ALL MACs needs to be learned on  Local MAC: Source-MAC Learning only happen to traffic received
on CE Ports
EVERY Switch
 Remote MAC: Source-MAC for traffic received on Core-facing ports
• Large L2 domain and virtualization are only learned if Destination-MAC is already known as Local
present challenges to MAC Table  Example: FabricPath Implementation
scalability

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Controlling Broadcast Traffic
Servers/Caches for Applications using Broadcast
• Example: ARP/ND
– Edge devices (e.g. with OTV)
maintain an ARP cache, which is ARP reply on
behalf of
populated by snooping ARP replies. 5 remote server Cache/
4 (IP A) Server
– Initial ARP requests are broadcasted Subsequent
ARP requests
2
ARP reply
to all sites, but subsequent ARP (IP A)

requests are suppressed at the Edge Cache


Device and answered locally. 1 3
First ARP
Snoop &
• Broadcast Application (e.g. request
(IP A)
cache
ARP
ARP) Servers proposed (e.g. reply

VL2, PortLand, SEATTLE) for


specific deployments (e.g. ARP Cache
where IP-MAC mapping is MAC 1 IP A
administered globally) MAC 2 IP B

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Optimal Forwarding
“How to Leverage the Entire Network Topology for Packet Forwarding
– and Approach Full Cross-Sectional Bandwidth?”
Towards a new Layer-2 Control Protocol
Why?

• Current Spanning Tree


Non Optimal Forwarding Paths
(see example)
1 2 2
Parallel Paths cannot be leveraged 1
1 2
2
Operational challenges in complex Root
2
topologies 2

• Let’s discuss 3 approaches


IETF TRILL
Cisco FabricPath
IEEE Shortest Path Bridging P802.1aq

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Optimal Forwarding
IETF TRILL and Cisco FabricPath
TRILL
IETF Approach to Shortest Path Bridging
• TRILL
(TRansparent Interconnect of Lots of Links)
– http://www.ietf.org/html.charters/trill-charter.html
• Main areas addressed by TRILL:
– Provide Shortest Path and
Equal Cost Multi-Pathing for traffic
– Be Plug-n-Play

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
TRILL Basics

IEEE Bridge

RBridge

 A TRILL Network is comprised of Routing Bridges (RBridges/RBs).


Each RBridge is uniquely identified by a ‘nickname’ or rbridge-id (auto-created from ISIS system id)
RBs can be connected by 802.1 LANs or
RBs can be connected by simple P2P links (incl. PPP – see RFC 6361)

 Architecturally, RBridges run “on top” of an 802.1 bridged network


similarly to Routers
RBridges may be interconnected by classical 802.1Q bridges:
Allows for gradual migration of existing networks
RBridges do not participate in xSTP, and drop BPDUs if they are received

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
TRILL
Principles of Operation
.1Q frame
.1Q frame B

E
A
D

IEEE Bridge
C .1Q frame A E D E
RBridge
.1Q frame A E A C
RBridge Outer .1Q frame A E C D
Header MAC

 Frames are encapsulated with the  RBridges learn what MAC addresses are
RBridge addresses and further on their edge ports using general
encapsulated with originating rbridge dataplane learning and MAY advertise
and next hop rbridge MAC address them other RBridges
Header fields differ from 802.1ah Remote mac-address-to-rbridge binding learning:
hardware or control plane
Headers are swapped hop by hop (similar to routing)
 Unknown unicast /multicast/broadcast
frames flooded along pre-calculated
distribution tree(s)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
TRILL Forwarding
• RBridges use ISIS for discovery and to synchronize Link State Databases
• TRILL uses these Link State Database to
– Compute pair wise bidirectional paths for unicast (per node and/or per VLAN)
between all Rbridges
– For multicast, distribution trees are calculated rooted at (potentially) every rbridge ;
trees are given an rbridge-id/nickname as well
• TRILL adds to standard IS-IS
– Ships in the night with other protocols using ISIS
– TRILL Hellos
• Find out whether nodes are on a LAN or P2P link
• Designated Rbridge (DRB) Election
• Root-Bridge-IDs
– See also: RFC 6165 (Extensions to IS-IS for Layer-2 Systems)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
TRILL—Ethernet Data Encapsulation
Outer Ethernet Header (link specific):

Outer Destination MAC Address (RB2)

Outer Destination MAC Address Outer Source MAC Address

Outer Source MAC Address (RB1)


 Outer-VLAN Tag Information: This is used only if two
RBridges communicate across a standard 802.1Q
Ethertype = IEEE 802.1Q Outer.VLAN Tag Information
network

 V: Version
TRILL Header:
 M: Multi-destination; indicates if the frame is to be
delivered to a single or multiple end stations
Ethertype = TRILL V R M Op-Length Hop Count  Opt-Length: >0 if an Option field is present
 Hop Limit: Similar to TTL
Egress (RB2) Nickname Ingress (RB1) Nickname  RBridge Nickname: Not the MAC address of the
Rbridge, but the a TRILL ID for the RBridge (Egress
Nickname used differently if M = 1)
Inner Ethernet Header:

Inner Destination MAC Address

 Multicast tree pruning:


Inner Destination MAC Address Inner Source MAC Address
Requires inspection of customer Destination MAC Address
and customer VLAN
Inner Source MAC Address

 In case of Fine Grain Labeling: Second VLAN tag


Ethertype = IEEE 0x8100 Inner.VLAN Tag Information (see draft-ietf-trill-fine-labeling)

Ethertype = IEEE 0x893B Inner.VLAN second part


See also: RFC 6325 and RFC 6327
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Packet Flow — Known Unicast
• Perform MAC lookup on M2 to •Perform Egress
determine Egress RBridge  RBridge nickname lookup
RB3 on RB3 to determine next • Decapsulate TRILL header
• Encapsulate in TRILL header hop RBridge •Perform MAC lookup on M2 to
& forward to next-hop RBridge determine egress port

RB6 TRILL
RB9 Network RB2
MR6
M1 RB5
B1 MR5 B4 B5 RB3
B2 MR3
RB1 B3 B6
RB8 B7 M2
MR1 802.1Q 802.1Q RB4
MR8 RB7
Cloud Cloud

MR8 Outer MAC DA Outer MAC DA MR3


Outer MAC DA Outer MAC SA Changes Outer MAC DA Outer MAC SA
MR1 Outer MAC SA Hop-to-Hop Outer MAC SA MR5
Etype = Outer VLAN (MACs, VLAN, TTL) Etype = Outer VLAN
802.1Q
Etype = TRILL V/M/R, TTL 802.1Q
Etype = TRILL V/M/R, TTL
RB3 Egress RB-ID Ingress RB-ID RB1 Egress RB-ID Ingress RB-ID RB1
M2 Inner MAC DA Inner MAC DA
Inner MAC DA Inner MAC SA Unchanged From Inner MAC DA Inner MAC SA
M1 Inner MAC SA Ingress to Egress Inner MAC SA
Etype = Inner VLAN Etype = Inner VLAN
802.1QPayload …. 802.1QPayload ….

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Packet Flow — Multicast/Broadcast/Unknown Unicast
Perform MAC lookup on G1 Perform lookup on
Encapsulate in TRILL header, egress rbridge-id to
set M bit and tree id (egress determine distribution Decapsulate TRILL header
rbridge id) & forward to all-RB’s tree Perform MAC lookup on G1 to
mcast address determine egress ports
RB9 TRILL
MR9 Network RB2
RB6 G1
M1 RB5
MR5 RB3
MR3
RB1
RB8 G1
MR1 802.1Q 802.1Q RB4
MR8 RB7
Cloud Cloud

Note: All-RB=All-Rbridges = 01-80-c2-00-00-40 G1

All-RB-
MCAST Outer MAC DA Outer MAC DA All-RB-MCAST
(or MR9) Outer MAC DA Outer MAC SA Changes Outer MAC DA Outer MAC SA
MR1 Outer MAC SA Outer MAC SA MR5
Hop-to-Hop
Etype = Outer VLAN (MACs, VLAN, TTL) Etype = Outer VLAN
802.1Q
Etype = TRILL V/M/R, TTL M=1 802.1Q
Etype = TRILL V/M/R, TTL M=1
RB9 Egress RB-ID Ingress RB-ID RB1 Egress RB-ID Ingress RB-ID RB1
G1 Inner MAC DA
Unchanged From Inner MAC DA
Inner MAC DA Inner MAC SA Ingress to Egress Inner MAC DA Inner MAC SA
M1 Inner MAC SA Inner MAC SA
Etype = Inner VLAN Etype = Inner VLAN
802.1QPayload …. 802.1QPayload ….

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
TRILL Benefits
• Shortest path delivery of unicast
• Layer 2 multi-pathing (ECMP) of unicast
• Optimal multicast delivery over shared trees
– Load-balancing over multiple trees.
– Per-VLAN/c-group pruning of trees via IGMP/PIM snooping.

• Fast convergence times, Minimal configuration


• Support for Shared Media and P2P links
• Loop Prevention and Mitigation (adds a TTL)
• Support for multi-homing (DRB election)
• Confines MAC Address learning to edge nodes, providing MAC address scalability similar
to IEEE 802.1ah (MAC-in-MAC)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Cisco FabricPath in a Nutshell
Similarities with TRILL
– “MAC-in-MAC”-like encapsulation,
includes TTL
– ISIS based control plane (unicast and multicast)
• No MAC learning in the Fabric,
Forwarding based on “Switch IDs”
– ECMP for Multi-Path Load Balancing
Ethernet FabricPath Header

FarbicPath Additions Ethernet

– Multiple-Topologies
FabricPath
– Conversational Learning at the Edges
– Interworking with STP-based Ethernet Access Domains

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Data Plane Operation
Forwarding Decision Based on ‘FabricPath Routing Table’
S1 S2 S3 S4

• FabricPath header is
imposed by ingress switch FabricPath
AB S11  S42
Routing Table
• Only switch addresses are
Switch IF
used to make “routing” … …
S11 S12
FabricPath S42

decisions S42 L1, L2, L3, L4

• No MAC learning required Classical Ethernet 1/1

inside the L2 Fabric Mac Address Table Classical


MAC IF
A 1/1
A Ethernet B

… …
B S42 Single mac address lookup at the edge

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
FabricPath Forwarding: Unknown Unicast

S1 S2 S3 S4

L2 L3 L4
L6 L7 L8
L5
L1
L10 L11 L12

L9
HIT S11 S12 L2 Fabric S42
LEARN
MAC IF MAC IF
Decap MISS C 3/1
A 1/1
1/1 3/1
Encap
C S42
Decap MISS
Don’t LEARN
A B C

FabricPath Port
CE Port

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
FabricPath Forwarding: Known Unicast
Switch IF

… …
S42 L12
S1 S2 S3 S4

L3 L4
L2
L6 L7
L5
L1 L10 L11
Switch IF L12

L8 L9
… …
S42 L1, L2, L3, L4
S11 S12 L2 Fabric S42

MAC IF
MAC IF Encap HIT! C 3/1
A 1/1 1/1 3/1 A S11
Decap
HIT! C S42

A B C

FabricPath Port
CE Port

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Multicast with FabricPath
Forwarding through distinct ‘Trees’
Root for Root for Root for Root for
• Several ‘Trees’ are Tree #1 Tree #2 Tree #3 Tree #4

rooted in key
location inside the
fabric
Ingress switch for
• All Switches in L2 FabricPath decides which
“tree” to be used and add
Fabric share the tree number in the header
same view for each L2 Fabric
‘Tree’
• Multicast traffic
load-balanced
across these ‘Trees’
A C

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Migration to FabricPath
vPC+ Helps Integrating CE Devices
Layer 3
• Allows inserting non-FabricPath Network
capable devices in the network:
L3 Routing
 With Active/Active redundancy Active Active

 Without relying on STP FHRP


FHRP
(port channels)
FabricPath
• Provides active/active FHRP L2 Fabric

Classical
vPC+
Ethernet

Non-FabricPath
capable devices

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Multiple Topologies

• Topology: A group of links in the Fabric.


By default, all the links are part of topology 0.
L2 L3 L4
– Other topologies can be created by assigning a L6 L7 L8
L5
subset of the links to them. L1
L10 L11
– A link can belong to several topologies L12

– A VLAN is mapped to a unique topology


L9
• Topologies can be used for traffic engineering, L2 Fabric
security, etc.

Topology 0

Topology 1
Topology 2

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
STP Interaction

• L2 Fabric is presented as a single


bridge to all connected CE devices
• L2 Fabric should be the root for all
FabricPath
(L2 IS-IS)

connected STP domains. L2 Fabric


CE ports will be put into blocking state
when ‘better BPDU’ is received
Classical
• No BPDUs are forwarded across the Ethernet
(STP)
fabric STP
Domain 1
✖ STP
Domain 2

FabricPath Port
CE Port

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Optimal Forwarding
IEEE Shortest Path Bridging
802.1aq — Shortest Path Bridging
Motivation

 802.1 Traditional bridging based on RSTP/MSTP


– Non-optimal forwarding
– Manual configuration needed for disjoint trees and mapping of VLANs to these trees
 Approach: 802.1aq Shortest Path Bridging
– Optimal unicast and multicast forwarding
– Automatic SPT management controlled by IS-IS
 Same motivations as TRILL

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
802.1aq—Shortest Path Tree per Bridge
Original concept
Each bridge is the “root” of a separate shortest path tree instance
Bridge G is the root of the green tree
Bridge E is the root of the blue tree
Both trees are active AND symmetric at all times
Needed in Ethernet to have congruent multicast and unicast

Root
A A A
Root Root Blocked Ports
Root D D D
B B B
C Blocked
Root Root
C Root C
Ports
E E E
Root
G F Root G F G F
Root

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
IEEE 802.1aq Variants
• Shortest Path Bridging MAC (SPBM) targets PBB networks where all addresses
are managed
• Shortest Path Bridging VID (SPBV) is applicable in customer, enterprise or
storage area networks
SPB

SPBV SPBM

Enterprise Network Access Network Metro Core Network


• Plug & Play • Reliability • Reliability
• Easy to operate • Bandwidth efficiency • Auto-discovery
• Unknown addresses • Unknown or managed • Load sharing
addresses • Managed addresses

MAC learning MAC learning


in data plane in control plane

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
P802.1aq Shortest Path Bridging
Provider Bridging and Provider Backbone Bridging

Shortest Path Bridging VID Mode (SPBV)


VLAN is represented by a set of Shortest Path VIDs (SP-VID)
SP-VID represents a unidirectional tree sourced at a given bridge
Shared VLAN Learning between SP-VIDs *
–ISIS takes care of the SPVID assignments depending on VID and the SPT that the frame needs to be
transmitted on
Shortest Path Bridging MAC Mode (SPBM)
–802.1ah source B-MAC+B-VID for tree identification (assumes 802.1ah)
Control Plane Learning

*”Private VLANs” Leverage the Same Concept: One Service Instance (VLAN) Leverages Two VIDs (Upstream and Downstream)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
IS-IS Control Plane
• Neighbor and Topology discovery
Shortest Path Tree computation (unicast and multicast)
Each bridge builds view of the physical topology of the SPT Region
Control plane learning instead of dataplane learning

• Service discovery
I-SID registrations are included into a new TLV

• Maintenance of SPTs and CIST


• SPTs can be set according to the discovered I-SID
membership information
IEEE 802.1ak (MRP) is not needed

• VID allocation to VLANs

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Path Congruency and Symmetry

3 3
1 1
1 1
2 2
1 1 1 1
1 1
1 4 5 1 4 5

2 1 2 1

6 6
unicast
multicast

• Necessary if MAC learning is in the data plane


• Not necessary if MAC learning is in the control plane
• For both: SPB and SPBB
• Necessary for the proper operation of 802.1ag E-OAM and
beneficial for clock distribution
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Loop Prevention and Mitigation
• Inconsistent view on network topology at different nodes may cause transient loops in case of a link-
state control protocol
• Loop prevention
Tree Agreement Protocol (TAP) (see Draft 802.1aq, version 4.4, clause 13)
Handshake mechanism between neighbors
Extension to MSTP’s handshake
• Loop mitigation
Ingress Checking (e.g. RPFC)
Frames not arriving on the shortest path from the Source Bridge are discarded: No TTL needed
Makes the tree directed: Good for loop prevention in most cases
Transient loops may appear
Severe problem for multicast/broadcast traffic
A chance of network melt-down remains if one does not care
Ingress filtering has to be modified

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Neighbor Handshake Mechanism
(Draft 4.4 802.1aq, Clause 13) Proposal

Agreement

Agreement

Ensure that bridges with different views on the network topology do not exchange
frames
– Local agreements: Two way agreement,
Tree way handshake
When Topology Change occurs, bridge determines set of multicast trees where
distance to the root has changed
– Remove the state for those trees and advertize a digest of LSP database (CRC,
Cryptographic hash function) of the new topology database to the peers of the bridge
– On receiving a matching digest from a peer, a bridge can be sure that the neighbor has
done the same and that the updated multicast state can be installed on the interface
facing the peer

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Optimal Forwarding
Multi-Path Forwarding
Multi-Path Load Balancing
Approaches

• “Classical” ECMP
– [OTV], [TRILL]
• Equal Cost Trees
– [802.1aq]
• Ethernet ECMP
– [802.1Qbp]

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Classic Equal Cost Multi Path (ECMP)
• Pre-Requisite
– Link-Layer Routing Protocol which can compute two or more equal cost shortest paths
between two nodes
• ECMP distributes the traffic per hop among the equal-cost paths
– Packet-based (in round-robin fashion): Can cause out-of-order packets
– Flow-based using hashing e.g. source and destination addresses (and potentially
additional header fields):
Effectiveness depends on the number and distribution of flows (according to the hash
function)
• ECMP is leveraged by [TRILL], [FabricPath]

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Equal Cost Trees (ECT)
Optimizations for 802.1aq Shortest Path Bridging

802.1aq SPB requires to compute a Shortest Path Tree (SPT)


per Node
“Fast SPF” to reduce the number of SPT
Equal Cost Multipath
802.1aq Equal Cost Tree Multipath (ECT) Algorithm allows to compute 16 different trees
per node
Deterministic masking operation by the source node (root of the SPF) to place traffic onto
any of the 16 trees

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
802.1aq Equal Cost Trees (ECT)

TRILL ECMP Example 802.1aq ECT Example

• ECT provides up to 16 symmetrical paths (by XORing the Node-ID with one of
16 predefined masks)
• Per-Hop (TRILL) vs. Global (ECT) Traffic Hashing
Different from TRILL, ECT can only identify a maximum of 16 different paths between any
source and destination pair
Can lead to situations where certain links are not utilized at all (depending on the hash
function used)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
802.1Qbp ECMP
Motivation and Requirements
• Support of per-hop ECMP
• Support of TTL for loop mitigation
• Support of Flow-id
– To avoid deep packet inspection in the core
– To provide proactive service-level monitoring
• Flexible n-tuple hash algorithm for flow-identification
– Any edge node can choose any set of n-tuples and any hash algorithm to derive a flow
id
• Support proactive service-level monitoring
– For a given flow-id, the path for that flow through the network be deterministic

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Recap:
Existing PBB/802.1ah Frame Format
B-DA

B-SA
EtherType = I-Tag
B-TAG

UCA
PCP rsv2 I-SID (part 1)

rsv1
DE
B-VID
(3b) (2b) (8b)
I-TAG
I-SID (part 2)
I-SID (16b)
DA
SA • PCP: Priority Code Point – 3 bits
S-TAG
• DE: Discard Eligible – 1 bit
S-VID
C-TAG • UCA: Use Customer Address – 1 bit
C-VID • Rsv1: Reserved1 – 1 bit
• Rsv2: Reserved2 – 2 bits
Payload
• I-SID: Service ID – 24 bits
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Proposed 802.1Qbp/ECMP Frame Format
B-DA

B-SA
EtherType = F-Tag

F-TAG PCP Rsv TTL

DE
(3b) (6b) (6b)
I-TAG
Flow-ID
I-SID (16b)
DA
• F-Tag Fields:
SA
– PCP: Priority Code Point – 3 bits (copied from B-Tag)
S-TAG
– DE: Discard Eligible – 1 bit (copied from B-Tag)
S-VID – Rsv1: Reserved – 6 bits
C-TAG – TTL: Hop Count – 6 bits
C-VID – Flow-ID: 16 bits

• F-Tag could be combined with I-Tag to avoid addition of


Payload
2 bytes relative to 802.1ah frame format

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Interconnecting Ethernet Domains
“How to Connect Ethernet Domains Across a WAN in an Efficient Way?”
Interconnecting Ethernet Domains
The “Legacy Approach”:
Virtual Private LAN Services (VPLS)
Virtual Private LAN Services (VPLS)
(Almost) Emulating a Bridge: Flooding, Forwarding

• Flooding (Broadcast, Multicast, Unknown Unicast) Applies


Split-
Horizon
Customer

• Dynamic learning of MAC addresses on PHY and VCs


Equipment
N-PE 1 N-PE 3
CE

Virtual Forwarding Instance (VFI) PW


CE

CE
U-PE B
• Forwarding N-PE 2 Applies Applies
N-PE 4
Split- Split-
Ethernet UNI Horizon Horizon Ethernet UNI

Physical Port Customer


Equipment
Virtual Circuit N-PE 1 N-PE 3
CE

• VPLS uses Split-Horizon and Full-Mesh of PWs for


CE
PW
CE
loop-avoidance in core U-PE B
N-PE 2 N-PE 4
Ethernet UNI Ethernet UNI
SP does not run STP in the core

VPLS Defines an Architecture to Provide Connectivity


Between Geographically Dispersed Customer Sites Across
MANs and WANs, as If They Were Connected Using a
LAN.
RFC 4761 (BGP-Based VPLS); RFC 4762 (LDP-Based VPLS)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Scaling VPLS: Hierarchical VPLS
Flavors
Access: VPLS Core:
802.1ad Full Mesh Access: VPLS Core:
Provider of PW Active PW H&S PW Full Mesh
Bridges Overlay of PW

U-PE1 N-PE1
U-PE1 N-PE1

Ethernet
Bridges Standby PW

N-PE2
N-PE2

“H-VPLS“
“H-VPLS“
with MPLS to the Edge
with Ethernet Access

• IEEE 802.1ad Provider Bridges in the • MPLS edge and core


Access running 802.1s/w MSTP/RSTP, – Full-mesh of PW in core, split-horizon
VPLS core (full-mesh of PW with split- – Hub and Spoke access PW for access; only
horizon for loop-avoidance one PW per U-PE (per service instance)
active at a time

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Scaling VPLS Further
Combining H-VPLS and PBB (802.1ah)*
VPLS/ VPLS VPLS/ 802.1Q/
802.1Q/
w/ 802.1ah w/ 802.1ah 802.1ad
802.1ad

 H-VPLS current challenges


MAC-Address Scalability at the N-PE
PW Scalability (Full mesh per VSI per customer; scales O(n^2))
H-VPLS with Ethernet-Access is limited to 4k Service Instances due to 4k S-VLANs

 Approach: Use PBB/802.1ah with H-VPLS


H-VPLS with 802.1ah access Network
H-VPLS with MPLS access network (.1ah/PBB function at U-PE) *RFC 7080
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
VPLS Loadbalancing/ECMP in the Core
Why Flow Aware Transport of PW?

P3

VPNC PE1 VPNC


P5 PE2
E1 P1 E2

3 Traffic Flows VC Label


P4 IP/MPLS 1
2
3

All PW flows transported over single link (same bottom label “ “)


– ECMP requires Core routers to have additional capability to discriminate the flows
Flow Aware Transport (FAT) of PW (RFC6391)
– New method to discriminate flows so ECMP applies

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
FAT PW Encapsulation
Original Pseudo Wire Encapsulation FAT Pseudo Wire Encapsulation

MPLS Tunnel label(s) MPLS Tunnel label(s)


PW Label PW Label
Optional Control Word Flow Label
Optional Control Word
Payload
Payload

 FAT PW architecture introduces “Flow Label” — an additional label to be interposed


during PW packet encapsulation
• between the PW Label/VC Label & Control Word
• between the PW Label/VC Label & Payload (Control Word not present)
 Flow Label stimulates ECMP load balancing behavior

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
FAT PW Architecture
FAT PW Capability Exchange

Label Mapping Message: Label Mapping Message:


AC - VC-Label 21 AC - VC-Label 22
FL Sub-TLV: F=1 FL Sub-TLV: F=1

AC AC
CE1 PE1 PE2 CE2

 Targeted-LDP: Build VPLS PWs between VPLS PEs


 Flow Label Sub-TLV is exchanged between a set of ingress and egress PEs in Label Mapping Message:
– Both PEs must set F=1.
– If F=0 from either PE, no capability exchange / Flow Label not generated
 Flow Label is allocated by a local PE dynamically as unique
(based on a single or combination of L2, L3, L4 parameters) flows become available
 Flow Labels are not exchanged between peer PEs; only VC Labels are exchanged
between peer PEs (per VPLS architecture)
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
FAT PW ECMP Load Balancing
Example

P3
VPNC PE1 PE2 VPNC
P1
E1 P5 E2

Flow Flow
Traffic Flow P4 Labels Payload

 Ingress PE performs Flow Label imposition,


Egress PE discards the Flow Label without processing it
 Intermediary routers map flows to ECMP using the bottom of the stack
label – in this case Flow Label
 Egress PEs use VC label to determine appropriate egress interface (AC)
connected to relevant VPN site (nothing new here)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Interconnecting Ethernet Domains
The “New” Approach: Ethernet VPN (EVPN)
Towards “EVPN”
Requirements

• All-Active Redundancy with Load balancing on L2/L3/L4 flows


• Flow-based multi-pathing
• Geo-redundant PE nodes & optimum unicast forwarding
• Flexible Redundancy grouping
• Multicast optimization w/ MP2MP

See also: draft-ietf-l2vpn-evpn-req

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Towards “EVPN”
Solve Additional Challenges of Current VPLS for All-Active Redundancy
MAC1 MAC2

• Looping of Traffic Flooded from PE


• Duplicate Frames from Floods from the Core
• MAC Flip-Flopping over Pseudowire MAC1 MAC2

In case Port-Channel Load-Balancing does not


produce a consistent hash-value for a frame with
the same source MAC (e.g. non MAC based
Hash-Schemes) MAC1 MAC2 MAC2

MAC2

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
BGP MPLS Based Ethernet VPNs
Main Principles
• Leverage similarities with L3VPN as much as possible Further segmentation using Ethernet-TAGs,
to e.g. identify VLAN(s). Use of Ethernet-TAGs

• Determining Reachability to Unicast MAC Addresses


is optional, but allows for more efficient forwarding
at MES (no C-MAC lookup necessary).

– Local learning: Ethernet Segment (ES) with


BGP
• PE/MES continues to learn C-MAC addresses over AC Ethernet Segment Identifier (ESI)

– Remote learning:
PE PE
• Distribution of Customer MAC-Addresses using BGP
• When multiple PEs/MESs announce the same C-MAC,
hash to pick one PE

• Multicast Traffic Distribution PE PE

– Options: MP2MP LSPs, PM2P LSP or P2P


(w/ ingress replication)

• MP2P (like L3VPN) Trees for Unicast Distribution


• Full-Mesh of PW no longer required

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Operational Principles
• Autodiscovery
Autodiscovery similar to current VPLS

• Multicast/Broadcast
Distribute the MAC-SA via BGP (if new) to all other PEs and send the frame over multicast tunnel
Far-end PE forwards the frame over local ACs (no learning)
If a PE receives a frame with unknown MAC DA, discard the frame (or optionally forward it)

• Known Unicast
Forward using MP2P label associated with VPLS instance (VSI)

• Special Multicast Groups


Leverage P2MP tunnels as needed (per customer requirement)

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
New BGP E-VPN NLRI
Route Types

Route Usage

Ethernet A-D Route • MAC Mass-Withdraw


• Aliasing
• Advertising Split-Horizon Labels
MAC Advertisement • Advertise MAC Address Reachability
Route • Advertise IP/MAC Bindings

Inclusive Multicast Route Multicast Tunnel Endpoint Discovery

Ethernet Segment Route • Redundancy Group Discovery


• DF Election

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Operation
Loop prevention for BUM for multi-homed segments
• Designated Forwarder Election per
draft-ietf-l2vpn-evpn Ethernet A-D Route per
(per ESI and Ethernet TAG) Ethernet Segment incl.
– Active/Active support for ESI attachments via LAG ESI MPLS label

– Active/Passive for non-LAG ESI attachments


PE
• Multi-homed PE implement Split-Horizon
procedures
– Multi-homed PE include ESI MPLS label, which
identifies the “source ESI” PE

– A PE that receives a multicast/broadcast frame


from the WAN filters out that frame over an AC
whose ESI matches the one in the received frame

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
EVPN Operations Example (1/4)
M1 communicates with M2 (e.g. ARP) - Broadcast
AGG1 BGP AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5
C-MAC2
ESI=1

C-MAC1
AGG3 AGG6
PE2 PE4
ESI=2
iBGP L2-NLRI
• next-hop: MES1
• <C-MAC1, Label 100>

• Host M1 sends a message with MAC SA = M1 and MAC DA=bcast


• PE1 learns M1 over its Agg2-PE1 AC and distributes it via BGP to other PE
devices
• All otherPEdevices learn that M1 sits behind PE1
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
EVPN Operations Example (2/4)
M1 communicates with M2 (e.g. ARP) - Broadcast
AGG1 AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5
C-MAC2
ESI=1

C-MAC1
AGG3 AGG6
PE2 PE4
ESI=2

• PE1 sends this message over all its local ACs that are not blocked (for mcast/bcast) & sends
it over MP2MP LSP (of that EVI)
Only a single AC per (multi-homed ID) ESI can be a designated forwarder (DF) to send (but not receive) mcast/bcast
messages to the customer site
Any AC in the group (per ESI) can receive mcast/bcast messages

• PE2 receives the message but it drops it at its AGG2-PE2 AC even though this AC is a DF for
ESI=1 because ESI of the frame matches the ESI of the AC

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public
EVPN Operations Example (3/4)
Reply from M2 to M1 (Unicast)
AGG1 AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5
ESI=1

AGG3 AGG6
PE2 PE4
ESI=2

iBGP L2-NLRI
• next-hop: MES4
• <C-MAC2, Label 100>

• Host M2 sends response with MAC SA = M2 and MAC DA = M1


• PE4 learns M2 over its Agg5-PE4 AC and distributes it via BGP to other PE devices
• All other PE devices learn that M2 sits behind PE4

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
EVPN Operations Example (4/4)
Reply from M2 to M1 (Unicast)
AGG1 AGG4
M2
PE1 PE3
ESI=3
M1 AGG2 AGG5

ESI=1

AGG3 AGG6
PE2 PE4
ESI=2

• Since PE4 already knows that M1 sits behind PE1, it forwards the frame to PE1
If PE4 has two BGP ECMP for M1 (e.g., both PE1 & PE2 advertised M1), then it uses a hash based
on L2/L3/L4 header to decide which of the two PEs to forward the frame to

• Upon receiving the frame, PE1 does a MAC lookup and forwards the frame to
Agg2-PE1 AC

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Towards PBB-EVPN
Design Goals
• MAC Advertisement Route Scalability
– To support millions of C-MAC addresses (million of VMs)
• C-MAC Mobility with “location” MAC addresses
• C-MAC Address Conversational Learning
• Interworking with TRILL & 802.1aq/.1Qbp networks w/ C-MAC Transparency
– To avoid learning of C-MACs by DC WAN Edge PE
• Per Site/Segment Policy (rather than per network)
• Avoiding C-MAC flushing upon link, port, or node failure for multi-homed devices
• Avoid transient loop for known unicast when doing egress MAC lookup

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
PBB-EVPN Solution Overview
C-MAC B-MAC

“BMAC ->Site-ID/MH-ID/ESI” M1 B-MAC1


PE1 PE3

M1 M2 B-MAC1
Site 1
MHN1 Site 2

M2 B-MAC NH
B-MAC1 PE1
PE2 PE4
PBB EVPN B-MAC1 PE2

• Location (a site, represented by a virtual B-MAC (=ESI), Identifier (C-MAC) approach


• B-MACs advertized as routable address using BGP (see addressing approaches)
• Set BGP NH of B-MAC to Router-ID (e.g. Loopback address) of the advertising PE
• Remote PEs use the received B-MAC NLRIs to form a BGP path list for a given B-MAC
• C-MACs are learned against B-MACs over the core and in turn resolve to BGP path list –
e.g., C-MACs recursively resolve to a PE draft-ietf-l2vpn-pbb-evpn

• Upon an AC or PE failure, adjust the BGP path list accordingly for a given B-MAC –
e.g., no C-MAC withdraw is needed

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
PBB-EVPN
Review and Solution Context

TRILL TRILL
TRILL/FP-EVPN
FabricPath FabricPath

• MAC Advertisement Route Scalability


– Single virtual B-MAC represents a (multi-homed) site (active/active case)
• B-MAC’s can be location dependent/sub-netted, whereas C-MAC stay
unchanged
– Per site (i.e. per virtual B-MAC) BGP policy support
– Simple VM-Mobility support
– Avoid C-MAC flushing requirements
• Interworking with TRILL and C-MAC Transparency
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Summary
Summary: Flat != Easy
Technologies, Solutions, and Standards Are Converging

Addressing
Scalability, Mobility, Lookup
• “Map ‘n Encap”: Location
and Identity addresses
• Avoid Flooding, control plane
learning, broadcast “proxies”

Optimal Forwarding *
Multi-Pathing, Optimal
Efficient Interconnection of
Network Topology Use Ethernet Networks
• ISIS Control Plane: • ISIS/BGP Control Plane
Unicast & Multicast • Control Plane learning
• ECMP and enhancements • Native IP and MPLS transport
• Multiple Topologies • Active-Active Multi-Homing
• Efficient Multicast

*: 3x3x3 Cube World Record Holder: Feliks Zemdegs, Melbourne Winter Open 2011: 5.66s:
http://www.youtube.com/watch?v=3v_Km6cv6DU
BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
References
• [FabricPath]: FabricPath http://www.cisco.com/en/US/prod/switches/ps9441/fabric_path_promo.html

• [LISP]: Locator/ID Separation Protocol https://datatracker.ietf.org/wg/lisp/charter/

• [802.1Qbp] ECMP http://www.ieee802.org/1/files/public/docs2011/new-ashwood-sajassi-ecmp-par-0111-v04.pdf

• [EVPN]: BGP MPLS Based Ethernet VPN http://tools.ietf.org/html/draft-raggarwa-sajassi-l2vpn-evpn-04

• [TRILL]: Transparent Interconnection of Lots of Links https://datatracker.ietf.org/wg/trill/charter/


http://tools.ietf.org/wg/trill/draft-ietf-trill-rbridge-protocol/

• [VL2]: VL2: A Scalable and Flexible Data Center Network http://ccr.sigcomm.org/online/?q=node/502

• [MOOSE]: Addressing the Scalability of Ethernet with MOOSE http://www.cl.cam.ac.uk/~mas90/MOOSE/MOOSE.pdf

• [PORTLAND]: PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric http://ccr.sigcomm.org/online/?q=node/503

• [SEATTLE]: Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises


http://www.cs.princeton.edu/~chkim/Research/SEATTLE/seattle.pdf

• [MONSOON]: Towards a Next Generation Data Center Architecture: Scalability and Commoditization
http://research.microsoft.com/apps/pubs/default.aspx?id=79348

• [VLB]: Valiant Load Balancing in Backbone Networks http://www.stanford.edu/~ashishg/network-algorithms/rui.pdf

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Participate in the “My Favorite Speaker” Contest
Promote Your Favorite Speaker and You Could be a Winner
• Promote your favorite speaker through Twitter and you could win $200 of Cisco
Press products (@CiscoPress)
• Send a tweet and include
– Your favorite speaker’s Twitter handle
– Two hashtags: #CLUS #MyFavoriteSpeaker
• You can submit an entry for more than one of your “favorite” speakers
• Don’t forget to follow @CiscoLive and @CiscoPress
• View the official rules at http://bit.ly/CLUSwin

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Complete Your Online Session Evaluation
• Give us your feedback and you
could win fabulous prizes. Winners
announced daily.
• Complete your session evaluation through
the Cisco Live mobile app
or visit one of the interactive kiosks located
throughout the convention center.

Don’t forget: Cisco Live sessions will be available


for viewing on-demand after the event at
CiscoLive.com/Online

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Continue Your Education
• Demos in the Cisco Campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings

BRKSPG-2206 © 2014 Cisco and/or its affiliates. All rights reserved. Cisco Public 87

Anda mungkin juga menyukai