Anda di halaman 1dari 1

EECS 589 - Advanced Computer Networks - Summary 8

- Ajit Aluri
Delayed Internet Routing Convergence
In this paper the authors, C Labovitz, A Ahuja, A Bose, and F Jahanian present a two year study on Internet routing convergence. Challenging the existing notion that intermittent delays during inter domain failures are mainly due to queuing and router processing latencies, the authors show that much of the delay in restoration of paths occurs mainly due to delayed BGP (Border Gateway Protocol) convergence. They also present theoretical upper bound and a lower bound on BGP convergence and conclude by presenting certain changes to vendor implementation of BGP to achieve constant convergence time complexity in complete graph topologies. Having defined failover - implicit withdrawal and replacement of a route with another path, and steadystate network - no BGP peer sends an update for 30 minute, the authors observe from collected data that many of the vendors implement shortest path first (OSPF) routing to ultimately decide the best route in BGP and thus proceed to assume the default behavior of BGP in the study they present. The authors then explain the methods they employed to collect data for study. They injected faults consisting of BGP update message including route announcements and withdrawals and logged all the BGP updates received from the peers. They used passive probes from RouteMachines to monitor the faults on core internet routers and active measurements to monitor the end to end performance. The authors then define the categories of packets they injected into the network: Tup previously unavailable route is announced available, Tdown previously available route is withdrawn, Tshort - active route is replaced with a shorter route, Tlong active route is replaced implicitly by a longer route. The authors observed that about 70% of Tup and Tshort events converged within 90 seconds while 5 % of Tdown/Tlong events converged within the same period. They also observe that about 20% of the Tlong or Tdown events converged after more than 2 minutes. Another important observation was that the Tdown and Tlong events triggered an average of more than two times the number of update messages than Tshort/ Tup and all the categories generated less than 3.5 messages on average before converging. In end to end performance the authors observed that during the period of convergence there was close to 20% packet loss for Tlong and 15% for Tshort and the latencies rose by up to 60%. Taking an example of a four node complete graph, the authors then present a theoretical upper bound on the convergence of BGP as O((n-1)!). In this case, the nodes send updates as soon as they detect a change in the routing table and the authors say that this causes the number of update messages to increase and delays the convergence. The authors also point out that the nodes choose paths that are shortest first and therefore each update stage can be viewed as a state where they explore fixed path length before moving to the state where they consider the path length greater than the earlier by one. They then show that using the Min Router Adver feature of BGP, they can achieve a convergence in O(n) updates, where n is the number of nodes in a complete graph topology. In Min Router Adver, the routers broadcast their new state only after a period of 30 seconds. This helps them to process all the update received by other nodes. By advertising with a delay of 30 seconds, the authors state that the convergence happens at a faster rate because there are fewer updates to process. They then proceed to suggest the implementation of this feature in vendor BGP implementations and show that it achieves a constant order convergence for Tup and O(n) convergence for Tdown.

Anda mungkin juga menyukai