Cubin Craig Cameron Thesis

Optical Burst Switching: Towards Feasibility
Craig Warrington Cameron

B.Eng.(Hons.) (Melb.), M.S. (Caltech) A thesis submitted in total fullment of the requirements for the degree of
Doctor of Philosophy
Department of Electrical and Electronic Engineering
April 2005
Many generous friends and family members helped in the creation of this thesis, oering valuable advice and support.
In particular, my supervisor, Moshe Zukerman, was an unlimited source of inspiration and wisdom.
Thank you.
Abstract Recent Wavelength Division Multiplexing (WDM) experimental results show successful transmission of data over a single optical ber at an aggregate speed of 10 Tb/s, spread over more than 256 independent wavelengths. As the number of wavelengths per ber increases, converting data between the optical and electronic domains becomes a critical bottleneck in terms of cost, size, processing speed and power consumption. In order to realize potential ber bandwidth and WDM gains fully, the number of such conversions must be minimized. Optical Burst Switching (OBS) has recently been proposed as a future high-speed switching technology for Internet Protocol (IP) networks that may be able to eciently utilize extremely high capacity links without the need for data buering or optical-electronic conversions at intermediate nodes. Unlike classical circuit switching however, contention between bursts may cause loss within the network. This dissertation suggests several strategies to minimize loss probability while maximizing throughput in OBS networks. Proposals to date for OBS have yielded very high loss rates even for moderate network loads. This work measures and signicantly reduces this loss, and in doing so, positions OBS as a feasible option for future optical networks. The new algorithms introduced are Shortest Path Random Deection Routing that both lowers the loss rate for moderate network loads and remains stable even for very high network loads, and O Timer Burst Assembly that removes harmful synchronization while bounding delay and improving TCP performance. In addition, a method to theoretically calculate the steady state sending rate of TCP over OBS networks is derived and used to show that as the number of wavelengths increases, the utilization of TCP over OBS also increases, suggesting that OBS systems require large numbers of wavelengths to be feasible. Finally, theoretical models to measure the queuing performance of Gaussian trac are introduced and used to dimension buers at the edge of OBS buers in order to achieve desired levels of queuing loss.
Declaration
This is to certify that The thesis comprises only my original work towards the PhD except where indicated in the Preface, Due acknowledgement has been made in the text to all other material used, The thesis is less than 100,000 words in length, exclusive of tables, maps, bibliographies and appendices
CRAIG WARRINGTON CAMERON, 1st April 2005
Preface
The following work presented in this thesis was developed in collaboration with other researchers and is not solely my original work: The mathematical formulae presented in Chapter 6.3 were originally developed by Professor Ron Addie. As a consequence, derivation of these formulae is not included in this thesis. The simulation results of Section 3.3.2 were produced by Mr. JungYul Choi during his visit to the Department of Electrical and Electronic Engineering in early 2004.
CRAIG WARRINGTON CAMERON, 1st April 2005
CONTENTS
Contents
Contents List of Figures List of Tables Acronyms 1 Introduction 1.1 1.2 1.3 1.4 1.5 1.6 Looking Forward . . . . . . . . . . . . . . . . . . . Focus of this Thesis . . . . . . . . . . . . . . . . . . Subdivision of Thesis by Chapter . . . . . . . . . . Contributions of this Thesis . . . . . . . . . . . . . Publications by the Author Related to this Thesis . Other publications by the Author . . . . . . . . . . 1 7 11 12 15 15 16 17 21 22 23 25 25 26 27 28 29 34 34 36 37
2 All-Optical Networks 2.1 2.2 2.3 2.4 Introduction . . . . . . . . . . . . . . . . . . . . . . Current High Speed Networks . . . . . . . . . . . . 2.2.1 2.3.1 2.4.1 2.4.2 2.4.3 Shortcomings of SONET . . . . . . . . . . . Enabling Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . Lightpaths and Wavelength Routing . . . . Peer and Overlay Models . . . . . . . . . . . Out of Band Control . . . . . . . . . . . . . Future Optical Networks . . . . . . . . . . . . . . . Optical Circuit Switching
CRAIG CAMERON 2.5 Optical Packet Switching . . . . . . . . . . . . . . . 2.5.1 2.5.2 2.5.3 2.6 2.6.1 2.6.2 2.6.3 2.7 2.7.1 2.7.2 2.7.3 2.7.4 2.7.5 2.8 Optical Buering . . . . . . . . . . . . . . . Packet Header Processing . . . . . . . . . . Switching Speed Limitations . . . . . . . . . Burst Assembly . . . . . . . . . . . . . . . . Path Reservation . . . . . . . . . . . . . . . Contention Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . On-O Source Model . . . . . . . . . . . . . Engset Model . . . . . . . . . . . . . . . . . 2D Markov Chain Framework . . . . . . . . Computational Methods . . . . . . . . . . . A Loss vs. Load Example . . . . . . . . . .
2 38 39 41 41 42 42 43 45 47 47 48 49 50 52 53 55 55 55 56 57 58 58 59 60 62 64 65 66 66 67 69 69
Optical Burst Switching . . . . . . . . . . . . . . .
OBS Loss Calculation
Conclusion . . . . . . . . . . . . . . . . . . . . . . .
3 Burst Assembly 3.1 3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . Burst Assembly Algorithms . . . . . . . . . . . . . 3.2.1 3.2.2 3.2.3 3.3 3.3.1 3.3.2 3.3.3 3.4 3.5 3.4.1 3.5.1 3.5.2 3.5.3 3.6 Threshold-based Burst Assembly . . . . . . Timer-based Burst Assembly . . . . . . . . . Hybrid Burst Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Environment . . . . . . . . . . . Simulation Results . . . . . . . . . . . . . . Timer-based synchronization . . . . . . . . . Timer-based Synchronization . . . . . . . . Simulation Environment . . . . . . . . . . . Steady State Timer Value . . . . . . . . . . Breaking Synchronicity . . . . . . . . . . . .
Simulation-based Study
O-Timer Burst Assembly . . . . . . . . . . . . . . OTBA Properties . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS 4 TCP Over OBS 4.1 4.2 4.3 Introduction . . . . . . . . . . . . . . . . . . . . . . History of TCP . . . . . . . . . . . . . . . . . . . . TCP Today . . . . . . . . . . . . . . . . . . . . . . 4.3.1 4.3.2 4.3.3 4.3.4 4.3.5 4.3.6 4.3.7 4.4 4.4.1 4.4.2 4.4.3 4.4.4 4.5 4.5.1 4.5.2 4.6 4.6.1 4.6.2 4.6.3 4.6.4 4.6.5 4.7 4.7.1 4.7.2 4.8 4.8.1 4.8.2 4.8.3 Slow start . . . . . . . . . . . . . . . . . . . Congestion Avoidance . . . . . . . . . . . . Fast Retransmit . . . . . . . . . . . . . . . . Fast Recovery . . . . . . . . . . . . . . . . . New Fast Retransmit/Fast Recovery Algorithms . . . . . . . . . . . . . . . . . . . . . Naming TCP Variants . . . . . . . . . . . . Selective Acknowledgements - SACK . . . . TCP Friendliness . . . . . . . . . . . . . . . Modelling SACK . . . . . . . . . . . . . . . Parallel Sources . . . . . . . . . . . . . . . . Bounds on TCP loss . . . . . . . . . . . . . TCP source classications in OBS . . . . . . TCP and Loss - Simulation Methods . . . . . . . . . . . . . . . OBS Network Topology . . . . . . . . . . . OBS On-O Source Trac Model . . . . . . OBS Loss Model . . . . . . . . . . . . . . . TCP Model . . . . . . . . . . . . . . . . . . Finding the Fixed Point . . . . . . . . . . . Including Time Outs . . . . . . . . . . . . . Large Number of Wavelengths . . . . . . . .
3 73 73 74 75 75 76 76 77 77 78 78 79 83 84 84 84 85 85 86 87 88 88 89 90 90 92 93 94 97
TCP Models . . . . . . . . . . . . . . . . . . . . . .
TCP over OBS . . . . . . . . . . . . . . . . . . . .
Linking TCP and OBS Models
Numerical Results . . . . . . . . . . . . . . . . . . .
Burst Assembly and TCP . . . . . . . . . . . . . .
Fast, Slow or Medium TCP Sources? . . . . 100 Delay Penalty . . . . . . . . . . . . . . . . . 101 Delayed First Loss (DFL) Gain . . . . . . . 102
CRAIG CAMERON 4.8.4 4.8.5 4.8.6 4.8.7 4.9
Correlation Benet . . . . . . . . . . . . . . 103 Burstication Factor . . . . . . . . . . . . . 104 Dynamic Burst Sizing . . . . . . . . . . . . 105 O-Timer Burst Assembly and TCP . . . . 107
TCP alternatives . . . . . . . . . . . . . . . . . . . 107
4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . 108 5 Contention Resolution 5.1 5.2 111
Introduction . . . . . . . . . . . . . . . . . . . . . . 111 Deection Routing . . . . . . . . . . . . . . . . . . 112 5.2.1 5.2.2 5.2.3 5.2.4 Deection Types . . . . . . . . . . . . . . . 113 Fixed Alternate Deection . . . . . . . . . . 113 Dynamic Trac-Aware Deection . . . . . . 114 Out Of Order Packets . . . . . . . . . . . . 114 Node Architecture . . . . . . . . . . . . . . 116 Routing Notation . . . . . . . . . . . . . . . 117 Shortest Path Random Deection Routing . 117 Shortest Path Prioritized Random Deection Routing (SP-PRDR) . . . . . . . . . . . . . 119 Managing Preemption . . . . . . . . . . . . 120 Oset Time . . . . . . . . . . . . . . . . . . 121 Fibre Delay Lines . . . . . . . . . . . . . . . 122 Routing loops . . . . . . . . . . . . . . . . . 123
5.3
Shortest Path Prioritized Deection . . . . . . . . . 116 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5 5.3.6 5.3.7 5.3.8
5.4 5.5 5.6
SP-PRDR Simulation . . . . . . . . . . . . . . . . . 123 Results . . . . . . . . . . . . . . . . . . . . . . . . . 126 Conclusion . . . . . . . . . . . . . . . . . . . . . . . 131 133
6 Gaussian Queuing in OBS 6.1 6.2
Introduction . . . . . . . . . . . . . . . . . . . . . . 133 Long Range Dependence . . . . . . . . . . . . . . . 134 6.2.1 6.2.2 Internet Trac - SRD or LRD? . . . . . . . 135 Multi-scaling behaviour . . . . . . . . . . . . 136
CONTENTS 6.3
Gaussian Queuing . . . . . . . . . . . . . . . . . . . 137 6.3.1 6.3.2 6.3.3 6.3.4 6.3.5 Research Overview . . . The Model . . . . . . . . Simulation Model . . . . An Approximation for c The Output Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 139 142 142 144 146 148 149 150
6.4
6.5
Networks of Queues: Simulations 6.4.1 Ingress Node Buer Size . 6.4.2 Eect of Service Rate . . . Conclusion . . . . . . . . . . . . .
7 Conclusion 151 7.1 Burst Assembly . . . . . . . . . . . . . . . . . . . . 152 7.2 TCP . . . . . . . . . . . . . . . . . . . . . . . . . . 152 7.3 7.4 7.5 Deection Routing . . . . . . . . . . . . . . . . . . 152 Ingress Buer Dimensioning . . . . . . . . . . . . . 152 Further Work . . . . . . . . . . . . . . . . . . . . . 153 155
Bibliography
CRAIG CAMERON
CONTENTS
List of Figures
2.1 2.2 2.3 2.4 2.5 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 Example Path (2-7-9-12) in OBS Network . . . . . Example Timeline of an OBS Burst Transmission. . On-o model mean values. . . . . . . . . . . . . . . Simplied Single-link Topology . . . . . . . . . . . Sample OBS Loss Vs. Load Graph, M = 128. . . . Sample path for Threshold Burst Assembly Model . Sample path for Timer Burst Assembly Model . . . Simplied Single-link Topology . . . . . . . . . . . Average Delay: Threshold-based . . . . . . . . . . . Average Burst Size: Timer-based . . . . . . . . . . Blocking Probability: Threshold-based . . . . . . . Blocking Probability: Timer-based . . . . . . . . . Sample path for source-destination pair using OTimer Burst Assembly Model . . . . . . . . . . . . Sample Path for To = 7ms, = 0.25 . . . . . . . . 3.10 Sample Path for To = 4ms, = 0.1 . . . . . . . . . 3.11 Sample Variance of Burst Departure Times for 1000 parallel burst assembly processes. To = 7ms, = 0.25 . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 4.2 Graphical Representation of Network Throughput [105] . . . . . . . . . . . . . . . . . . . . . . . . . . Calculating Number of TCP Packets Sent/Cycle . . 70 66 68 68 43 46 48 49 53 57 58 60 61 62 63 63
74 80
CRAIG CAMERON 4.3 4.4 4.5 4.6 4.7 4.8 4.9 Induced Error in TCP Model Ignoring Time Outs Large p . . . . . . . . . . . . . . . . . . . . . . . . . Induced Error in TCP Model Ignoring Time Outs Small p . . . . . . . . . . . . . . . . . . . . . . . . . Simplied single-link topology . . . . . . . . . . . . Calculating xed point of TCP input load and OBS loss . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphical method to nd xed point loss: M =16, K=10, RTT=0.1s. . . . . . . . . . . . . . . . . . . Input load per source vs. number of output links. M=16, no wavelength conversion . . . . . . . . . . Loss vs. number of output links. M=16, no wavelength conversion . . . . . . . . . . . . . . . . . . . 4.10 Total input load vs. number of output wavelengths. M=16, no wavelength conversion . . . . . . . . . . 4.11 Input load per source vs. number of output links. M=16, no wavelength conversion, timeouts included 4.12 Loss vs. number of output links. M=16, no wavelength conversion, timeouts included . . . . . . . . 4.13 Input load per source vs. number of output wavelengths. . . . . . . . . . . . . . . . . . . . . . . . . 4.14 Loss vs. number of output wavelengths. . . . . . . . 4.15 Total input load vs. number of output wavelengths.
82 83 89 91 91 93 94 95 96 97 98 99 99
4.16 Correlation Benet for Fast TCP Sources . . . . . . 105 5.1 5.2 5.3 5.4 5.5 Sample OBS Network - NSFNET T3 comprising 13 nodes and 32 directed links . . . . . . . . . . . . . . 119 Sample network with weighted links. . . . . . . . . 122 Number of ows per link (RM1) . . . . . . . . . . . 124 Number of ows per link (RM2) . . . . . . . . . . . 125 Average O-D Pair Burst Loss Probability, Routing Matrix 1 (unbalanced) . . . . . . . . . . . . . . . . 127
CONTENTS 5.6 5.7 5.8 Average O-D Pair Burst Loss Probability, Routing Matrix 2 (balanced) . . . . Average Number of Hops sions, Routing Matrix 1 . Average Number of Hops sions, Routing Matrix 2 . . . for . . for . . . . . . . . successful . . . . . . successful . . . . . .
. . . . . . 127 transmis. . . . . . 128 transmis. . . . . . 128
5.9 Utilization averaged over all links, Routing Matrix 1 129 5.10 Utilization averaged over all links, Routing Matrix 2 129 5.11 Maximum Link Utilization, Routing Matrix 2 (5-8) 130 6.1 6.2 Simulation Topology. Routers with innite sized buers feeding nite OBS ingress buer . . . . . . . . . . . 146 Loss vs. Buer Size: For each upstream router, m = 2.5, 2 = 10, v = 100, = 5. For the ingress node, = 50 . . . . . . . . . . . . . . . . . . . . . . . . . 148 Loss vs. Service Rate: For each upstream router, m = 4, 2 = 10, v = 100, = 10. For the ingress node, K = 200 . . . . . . . . . . . . . . . . . . . . 149
6.3
CRAIG CAMERON
10
CONTENTS
11
List of Tables
5.1 5.2 Global Routing table for simulated Origin-Destination pairs. All trac ows are bi-directional. . . . . . . 124 Number of bidirectional O-D pair ows for each link in network. . . . . . . . . . . . . . . . . . . . . . . 124 Analytic vs. Simulation Results for Ploss . . . . . . . 144 Analytic vs. Simulation Results for the Output Variance . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.1 6.2
CRAIG CAMERON
12
Acronyms
3D ACK ADM AR ASE ASON ATM AVR BHC BHP cwnd DCC DFL EDFA FDL FIFO FTP HTTP IETF IP ITU JET NACK NI OBS OCS O-D OEO OPS OTBA QoS Three Dimensional Acknowledgement Add/Drop Multiplexer Auto-Regressive Amplied Spontaneous Emission Automatically Switched Optical Network Asynchronous Transfer Mode Asymptotic Variance Rate Burst Header Cell Burst Header Packet Congestion Window Size Digital Cross Connect Delayed First Loss Erbium-Doped Fiber Amplier Fibre Delay Line First-In-First-Out File Transfer Protocol Hypertext Transfer Protocol Internet Engineering Task Force Internet Protocol International Telecommunications Union Just-Enough-Time Negative Acknowledgment Network Interface Optical Burst Switching Optical Circuit Switching Origin-Destination Optical-Electronic-Optical Optical Packet Switching O Timer Burst Assembly Quality of Service
CONTENTS SP-PRDR RAM RED RTT RWA rwnd SACK SDH SMP SOA SONET SPE SRD SSQ ssthresh TaW TaG TCP TD TDM TBS TO UDP VT WDM XGM XPM
13 Shortest Path Prioritized Random Deection Routing Random Access Memory Random Early Detection Round Trip Time Routing and Wavelength Assignment Receiver Window Size Selective Acknowledgement Synchronous Digital Hierarchy Semi Markov Process Semiconductor Optical Amplier Synchronous Optical Network Synchronous Payload Envelope Short Range Dependent Single Server Queue Slow Start Threshold Size Tell-and-Wait Tell-and-Go Transmission Control Protocol Triple Duplicate Time Division Multiplexing Terabit Burst Switching TCP Time Out User Datagram Protocol Virtual Tributary Wavelength Division Multiplexing Cross Gain Modulation Cross Phase Modulation
CRAIG CAMERON
15
Chapter 1 Introduction
1.1 Looking Forward
In the last twenty years, the Internet has changed from a universitybased research network to a ubiquitous communication medium that enables a diverse range of useful applications, including email and the World Wide Web. Within the USA, the amount of Internet data trac surpassed that of voice trac several years ago and continues to grow rapidly, approximately doubling every year since 1997 [137]. Assuming this growth continues, it is expected that current electronic network architectures and protocols will soon be unable to carry such large amounts of trac. This expectation has fuelled interest in all-optical technologies that do not require optical signals to be converted to electronic form at each hop within the network. Given that advances in Wavelength Division Multiplexing (WDM) have enabled hundreds of wavelengths to be sent down a single ber, conversion of all wavelengths at each hop is not feasible due to cost, power and space constraints. It is therefore expected that all-optical networks will replace current networks at some point in the future. The exact form that this future optical network will take is tightly coupled to technological progression in the eld of opti-
CHAPTER 1. INTRODUCTION
16
cal devices. Currently, the slow switching speeds of commercially available all-optical switches necessitate end-to-end circuit switching approaches. In the medium to long-term future, it is expected that most of the current technological limitations of todays optical devices will be solved, enabling more sophisticated and more ecient packet-based switching architectures. It is also implicitly assumed that the cost of converting optical signals into the electrical domain and vice-versa will remain high with respect to competing all-optical solutions. Already, research projects such as the National LambdaRail and the Mid-Atlantic Terascale Partnership have established and are utilizing high bandwidth networks to transport scientic data over 40 simultaneous wavelengths, each transmitting at 10Gb/s. Experimental results from NEC show that the potential operational capacity per ber may be much higher, with successful transmission of 10 Tb/s over more than 256 independent wavelengths1 . Looking forward, these experimental networks will soon become the norm and within a few years, will be replaced by even faster all-optical networks.
1.2
Focus of this Thesis
Feasible protocols and architectures that can eciently utilize the huge bandwidth of optical ber are essential building blocks in the development of future optical networks. In this thesis, we concentrate on Optical Burst Switching (OBS) and solve several problems that are currently seen as limiting its feasibility. OBS has been proposed as a future high-speed switching technology for all-optical networks that may be able to eciently utilize extremely high capacity links without the need for data buering or optical-electronic conversions at intermediate nodes. Packets
1
http://www.nec.co.jp/press/en/0103/2201.html
CRAIG CAMERON
17
arriving at an OBS ingress node that are destined for the same egress OBS node and belong to the same Quality of Service (QoS) class are aggregated and sent in bursts. At intermediate nodes, the data within the optical signal is transparently switched to the next node according to forwarding information contained within a control packet preceding the burst. At the egress node, the burst is subsequently de-aggregated and forwarded electronically. Unlike classical circuit switching, contention between bursts may cause loss within the network. The main problem of OBS is that this loss is quite high, even for moderate input loads. In this thesis, we develop two main algorithms for reducing this loss: O Timer Burst Assembly (OTBA) and Shortest Path Prioritized Random Deection Routing (SPPRDR). We also introduce a modeling framework for determining steady state loss and input rates in the presence of Transmission Control Protocol (TCP), a ubiquitous protocol in todays Internet that uses closed loop feedback of loss signals to modify input rates. Furthermore, we introduce and validate through simulation dimensioning techniques for OBS ingress buers that enable ner grained control of network resources and provide useful lower bounds on buer sizes, given loss and rate constraints. By measuring, managing and reducing loss, the novel techniques introduced in this thesis overcome serious hurdles in current Optical Burst Switching proposals, helping to bring OBS towards feasibility and eventual deployment.
1.3
Subdivision of Thesis by Chapter
Chapter 2: All Optical Networks

In this Chapter, we present an overview of currently installed optical networks and show that the associated protocols and network infrastructures are unable to scale up to fully utilize the bandwidth of
18
optical ber. We then give an historical review of the development of competing all-optical paradigms over the last 10 years, highlighting key enabling optical technologies and dene three proposals for high speed all-optical networks: Optical Circuit Switching (OCS), Optical Packet Switching (OPS) and Optical Burst Switching (OBS). We chose to concentrate on Optical Burst Switching (OBS) as a compromise between OCS, which is simple, yet inecient given unpredictable and bursty demand, and OPS, which is currently constrained by rudimentary optical technology, especially slow optical switches and the lack of random access memory equivalents in the optical domain. We explore the key protocols and terminology of OBS, describe a theoretical loss model for OBS, highlight the key problems with OBS and refer the reader to the relevant sections of this dissertation that solve these problems.
Chapter 3: Burst Assembly

In this Chapter, we examine and simulate two previously proposed OBS burst assembly algorithms: threshold-based and timer-based burst assembly. We show that the latter algorithm may induce extremely high and deterministic levels of burst loss due to timing synchronicity. We present a new dynamic algorithm, O Timer Burst Assembly (OTBA), that links burst size and burst injection times to input trac intensity, introducing required randomness in the network to greatly reduce burst loss due to timing synchronicity.
Chapter 4: TCP over OBS

A study from 1998-2003 [70] shows that trac from TCP applications dominates current Internet usage . In this Chapter, we examine current versions of TCP to nd appropriate analytical models that can be used in OBS networks. Several source rate TCP models are combined with an OBS loss model, previously introduced
CRAIG CAMERON
19
in Chapter 2, to nd xed-point input loads and loss rate for TCP over OBS where individual TCP sources have at most one packet in each burst. If an individual TCP source can have multiple packets in a single burst, analysis is much more complicated. The general consensus is that larger bursts give higher throughput up to a point, after which the added delay causes the throughput to decrease. Although the sources themselves are beyond the direct control of a network provider, tuning the burst assembly process can directly aect the number of packets from each TCP source that are aggregated in each burst. We review current proposals for dynamic burst sizing algorithms that show that increasing burst size after a long burst and conversely sending bursts at a higher rate after short bursts signicantly improves TCP performance in OBS. We then revisit O Timer Burst Assembly (OTBA) presented in Chapter 3 and show that this algorithm introduces the necessary correlation for improved TCP performance, yet is very simple to implement, unlike the reviewed proposals that are complex and require considerable amounts of processing and networking state information.
Chapter 5: Contention Resolution

In OBS, the reservation of capacity for bursts is a dynamic and unacknowledged process. If no reservation can be made due to the desired output wavelength and ber already being fully reserved, contention is said to occur. In this Chapter, we describe methods for resolving contention such that burst loss is minimized. We focus on deection routing, a technique that has been often overlooked due to perceived complexity and instability at high loads, often yielding higher burst loss probabilities than if bursts were not deected but simply dropped. To solve this problem, we develop a novel algorithm, Shortest Path Prioritized Random Deection Routing (SP-PRDR) that combines wavelength conversion,
20
deection routing and preemption contention resolution schemes to signicantly reduce burst loss rates in OBS networks, yet maintains stability, even at high loads.
Chapter 6: Gaussian Queuing In OBS

In traditional packet switched networks, packets enter a common queue, independent of their destination address. During the aggregation process of OBS described in Chapter 3, packets are separated and buered only with other packets sharing the same destination and Quality of Service (QoS) class within the OBS network. Viewed from a network providers perspective, limiting buer access to the output links is equivalent to establishing a virtual circuit from that buer to its common destination, enabling ne-grained control over network usage. In this Chapter, we examine burst size distributions introduced in previous research and show that due to high levels of multiplexing, the number of packets arriving at an ingress buer over a xed period of time tends to a Gaussian distribution. With this in mind, we develop methods to approximate packet loss given Gaussian trac and restricted buer service rates. These approximations can then be used to dimension appropriate sized buers in OBS ingress nodes. We also present formulae to estimate the output variance of Gaussian trac, given an innite buer. These results are used to model the aggregation of trac from many large buered electronic packet-based routers, into a single limited size OBS buer. We then verify the closed-form theoretical results through simulation of an rst order auto-regressive (AR) time series.
CRAIG CAMERON
21
1.4
Contributions of this Thesis
Below is a list of the major contributions of this thesis. Contributions are sorted in approximate order of appearance with Section numbers indicating where the point is rst discussed in the thesis.
1. Identication of synchronization induced losses in timer-based burst assembly implementations and simulations (Section 3.3.3) 2. Development and analysis of a new burst assembly algorithm for OBS, O Timer Burst Assembly (OTBA) that introduces randomness into burst departure timing and thereby avoids synchronization (Section 3.4). 3. Comparative analysis of Transmission Control Protocol (TCP) models for dierent ranges of packet loss values (Section 4.4). 4. Development and analysis of a coupled OBS-TCP calculation framework that nds the xed-point input rates of composite TCP sources within OBS bursts and the associated burst loss rates (Section 4.6). 5. Analysis of the performance of OBS using O Timer Burst Assembly over TCP (Section 4.8.7). 6. Development, analysis and simulation-based verication of novel contention avoidance algorithm: Shortest Path Prioritized Deection Routing (Section 5.3.4). 7. Introduction of analytical techniques to measure packet loss in queues with Gaussian distributed input trac and their application to OBS ingress buer dimensioning (Section 6.3).
22
1.5
Publications by the Author Related to this Thesis
1. C. Cameron, A. Zalesky, and M. Zukerman, Prioritized deection routing in optical burst switching networks, IEICE Transactions on Communications, vol. E88B, no. 5, pp. 18611867, May 2005 [37]. Earlier version in Proc. International Conference on the Optical Internet (COIN), Yokohama, Japan, Jul. 2004 [35]. 2. C. Cameron, J. Choi, S. Bilgrami, H. L. Vu, M. Zukerman, and M. Kang,Fixed-point performance analysis of TCP over optical burst switched networks, OSA Optics Express, vol. 13, no. 23, pp. 91679174, Nov. 2005 [32]. Earlier version in Proc. ATNAC, Sydney Australia, Dec. 2004 [31]. 3. C. Cameron, A. Zalesky, and M. Zukerman, Shortest path prioritized random deection routing in optical burst switched networks, in ICST International Workshop on Optical Burst Switching (WOBS), San Jose, USA, Oct. 2004 [36]. 4. C. Cameron, J. White, and M. Zukerman, O-timer burst assembly in OBS networks, in Proc. ATNAC, Sydney Australia, Dec. 2004 [34]. 5. J. Choi, H. L. Vu, C. Cameron, M. Zukerman, and M. Kang, The eect of burst assembly on performance of optical burst switched networks, Lecture Notes in Computer Science, vol. 3090, pp.729739, 2004 [49]. Earlier version in Proc. International Conference on Information Networking (ICOIN), Busan, Korea, Feb. 2004 [48]. 6. R. Addie, C. Cameron, C. Foh, and M. Zukerman, Conditional independence in gaussian queues, submitted to Transactions on Networking, January 2003, currently in review [6].
CRAIG CAMERON
23
1.6
Other publications by the Author
1. C. Cameron, S. Low, D. Wei, High Density Model for Server Allocation and Placement, Proc. IEEE International Workshop: CAMAD, New York USA, May 2002 (invited paper). 2. C. Cameron, S. Low, D. Wei, High Density Model for Server Allocation and Placement, Proc. ACM SIGMETRICS, Los Angeles USA, June 2002 [33]. 3. C. Cameron, S. Low, D. Wei, High Density Model for Server Allocation and Placement, IEEE Information Theory Workshop, Bangalore India, Oct 2002 (invited paper).
24
CRAIG CAMERON
25
Chapter 2 All-Optical Networks

2.1 Introduction
Recent improvements in Wavelength Division Multiplexing (WDM) have enabled a single ber to carry over 128 independent wavelengths and aggregate data rates of over 1Tb/s, using commercially available equipment. Experimental results from NEC and Alcatel show that the potential capacity per ber may be much higher, with successful transmission of 10 Tb/s over more than 256 independent wavelengths [25, 77]. However, at these speeds, converting data between the optical and electronic domains becomes a critical performance bottleneck. Current network protocols may not be able to scale, in terms of cost, size, processing speed and power consumption. In order to fully realize potential ber bandwidth and WDM gains, the number of such conversions must be minimized. Furthermore, due to the success of the Internet, especially over the last 10 years, the Internet Protocol (IP) [57] has become the dominant protocol for new network services [92]. The combination of these factors has lead to extensive research into all-optical networks using IP control and signaling mechanisms that aims to leverage the synergies between both technologies to
CHAPTER 2. ALL-OPTICAL NETWORKS
26
create a cheap, dynamic, simple, low-power, robust and extremely fast communication medium. In this Chapter, we present a historical overview of the development of competing optical paradigms over the last 10 years. We then concentrate on a particular technology called Optical Burst Switching (OBS) and explore key protocols and terminology from this technology. We conclude the chapter by introducing key problems with OBS, referring the reader to relevant sections of this dissertation that solve these problems.
2.2
Current High Speed Networks
Synchronous Optical Network (SONET) [116] and the closely related Synchronous Digital Hierarchy (SDH) standards are the predominant technologies in todays carrier networks [39, 107]. In this Section, we describe the operation of SONET and show that it is unable to scale up to utilize fully the large bandwidth of optical bers. For more details, refer to comprehensive descriptions of SONET in [27, 83, 157]. It is important to note that in this Section, we use the notation SONET to refer both to frame formats and to the commonly used physical layer infrastructure that SONET was designed to run on. As mentioned above, all-optical networks are transparent and are therefore data format independent. While the data carried inside optical streams in an all-optical network may indeed be SONET formatted, the associated SONET protocols are restricted to nodes at the edge of an all optical network and therefore do not aect the operation of the all optical network. In summary, we present SONET as an example of a network protocol that carries critical control information inside the framing format, control information that needs to be read at each node in a SONET network. SONET employs sophisticated multiplexing techniques to interleave synchronous streams of electronic data from the basic signal
CRAIG CAMERON
27
rate of approximately 50Mb/s (STS-1) up to a maximum theoretical rate of approximately 40Gb/s (STS-768), a speed not yet reached in commercial systems. SONET is a synchronous system with frames sent every 125s. Lower speed streams are mapped into virtual tributaries (VTs) designed to carry 1.5 (VT1.5), 2 (VT2), 3 (VT3) and 6Mb/s (VT6) streams. A VT group consists of either four VT1.5, three VT2s, two VT3s or one VT6. The payload of a basic STS-1 stream consists either of a single STS-1 Synchronous Payload Envelope (SPE) or 7 byte-interleaved VT groups. To achieve higher speeds, individual STS-1 frames are aggregated together using byte-interleaving or through the use of larger frame sizes, usually referred to as concatenated frames. The two main node types in a SONET network are Add/Drop Multiplexers (ADMs) and Digital Cross Connects (DCCs). ADMs are designed to pick out one or more low-speed streams from a high-speed stream and also similarly insert one or more low-speed streams into a high-speed stream. A DCC is a more advanced node that, in addition to ADM functionality, can groom trac. Grooming allows composite low-speed streams to be individually switched, resulting in ne grained control at the expense of increased complexity and port count.
2.2.1
Shortcomings of SONET
The success of SONET has been largely due to the comprehensive functionality of the additional control information carried along with the frame [1]. This overhead includes functions to manage performance, faults, conguration and security but has a signicant drawback: to control the network, this overhead and therefore each frame needs to be read at each node. This means that each frame must be received in the optical domain, converted to electrical form and then retransmitted in the optical domain. This process is called Optical-Electronic-Optical (OEO) conversion. In addition to these
28
conversions at every node, several electrical regenerators may need to be placed between each node to restore the output signal level, reshape the pulses and retime the signal. As a consequence, high speed SONET, such as STS-768, is prohibitively expensive due to the large number of high speed OEO converters required. An important side-eect of OEO conversion is that the process is code, protocol and timing sensitive. In addition, SONET does not have any signaling mechanisms for dynamic creation of circuits. The combination of these characteristics result in provisioning and upgrading being extremely complicated and lengthy processes, often taking up several weeks or even several months [104]. The coarse granularity of SONET also may cause signicant ineciency . For example, a customer can only upgrade from an STS-48 (2.5Gb/s) to an STS-192 (10Gb/s) if more capacity is required. Another signicant ineciency is due to the Time Division Multiplexing (TDM) nature of SONET. Even if a fraction of the capacity is being used to transmit useful data, the excess capacity is not available to other users. Each connection is logically circuit switched and therefore aggregating many connections gives no multiplexing gain. As a consequence, SONET networks must be dimensioned with respect to peak load for each of its composite streams. Furthermore, SONET is a single wavelength technology. Given that more than 256 wavelengths can be used simultaneously on a single ber, this limitation has forced the rapid development of alternative network infrastructures and protocols.
2.3
Future Optical Networks
In the previous Section, we showed that SONET is unsuitable for future, high speed networks for a multitude of reasons. Instead, what is required is a set of protocols and associated network infrastructure that both overcome the problems with SONET, yet do
CRAIG CAMERON
29
not introduce signicant new problems themselves. More precisely, the ultimate research goal is the development of a new scheme that does not require extensive OEO conversions, can be rapidly provisioned and upgraded, is independent of payload data formats, uses bandwidth eciently and most importantly, can scale to large numbers of wavelengths per ber. This scheme will be most useful in the cases of high levels of aggregation of users and therefore of greatest importance in the network core. Current research focuses on three main technologies that solve most or all of the above problems: Optical Circuit Switching (OCS), Optical Burst Switching (OBS) and Optical Packet Switching (OPS). Extrapolating the historical trend from circuit switched voice to current packet based data networks, we believe that high speed optical networks will follow a similar trend, beginning from the circuit paradigm of OCS and eventually evolving to the packet-like technologies of OPS. This view is mirrored in several other publications, including [39, 53, 187, 193]. While OCS implementation is well underway, OPS is still very much a long-term research area due to current optical technology limitations. OBS, on the other hand, has many operational testbeds, including two the author has personally visited as an academic visitor at the Information and Communication University in South Korea and the University of Tokyo in Japan [171], yet is still far away from standardization. This combination of protocol immaturity and implementation existence enables research in this eld potentially to have a large impact and accordingly, this dissertation concentrates on OBS.
2.3.1
Enabling Technologies
The key problem in scaling up todays optical networks over several orders of magnitude is that there is a fundamental limit on the number of OEO conversions within a feasible network infrastructure.
30
The development of all optical amplication, optical switching and wavelength conversion has removed much of the burden of OEO conversion, restricting such conversions to the edge of the network. In this section, we describe the historical technical progression in these technology areas and the subsequent impact for all-optical network development. All Optical Ampliers In the past, to overcome optical signal degradation as it travels down the ber and through network components, optical signals needed to be electronically regenerated. This process involves receiving the weakened optical signal, converting it to electronic form, amplifying, retiming and reshaping this electronic signal which is then used to drive a laser that outputs a restored version of the optical signal. As mentioned above, such conversions are expensive, signal format dependent and require large amounts of power. The development of the Erbium-Doped Fiber Amplier (EDFA) in the late 80s drastically reduced the need for electronic regenerations. This device is capable of amplifying many wavelengths simultaneously, yet is insensitive to bit-rates, modulation formats and power levels. Further research in dispersion compensation, Fiber Bragg Gratings and new ber types that reduce non-linearities has led to current experimental links carrying hundreds of channels at 10Gb/s with the distance between electrical regenerators in the order of thousands of kilometers [157]. All-optical Switches Now that regenerators could be removed from optical links, switching nodes became the electronic bottleneck. If no conversion to electronic form of a data stream occurs within a switching element, this element is called an all-optical switch. Furthermore, due to optical technology constraints, data within optical signals can not be read
CRAIG CAMERON
31
in the optical domain. Therefore all-optical switching from input wavelength to output wavelength is called transparent switching, in contrast to opaque switching where conversion to the electronic domain is required for the switching process. Assuming control information can be separated from the main data signal and received electronically at each node, then the required functionality of an all-optical switch is simply being able to transparently switch a given input wavelength on a given input ber to a desired output wavelength on a desired output link. Several technologies that achieve this goal have been recently developed, including micro-electro-mechanical switches (MEMS). This technology is already employed in commercially available switches such as Lucents LambdaRouter and was recently sold to Japan Telecom to connect major metropolitan areas across Japan1 . MEMS consists of an array of tiny mirrors that move when an electrical current is applied. By adjusting the tilting angle of one or more mirrors, optical signals can be switched from input to output bers. 3D MEMS is an extension of this technique in which mirrors are positioned in a three dimensional matrix and rotate on two axes, enabling mappings between a much greater number of input and output ports [142]. Calient Networks DiamondWave PXC photonic switch is an example of currently available switches utilizing 3D MEMS technology to achieve 256x256 switching capacity2 . Researchers at Lucent believe that multithousand port fabrics appear to be physically realizable, with the potential of switching capacity 2000 times greater than that of currently struggling electronic fabrics [26]. Furthermore, the average loss experienced by a MEMS switch is extremely low. For example, Lucents LambdaRouter averages around -1.25dB loss [26]. There are also several other all-optical switching technologies under development using thermo-optical waveguides [164], bubbles
1 2
http://www.lucent.com/press/1001/011009.nsa.html http://www.calient.net/les/DW PXC May2004.pdf
32
of uid [72], Semiconductor Optical Ampliers (SOAs) and electrooptic lithium niobate (LiNbO3 ) [82]. The latter two are capable of switching times in the nanosecond range however, SOAs add signicant amounts of noise to optical signals, while LiNbO3 switches cause approximately 8dB of loss, limiting their scalability. In addition, both of these fast technologies are polarization sensitive [62]. Bidirectional switches using ber Bragg gratings and optical circulators are also in development for use in future all-optical ring networks [201].
Wavelength Conversion Wavelength conversion enables optical signals to be switched between dierent input and output wavelengths. The case where any input wavelength can be mapped to any output wavelength is denoted as full wavelength conversion and the case where switching is limited to a subset of output wavelengths is called partial wavelength conversion. The simplest way to achieve full wavelength conversion is to convert the optical signal to electronic form and use that signal to drive a tunable laser [76]. Achieving this in the optical domain is considerably more difcult. Many dierent techniques to achieve this goal have been developed, leveraging Cross Gain Modulation (XGM) [184], Cross Phase Modulation (XPM) [61] and Four-wave Mixing (FWM) [209]. All three techniques increase the intensity of the signal within an SOA and then leverage the non-linear properties of optical devices and bers to change the output wavelength of the signal. XGM leverages the dependence of the SOA gain on the input power. This dependence happens on an extremely fast time scale and therefore tracks the input signal bit-by-bit. Due to this effect, insertion of a lower power probe at frequency fp will result in the probe experiencing low gain for an input bit of 1 and high gain for an input bit of 0, essentially reproducing the signal on
CRAIG CAMERON
33
another wavelength. This method has the disadvantage of introducing signicant amounts of pulse distortion and phase changes in the output signal due to rapid changes in the refractive index of SOA caused by carrier density changes. XPM uses this change in carrier density to modulate the phase of the probe and thereby induce wavelength conversion. This phase change is converted to the required intensity modulation by splitting the probe signal inside a Mach-Zehnder interferometer. Compared to the above XGM, the use of an interferometer greatly improves the quality of the output signal [181] FWM refers to a third-order non-linearity in optical bers in which three optical waves of frequencies fi , fj and fk produce an additional wave of frequency fi + fj fk . In the case of wavelength conversion, this non-linearity can be leveraged to produce any desired output wavelength. Let the input wavelength be fi and the probe frequency be fp . In this case, FWM will produce signals with frequencies of 2fi fp and 2fp fi . This non-linear eect is amplied due to the intensity of the signals within the SOA, however the greater the dierence in frequency, the less ecient the conversion from pump energy to output energy. Therefore, only limited wavelength conversion is possible using FWM. For example, an experiment in 1998 could only achieve a conversion range of 3.2nm using 100Gb/s input signals [181]. A wider range is possible for slower bit rate input signals: 24.6nm for 40Gb/s and over 80nm for 2.5Gb/s [181]. It has also been shown recently that applying an assist beam with a short wavelength can greatly improve the conversion eciency and reduce the receiver power penalty, thus enabling lower gain, commercial SOAs to be used [97] Optical wavelength converters are still an immature technology and most companies that were developing them, including Genoa, Luxcore Networks, Opto Speed SA and Optivation Inc., have switched to less risky products or gone bankrupt. Despite this downturn, research and development continues in companies
34
such as Kamelian Ltd which is currently selling XPM wavelength converters. Assuming this technology becomes commercially available, optical signals will no longer be constrained to traversing the network on a xed wavelength. In the following Sections, we show that this added exibility will greatly reduce blocking when setting up circuit switched connections [194] and also reduce loss due to contention in OBS and OPS.
2.4
Optical Circuit Switching
It is widely accepted, for the reasons described above, that future high speed networks will need to be able to switch optical signals at intermediate nodes without conversion to the electrical domain. In this case, one main function that needs to be present within the network is the ability to determine the appropriate output ports for the corresponding input ports. This leads to two separate approaches: circuit switching or packet switching. The former reserves capacity from an ingress to an egress node, statically conguring the switch matrix for the length of the connection while the latter switch matrix may be dynamically recongured for each incoming packet. In this Section we give an overview of the development of circuit switching in the optical domain.
2.4.1
Lightpaths and Wavelength Routing
The ultimate goal of all-optical circuit switched networks is to construct an end-to-end path by reserving a wavelength, or set of wavelengths, to enable the optical signal to travel through the network without OEO conversions [108]. This process is often referred to as Wavelength routing and each connection denoted as a Lightpath [44]. Key problems in Optical Circuit Switching research concentrate largely on control mechanisms for the creation and dynamic
CRAIG CAMERON management of circuits.
35
When creating circuits within an all-optical network, wavelengths at each link need to be reserved and the switches informed of the input-output mappings. Choosing which wavelengths to reserve on which links is traditionally called the Routing and Wavelength Assignment (RWA) problem. Assuming full wavelength conversion, RWA is identical to allocating circuits in non-optical networks, a problem that has been researched for decades. However, in the case of limited or no wavelength conversion, allocating optimally is extremely dicult. Indeed it has been shown that, in the case of no wavelength conversion, minimizing the number of wavelengths to carry all demands, even in the case of static demand, is NP-complete [44]. Although several heuristics have been suggested, including longerpaths-rst [44], minimizing the number of wavelengths can result in huge numbers of wavelengths. To avoid this, the basic problem can be reformulated by bounding the maximum number of wavelengths while changing the objective function to the maximization of carried trac [156], minimizing the number of ports required [133] and minimizing the bottleneck link utilization [119]. In the case of variable trac demand, the RWA problem becomes even more complex. Most approaches decouple the routing and wavelength assignment problems, rst selecting a route for the circuit, then selecting a wavelength along that route, or several wavelengths in the case of limited or full wavelength conversion [44]. Heuristics used to choose the wavelengths include First-Fit, that selects the rst available wavelength from an arbitrarily indexed list of all wavelengths [44], Most-Used, that selects the available wavelength that is most used in the network [130] and Max-Sum, that selects the wavelength that maximizes the sum over all path capacities after the connection is established [169]. A comprehensive summary of the most popular RWA solutions can be found in [204]. Furthermore, in order to eciently utilize the network, circuits
36
themselves need to be eciently utilized. While high levels of aggregation in the core ensure that individual user changes do not signicantly inuence the short-term trac statistics, changes in aggregate network use due to changes in content popularity, network capacity and demand for content may require reallocation of network resources even at short scales. For example, during working hours, a company may require a high capacity circuit but would prefer not to pay for this capacity during the night when it is not in use. Similarly, if a group of servers suddenly becomes extremely popular, a service provider may want to create more circuits to carry the extra trac. One extreme case of this phenomenon in recent times was the release of the Kenneth Starr report on Mr Clinton in September 1998 which caused severe network congestion at media websites [155] and over 20 million attempted connections through the MCI Interconnect Gateway linking the White House to the Internet3 .
2.4.2
Peer and Overlay Models
The interaction between dierent layers in a protocol stack is also very important in the context of routing optical signals. There are two main interconnection models between the network, or IP, layer and the lower link and physical layers: overlay and peer [21]. In the peer model, each optical switch is an IP-addressable entity and the optical and network layers share common routing and signaling protocols. At the other extreme, the optical and network layers in the overlay model are completely disconnected and only exchange information through a clearly dened network interface (NI). Dynamic routing algorithms for circuit establishment and blocking performance in both overlay and peer models are presented in [206].
3
http://www.wired.com/news/technology/0,1282,14981,00.html
CRAIG CAMERON
37
2.4.3
Out of Band Control
As seen above, choosing which wavelengths and links to dedicate to a circuit is a complex problem. In addition, once the path has been decided, the corresponding switches need to be informed in order to implement this decision. Unlike traditional SONET networks, the control information can no longer be sent in conjunction with the data due to the transparent nature of all-optical switches. Such complex information must be converted to electronic form for processing but converting every optical signal just to extract control information has previously been shown not to scale well. Sending the control information on a separate channel, or Out-Of-Band (OOB), solves this dilemma with only one wavelength needing to be converted instead of all wavelengths. Furthermore, if the overhead required by the control information is small in relation to the circuit capacity, this control channel can be signicantly slower with correspondingly low OEO cost. The presence of an OOB control channel enables much of the functionality that was tightly embedded into legacy protocols, such as SONET, to be reintegrated into the network functionality. Technologies such as Multiprotocol Lambda Switching (MPS) [17] and Automatically Switched Optical Network (ASON) [2, 3] are developing standards that aim at dening the content and operation of these control messages over all-optical networks. It is expected that the major telecommunications companies are likely to adopt ASON, an International Telecommunications Union (ITU) standard [84], over MPS which was developed by the Internet Engineering Task Force (IETF). However, recent improved communication between the two groups has helped remove serious signaling incompatibilities4 , making a combined solution more likely for future network implementations.
4 ftp://sg15opticalt:otxchange@ftp.itu.int/tsg15opticaltransport/COMMUNICATIONS/ liaisons/TD77 3 Annex E.html
38
However, as the number of wavelengths per bre and the associated number of lightpaths required to be managed grows, the ability of circuit switching to scale is questionable. In Chapter 5 we show that trac-aware routing algorithms easily become unstable. A recent investigation into robustness and the Internet also suggests that complex systems, including the modern Internet, exhibit emergent phenomena: empirical discoveries that come as a complete surprise, defy conventional wisdom and cannot be explained within the framework of mathematical models due to their reliance on the large scale nature of the system [183]. Given that once the circuit, or lightpath, has been established it is very dicult to change either the routing or the wavelengths used along the path without signicant disruption, it is very important to choose initial values carefully. In todays large networks, this system optimization is largely done by human trac engineers due to fear among network operators that automated solutions will possibly be unstable in practice, yielding both sub-optimal performance and poor reliability, a fear grounded in unsuccessful experiments with adaptive routing in the ARPAnet [111]. Guaranteeing stability for complex ASON-style networks may prove to be particularly dicult. Furthermore, circuit switching is burdened by a fundamental problem. Circuits, by denition, need to be provisioned for peak trac intensity levels if loss is to be bounded over short to medium time scales. Therefore in non-peak periods, much of this allocated capacity may be unused, yet unavailable for other circuits in the network. To achieve useful levels of statistical multiplexing through capacity sharing, some type of packet switching must be used.
2.5
Optical Packet Switching
Circuit switching is inherently inecient given time-varying trac intensity as the capacity reserved by the circuit is not shared. This
CRAIG CAMERON
39
loss in statistical multiplexing capacity was the main motivation behind the development of packet based networks in the electrical domain and may cause a similar paradigm shift within the optical domain. Ultimately, cost is the determining factor in the choice of network protocols. For example, during the period from the 1960s to 1995, the cost per bit of leased lines decreased very slowly, halving every 79 months [159], much more slowly than the decrease in computer cost. Therefore, adding computing to the network in the form of packet switching functionality was seen to be economically advantageous. Currently, the converse is true: the development of WDM has led to a major decrease in the cost of long haul networks, estimated in 2000 to halve every 12 months [159]. However, the bandwidth of ber is nite, albeit large. Accordingly, future networks may once more be bandwidth constrained such that ecient use of the network in the form of packet switching may become a nancially attractive option. The key dierence between packet switching and circuit switching is that in the former, the routing of data is determined by the label or header of a discrete group of bits, while the latter simply maps an input port to an output port. As packets are routed individually, many packets from dierent sources, going to dierent destinations can share a common wavelength, leading to potentially high levels of statistical multiplexing and associated eciency gains. There are three main limitations in optical packet switching that are not present in the electronic equivalent: the lack of Random Access Memory (RAM) for buering [189], the lack of sophisticated optical processing, and relatively slow switching speeds.
2.5.1
Optical Buering
It is currently impossible both to store an optical signal indenitely and randomly access stored optical signals. In electrical packet switches, to avoid contention between packets arriving at similar
40
times and destined for the same output link, packets can be queued for later transmission when the corresponding output link becomes free. In optical packet switches, such queuing of packets is not currently possible. Although there have been some promising discoveries, such as the chiropticene switch [162], optical RAM is still in the early stages of development and may never be achievable.
A limited form of buering is achievable in the optical domain: optical signals can be delayed by a xed time period by sending them down an optical ber that loops back to the input port. Such loops are called Fiber Delay Lines (FDLs). Delay times are simply the length of the loop multiplied by the speed of light, for example, 3km of ber would give an approximate delay of 10s, or approximately the time taken for 10 packets of 1.5kB to be transmitted on a 10Gb/s link. However, 3km is quite a lot of ber to install on every output port and to achieve variable-delays many dierent length FDLs must be used, adding to the complexity. Maintaining temperature stability is also dicult across long sections of bre [138]. To illustrate the extent of the problem, early switch designs required more than 40 FDLs for a packet loss probability of 1010 at a load of 0.8, assuming no wavelength conversion [93].
To reduce the amount of ber needed to implement FDLs, recirculating loops can be used, such that the delay is additionally multiplied by the number of circulations. Additional optical amplication is also required to compensate for losses incurred on each circulation, leading to increased levels of Amplied Spontaneous Emission (ASE) noise. One ecient way around this limitation is to use non-circulating, or travelling-type, FDLs that are shared amongst all the input packet streams through wavelength conversion [207], or a combination of recirculating and travelling-type FDLs [208].
CRAIG CAMERON
41
2.5.2
Packet Header Processing
Packets contain information that enables them to be correctly routed through a network. Ideally, this information would be read in the optical domain, however although all-optical header processing has been an active area of research for over 10 years, much more progress is required for this technology to become feasible [62]. In a similar way to Optical Circuit Switching, header information can be sent such that it can be converted to electronic form independently of the associated data. The two main methods currently used in OPS are i) appending a serial header at a lower bit rate and ii) subcarrier multiplexing the header with the baseband payload such that the header occupies a spectrally higher position. However, with both methods, OEO are still required at each hop at every wavelength as the control information is carried with the packet, or In-Band.
2.5.3
Switching Speed Limitations
Assuming packet sizes continue to be limited to around 1.5kB due to Ethernet framing size, switching has to be in the nanosecond range. On the extreme end of the scale, packets can be as small as 44 bytes, with an associated transmission time of about 35ns at 10Gb/s. Most of the technologies discussed in Section 2.3.1 are several orders of magnitude too slow, with the possible exception of SOAs and LiNbO2 approaches, both of which are still very immature technologies [62] and require some form of synchronization at the input of the switch matrix [138]. One approach to reducing the switching overhead is to aggregate packets to form suciently large packets such that the switching time becomes a neglible fraction of the packet transmission time. For example, a packet size of 1.5MB takes approximately 1ms to send at 10Gb/s. In the following Section, we introduce a technology that aggregates packets at the edge of the network into distinct
CHAPTER 2. ALL-OPTICAL NETWORKS groups of packets, called bursts.
42
2.6
Optical Burst Switching
It became apparent in the late 1990s that Optical Packet Switching required much more advanced optical technology that was either currently available or even being developed at the time [174]. In order to leverage the potential multiplexing gains oered by many wavelength WDM systems (while taking into account the limitations of current technology, especially slow switching times and the need for electronic header processing), a practical approach combining out-of-band electronic control with all-optical switching of data was suggested. Optical Burst Switching (OBS) was the name given to this merging of Optical Circuit Switching and Optical Packet Switching technologies. In the remainder of this Chapter, we describe the main features of OBS and identify critical problems.
2.6.1
Burst Assembly
Semiconductor Optical Amplier (SOA) and electro-optic lithium niobate (LiNbO3 ) all-optical switches are capable of switching times in the nanosecond range but have serious problems as discussed in Section 2.3.1. Assuming these problems will not be quickly overcome, the time required to recongure an all-optical switch matrix is a signicant fraction of, or even more than, the time taken to transmit an IP packet. Therefore, to achieve useful levels of efciency, packets must be aggregated at the edge of an OBS network. The node where packets are aggregated is called an ingress node. After being switched through the OBS network, successfully received bursts are disaggregated into packets. The nal node is called an egress node. A sample path in an OBS network is shown in Figure 2.1.
CRAIG CAMERON
43
Egress node Ingress node (origin) All packets destined for node 12 aggregated and sent in discrete bursts at times determined by burst assembly policy Intermediate nodes Bursts transparently switched to output link. Times/links determined by control plane (destination) Bursts disaggregated and packets individually forwarded in electrical domain
Figure 2.1: Example Path (2-7-9-12) in OBS Network
2.6.2
Path Reservation
In order to achieve statistical multiplexing gains, the entire capacity of a network must remain unsegmented such that there is a single pool of unused bandwidth that is universally available. In the case of circuit switching, any unused bandwidth in a circuit is inaccessible to other circuits and therefore bursty trac distributions result in very low utilization of the network. Early burst switching technologies, called Tell-and-Wait (TaW) and Tell-and-Go (TaG), were developed in the early 1990s to reduce this ineciency in Asynchronous Transfer Mode (ATM) networks[182]. Both Tell-and-Wait and Tell-and-Go attempt to reserve a shortterm circuit to deliver a burst of cells such that network capacity can be shared and subsequent multiplexing gains achieved. Similar random access techniques can be found in a diverse range of networking elds, such as ALOHA in wireless systems and CSMA-CD in low speed wired ethernet. TaW sends a short request message that attempts to reserve
44
bandwidth at each switch in the path. If the reservation is successful, an acknowledgement (ACK) is sent from the nal node to the origin of the request message and the burst immediately sent on receipt of this ACK. If a reservation cannot be made at any of the nodes in the path, a Negative Acknowledgment (NACK) is returned to the origin of the request message along the reverse path and previously made reservations are freed. TaG, on the other hand, does not reserve any bandwidth in advance and sends burst whenever it is ready. Upon arrival of the header at an intermediate node in the path, capacity on the corresponding output link is reserved, given that sucient capacity is available. In the case that sucient capacity is not available, the burst is discarded and only the header forwarded to the nal node, which then returns a NACK. The performance of these two protocols depends on the propagation delay of the path and the size of the burst. For large propagation delays with respect to the burst size, TaG outperforms TaW and vice-versa. In the late 1990s, the ideas behind TaG were applied to alloptical network and renamed Terabit Burst Switching (TBS) [174]. Path reservation in TBS diered from TaG only in that the burst header was sent OOB on a separate control channel and the burst was buered in FDLs while the switching fabric was congured. ATM was chosen as the standard format of the control channel and the short request messages subsequently renamed Burst Header Cells. Around the same time, another protocol was independently developed that does not require any optical buering at intermediate nodes. This path reservation scheme was named Just-Enough-Time (JET) [195] and has since become the de-facto path reservation standard in what is now called Optical Burst Switching, a term introduced by the authors of JET [152, 153]. The notational dependence on ATM was also removed with Burst Header Cell renamed
CRAIG CAMERON
45
to Burst Header Packet (BHP), mirroring a general trend at that time away from ATM and towards Internet Protocol (IP) networks. JET removes the need for all-optical buering at any nodes in the path by including timing information in the BHP and incorporating a delay between the sending of the BHP and the burst, such that the burst can be buered in electronic form. On arrival at an intemediate node, the BHP attempts to make a reservation on the corresponding output link and wavelength for an interval of time in the future. This interval begins at the time that the burst will arrive at that node and ends immediately after the complete burst has been switched through the node. The delay between the sending of the BHP and the actual burst is called an oset time. This delay must be at least as long as the maximum BHP processing time taken at each intermediate node multiplied by the hop-count of the complete path, or if this is unknown, the diameter of the network can be used as an upper bound. If the oset time is too short, the burst may catch up with the BHP and subsequently be incorrectly switched at downstream nodes. Figure 2.2 shows an example timeline of an OBS burst transmission. Oset times also aect the priority of a burst. JET reservations are made on arrival of the BHP, therefore BHPs that arrive a relatively long time before their corresponding burst implicitly take priority over later arriving BHPs. This characteristic has been used in several papers, including [59, 179, 196], to prioritize certain bursts and thereby realize dierent levels of QoS.
2.6.3
Contention Resolution
If a reservation cannot be made at an intermediate node, contention is said to occur and the BHP and corresponding burst must be dropped if the contention cannot be resolved. Therefore, eective contention resolution is critical to restrict packet loss in OBS net-

Ingress Intermediate Nodes Egress
46
BHP Offset Time BHP Proc. Time
Burst
Time
Figure 2.2: Example Timeline of an OBS Burst Transmission. works to reasonably low levels. There are four main methods for resolving contention: 1. Wavelength conversion: On contention, try to make a reservation on a dierent output wavelength on the desired output link. 2. Fiber Delay Line (FDL): On contention, try to make a reservation on the desired output wavelength on the desired output link but at a dierent time. 3. Deection routing: On contention, try to make a reservation on the desired output wavelength on a dierent output link. 4. Preemption: On contention, remove the contending reservation, then make a reservation on the desired output wavelength on the desired output link.
CRAIG CAMERON
47
The use of wavelength conversion and FDLs has been comprehensively researched and both found to be extremely helpful in reducing loss in OBS networks, but both have signicant disadvantages. FDLs are bulky, add delay and require more complex reservation algorithms and wavelength conversion is currently an immature and expensive technology [41]. Deection routing and preemption are techniques that have been often overlooked due to perceived complexity and instability at high loads. In Chapter 5 we develop a novel algorithm called Shortest Path Prioritized Random Deection Routing (SP-PRDR) that combines wavelength conversion, deection routing and preemption contention resolution schemes to signicantly reduce blocking rates in OBS networks, yet maintains stability.
2.7
OBS Loss Calculation
In this Section, we describe a method to calculate loss in OBS networks that was rst introduced in [210]. This method can be used to compute the loss probability on a single link of an OBS network, in the absence of any performance enhancing algorithms introduced in this thesis and therefore serves as a useful base from which to assess the potential gains of each algorithm.
2.7.1
On-O Source Model
We consider a nite number of sources, N M , and a nite number of sinks, M . Each source generates an independent data stream destined for a unique sink. All trac destined for a common sink is buered together, such that there are M independent buers, each buering N streams. Each stream has a Poisson distributed packet arrival rate of p with a xed service rate of p . Each T seconds, the corresponding buered packets are aggregated into a burst and passed on to the scheduler for possible trans-
48
ON 1/B
OFF T - 1/B
ON 1/B
Time
Figure 2.3: On-o model mean values. mission, therefore the burst arrival rate, B , is a function solely of the timer value T . The burst service rate, B , also depends on the statistics of the input trac. B = 1/T. p . N T p (2.1)
B = The burst input load is then B =
(2.2)
B N p = = N p . B p
(2.3)
This model can also be represented using on and o period distributions, as shown in Figure 2.3. In this case, the inverse of the average on period, B , is equal to B , while the inverse of the B , is equal to 1/(T 1/B ). average o period,
2.7.2
Engset Model
The above on-o source model is most often used to calculate blocking probability using Engset loss formulae [12]. In this case, is the Engset arrival rate per free customer, is the service rate, M is the limited number of customers and K is the number of servers in the system. It was shown in [210] that this model overestimates the loss of an OBS system due to the fact that within the Engset framework a blocked customer stays free and keeps attempting at the same intensity. This is not the case in basic OBS as bursts are
CRAIG CAMERON
Source 1 Source 2 Source 3
49
Buffer 1
OBS Scheduler Wavelength 1

B
Source N Source (N-1)*M+1
Wavelength 2
Source N*M
Figure 2.4: Simplied Single-link Topology dropped upon contention and therefore behave as if they are being served by a dummy server.
2.7.3
2D Markov Chain Framework
For the remainder of this Section, we explore a more accurate model that uses a two dimensional Markov chain approach to model an OBS single link topology with K output wavelengths, as shown in Figure 2.4. In the case of a practical multi-link OBS network, this topology can be interpreted as generalizing the behaviour of the output of an optical burst switch within the network. There are three types of customers: (1) busy (bursts that are being transmitted), (2) free (unused input link), and (3) blocked (bursts that are being dumped). Let i,j be the steady state probability where i(0 i K) is the number of busy customers and j(0 j M K) is the number of frozen customers (sources who transmit blocked bursts). To ensure that the steady state of the Markov chain exists, we restrict values of N , p and p such that b < 1.
Buffer M
B
Wavelength K
CHAPTER 2. ALL-OPTICAL NETWORKS We now have the following steady state equations: [(M i j)B + (i + j)B ]i,j = (M i + 1 j)B i1,j +(j + 1)B i,j+1 + (i + 1)B i+1,j . and [(M K j)B + (K + j)B ]K,j = (M K + 1 j)B K1,j +(j + 1)B K,j+1 + (M K + 1 j)B K,j1 . and a normalization equation,
K M K
50
(2.4)
(2.5)
i,j = 1.
i=0 j=0
(2.6)
The input load is given by

K M K
To =
i=0 j=0
(M i j)B i,j ,
(2.7)
the carried load is given by

K M K
Tc =
i=0 j=0
ii,j ,
(2.8)
and the blocking, or equivalently loss, probability is obtained by B= To Tc . To (2.9)
2.7.4
Computational Methods
On inspection, the above formulae do not depend on both B and B , but the fraction B /B = B . Equations 2.4 and 2.5 can there
CRAIG CAMERON fore be rewritten as: [(M i j)B + i + j]i,j = (M i + 1 j)B i1,j +(j + 1)i,j+1 + (i + 1)i+1,j . and [(M K j)B + K + j]K,j = (M K + 1 j)B K1,j
51
(2.10)
+(j + 1)K,j+1 + (M K + 1 j)B K,j1 . (2.11)
Matrix Inversion Equations 2.10 and 2.11 can be rearranged such that they can be represented by the matrix equation Ax = b, where the x = [0,0 , ...0,M K , 1,0 , ...1,M K , ...K,M K ]T and b = 0. The additional constraint from Equation 2.6 is included by substituting one of the rows of A with all ones and setting the corresponding element in b also to one. The solution for x and the corresponding values for i,j can be simply found through matrix inversion: x = A1 b Iteration Given the ranges of i and j: (0 i K) and (0 j M K), the above matrix A is of size (K +1+M K +1)(K +1+M K +1). As the values of M and K increase, inverting this matrix becomes slower and slower. For example, for M = 128, K = 64, = 1, A contains more than 4000 elements and inversion in Matlab on a 2.8Ghz computer with 768MB RAM requires several minutes of CPU time and an additional several hundred megabytes of RAM on top of the base Matlab requirements. The inversion cannot (2.12)
52
be even calculated for larger values due to memory constraints. Given that optical bers have been shown to be able to carry over 256 independent wavelengths simultaneous, the matrix inversion method is inadequate for analyzing arbitrary OBS links. Instead, we use an iterative method from [167]. Note that Equation 2.10 can be rearranged in the form: (M i j)B + i + j . (M i + 1 j)B i1,j + (j + 1)i,j+1 + (i + 1)i+1,j (2.13) with appropriate exceptions for the i = 0, i = K, j = 0 and j = M K cases. i,j = Given initial values of i,j , iterative substitutions from i = 0 to K and j = 0 to M K using Equation 2.13 determine new values for i,j . This process can be repeated until successive convergence is obtained. For calculations in this thesis, we found that initially setting i,j = 1, i, j and exiting the iterative method when the maximum change in any of the i,j values was less than 108 in a given iteration loop, yielded suciently fast convergence times and accurate results. Repeating the above example of M = 128, K = 64, B = 1 gave an result accurate to within 7 signicant gures after less than one second of CPU time and used negligible amounts additional RAM for the required 675 iterations.
2.7.5
A Loss vs. Load Example
All the above formulae involve the parameter B . A more com monly used parameter is utilization, denoted by , that, unlike B that represents the average on time divided by the average o time, represents the average on time divided by the total time. Conversion between these two parameters is simple: = av. ontime 1/ B = = + 1/ av. ontime + av. otime 1 + B 1/ (2.14)
CRAIG CAMERON
53
10
10
10 Loss 10
10
K=64 K=32
10
10
0.5
0.6
0.7 0.8 Link Load
0.9
Figure 2.5: Sample OBS Loss Vs. Load Graph, M = 128. As the number of sources, M , and the number of wavelengths, K, may be dierent, the output link load, often simply called the load, is given by the equation: link load = .M K (2.15)
An example result showing the loss probability for M = 128 and dierent values of load () and output wavelengths (K) is shown in Figure 2.5. It can be seen from this graph that OBS has quite high loss even for moderate loads and large numbers of wavelengths.
2.8
Conclusion
In this Section we presented a historical overview of competing optical paradigms over the last 10 years. We explored key protocols and dened the terminology of one such technology: Optical Burst Switching (OBS). It can be seen from the result shown in Figure
54
2.5 that standard OBS has quite high loss even for moderate loads and large numbers of wavelengths. In the following chapters of this thesis we measure the impact of this loss on higher level protocols such as the Transmission Control Protocol (TCP) and develop new techniques to reduce this loss and bring OBS towards feasibility.
CRAIG CAMERON
55
Chapter 3 Burst Assembly

3.1 Introduction
In this chapter we examine burst assembly algorithms and explore a simulation based performance study of the two main algorithms: threshold-based and timer-based burst assembly. We show that the latter burst assembly algorithm may induce extremely high and deterministic levels of burst loss due to timing synchronicity and present a new dynamic algorithm, O Timer Burst Assembly (OTBA), that links burst size and burst injection times to input trac intensity, introducing required randomness in the network to greatly reduce burst loss.
3.2
Burst Assembly Algorithms
Switching speeds are a signicant bottleneck in all-optical networks. For example, a network operating at 10Gbps per wavelength needs to switch at time scales in the order of nanoseconds to handle very small packets such as Transmission Control Protocol (TCP) acknowledgements, or even standard 1.4kB IP packets. Current state of the art, commercially available all-optical switches have switching speeds of barely less than 1ms, approximately one thousand
CHAPTER 3. BURST ASSEMBLY
56
times slower [90]. Given rapid technological advancement over the next few years, several orders of magnitude improvement may eventually enable packet by packet switching, however in developing feasible medium term solutions, overcoming this bottleneck is critically important. One solution, used by OBS, is to aggregate, or assemble, packets into bursts at ingress nodes, such that the switching times, and associated guard band ineciencies, are negligible with respect to the burst sizes. Continuing the previous example, a 10MB burst takes several milliseconds to transmit at 10Gb/s, such that a switching delay of several hundred microseconds is approximately ten percent of the bursts transmission time. The two main burst assembly algorithms are Threshold-based and Timer-based.
3.2.1
Threshold-based Burst Assembly
The threshold-based burst assembly algorithm requires a single, i congurable parameter, Bth [bytes] for each queue i. This parameter corresponds roughly to the length, in bytes, of each burst sent from that queue [176]. As packets arrive in their corresponding queues, the aggregate size of the assembled packets is monitored for each queue. If an arriving packet causes this aggregate size to i exceed Bth for queue i, the burst is scheduled to be sent and the packets comprising this burst are immediately removed from queue i. It is not specied whether the arriving packet triggering this event is appended to the outgoing burst or left in the queue, howi ever for commonly used values of Bth , the dierence between these two cases is negligible. This algorithm oers no delay guarantees. Under low input oered load, the scheme may need to wait for a long period of time i until Bth is reached. However, under high input oered load, the threshold will be quickly reached, minimizing delay.
CRAIG CAMERON
57
Bth
Bth
Bth time
X1
X2
Figure 3.1: Sample path for Threshold Burst Assembly Model In summary, this algorithm produces xed sized bursts with random inter-departure times. Usually the threshold value is constant for all input queues and is commonly denoted by Bth . A sample path for a single queue is shown in Figure 3.1.
3.2.2
Timer-based Burst Assembly
In a similar way to the above threshold-based algorithm, timerbased algorithms also require a single, congurable parameter. The parameter for each queue is a time period, T i [s], corresponding roughly to the inter-departure times of the bursts for queue i. In timer-based assembly, a timer for each queue is started at the initialization of system and immediately after the previous burst for that queue is scheduled to be sent [202], or in some versions, immediately after the rst packet arrives after the queue has been emptied [58, 80]. At the expiration of this timer (after time T i ) the burst assembler generates a burst containing all the packets in the buer at that point. With respect to delay, the performance of this scheme is the converse of the threshold-based scheme described above. Under low input oered load, timer-based burst assembly guarantees a xed minimum delay. However, under high input oered load, it may generate bursts that are quite large, perhaps unnecessarily increasing delay. In summary, this algorithm produces randomly sized bursts with xed inter-departure times. Usually the timer value is constant for all input queues and is commonly denoted as T . A sample path for
58
X1
X2
X3 time
Figure 3.2: Sample path for Timer Burst Assembly Model a single queue is shown in Figure 3.2.
3.2.3
Hybrid Burst Assembly
The threshold-based and timer-based algorithms can be used simultaneously to leverage the advantages of both schemes. In this hybrid system, the burst is sent when either of the constraints are reached [197, 202]. For example, for periods of low input load, the timer would expire rst, resulting in deterministic burst spacing but randomly sized (small) bursts. For periods of high input load the threshold would be reached rst, resulting in randomly spaced but constant sized (large) bursts. Borrowing an analogy from [192], this aggregation mechanism is like a bus system. At any time, there is is one bus with one or more empty seats waiting for passengers for each destination. A bus has a maximum capacity for passengers (threshold) and leaves periodically (timer). If a bus is full before its scheduled departure time, it will leave early and the next empty bus will pull into the station.
3.3
Simulation-based Study
Packets travelling through an OBS network are buered only at the ingress and egress nodes. Within the network, bursts are not buered at each node but transparently switched in the optical domain. Under this assumption of buerless operation, the relationship between the oered trac and the burst loss probability should depend only on the amount of oered trac, according to the in-
CRAIG CAMERON
59
sensitivity property [58, 109]. Indeed, most Optical Burst Switched analytical models, including those introduced in Section 2.7, only include an intensity parameter and, as such, are burst length and timing invariant. In this Section, we examine simulation results from [48] and [49] to test if the choice of burst assembly parameters aects the probability of burst loss on a single link of a OBS network. We use a simulation package written by Mr JungYul Choi during his visit to the University of Melbourne to compare average packet delay, average burst size and burst loss probability for threshold-based and timer-based across a range of parameter values. The simulation results are then compared against both the Engset model and the 2D Markov Chain model, both introduced in Section 2.7.
3.3.1
Simulation Environment
The Engset model and the two-dimensional Markov Chain model are burst assembly algorithm invariant and therefore insucient for analyzing possible dierences between timer and threshold-based algorithms. Simulations, on the other hand, enable more complex OBS features to be evaluated through direct manipulation of parameter values. The simulations in this Section were performed using the topology from Section 2.7.3, reproduced in Figure 3.3, with the following assumptions: Simulation length = 8 106 packet arrivals Full wavelength conversion M =6 K=3 Link capacity/wavelength = 1Gb/s Packet arrive according to a Poisson process with rate

60
Buffer 1

B
Wavelength 2
Source N*M
Figure 3.3: Simplied Single-link Topology Packet sizes i.i.d exponential, mean = 1KB First packet exceeding threshold sent with burst Timer reset when packet arrives in empty buer Timers initially synchronized These simulations measured burst loss probability for dierent timer values and threshold values as a function of the load on the output link.
3.3.2
Simulation Results
The main dierence between threshold and timer-based burst assembly lies in their approaches to delay and burst size. As mentioned in Section 3.2.2, timer-based algorithms are often used to upper-bound the delay experienced by any packet in an OBS system. This upper-bound is simply the Burst Assembly Time, T . For a xed interval, the distribution of Poisson arrivals is eectively uniform, therefore the average delay is T /2. On the other hand, for the
Buffer M
B
Wavelength K
CRAIG CAMERON
61
Average Packet Delay 0.04 Th: 100KByte Th: 200KByte Th: 300KByte Th: 400KByte Th: 500KByte Th: 600KByte Th: 700KByte Th: 800KByte
0.035
0.03
0.025 Delay(sec)
0.02
0.015
0.01
0.005
0 0.1
0.2
0.3
0.4 0.5 0.6 Input Traffic Offered Load ( p)
0.7
0.8
0.9
Figure 3.4: Average Delay: Threshold-based threshold-based algorithm, the packet delay is a random variable, dependent on the threshold size Bth . The average packet delay from the simulation is shown in Figure 3.4. Similarly, the average burst size for the threshold-based algorithm is clearly Bth . However, burst sizes for the timer-based algorithm are random variables, dependent on the burst assembly time T . The average burst size from the simulation is shown in Figure 3.5. These two results can also be used to show the eect of both timer and threshold parameters in the case of hybrid burst assembly. The average delay of such a system can be read from Figure 3.4 for trac loads yielding threshold-based delays below the chosen T and the average burst size can be read from Figure 3.5 for lower trac loads. The other two key results are shown in Figure 3.6 and Figure 3.7. In the case of threshold-based burst assembly, changing the thresh-
62
x 10
Average Burst Size Ti: 1ms Ti: 2ms Ti: 3ms Ti: 4ms Ti: 5ms Ti: 6ms Ti: 7ms Ti: 8ms
3.5
2.5 Size(bits)
1.5
0.5
0 0.1
0.2
0.3
0.4 0.5 0.6 Input Traffic Offered Load ( )

p
0.7
0.8
0.9
Figure 3.5: Average Burst Size: Timer-based old size had no eect on the loss performance, as predicted by the insensitivity property [58, 109]. However, the timer-based simulation showed increasing loss probability as the timer value increased. This result was directly caused by synchronization within the simulation and served to motivate the remainder of this Chapter, in which we introduce a new timer-based burst assembly algorithm that removes synchronization, thereby improving performance.
3.3.3
Timer-based synchronization
In the above simulations, all timers for each buer are initialized to the same value and therefore expire at the same time, causing an initial loss of 50 percent of the bursts as there are twice as many buers as output wavelengths. This loss gradually decreases as the average interarrival times of the bursts are not exactly T , but T + , where is a random variable, equal to the time taken for a
CRAIG CAMERON
63
Blocking Probability of Burst 0.35
0.3
0.25
Blocking Probability
0.2
0.15
0.1
0.05
Engset Numerical Analysis Th: 100KByte Th: 200KByte Th: 300KByte Th: 400KByte Th: 500KByte Th: 600KByte Th: 700KByte Th: 800KByte 0 0.1 0.2 0.3 0.4 0.5 0.6 Input Traffic Offered Load ( p) 0.7 0.8 0.9 1
Figure 3.6: Blocking Probability: Threshold-based

Blocking Probability of Burst 0.5 Engset Numerical Analysis Ti:1ms Ti:2ms Ti:3ms Ti:4ms Ti:5ms Ti:6ms Ti:7ms Ti:8ms
0.45
0.4
0.35 Blocking Probability
0.3
0.25
0.2
0.15
0.1
0.05
0.1
0.2
0.3
0.4 0.5 0.6 Input Traffic Offered Load ( p)
0.7
0.8
0.9
Figure 3.7: Blocking Probability: Timer-based
64
new packet to arrive in the empty buer. Due to the assumption that packet arrivals follow a Poisson process, the inter-departure distribution of bursts is exponential and memory-less. The mean of is therefore equal to the mean of the packet inter-arrival times, or equivalently: 1 E[] = , (3.1) E[] where E[], in this simulation, is given by: E[] = Link capacity/wavelength = 1000000/8 [s1 ]. Average Packet Size (3.2)
For large values of and T , E[] is extremely small in comparison with T and therefore the synchronization lasts for a signicant amount of time, on the order of T /E[], leading to higher average loss probabilities as shown in Figure 3.7. It is important to note that if the timer is immediately reset after burst aggregation, instead of waiting for the next packet arrival, this synchronization would continue indenitely, resulting in a deterministic blocking probability of 0.5, independent of the input load. Re-running the simulation over a much longer period gives a result that is identical to the result from the threshold-based scheme shown in Figure 3.6, i.e. all curves collapsed to a single line, consistent with the well known insensitivity property of buerless networks [58, 109].
3.4
O-Timer Burst Assembly
We showed in the previous Section that care needs to be taken, both in implementations and simulations, to avoid synchronization and subsequent high loss rates. In the remainder of this Chapter, we introduce a low complexity solution that alleviates timing synchronization problems.
CRAIG CAMERON
65
3.4.1
Timer-based Synchronization
In timer-based burst assembly, a timer is started at the initialization of the system and often immediately after the previous burst is sent. At the expiration of this timer, the burst assembler generates a burst containing all the packets in the buer at that point. This problem of synchronization, and subsequent high burst blocking rate, in the case of immediate reset was rst introduced in [175] and was solved in this paper by randomizing the burst oset times. Randomizing burst oset times changes the priority of the bursts. Indeed, increasing burst oset times has been proposed as an effective method of prioritizing dierent classes of trac [196] as a BHP corresponding to a burst with a long oset time is able to make reservations in advance of those with short oset times and therefore has a better chance of the reservation being successful. In this Section, we introduce necessary randomness without affecting priority by using xed duration o-periods to control the size of the bursts. Instead of resetting the timer on arrival of a new packet or immediately after removing the burst from the queue, we wait until the whole burst has been sent on the corresponding output link, then reset the timer. We name this new method O Timer Burst Assembly (OTBA). For any source-destination pair within an OBS network, the trac between the source and the destination can be viewed as a on-o process in time at the output link: either a burst is being sent or not. OTBA requires only one, xed, deterministic control parameter: the desired duration of the corresponding o period, or To . The length, in time, of the on period for the nth burst, Xn , is determined by the amount of trac arriving in the time period Xn1 + To . The sending times of bursts will therefore uctuate based on the input trac load for the corresponding source-destination pair. In real backbone networks, trac demands between dierent edge nodes has been shown to be highly non-
66
X1
X2
X3 time
Toff
Toff
Toff
Figure 3.8: Sample path for source-destination pair using O-Timer Burst Assembly Model uniform [23]. Therefore, it is expected that burst sending times across all source-destination pairs sharing a common output link will become unsynchronized, resulting in reduced blocking probability when compared with the standard timer-based assembly mechanism. Figure 3.8 shows an example realization within the O-Timer Burst Assembly Model. The o time period is denoted by To and the rst 3 bursts in this sample path are denoted by X1 , X2 , X3 .
3.5
OTBA Properties
It was shown in Section 3.3 that the loss on a single OBS link is independent of threshold and timer values, if the simulation is run for a sucient length of time. Similarly, the key result of interest for OTBA is not long term loss, as it is parameter invariant, but the time taken to break synchronicity in the worst case where the timers are initially fully synchronized.
3.5.1
Simulation Environment
We developed a new simulation to show the dynamics of OTBA, using the topology from Figure 3.3 and with the following assumptions: Full wavelength conversion M =6
CRAIG CAMERON K=3 Link capacity/wavelength = 1Gb/s Packet arrive according to a Poisson process with rate Packet sizes i.i.d exponential, mean = 1KB Timer reset after previous burst sent Timers initially synchronized
67
3.5.2
Steady State Timer Value
Given that the burst inter-departure times depend on the size of the previous bursts, the steady state burst inter-departure time To + X can be derived from the state equation: Xn = (Xn1 + To ). (3.3)
In steady state, the sizes of the bursts Xn and Xn1 are equal, leading to the equation X = (X + To ) and the steady state burst inter-departure time X + To = To 1 (3.5) (3.4)
These formulae were veried using the simulator introduced in Section 3.5.1. Sample paths for To = 7ms, = 0.25 (Figure 3.9) and To = 4ms, = 0.1 (Figure 3.10) both show fast convergence to the theoretical values predicted by Equation 3.5 of 9.33 and 4.44 respectively, with desirable high variability around this equilibrium point.
68
10 Toff + Previous Burst Duration [ms] 9.5 9 8.5 8 7.5 7 0
20 40 60 80 Timer Expiration Count [Bursts]
100
Figure 3.9: Sample Path for To = 7ms, = 0.25

4.8 Toff + Previous Burst Duration [ms] 4.7 4.6 4.5 4.4 4.3 4.2 4.1 4 0 20 40 60 80 Timer Expiration Count [Bursts] 100
Figure 3.10: Sample Path for To = 4ms, = 0.1
CRAIG CAMERON
69
3.5.3
Breaking Synchronicity
To demonstrate the speed by which OTBA removes synchronicity, we simulated 1000 parallel burst assembly processes using the above assumptions and calculated the sample variance of burst departure times using the standard equation:
2 Sn
1 = N
(xi m)2 ,
i=1
(3.6)
where N =1000 and m is the mean burst departure time. In the case of standard timer-based burst assembly, variance of burst departure times is on the order of 1/2 . The variance of burst departure times for OTBA is much larger, as shown in Figure 3.11. Due to the timers being initially synchronized, the variance starts from zero, however within a few burst departure times, the variance has increased signicantly, implying synchronization between the bursts has been removed. The worst case of synchronization results in all bursts periodically arriving at the same time. In this case, the number of bursts that can be sent on a single ber is exactly K, independent of the oered load , such that the blocking probability of the system is constant and equal to (M K)/M , for M > K. For M = 6 and K = 3, the blocking probability is equal to 0.5. In contrast, the blocking probability using OTBA, as predicted by the insensitivity property described in Section 3.3, is equal to the Threshold-based system of Figure 3.6. For all values of < 0.5, the blocking probability is less than 0.2; a signicant improvement.
3.6
Conclusion
Burst assembly is a key aspect of OBS that aims to compensate for the slow speeds of currently available all-optical switching technologies. In this Chapter we showed that the loss on a single OBS is
70
Variance of Burst Departure Times [1E6 s ]
4 3.5 3 2.5 2 1.5 1 0.5 0 0 5 10 15 Timer Expiration Count [Bursts] 20 OTBA Standard Timerbased
Figure 3.11: Sample Variance of Burst Departure Times for 1000 parallel burst assembly processes. To = 7ms, = 0.25 independent of burst size or timer duration, given no synchronization. While threshold-based burst assembly yields unsynchronized burst departure times, delay is unbounded in the limit of packet arrival rate () going to zero, a problem that is solved by timer-based algorithms. On the other hand, the two standard timer-based algorithms were shown to have signicant long lasting synchronization and corresponding high loss. In the case where the timer is immediately reset on creation of a burst, this synchronization does not dissipate and loss on the OBS link remains high and deterministic. The other standard timer-based algorithm waits for the arrival of a new packet in the empty buer before resetting the timer. Assuming that packet arrivals follow a Poisson process, this time is extremely small for moderate levels of load. Indeed, we showed through exploration of simulations by JungYul Choi that synchronization is still present after 8 106 packet arrivals.
CRAIG CAMERON
71
By using a new burst assembly algorithm, O Timer Burst Assembly (OTBA), synchronization within a few burst transmission times is removed, as conrmed by the rapidly increasing sample burst transmission variance. In addition, the next Chapter introduces additional performance benets of OTBA in the presence of the Transmission Control Protocol.
72
CRAIG CAMERON
73
Chapter 4 TCP Over OBS

4.1 Introduction
A trace taken by the Cooperative Association for Internet Data Analysis in 1998 showed that 95% of the bytes, 90% of the packets, and 80% of the ows on the examined link used variants of the Transmission Control Protocol (TCP) [54]. While the uptake of new applications using User Datagram Protocol (UDP), including streaming media and grid computing, seems to have increased over the last few years, measurements from 2003 show that TCP applications, including peer-to-peer le-sharing and the world wide web, continue to dominate [70]. Assuming that this dominance will continue, at least in the near future, it is crucial to be able to analyze the performance of TCP over new network architectures and protocols. This chapter analyzes current versions of TCP and combines a widely veried source rate TCP model with an OBS loss model, previously introduced in Section 2.7, to nd xed-point input loads and loss rates for TCP over OBS. Finally, we explore the eect of O-timer-based burst assembly on TCP performance.
CHAPTER 4. TCP OVER OBS
74
4.2
History of TCP
TCP was originally introduced in 1981 as a congestion control and avoidance mechanism in the U.S. Defense Advanced Research Projects Agency research program to investigate techniques and technologies for interlinking packet networks of various kinds [151]. Congestion control is a recovery mechanism that helps the network clear congestion, while congestion avoidance tries to prevent the network from entering a congested state [106, 134]. Figure 4.1 shows a sample throughput curve: congestion avoidance allows the network to be operated around the Knee while congestion control tries to avoid the Cli.
Knee Cliff
Throughput
Load
Figure 4.1: Graphical Representation of Network Throughput [105] The initial version of TCP was found to misbehave under certain network conditions, causing the operating point to drop well below the Knee, or even past the Cli. Since then many versions have
CRAIG CAMERON
75
since been introduced and optimizing TCP is still a very active area in the research community. The rst indication that improvements were needed came in October 1986, when the throughput between LBL and UC Berkeley dropped from 32kpbs to 40bps, a 99.9% reduction [101]. This increase in network load resulting in a decrease in the useful work done in the network was dened as congestion collapse [134]. There are two main causes of congestion collapse, (i) unnecessarily retransmitting packets what were either in transit or that had already been received and (ii) delivering packets through the network that are dropped before reaching their nal destination. The main goals of TCP are to maximize fairness and throughput while avoiding congestion collapse.
4.3
TCP Today
Current implementations of the TCP protocol are divided into four main intertwined algorithms: slow start, congestion avoidance, fast retransmit and fast recovery [13, 68, 166]. Excellent, but dated, descriptions of these mechanisms can be found in [165] and [185]. Traditionally, a unit of data encapsulated by TCP is called a segment. Following the notation of [64, 124, 141, 199], we refer to TCP segments as packets.
4.3.1
Slow start
TCP is an advanced and complicated window based ow control algorithm. Original versions of TCP used a receiver controlled window to control the rate of the TCP source: the TCP source transmits using this advertised window size, rwnd, as soon as the connection has been established [151]. After a packet is received, an acknowledgement (ACK) is sent back to the TCP source, indicating the number of the last consecutively received byte plus one.
76
It was observed that new packets should be injected into the network at the rate at which these ACKs are returned to avoid possible overow in intermediate routers [101]. Acting on this observation, another window was added, called cwnd or the congestion window. When a new connection is established, cwnd is initialized to one packet, or more precisely the number of bytes in one segment, and during the transfer, the TCP source window size is set to the minimum of rwnd and cwnd. Each time an ACK is received, cwnd is increased by one packet, leading to exponential growth of cwnd and the associated sender rate.
4.3.2
Congestion Avoidance
This exponential growth, left unchecked, would quickly lead to short time-scale congestion and packets being dropped by the network. To reduce the probability of packet loss, a slow start threshold size, ssthresh, was introduced. Once cwnd > ssthresh, cwnd is only increased by one packet per RTT, a linear, not exponential, increase. The main assumption in the initial development of TCP was that packet loss is very small, more specically much less than 1% [166], and represents congestion in the network. This congestion can be detected at the TCP source by the retransmission timer associated with the segment timing out or the receipt of three duplicate ACKs. Upon notication of congestion at the TCP source, ssthresh is set to approximately half of the current window size (min(cwnd/2,rwnd/2,2)) and cwnd is reset to one packet, forcing the TCP source back into the slow start phase.
4.3.3
Fast Retransmit
It is possible that the receipt of three duplicate acknowledgements does not indicate congestion. For example, packet reordering within the network is one eect that may cause spurious duplicate acknowl-
CRAIG CAMERON
77
edgements. TCP was modied to retransmit the possibly missing segment on receipt of three duplicate ACKs without waiting for the retransmission timer to expire. This added algorithm is called Fast Retransmit. It is important to note that the research community is now divided as to whether reordering is a common [20] or uncommon [98] event in real networks. A more detailed discussion of packet reordering follows in Chapter 5.
4.3.4
Fast Recovery
With networks growing faster and faster, the TCP algorithm was modied to allow high throughput under moderate congestion. This modication is called Fast Recovery. After the Fast Retransmit stage, several adjustments are made to ssthresh and cwnd, then the TCP source enters the Congestion Avoidance stage, not Slow Start. These adjustments are as follows [166]: On receipt of the third duplicate ACK, set ssthresh to cwnd/2. Retransmit the missing segment. Set cwnd = ssthresh + 3packetsize. For each additional duplicate ACK received, increment cwnd by one packet. On receipt of an ACK that acknowledges new data, set cwnd equal to sshthresh and enter the Congestion Avoidance stage.
4.3.5
New Fast Retransmit/Fast Recovery Algorithms
When multiple packets are dropped from a single window of data, the ACK for the packet transmitted using the original Fast Retransmit algorithm may not acknowledge all the packets transmitted before the Fast Retransmit stage was entered. To leverage more eectively the information contained in these partial ACKs,
78
the Fast Retransmit and Fast Recovery algorithms were modied. With these modifcations, TCP is able to recover without a TO by retransmitting one lost packet per RT T upon receiving each partial ACK without waiting for three duplicate ACKs. Additionally, cwnd is not halved until all lost packets from the windows have been retransmitted [199].
4.3.6
Naming TCP Variants
There are many implementations of TCP, each operating slightly dierently [144] and even some with signcant problems [147]. Grouping similar implementations together gives three main variants of TCP that are currently deployed, each using a subset of the four main intertwined algorithms. Tahoe [101] incorporates slow start, congestion avoidance and fast retransmit, Reno [103] incorporates all four algorithms and New Reno [95] incorporates the standard Slow Start and Congestion Avoidance algorithms but also uses the modied Fast Retransmit and Fast Recovery algorithms discussed in Section 4.3.5.
4.3.7
Selective Acknowledgements - SACK
As mentioned in Section 4.3.5, TCP may experience poor performance when multiple packets are lost from a single windows. The Selective Acknowledgements (SACK) option is an attempt to remedy this problem [125]. Assuming both TCP source and receiver have successfully negotiated a SACK-enabled connection, every ACK includes information about all segments that have arrived successfully, not just the last consecutive segment received. This enables the TCP source to specically resend all the potentially lost segments without waiting a round trip time to nd out about each segment or unnecessarily resend segments which have been correctly received. Simulations have shown this scheme to
CRAIG CAMERON
79
have signicantly higher throughput if more than one packet is dropped in a single window of data [64].
4.4
TCP Models
As can be seen from Section 4.3, the details of the TCP protocols are complicated. However, all current implementations of TCP conform to two basic rules: (i) if a packet is lost, the cwnd is roughly halved and (ii) during congestion avoidance, cwnd is increased by one packet per window of data. Using these rules, steady state models for TCP have been posted on mailing lists [123], formally published [67] and further rened, or independently derived [114, 126, 139, 141]. The most important factors in each model are the assumptions introduced to lower the complexity and therefore enable the authors to produce closed-form results. Concentrating on the simplest model derived in [123] and [67], the main assumption is that a single packet is dropped each time the congestion window is increased to W packets and never for smaller window sizes. This implies a continuous cycle of increasing cwnd by one each RTT until the window size reaches W, at which time a packet is dropped, cwnd halved and the cycle is renewed. The number of packets sent by the TCP source per cycle is therefore: W W 3 W + ( + 1) + ( + 2) + ... + W W 2 , from Figure 4.2 (4.1) 2 2 2 8 With an average packet drop rate of p, the maximum cwnd is therefore 8 W . (4.2) 3p The average cwnd, given steady state, is 0.75W , therefore the av-
80
window size [pkts] W W2/8 W/2 W2/4
W/2
time [RTT]
Figure 4.2: Calculating Number of TCP Packets Sent/Cycle erage sending rate of a TCP source, S [packets/s], is S Substituting for W from (4.2), S 1.5 1 . RT T p (4.4) 0.75W . RT T (4.3)
It is important to note that this model ignores timeouts and subsequent slow-start dynamics. However, if the maximum cwnd is larger than 3 packets, timeouts never occur in this model as the TCP source switches to Fast Recovery stage after receiving 3 duplicate acknowledgements. From Equation 4.2, timeouts can be ignored for W equal to 4 or higher and p less than 8/(3.42 ), or approximately 16%. Furthermore it is also assumed that the TCP source is never constrained by the advertised receiver window,
CRAIG CAMERON
81
rwnd, and that the TCP source has suciently large amounts of data to send such that multiple cycles of the Congestion Avoidance algorithm occur and steady state is reached [126]. Timeouts nevertheless are quite common in real world packet traces. In [141], the authors develop an improved model that incorporates the impact of timeouts, leading to a new TCP throughput equation: S RT T 1
2p 3
+ T0 min(1, 3
3p )p(1 8
, + 32p2 )
(4.5)
where T0 is the time-out period. This new model also includes implicit assumptions about how loss occurs in the network. Unlike Equation 4.4, packets are not assumed to be lost when a certain window size is reached. Instead, a bursty loss distribution is assumed. In this model, a round is dened by the TCP activity from the transmission of W packets to the receipt of the rst acknowledgement. Packet losses are assumed to be correlated among the back-to-back transmissions within a round: if a packet is lost, all remaining packets transmitted until the end of that round are also lost. A more detailed stochastic analysis, with a more realistic error model, can be found in [140, 141], and a comparison of this stochastic model with Equation 4.5 shows a close match. The only dierence between Equations 4.4 and 4.5 is that the latter has an extra term in the denominator. Therefore, the eect of including timeouts in the TCP model can be calculated from the following fraction: T0 min(1, 3
3p )p(1 8 2p 3
+ 32p2 )
(4.6)
RT T
. For small values of this fraction, the additional term in the de-
82
0.1
0.08
Loss (p)
0.06
0.04
RTT = 50ms RTT = 100ms RTT = 150ms
0.02
0 0
2 3 4 5 Ratio : (added term)/(basic term)
Figure 4.3: Induced Error in TCP Model Ignoring Time Outs Large p nominator is correspondingly small and therefore has little eect. Figure 4.3 shows that for large values of loss, the second term of the denominator in Equation 4.5 is quite signicant. This shows that in the case of high loss, ignoring timeouts by using Equation 4.4 may greatly overestimate the sending rate of a TCP source. However, Figure 4.4 shows that for moderate and low levels of loss, the simpler formula is suciently accurate. For example, for RTTs between 50ms and 150ms, the second term of the denominator in Equation 4.5 is less than 5% of the entire denominator of Equation 4.4. In both results, the TO is set to a value of 1s, as recommended in [146]. Several independent empirical and simulation studies of the model in Equation 4.5, and the model in Equation 4.4 for small p have shown that these models provide a good t to the observed send rate of TCP sources under a wide variety of network conditions [29, 141, 154].
CRAIG CAMERON
83
10
10 Loss (p)
RTT = 50ms RTT = 100ms RTT = 150ms 10

5
10
0.01 0.02 0.03 0.04 Ratio : (added term)/(basic term)
0.05
Figure 4.4: Induced Error in TCP Model Ignoring Time Outs Small p
4.4.1
TCP Friendliness
A TCP-friendly ow was rst dened as a ow with an arrival rate less than or equal to the arrival of a conformant TCP connection in the same circumstances [67]. Using the original notation, the rate, T [bytes/s], of a TCP-friendly ow was required to conform to: T 1.5 2/3 B R p (4.7)
for packets of size B [bytes], a packet loss rate of p and a roundtrip time, including queuing delay, of R seconds. Note that as 1.5 2/3 = 1.5, and assuming xed or zero queuing delay, the previously introduced in Equation 4.4 is identical with the TCP-friendly equation for equality. Therefore, the basic models and corresponding formulae presented above can also be used to denote an upper bound throughput for TCP-friendly ows.
84
4.4.2
Modelling SACK
Neither Equation 4.4 nor Equation 4.5 attempt to model the eect of enabling the SACK option. If the loss rate is suciently high such that the chance of more than one packet being lost in one window is signicant, both the previously introduced models may be inaccurate. A recent study has found that SACK is widely advertised (approximately 80% of all TCP trac measured [150]). Indeed, it is enabled by default on most modern operating systems. Simulation studies of SACK over OBS can be found in [198] and [199], but as yet no closed form method of calculating SACK throughput as been developed, rendering it unamenable to analysis.
4.4.3
Parallel Sources
All the models introduced above apply to a single TCP source. Parallel sources were studied extensively in the context of simultaneously initiating several TCP connections to transfer one le being unfair [87, 88]. In these papers the authors showed that parallel streams can be represented accurately by simply multiplying either of Equations 4.4 or 4.5 by the number of parallel sources, N . It is extremely important to note that in this case, p is also a function of N and signies the conditional loss rate for a specic source, given that there are N sources.
4.4.4
Bounds on TCP loss
Although TCP has been designed such that loss indicates congestion, it has been shown that TCP checksum failures are also a signicant cause of loss within todays Internet [168]. When looking at several Internet traces, it was found that between 1 packet in 1,100 and 1 in 32,000 failed the TCP checksum even on links where linklevel CRCs should have caught all but 1 in 4 billion errors. These checksum failures were due to: (i) End-host hardware errors, (ii)
CRAIG CAMERON
85
End-host software errors, (iii) Router memory errors and (iv) Link errors, especially in those using Van Jacobson Header Compression [102]. Due to the apparent prevalence of these congestion unrelated loss events, analyzing the performance of networks for extremely low levels of loss may yield overly optimistic results in real world implementations.
4.5
TCP over OBS
TCP over OBS is a relatively unexplored area. Previous papers include comparisons between TCP variants [199], guidelines for burst assembly algorithms [38, 58, 85, 192] and examining the eect of different TCP window sizes [192]. Both TCP and OBS have many congurable parameters including maximum TCP window size, TCP segment size, packet arrival distribution, burst-assembly algorithm, propagation delay, network topology and link capacities. There are also many dierent TCP variants, including Tahoe, Reno, New Reno and SACK, that behave slightly dierently when a packet is dropped. With all these degrees of freedom, it is often misleading to compare results from dierent experiments and therefore highly desirable to develop exible models that can be easily tuned to a wide range of TCP and OBS parameters. In this Section, we introduce a exible scheme to couple the simple OBS model from Section 2.7 and the TCP model from Equation 4.4 that can be easily extended to incorporate more complex OBS and TCP models as they are developed.
4.5.1
TCP source classications in OBS
TCP sources in OBS networks have been categorized into three main groups: fast, slow and medium [58, 198]. A fast source has an access bandwidth so high as to emit all the segments of its current
86
congestion window (cwnd) within the aggregation time of a burst and a slow source has at most one packet per burst. A medium source bridges the gap, having greater than one packet but less than the complete window in each burst.
4.5.2
TCP and Loss - Simulation Methods
On rst glance, it is tempting to use TCP traces from real networks in order to obtain a more realistic trac characterization. However, OBS networks have not yet been deployed and even if they had, the resulting traces would be shaped [69]. A trace reects the conditions in the network at the time the connection was measured. Due to the feedback incorporated in the rate control mechanisms of TCP, traces cannot be reused in another context as the sources would have behaved dierently. Instead, models that characterize source behaviour must be employed either to generate the trac for a simulation or to produce theoretical results. Developing a useful trac model is a key step in examining any networks performance. Modeling packet arrivals as a Poisson process was extensively used in the early nineties, before it was found that trac exhibited self-similar, non-Poisson, characteristics [148]. Since then, many papers have introduced and examined different models, however because of its simplicity, the Poisson model is still used today. Interestingly, even in the ns-2 simulations of [85], each TCP source was described as generating trac with exponentially distributed inter-arrival times. Assuming the authors are referring to packet arrivals being generated as a Poisson process, it is important to note that these packets are queued at the source, such that the output of the source is directly controlled by the TCP protocol. As described in Section 4.2, TCP has historically been used for non-realtime le transfers, including File Transfer Protocol (FTP) and Hypertext Transfer Protocol (HTTP). These applications have
CRAIG CAMERON
87
xed sized chunks of data that are transferred as fast as possible and as such, are limited by the physical characteristics of the end point nodes and the available network capacity. Within this paradigm, the bottleneck is not the packet generation rate at the source, but the loss in the OBS network. To show the impact of loss on TCP throughput, simulations were run for varying levels of OBS network loss, from 0.001 to 0.01 [85]. In a similar fashion, ns-2 simulations in [198] x p for a single link and nd the corresponding TCP throughput for Reno, New Reno and SACK over Reno. The conclusion in this paper is that SACK performs better for the same loss, but nothing is said about how the choice of TCP variant inuences the loss experienced by the TCP sources.
4.6
Linking TCP and OBS Models
Much research into Optical Burst Switching follows a standard pattern: take the original protocol [152], add some enhancements, then produce input load versus loss probability performance graphs, assuming the input load can be xed and known. But even for moderate loads, burst blocking probabilities have been shown to be very high. It is important to note that the sending rate of a TCP source is tightly coupled to packet loss within the network: a high rate of packet loss will cause a sender to slow down, thereby reducing the network load and decreasing subsequent packet loss rates. Therefore, if all endpoints are transferring large les using TCP as the Transport Layer Protocol, the range of feasible input loads is reduced down to a single xed operating point. Therefore, a useful result is not traditional load vs. loss, but expected TCP input rate vs. network parameters, such as round-trip delay and wavelengths per link.
88
4.6.1
OBS Network Topology
Blocking probability in OBS networks is also tightly coupled to transient trac matrices and the network topology. Choosing a single denitive matrix and topology is an intractable problem, as explained in [69]. In order to limit the dimensionality of the parameter space, the remainder of this Chapter collapses the network problem to a single link and an ingress OBS node. Although this may limit the direct applicability of the results, it clearly isolates the theoretical process, the main contribution of this Section, from distracting OBS and TCP minutiae. There have been many simulations of TCP over OBS, including [58, 85, 199], but none nd the xed point input TCP load or loss analytically. More precisely, the remainder of this Chapter focuses upon a simplied single-link topology with K output wavelengths with full wavelength conversion: any input wavelength can be switched to any output wavelength. Note small values of K also correspond to the case of no wavelength conversion for K physical output bers. The ingress node contains a nite number of buers, M , each with a nite number of TCP sources, N , such that the total number of TCP sources is N M . The number of buers, M , is chosen such that the bottleneck and subsequent loss will be in the OBS network, not at the buers. This implies a many-to-one mapping between buers and egress nodes: all packets queued in a common buer are destined for a common egress node, however there may be multiple buers that map to a single egress node to meet QoS requirements for certain trac streams. A graphical representation of this single-link topology is shown in Figure 4.5.
4.6.2
OBS On-O Source Trac Model
Each TCP source has a Poisson distributed packet arrival rate of p with a xed service rate of p and all sources are aggregated using the timer-based burst assembly algorithm: every T seconds,
CRAIG CAMERON
89
Buffer 1

B
Wavelength 2
Source N*M
Figure 4.5: Simplied single-link topology buered packets are aggregated into a burst and passed on to the scheduler for transmission. For more details on timer-based assembly, refer to Chapter 3. The burst arrival rate, B , and the burst service rate, B , are therefore: B = 1/T, B = p /(N T p ). (4.8)
or using on and o period distributions, B = 1/(T 1/B ), B = B . (4.9)
4.6.3
OBS Loss Model
The framework introduced in Chapter 2 calculates the blocking probability for a single link OBS network from the on/o parame ters, B and B dened in Equation 4.9 and values of K output wavelengths and M buers, where K < M . Values of K M are outside the scope of this Chapter as M is chosen such that loss occurs in the OBS network, not from buer overow. The case where
Buffer M
B
Wavelength K
CHAPTER 4. TCP OVER OBS loss occurs from buer overow is dealt with in Chapter 6.3.
90
4.6.4
TCP Model
As shown in Figure 4.4, for loss rates of p < 103 , the simple model for a saturated source in Equation 4.4 can be used to calculate the TCP sending rate per source, S [pkt/s], for a particular loss rate, p, and Round Trip Time, RT T . For higher values of loss, timeouts must be considered and Equation 4.5 substituted for Equation 4.4. The choice of Equation 4.4 implies that for accurate results, each TCP source is classied as slow, such that there is at most one packet from each source per burst injected into the OBS network. As the model of [210] allows arbitrary burst assembly times, the burst assembly time was chosen to be suciently small. It is also important to note that Equation 4.4 gives the send rate of a TCP source and not the throughput as dened by the amount of data received at the destination as a function of time. A closed form equation for TCP throughput can be found in [141]. This paper also showed that the dierence between the throughput and the send rate is neglible for loss rates under 1%, hence the send rate is a good approximation for throughput for suciently low levels of loss.
4.6.5
Finding the Fixed Point
The xed point for the input load of TCP over OBS networks can be found by continually applying the calculations from Section 4.6.3 and Equation 4.4 until sucient convergence is reached. The relationship between the equations is outlined in Figure 4.6 and consists of four main steps - setting the input rates per source (TCP), calculating the corresponding burst distributions (OBS), calculating the resulting loss (OBS), calculating the expected sending rate for this loss (TCP) and then using this sending rate as the new input load. As mentioned above, the composite TCP sources are assumed to
CRAIG CAMERON
91
be slow and therefore the burst loss probability approximates the packet loss probability.
TCP
TCP Performance (S)
OBS
OBS Performance (p)
Figure 4.6: Calculating xed point of TCP input load and OBS loss A more intuitive way of representing the relationship between TCP and OBS is to overlay both loss and load graphs as in Figure 4.7. The intersection point corresponds to the xed point solution.
10
0
10
-2
Loss (p)
10
-4
10
-6
10
-8
TCP OBS Fixed Point
10
-0 1
10
20 30 40 Load per Source (Mb/s)
50
Figure 4.7: Graphical method to nd xed point loss: M =16, K=10, RTT=0.1s.
TCP Parameters (p p)
OBS Parameters (B B)
60
92
4.7
Numerical Results
This Section presents numerical results for xed point loss and input loads over a range of numbers of buers, M , output wavelengths, K and Round Trip Times, RT T . Additional parameters were xed: N = 100 sources per buer, output capacity per wavelength = 10Gb/s, TCP packet size = 1KB and timer value T = 1ms. It is important to note that the number of parallel TCP sources that are carried over an OBS network, in this case N M , aects the performance of each individual source, as described in Section 4.4.3. Fixed values used in published simulation results range from one [192, 199] and three [85] to 500 sources [89]. In one case, the number of parallel TCP sources is not dened [58]. As backbone networks in the Internet carry thousands of ows per minute [74] with a signicant number having lifetimes of hours and carrying a high proportion of the total trac [30], it is important to measure the long-term performance of many aggregated TCP ows, not just an isolated few. Figure 4.8 shows the calculated xed point input rate of each TCP stream for several values of Round Trip Times (RTT) and from one to ve output wavelengths. Increasing the number of output wavelengths was found to dramatically increase the TCP source input rates, much more so than decreasing the round trip time. The reason for this increase is due to the equally dramatic decrease in blocking probability as the number of output wavelengths increases, as shown in Figure 4.9. From this we conclude that minimizing loss is critically important when designing OBS variants and therefore delay-inducing optimizations such as deection-routing [35] can still be quite helpful. Still, for OBS systems with no wavelength conversion and a small number of output bers, the utilization of the network is very low, as shown in Figure 4.10.
CRAIG CAMERON
93
6 Fixed Point Load per TCP Source (Mb/s) RTT = 50ms RTT = 100ms RTT = 150ms RTT = 200ms
0 1
3 4 K (Number of Output Links)
Figure 4.8: Input load per source vs. number of output links. M=16, no wavelength conversion
4.7.1
Including Time Outs
The above results used Equation 4.4 to calculate the TCP sending rate, with subsequently high loss rates. In the case of loss greater than 103 , the simple model for a saturated source is inappropriate and the more complex model that includes Time Outs should be used. The results from the numerical calculations using Equation 4.5 are reproduced in Figures 4.11, 4.12. In the case of K = 1, the xed point loads and loss rates using Equation 4.5 are noticeably reduced in comparison to those calculated using Equation 4.4. For higher values of K, the choice of model does not signicantly aect the results.
94
10
RTT = 50ms RTT = 100ms RTT = 150ms RTT = 200ms Fixed Point Loss (p) 10
2
10
10
Figure 4.9: Loss vs. number of output links. M=16, no wavelength conversion
4.7.2
Large Number of Wavelengths
We now present results for a large number of output wavelengths, with wavelength conversion. Figure 4.13 shows the xed point input TCP load per source versus the number of output wavelengths, K, for a range of numbers of buers, M . Across all values of M , the xed point load increases exponentially as extra wavelengths become available, however, for large values of M , such as M = 96, the incremental benet of adding wavelengths is much less than for smaller values of M . As discussed in Section 4.6.3, values of M < K are outside the scope of this Chapter as M is chosen such any bottleneck occurs in the OBS link, not at the buers. Figure 4.14 shows the xed point loss for the loads corresponding to the xed point load in Figure 4.13. As the number of buers, M , increases, the number of output wavelengths, K, required to
CRAIG CAMERON
95
10 Total Fixed Point Load per TCP Source (Gb/s) 9 8 7 6 5 Utilization = 2.3/(2*10) = 11% 4 3 2 1 0 1 2 3 4 K (Number of Output Links) 5 RTT = 50ms RTT = 100ms RTT = 150ms RTT = 200ms
Figure 4.10: Total input load vs. number of output wavelengths. M=16, no wavelength conversion give a constant loss also increases but at a slower rate than the change in M . For example, to achieve a loss rate of 104 , only 8 wavelengths are needed for M = 16, but 22 wavelengths are needed if M = 96. Both results show that OBS scales well as the number of output wavelengths, K, increases. Indeed, for values of K less than 5, each TCP source is restricted to rates of under 5Mbps with high loss of over 103 . As the number of output wavelengths increases beyond 20, all values of M give a reasonable loss of less than 104 , suggesting that OBS networks require large numbers of output wavelengths, with full wavelength conversion, to support TCP trac. When summing over all input sources (N M ), the total input rate is independent of the number of buers M , as shown in Figure 4.15. This result enables an OBS network to be dimensioned to achieve a specic total xed-point input rate. Furthermore, for
96
6 Fixed Point Load per TCP Source (Mb/s) RTT = 50ms RTT = 100ms RTT = 150ms RTT = 200ms
0 1
Figure 4.11: Input load per source vs. number of output links. M=16, no wavelength conversion, timeouts included low values of loss, Figure 4.15 also gives a estimate of utilization. For example, for K = 40 and M = 96, the total xed point TCP input load is 221 Gbps or a utilization of 221/(40 10) 55%. Similarly, for K = 10 and M = 32, the corresponding utilization is 27.2/(10 10) 27%. Achieving consistently high utilization with minimal loss is a very dicult challenge. In the Sprint backbone, diurnal variations can be as high as 7:1 and weekend trac is signicantly less than on the weekdays [74]. This trac variability is often managed by over-provisioning the network. Indeed, most of the backbone links monitored in [99] were utilized under 50% and less than 10% of the links experience utilization higher than 50% in any given 5 minute interval. In this Section we showed that TCP over OBS performance enables levels of network utilization comparable to current networks,
CRAIG CAMERON
97
10
RTT = 50ms RTT = 100ms RTT = 150ms RTT = 200ms Fixed Point Loss (p) 10
2
10
10
Figure 4.12: Loss vs. number of output links. M=16, no wavelength conversion, timeouts included given a sucient number of wavelengths and full wavelength conversion, suggesting that OBS is indeed a viable option for future all-optical network deployment.
4.8
Burst Assembly and TCP
As mentioned in Chapter 4, the majority of the trac currently carried over the Internet uses TCP to avoid network congestion and guarantee reliable delivery of data. The impact of dierent burst assembly algorithms on TCP performance has therefore become a widely explored area as OBS research matures, yet results are mixed and often contradictory. For example, increasing the burst delay was shown to signicantly improve TCP throughput in [85, 89] but to have no eect for low loads and even to decrease the performance for average oered transmitter loads of around 0.5 [192]. A more
98
45 Fixed Point Load per TCP Source (Mb/s) 40 35 30 25 20 15 10 5 0 0 10 20 30 40 50 K (Number of Output Wavelengths) 60 M = 16 M = 32 M = 64 M = 96
Figure 4.13: Input load per source vs. number of output wavelengths.
balanced approach was taken in [58] where it was shown that while increasing loss correlation through increasing burst delay improved TCP performance, the additional delay incurred may in fact cause decreased TCP performance. It suggests that the burst delay be approximately ten to twenty percent of the round trip time (RTT). A much looser upper bound of the TCP retransmit timeout value minus the round trip time is suggested in [38]. It is also recommended in [89] that the burst delay not be too large to ensure that TCPs pipelining mechanism does not break, but no maximum value was given. In this Section, we survey the dierent approaches to TCP and burst assembly and describe the performance gains and losses with respect to key TCP and key burst assembly parameter ranges.
CRAIG CAMERON
99
10
10 Fixed Point Loss (p)
M = 16 M = 32 M = 64 M = 96
10
10
10
10
10
10
20 30 40 50 K (Number of Output Wavelengths)
60
Figure 4.14: Loss vs. number of output wavelengths.

350 Total Fixed Point TCP Load (Gb/s) 300 Utilization = 221/(40*10) = 55% 250 200 150 100 50 0 0
M = 16 M = 32 M = 64 M = 96
Utilization = 27/(10*10) = 27% 10 20 30 40 50 K (Number of Output Wavelengths) 60
Figure 4.15: Total input load vs. number of output wavelengths.
100
4.8.1
Fast, Slow or Medium TCP Sources?
The eect of burst assembly on TCP is dependent on the types of the TCP sources being assembled. Using the notation introduced in Chapter 4, slow sources experience a packet loss event every 1/p sent packets, assuming burst loss events are independent. However fast sources lose their entire cwnd every (1/p1) cwnds successfully delivered. This means that, in the case of the fast sources, the fast recovery and fast retransmit mechanisms of recent TCP variants are not used. Instead the fast source eventually timeouts and begins slowstart. This dierence in behaviour means that care must be taken when both evaluating and tuning the performance of TCP sources over OBS. Schemes that optimize parameters for fast TCP sources may be harmful for slow sources and vice-versa. Furthermore, when dimensioning an OBS network, the choice of burst assembly parameters, including Burst Assembly Time, T and Burst Threshold Size, Bth directly inuences the categorization of the bursts component TCP sources. It is therefore important to explore the impact of OBS on slow, medium and fast TCP sources to develop guidelines to help choose burst sizes explicitly and, implicitly, determine the type of component TCP sources. The interaction between burst size and TCP source types can be seen in the following examples. All OBS proposals to date use the OBS network paradigm as a replacement for networks with very high levels of aggregation. As such, the capacity of a wavelength in the OBS network is assumed to be much higher than access network and therefore TCP sources may be limited to the slow and possibly medium categories. Let the access capacity of a TCP source be CT CP and the maximum window size be Wm . Assuming the entire access capacity is used for a single stream, the lower bound on the aggregation timeout value for the ow to be fast equals Wm /CT CP . While values in
CRAIG CAMERON
101
todays access networks require extremely long aggregation times for TCP sources to be classied as fast, an example given in [199] suggests that fast ows may be a feature of future networks. This example lets CT CP = 2.5Gbps and Wm = 2Mb, leading to TCP sources being labelled as fast for assembly times of over 0.8ms. It is important to note that while the access bandwidth was scaled, the assembly time was not. Given recent advances in switching technologies, future assembly times could be much less than one millisecond. In summary, classifying TCP sources as fast, slow or medium is a function of the access bandwidth, the assembly time and the maximum windows size. In dimensioning an OBS network, one must predict the rst and the last parameters, then choose an appropriate assembly time to achieve an optimal classication and subsequent good performance. The following sections introduce competing positive and negative eects as functions of all three parameters.
4.8.2
Delay Penalty
The most obvious impact of burst assembly on TCP is increased delay for individual packets. As described in Chapter 4, TCP source rates, S, scale with respect to the Round Trip Time (RTT) and the timeout period, T0 as follows: S 1 , RT T T0 assuming constant loss p. (4.10)
Burst assembly increases RTT and in some cases T0 thereby decreasing the sending rates of the bursts composite TCP sources. For networks with small propagation delays, this extra queuing delay may signicantly decrease TCP performance. It is therefore important to restrict packet assembly times such that the proportional increase in delay is small. This need for an upper bound on
102
delay is another reason why threshold-based burst assembly algorithms are inappropriate. Indeed, as burst sizes are tightly coupled to input TCP rates, increased delay would decrease input rates, further increasing the delay for successive bursts. Over a short period of time, this burstication delay may become very large with correspondingly low utilization of the network. A simulation study of FTP transfer times over an OBS network in [192] is an illustrative example of the delay impact of burst assembly on TCP. In this work, the authors assume a network RTT of 3ms and employ a threshold-based burst assembly algorithm. The access network is assumed to be 100Mbps. For loads higher than 0.2, setting the threshold larger than 10 packets signicantly increases the duration time of the FTP transfer.
4.8.3
Delayed First Loss (DFL) Gain
Conversely, for fast and medium sources, or equivalently, for large burst sizes, burst assembly may increase TCP source rates and network utilization. An heuristic description of this phenomenon can be found in [85]:With larger burst sizes, fewer bursts are generated for the same input trac rate and thus fewer bursts are generated for the same input trac rate and thus fewer bursts are dropped during the simulation duration. Hence, fewer reductions in window size and hence higher throughput. This simple explanation can be further extended using the notation and concepts of Chapter 4. Short bursts or slow sources lose 1/p sent packets, assuming burst loss events are independent. Long bursts or past sources lose their entire cwnd every (1/p1) cwnds. In the former case, cwnd is halved and increases linearly. In the latter case, cwnd is reduced to one and increases exponentially until the congestion avoidance stage is entered. Although this implies a period of reduced input TCP rates, the longer periods between losses allows cwnd to increase to a higher value than for short bursts or slow sources [90], as the rst
CRAIG CAMERON loss between two Triple Duplicate (TD) loss events is delayed.
103
This potential performance gain is appropriately called the delayed rst loss (DFL) gain [198]. The latter paper introduces the concept of a virtual burst size, L, which denotes the number of packets belonging to a single TCP source in a burst. For example, a TCP source with L 1 is slow and a TCP source with L = cwnd is fast. It was shown in [200], both by simulation and analysis, that the DFL gain is proportional to L.
4.8.4
Correlation Benet
As described in Chapter 4, dierent implementations of TCP often have dierent mechanisms for dealing with loss. More specically, Reno variants halve the congestion window for each packet loss therefore a burst loss will dramatically lower cwnd for the composite medium and fast TCP sources. New Reno handles multiple losses more eciently by not halving cwnd for successive losses but only retransmits one lost packet per RTT. Indeed, if many packets belonging to a single TCP source are lost, the source could time out before all lost packets are retransmitted in the fast recovery stage. In this case, the performance of New Reno could be even worse than that of Reno. The eect of these TCP specic retransmission strategies is often referred to as the retransmission penalty [198]. Following the notation of [58], the combined eect of this retransmission penalty and the DFL gain is the correlation benet [58]. It is suggested in [198] that the correlation gain is always positive, as long as the burst size is reasonably large. An example of size suciency is given to be L > 3, implying that burst assembly parameters should be chosen such that most TCP sources can be classied as medium or fast.
104
4.8.5
Burstication Factor
Another common term used to describe the eect of burst assembly on TCP performance is the Burstication Factor, or the relative increase in composite TCP source sending rates by introducing aggregation [58]. By extending the sending rate models introduced in this thesis in Chapter 4, the authors derive equivalent sending rate models for fast and slow TCP sources that are assembled into bursts. While slow TCP sources have a burstication factor of one (i.e. unaected by burst assembly), fast TCP sources can have burstication factors of much larger than one. The theoretical send rate equation for fast TCP sources was derived in [58] to be:
S=
p3 p+1 , (1+)RT T0 (p(1p)2 +p3 f (p)) Wm pWm +p2 , (1+)RT T0 ((1p)2 +p2 f (p))
for p >
1 Wm
otherwise
(4.11)
where RT T0 denotes the roundtrip time without burst assembly, , the ratio between the roundtrip time with and without burst assembly, p, the packet loss probability, Wm the maximum TCP window size and f (p) is a polynomial dened by: f (p) = 1 + p + 2p2 + 4p3 + 8p4 + 16p5 + 32p6 . (4.12)
Figure 4.16 shows the Correlation Benet of fast sources for varying levels of loss and maximum window size Wm . The basic model from Equation 4.4 was used as a base line reference. As expected, the Correlation Benet is maximized when p equals the inverse of Wm and the performance of the fast source is greatly increased by the burst assembly process for moderate levels of loss. For extremely low levels of loss, the rate is limited by the maximum window size Wm . It was suggested in [38] that the optimal burstication delay increases with the number of TCP sources, however, the TCP input
CRAIG CAMERON
105
10
Wm = 64 packets Wm = 128 packets Wm = 256 packets 10 Loss (p) 10

3 2
0 2 4 6 8 10 12 14 Ratio of TCP source rates: Fast Source Model/Basic Model (Slow Source)
10
Figure 4.16: Correlation Benet for Fast TCP Sources rate per source was not explicitly analyzed.
4.8.6
Dynamic Burst Sizing
The impact of Burst Assembly algorithms on TCP performance is an active area of research within the OBS space. Many attempts have been made to tune the burst size in order to optimize TCP throughput, with the general consensus that larger bursts give higher throughput up to a point after which the added delay causes the throughput to decrease. Exactly where this cross-over point is located seems highly dependent on the specic simulation environments under examination due to the high dimensionality of the parameter space. Indeed, [198] presents optimal threshold values for a xed burst loss rate of 0.01, a parameter exceedingly hard to adjust due to the feedback eects of TCP. TCP connections are of nite length with time dependent rates, resulting in medium time scale variations in input loads. Given that TCP seems sensitive to both the burst size and the network con-
106
ditions, dynamic adjustment of the burst size would be extremely useful. Dynamic burst assembly was rst introduced in [38]. The argument presented in the paper was that after a large burst, it is very likely that TCP sources feeding that burst will increase their rates and therefore increasing the burst size is better. Similarly, small bursts imply that many TCP sources are in the TCP slow start stage. In order to quickly increase the window size of these sources, bursts should be sent out as fast as possible and therefore the burst size should be reduced. Using the notation introduced above, this algorithm can be seen as switching between fast and slow TCP types, in order to leverage the advantages of both. The paper noted that this method is very complex, however alternate, lower complexity, solutions were not simulated. Given that threshold-based burst assembly algorithms are impractical due to their lack of delay upper bound, the only control mechanism to tune burst size is the choice of burst timer T . By setting a short timer, one would expect small bursts and similarly, as the timer period increases, one would expect larger bursts. Due to the tight coupling to transient trac statistics, this is not always the case. However, it has also been suggested that TCP performance is more sensitive to the assembly period than the burst length [38]. This paper presented simulation results showing that dynamically modifying assembly times based on recent burst sizes yields signicant performance gains. The dynamic algorithm of [38] calculated the assembly period according to the following equation: Ti = (Average Burst Length)i . K C (4.13)
where K is the number of wavelengths, C is the capacity per wavelength on the output link and 1 is a user dened parameter called the assembly factor and equals the average number of buers sharing the output link. Calculating the Average Burst Length also
CRAIG CAMERON
107
requires two smoothing parameters to put more weight on the most recently sampled burst and another two parameters to enforce upper and lower thresholds that keep the assembly period within a reasonable range. This algorithm adds signicant complexity to the burst assembly because of the necessity of calculating the size of the next burst immediately after the previous burst is sent. Furthermore, the dimensionality of the user congurable parameter space includes the assembly factor, two smoothing values and two threshold values. Optimally tuning trac dependent parameters has previously been shown to be extremely dicult in the case of RED [52, 163]. Although RED is included in most currently installed routers, its fragility to misconguration has led to minimal activation within operational networks, indeed [128] strongly recommends that network providers not turn it on.
4.8.7
O-Timer Burst Assembly and TCP
In Chapter 3, we introduced a dynamic burst assembly algorithm, O Timer Burst Assembly (OTBA). Unlike the complex algorithm proposed in [38], OTBA requires only one parameter, the timer value T , yet provides the necessary burst size, trac intensity correlation: OTBA increases the burst size after a long burst and conversely sends bursts at a higher rate after short bursts.
4.9
TCP alternatives
Despite recent success in modifying TCP to scale up to high-speed networks, the continued dominance of TCP cant be guaranteed. New real-time multimedia applications, including Voice over IP, Internet Radio and Video on Demand, are becoming popular as access bandwidth to end-users increases. From an end-users perspective, it may not be in their best interests to avoid network
108
congestion if that requires a decrease in their individual transmission rate. Additionally, non-TCP tools to transfer les reliably at constant (high) rates are available today1 . The problem of avoiding congestion collapse despite trends away from end-to-end congestion control was recently examined in [67], which argued that recognizing the essential role of end-to-end congestion control for best-eort trac and strengthening incentives for using it are critical issues as the Internet expands. The main assumption of this paper is that the Internet will continue to become congested due to a scarcity of bandwidth, an assumption that may not continue to be true. In fact, it is widely known that the average utilization in current core networks in the Internet is often under 10% [74]. It is clear that incentives are required for users to implement end-to-end congestion control. Currently, several router mechanisms have been suggested that restrict the bandwidth of ows using a disproportionate share of the bandwidth in times of congestion [172]. The lack of deployment of these mechanisms within commercial networks suggest that the danger of congestion collapse may be over-exaggerated and the popularity of TCP may be waning. Nevertheless it is much too soon to predict the death of TCP. What is certain is that as new applications are developed, the role and implementation of TCP in future networks will change to enhance and adapt to dominant application and network requirements.
4.10
Conclusion
In this Chapter, we combined two source rate TCP models with an OBS model to nd xed point loads and losses. New versions of TCP, will most likely replace the current implementations, making many of the specialized TCP optimizations for OBS obsolete.
1
http://www.digitalfountain.com
CRAIG CAMERON
109
Not withstanding, it will always be desirable to be able to measure the expected throughput of TCP over OBS networks. As new versions of TCP arise, applying a xed-point approach to nd loss and loads will still remain possible, given more accurate substitutes for Equation 4.4 or Equation 4.5. We also investigated the eect of burst assembly algorithms on TCP performance. It was found that dynamic burst sizing improves TCP performance. O Timer Burst Assembly (OTBA) was reintroduced as an algorithm that increases burst sizes after a long burst and conversely sends bursts at a higher rate after short bursts, the required correlation for optimal TCP performance.
110
CRAIG CAMERON
111
Chapter 5 Contention Resolution

5.1 Introduction
If a reservation cannot be made at an intermediate node, contention is said to occur and the BHP and corresponding burst must be dropped if the contention is unable to be resolved. Therefore, eective contention resolution is critical to restrict packet loss in OBS networks to reasonably low levels. There are four main methods for resolving contention: 1. Wavelength conversion: On contention, try to make a reservation on a dierent output wavelength on the desired output link. 2. Fiber Delay Line (FDL): On contention, try to make a reservation on the desired output wavelength on the desired output link, but at a dierent time. 3. Deection routing: On contention try to make a reservation on the desired output wavelength on a dierent output link. 4. Preemption: On contention, remove the contending reservation, then make a reservation on the desired output wavelength on the desired output link.
CHAPTER 5. CONTENTION RESOLUTION
112
Up to this point in the thesis, only wavelength conversion has been used as a contention mechanism due to its modeling simplicity. FDLs have been comprehensively researched and both found to be extremely helpful in reducing loss in OBS networks [191], yet deection routing and preemption are techniques that have been often overlooked due to perceived complexity and instability at high loads. In this Chapter we develop a novel algorithm called Shortest Path Prioritized Random Deection Routing (SP-PRDR) that combines wavelength conversion, deection routing and preemption contention resolution schemes to signicantly reduce blocking rates in OBS networks, yet maintain stability.
5.2
Deection Routing
Currently, most trac on the Internet is sent using reliable control protocols that retransmit lost data [70]. Assuming this dominance will continue, at least in the near future, minimizing loss probability when designing new network architectures and protocols is a very important goal. In this Section, we focus on one such technique that aims to reduce the loss probability by increasing the round trip time: Deection Routing. We borrow the denition of deection routing from an earlier paper [50]: when a control packet arrives at a node and the link corresponding to the next-hop entry in the routing table is fully occupied, it is redirected onto a dierent unoccupied output link or dropped if all output links are occupied. Deection routing was rst studied on simple, uniform topologies such as ShueNet [71, 94] and the Manhattan Street Network [127]. Whereas these topologies are amenable to mathematical analysis, deection protocols are highly sensitive to topological structure, with the result that simulations using more realistic and complex topologies are often necessary [149]. For example, deection routing performs well on the Manhattan Street Network and ShueNet, even under heavy load [50, 112], but recent simulations
CRAIG CAMERON
113
using a variety of complex networks show that deecting rather than dropping can cause higher burst loss probability for high loads [180, 203] but much lower burst loss probabilities for networks with few wavelengths and under light load [96, 180, 190, 191].
5.2.1
Deection Types
The choice of which output links and wavelengths to use for deected bursts is critical to the overall performance of the network. There are three main types of deection protocols - xed alternate routing [96, 112, 190, 191, 203], dynamic trac-aware [118, 173] and random [180]. The more popular approach is to deect along a xed alternate path, either on a hop-by-hop basis [112, 190, 191] or by storing at each node both the complete primary path and the complete alternate path from itself to every possible destination node in the network [96]. While the latter becomes clearly unfeasible as the network grows, even the hop-by-hop approaches have a signicant disadvantage: the choice of a good alternate path is very dicult due to tight coupling between subsequent burst loss probabilities, trac matrices and network topology.
5.2.2
Fixed Alternate Deection
Algorithms to chose explicit xed alternate routes vary considerably, from simple heuristics such as the link-disjoint next-shortest path [203] to the more complex, yet still ad-hoc, approach of requiring alternate routes to return to the original next-hop node in less than three hops [190, 191]. In some cases, the algorithm is not given at all [96, 112]. The important aspect of all of these algorithms is that the deection paths do not change. Fixed deection routing has also been shown to destabilize OBS networks for high loads, yielding higher burst loss probabilities than if bursts were not deected but simply dropped [18, 42, 180, 203]. As link utilizations in the internet core have been shown to vary widely
114
even over moderate time scales [24], possible instability in OBS networks is a major concern. Two approaches have been suggested to stabilize deection routing schemes: wavelength reservation [203] and sender check/retransmission messaging functions [180]. The former approach, in a similar way to classical trunk reservation in circuit switched networks, limits the amount of deection at high load by reserving wavelengths on each link for the exclusive use of primary bursts. The latter approach avoids deections for one hop paths by holding bursts at the sender until a wavelength is free. Bursts that are deected back to their sender are also not transmitted immediately but similarly held.
5.2.3
Dynamic Trac-Aware Deection
Trac-aware deection is even more tightly coupled to transient trac matrices and the network topology and is consequently prone to instability: continual oscillation between congested and uncongested link states [16]. Although determining an optimal deection route can be reduced to an simple integer linear programming optimization problem in which a cost function that accounts for the new hop count and contention rate is minimized [117], such approaches are very sensitive to the trac demand, a variable that is often extremely hard to measure and or predict. In most cases, the majority of bursts are carried on the primary routes, therefore to eectively spread the load robustly, the primary routing algorithm must be also dynamic and trac dependent. Deection algorithms can be implemented with any primary routing algorithm and therefore are also applicable in this case.
5.2.4
Out Of Order Packets
Each OBS burst consists of a aggregated group of packets that share a common egress node and arrived at the ingress node at roughly the same time. In the case that some bursts are deected
CRAIG CAMERON
115
and others not, composite packets may arrive out-of-order at their nal destinations. This phenomenon can be seen in todays networks, even though explicit deection routing has not been widely deployed. Indeed, measurements taken from 1997 found that 2.0% of all the data packets being examined arrived out of order [145]. Additional measurements in 1999 showed that the probability of a TCP session experiencing reordering is over 90% [20]. Although one cannot claim that the individual sites participating in these measurement frameworks are plausibly representative of Internet sites in general, several techniques in use in todays internet have been identied as causing signicant packet reordering, suggesting that packet reordering is indeed a common occurrence in todays Internet.
One such technique is load balancing within a switch, which can allow a newer packet to be placed in a long queue while an older packet is placed in a short queue [19]. Another is packet striping across multiple links, which may cause reordering in the case of dierent queue lengths on each link. In a survey of 38 major ISPs conducted in mid 1997 [78], only two of the smaller ISPs did not have parallel paths between nodes [20]. Parallelism has been found to cause poor TCP performance [20], prompting modications to TCP to make its fast retransmission algorithm more tolerant to reordering [28].
Deection routing may cause additional packet reordering. However, it is important to note that TCP requires receipt of 3 duplicate ACKs to indicate a loss. Assuming deection is a rare event, composite TCP sources are classied as slow and the dierence between primary and deection path lengths is small, the out of order burst may arrive before the creation of the third duplicate ACK.
116
5.3
Shortest Path Prioritized Deection
Despite the potential negative impact on TCP, deection routing may be able to reduce the packet loss rate for OBS trac. In this Section, we introduce a new deection protocol for OBS, Shortest Path Prioritized Random Deection Routing (SP-PRDR), that aims to lower burst loss probabilities while using only limited state information from traditional Internet Protocol technologies. We show, through analysis and simulation, that loss in OBS networks is indeed signicantly reduced by SP-PRDR for loads that previously gave moderate or low losses in the unmodied case. In the simulation examples studied, by using SP-PRDR we are able to increase the input load by approximately 15-20% while maintaining a constant burst loss probability of 103 . Additionally, unlike other schemes, we show that the worst case burst loss probability of SPPRDR is provably upper-bounded by the burst loss probability of standard OBS. Previous OBS papers combining deection routing and priority dier in that they choose which burst or which burst segment to deect based on a xed, pre-determined, priority label [177, 178, 188]. In contrast, our method requires no QoS dierentiation on the edge of the OBS network. Instead, we introduce a priority system whereby the priority of a burst is dynamically determined based on the number of previous deections experienced by that burst.
5.3.1
Node Architecture
Technology trends point to network intelligence moving up to IP [92]. An appropriate choice to study therefore is the Smart Routers, Simple Optics architecture in which each network node is both an IP router and an optical layer cross-connect [92]. We route trac in the case of no contention on a hop-by-hop basis with next-hop paths chosen by an unweighted shortest path algorithm, as in OSPF [132].
CRAIG CAMERON
117
It is important to note that contention between generated bursts and cross-trac bursts can be a signicant fraction of the overall loss probability. However, as bursts can be buered on the edges in electronic form, an optimal initial oset time can be chosen to minimize these loss events. As this optimization is independent of routing, it can be implemented in parallel with any routing scheme and therefore will not be considered further.
5.3.2
Routing Notation
Routing algorithms can be most succinctly described by representing a network as a standard graph. Using the notation from [22], we dene a network as a unidirectional graph G = (N, A) with a nite nonempty set N of nodes and a collection A of pairs of distinct nodes from N . Each pair of nodes in A is called an ordered arc or a link. Note that (na , nb ) and (nb , na ) are dierent arcs. A walk is a sequence of nodes (n1 , n2 , ..., nk ), such that each of the neighbouring pairs (n1 , n2 ), (n2 , n3 ), ..., (nk1 , nk ) are arcs of G. As we allow loops, unlike [22], we denote a path as equivalent to a walk. The routing for each burst follows the reservations and therefore routing of the initial control packet. For the remainder of the Chapter, we refer to bursts being dropped and deected, abstracting away the implementation details: control packets being dropped (bursts dropped) and reservations being made for dierent output links (bursts deected).
5.3.3
Shortest Path Random Deection Routing
Many variants of Optical Burst Switching have been proposed, each with dierent reservation, contention and routing protocols. In this Section, we add a deection routing algorithm to the most common
118
OBS reservation protocol, Just-Enough-Time (JET) [195]. JET has one main feature: delayed reservation. After the burst is generated at the ingress node, a control packet is sent to reserve bandwidth on each link for the duration of the burst after which the bandwidth is freed and therefore available for future bursts. The burst is then held at the ingress node for a xed amount of time, equal to or greater than the total processing time of the control packet along the entire path, before being sent into the network. If a control packet is unable to schedule a burst at an intermediate node due to the corresponding output link being fully reserved anytime during the desired period, a contention is said to occur and the control packet and the corresponding burst must be re-routed or dropped (blocked). To simplify notation, as the burst always follows the path of the control packet, we ignore the control packet and refer only to the bursts. In this section, we outline a new approach that re-routes bursts in cases of contention. For each node, ni , the shortest path to every other node, nj , j = i, is calculated and used to route bursts in the contention-free case. We assume a distributed algorithm, such that each node only contains enough information to route a burst to the corresponding next node. In the case of contention, the contending burst is deected to a random free output link - the deection link. We assume the set of free links is known and that each link in the set has an equal probability of being chosen. If all output links have previously established overlapping reservations and therefore are not free, the burst is dropped. Upon arrival at the recently chosen random node, the burst is then routed using the shortest path from that node to the bursts destination. For example, let the network topology be NSFNET T3, as shown in Figure 5.1. The shortest path between nodes 2 and 6 is (2, 7, 6). Let a burst from node 2 experience a contention on link (2, 7) and let both links (2, 1) and (2, 3) be free. If link (2, 3) is chosen as the deection link, the new complete path is (2, 3, 5, 6), as the shortest path from node 3 to node 6 is (3, 5, 6).
CRAIG CAMERON
1
119
13 12 9 2 4 3 6 10 8 5 7 11
Figure 5.1: Sample OBS Network - NSFNET T3 comprising 13 nodes and 32 directed links
5.3.4
Shortest Path Prioritized Random Deection Routing (SP-PRDR)
As discussed in the introduction, Shortest Path Random Deection Routing has been shown to be unstable at high loads [180]. To solve this stability problem, and thereby achieve loss probabilities less than or equal to Shortest Path Routing without deection, we explicitly dierentiate between deected and undeected bursts by assigning dierent priorities to the corresponding reservations: Plow for the former and Phigh for the latter. In the case that a future non-deected control packet with priority Phigh experiences contention, if there is an overlapping reservation of priority Plow , this reservation is preempted, else the control packet is deected and its priority lowered to Plow . Note that if a low priority reservation is preempted, any downstream reservations will be then unused and may unnecessarily block other deected bursts. It is very important to note these surplus reservations can be preempted by higher priority bursts and therefore do not introduce signicant ineciency. Indeed, in preliminary experiments, having more than two priority levels was found to give very minimal performance improvement, therefore, only two are used in this paper.
120
Further, it is important to note that the SP-PRDR algorithm allows header packets from Phigh bursts to preempt a Plow burst even after the Plow burst has arrived at the node and is in the process of being switched to an output link. The probability of this occurring increases with the number of hops already traversed by the Phigh bursts. Deected bursts have, by denition, a longer average path length and therefore the time between the header packet and the burst arriving is usually shorter than undeected bursts, reducing the chance of this scenario, especially for small burst sizes. In preliminary experiments, allowing or disallowing in-ight preemption had little eect and therefore the default algorithm described above was used. As every deected burst can be preempted by a nondeected burst, the loss probability using this new algorithm is strictly less than or equal to that of the no deection case where a burst experiencing contention is immediately dropped. In other words, by deecting the burst and lowering its priority, it is possible for that burst to be received successfully while having no negative impact on the blocking of the nondeected bursts.
5.3.5
Managing Preemption
When a low priority burst reservation is preempted, the new high priority reservation will re-congure the switching fabric of the OBS intermediate node to ensure that its corresponding burst is successfully transmitted. As a result, the low priority burst will be truncated and therefore only partially received at the corresponding egress node. Knowledge of this preemption is localized only at the intermediate OBS node at which the contention occured, therefore the egress node expects a complete, untruncated burst, an expectation that is not met. There are two main solutions to resolve this problem. The egress node can either attempt to recover as many individual packets from
CRAIG CAMERON
121
the truncated burst as possible or drop the entire truncated burst it receives. In the former case, packet boundaries must be clearly detectable and each packet must contain some redundant data, such as a CRC checksum, to enable the receiver to verify its integrity. Corrupt packets can then be dropped at the egress node. In the latter case, a CRC checksum of the entire burst can be included in the header packet. On receipt of a truncated burst, the subsequent checksum failure would cause the entire burst to be dropped. It is important to note that although preempting bursts may lead to inecient use of downstream reservations, these reservations are also able to be preempted and therefore only block other low priority reservations.
5.3.6
Oset Time
If all links have equal weight, it is also possible to chose an appropriate oset time for systems using SP-PRDR that avoids the insucient oset time problem outlined in [96]. Let the time taken to process a control packet reservation request at each node be , the initial oset time be T , the diameter of the network be D and the number of allowed deections be d. Now, as mentioned above, after a control packet is deected from node na to a random output node nb , it is subsequently routed along the shortest path from this new node to its destination. Therefore, if all paths are weighted equally, the length of this new path cannot be more than the length of the old path plus two: in the worst case, the control packet loops back from nb to na . Therefore, a suitable minimum oset time is T = (D + 2d) . (5.1)
If a weighted shortest path algorithm is used, Equation 5.1 is no longer valid. Note that the diameter, D, is equal to the maximum of the shortest path distances between all possible pairs of nodes of a graph, where the distance between two nodes na and nb in a
122
weighted graph is the sum of weights of the edges of a shortest path between them. For a weighted graph, this is no longer equal to the length of the path. Additionally, replacing D with a new quantity: the maximum hop count of all weighted shortest paths, D , also is incorrect. A counter example is given in Figure 5.2.
3 1
4 10 6
Figure 5.2: Sample network with weighted links. The weighted shortest path from node 1 to node 5 is (1, 2, 3, 4, 5). Now let a burst from node 1 to node 5 be deected to node 6 upon arrival at node 4. The complete path is now (1, 2, 3, 4, 6, 2, 3, 4, 5). In this case D = 4 but the length of the path is now 9. Therefore if weights are used in the shortest path routing computation, additional safeguards, such as time to live hop counters, are be needed.
5.3.7
Fibre Delay Lines
JET also includes an optional feature: burst arrival postponing that further reduces loss probabilities by using Fibre Delay Lines (FDLs) at intermediate nodes. While it has been shown that FDLs decrease the probability of burst loss[43], they currently require complex hardware and electronic controls yet are only able to provide in the order of a 10s delay as the speed of light requires extremely long bre spools. It is important to note that our new routing algorithm can be easily adapted to leverage FDL burst loss reductions by randomly deecting only if the FDL is occupied or the output link is still fully reserved after the FDL delay.
CRAIG CAMERON
123
5.3.8
Routing loops
A given path P is said to have a routing loop if a duplicate node exists in the path. In traditional buered packet switched networks, loops waste network resources unnecessarily and all routing algorithms strive to be loop-free. In OBS networks, loops can be viewed in a similar way to an FDL but instead of a dedicated bre, a blocked burst uses the network as a buer. Within this paradigm, routing loops caused by deections can be useful and are therefore allowed.
5.4
SP-PRDR Simulation
We illustrate the benet of SP-PRDR for two dierent routing matrices, consisting of approximately 20 Origin-Destination (O-D) pairs with equal and constant input loads, chosen to represent a variety of path lengths and link-sharing degrees. We use the NSFNET T3 topology, with 80 wavelengths per directed link, as shown in Figure 5.1 and the full paths are shown in Table 5.1 and graphically overlayed on the NSFNET topology in Figures 5.3 and 5.4. Care was also taken to represent dierent network states: unbalanced (Routing Matrix 1) and balanced (Routing Matrix 2), as shown in Table 5.2. In this table we tally up the number of trac ows incident on every link in the network. In the unbalanced case there can be one to four ows on each link, while in the balanced network almost all links have two ows. As the loads for each O-D pair are equal and constant, the number of ows in Table 5.2 roughly corresponds to the link utilization assuming no loss. SP-PRDR uses the Shortest-Path algorithm to route undeected bursts through the network. In the simulations we give preference to nodes with a lower number in the case of equidistant paths, therefore a specic set of O-D pairs yields a unique routing matrix. Subsequent simulation results are also dependent on the O-D pair
CHAPTER 5. CONTENTION RESOLUTION Routing Matrix 1 Label Path A1 (1,4,6) B1 (2,7,9,12,13) C1 (8,5,3,2,1) D1 (9,12,11,10) E1 (13,8,5,6) F1 (7,6,5,8,10) G1 (3,2,7,9) H1 (12,9,7,2) I1 (11,10,8,5,3) Routing Matrix 2 Label Path A2 (1,4,6) C2 (8,5,3,2,1) D2 (9,12,11,10) E2 (13,8,5,6) F2 (7,6,5,8,10) G2 (3,2,7,9) H2 (12,9,7,2) J2 (4,6,7) K2 (8,10,11) L2 (11,12,13) M2 (12,11,10,8)
124
Table 5.1: Global Routing table for simulated Origin-Destination pairs. All trac ows are bi-directional. Flows Routing Matrix 1 0 0 1 6 2 6 3 2 4 1 Routing Matrix 2 0 2 13 1 0
Table 5.2: Number of bidirectional O-D pair ows for each link in network.
2 2
1 3 1 2 2 4 1 2 3
3 1 2
Figure 5.3: Number of ows per link (RM1)
CRAIG CAMERON
125
2 2
1 2 2 1 2 3 2 2 2
2 2 2
Figure 5.4: Number of ows per link (RM2)
choice. It is not feasible to explore all possible sets of O-D pairs, instead we concentrate on a sensitive parameter found when initially exploring the O-D pair parameter space: network balance. As outlined in [149], collapsing the parameter space to form parsimonious models is crucial to isolate invariants. Note that an informal proof that SP-PRDR has equal or lower burst loss probability than standard OBS has already been presented in Section 5.3.4. The simulation results serve as an illustration of the benet gained in balanced and unbalanced examples. We used the same simulation framework from [203] to simulate the performance of the network. This framework introduces several approximations in an attempt to accurately model OBS networks while maintaining feasible simulation times. Bursts are generated by independent Poisson processes, burst transmission times on each link are independent and exponentially distributed with a common mean and deected bursts are generated according to independent Poisson processes, not more complex and accurate two-state Markov modulated Poisson processes. In the case of preempted bursts, we assume the truncated burst is dropped at the egress node.
126
5.5
Results
We calculated the burst loss probability performance, the average number of hops taken for successful burst transmissions and the link utilizations of SP-PRDR. A maximum of one and two deections were allowed. These results were compared with SP-RDR and standard SP (with no deection). The results for both Routing Matrices are plotted below in Figure 5.5 to Figure 5.11. Simulations were run for sucient time to ensure that error bars corresponding to 95 percent condence intervals, using a Gaussian approximation across 5 independent runs, were smaller than the shapes used to mark the data points, except at the lowest value of load. Due to computational time constraints, probabilities lower than 105 are not precise and probabilities lower than 106 are not included in the results. While the results are presented in terms of average input load per O-D pair, the approximate network load can be derived from the number of ows per link in Table 5.2. Routing Matrix 1 has an average of 1.87 ows per link and Routing Matrix 2 has an average of 1.94 ows per link. To convert from average input load to approximate network load, x axis values must be multiplied by the average number of ows per link and divided by the number of wavelengths per link (80). For example, in Figure 5.6, an x value of 35 corresponds to an approximate network load of 0.85. Conrming what was previously noted in [180], the loss probability for SP-RDR is even higher than the xed, non-deecting case, for both routing matrices, as shown in Figure 5.5 and Figure 5.6. For the balanced case, the instability is even more apparent as deected bursts no longer are sent over under-utilized links and therefore have a much higher chance of blocking non-deected bursts. However, SP-PRDR signicantly reduces the loss probability for low input loads, while it is equal to the non-deection case for higher loads.
CRAIG CAMERON
127
10
10
10 Average Loss Probability
10
10
SP-RDR max 1 defl SP-RDR max 2 defl SP-PRDR - max 1 de SP-PRDR - max 2 de SP
10
10
15
20
25
30
35
Average Input Load per OD pair
Figure 5.5: Average O-D Pair Burst Loss Probability, Routing Matrix 1 (unbalanced)
10
10
10 Average Loss Probability
10
10
SP-RDR max 1 defl SP-RDR max 2 defl SP-PRDR - max 1 de SP-PRDR - max 2 de SP
10
10
15
20
25
30
35
40
45
Figure 5.6: Average O-D Pair Burst Loss Probability, Routing Matrix 2 (balanced)
128
4.8
4.6
SP-RDR max 1 def l SP-RDR max 2 def l SP-PRDR max 1 de SP-PRDR max 2 de SP
4.4 Average number of hops
4.2
3.8
3.6
3.4
3.2
10
15 20 25 Average Input Load per OD pair
30
35
Figure 5.7: Average Number of Hops for successful transmissions, Routing Matrix 1
3.5
SP-RDR max 1 de SP-RDR max 2 de SP-PRDR max 1 de SP-PRDR max 2 de SP
3.4
Average number of hops
3.3
3.2
3.1
2.9
2.8 15
20
25
30
35
40
45
Figure 5.8: Average Number of Hops for successful transmissions, Routing Matrix 2
CRAIG CAMERON
129
80 SP-RDR max 1 de SP-RDR max 2 de SP-PRDR max 1 de SP-PRDR max 2 de SP
70
60 Average Link Utilization
50
40
30
20
10
10
15
20
25
30
35
Figure 5.9: Utilization averaged over all links, Routing Matrix 1
80
75
70
65 Average Link Utilization (%)
60
55
45
40
35
30 15
20
25 30 35 Average Input Load per OD pair
40
45
Figure 5.10: Utilization averaged over all links, Routing Matrix 2

90
130
85
80
Maximum Link Utilization (%)
75
65
60
55
50 15
20
25
30
35
40
45
Figure 5.11: Maximum Link Utilization, Routing Matrix 2 (5-8)
This dependence on load can be seen more clearly in Figure 5.7 and Figure 5.8. As the input load increases, many more deections occur in the non-prioritized case, while bursts are preempted and consequently dropped for SP-PRDR. Both deection schemes utilize the network more than the nonprioritized case, as shown in Figure 5.9, Figure 5.10 and Figure 5.11. This is intuitively obvious as both schemes deect bursts instead of dropping, adding additional trac to the network. However, by using SP-PRDR, this additional trac, while adding to overall network utilization, can by preempted by the non-deected bursts and therefore does not induce instability and consequent higher loss. The performance improvement achieved by introducing SP-PRDR is best described by the following numeric examples. From Figure 5.5, using SP-PRDR enables the average load to be increased from 15 to 18, a gain of 20%, while maintaining a constant average loss probability of 103 . In the balanced case corresponding to Figure
CRAIG CAMERON
131
5.6, it is possible to increase the average load from 21 to 24, or approximately 15%, while maintaining a loss of 103 .
5.6
Conclusion
In this Chapter we developed a novel algorithm called Shortest Path Prioritized Random Deection Routing (SP-PRDR) that combines wavelength conversion, deection routing and preemption contention resolution schemes to signicantly reduce blocking rates in OBS networks, yet maintain stability.
132
CRAIG CAMERON
133
Chapter 6 Gaussian Queuing in OBS

6.1 Introduction
An OBS ingress node can be viewed as a queuing system. When packets arrive at an ingress node, they are placed in buers awaiting transmission to their new destinations. If, within a certain period of time, the number of arriving packets destined for a common egress node, which may come from dierent and often bursty sources, is suciently large, the corresponding buer may overow and packets are dropped. OBS network designers are faced with the challenge of how to provide required quality of service yet operate an economically viable network. In particular, there is a need for eective design and dimensioning tools for OBS ingress buers in order to minimize the potential cost of such nodes, due to the need for a large number of buers: one per egress node and QoS class pair. Bursty trac characterization and performance evaluation of statistical multiplexers are used typically to build such tools and packet-based tool development has been a hot research area for the last 20 years. However, within the OBS paradigm, it is often assumed that ingress buers will never overow and that the loss occurs due solely to contention inside the network. This assumption
CHAPTER 6. GAUSSIAN QUEUING IN OBS
134
was implicitly used in Section 4.6.3 by forcing the number of buers M to be larger than the number of wavelengths K and emptying buers to generate each burst. It is not always true that ingress buers dont overow, indeed there are advantages to ensuring that overow does occur. To equally share the output link resources, the size of each burst must be restricted. For example, in the case of timer-based burst assembly, after the timer expires the entire buer may not be assembled into a burst. Instead, only a fraction corresponding to the amount of link resources allocated to that buer could be aggregated, with the remaining packets staying in the queue until the process repeats after another timer period. In this way, usage of the output link resources can be more fairly shared amongst the competing buers. Viewed from a network providers perspective, limiting buer access to the output links enables ne grained control over network usage, especially when the corresponding ingress and egress nodes connect into another providers network and the path carries only transit trac. In this Chapter, we show that due to high levels of multiplexing, the number of packets arriving at an ingress buer over a xed period of time tends to a Gaussian distribution. With this in mind, we develop methods to approximate packet loss given Gaussian trac and restricted buer service rates. These approximations are used to dimension appropriate sized buers in OBS ingress nodes.
6.2
Long Range Dependence
When a certain property of an object is preserved with respect to scaling in time, it is said to exhibit self-similarity. Recent research has shown that network trac appears bursty over a wide range of time scales and that certain second order statistics are time-scale invariant over certain ranges, for example [55, 79, 120, 121, 135]. Precise denitions for self-similarity, including the Hurst parameter H,
CRAIG CAMERON can be found in [143].
135
A related property that has been found in trac traces is Long Range Dependence [47, 91]. For all second-order self-similar processes, the autocorrelation function, r(k), tends to r(k) H(2H 1)k 2H2 , k . (6.1)
If r(k) decays slowly enough such that it is non-summable, eg. hyperbolically, the process is said to be Long Range Dependent (LRD). Conversely, the process is Short Range Dependent (SRD) if the autocorrelation function is summable. Note that over the 1 normal range of interest, 2 < H < 1, self-similarity and long range dependence are equivalent and therefore, with signicant abuse of notation, these terms are often used interchangeably in the research literature.
6.2.1
Internet Trac - SRD or LRD?
The strong interest in modelling Asynchronous Transfer Mode (ATM) networks in the 1990s led to a demand for performance analysis tools to guarantee Quality of Service for varying source data trac loads. The key question was: How many sources can be admitted for a xed buer size with a specied cell blocking probability?[51]. One way of solving this problem was to assign each source an eective bandwidth [63, 110] and then compare the sum of input sources with an appropriately chosen threshold as the corresponding Call Admission Control (CAC) policy. These provisioning results were originally based on classical SRD poisson results [113] but, as trac carried on networks tends away from voice towards data, this SRD approximation has been shown to be less and less accurate, underestimating performance measures such as average packet delay and maximum queue size [148], even for ATM networks [86, 160, 161]. Furthermore, Fowler and Leland show that congested periods can be quite long with concentrated losses and that changes in buer
136
size and number of active connections have quite unexpected results [73]. A comprehensive summary of related research over the last 20 years can be found in [158]. Trac in the backbone of the Internet seems to have distinctly Gaussian characteristics, especially for high capacity links [45, 46, 75, 205]. However, the validity of SRD trac modelling is being vigorously debated: both LRD and SRD Gaussian models have been used to produce useful results. Moreover, links with medium or low aggregation levels have been found not only to exhibit LRD characteristics but also to be distinctly non-Gaussian.
6.2.2
Multi-scaling behaviour
If the corresponding Hurst parameter of a trac trace varies significantly with time-scale range changes, it is multi-scaled. Internet trac not only exhibits long range dependent behaviour, but over small time intervals, with a transition phase around the roundtrip time, trac appears to be multi-scaled with a non-Gaussian, log-normal distribution at small time scales [66, 75, 122]. Recent research has found that TCP connection arrivals also exhibit selfsimilarity [65]. Indeed, Norros suggests that the traditional notion of loss probability loses its objective character in the connectionless context since the sources are able to avoid extensive losses by changing their own behaviour [136]. Multi-scale trac models need many parameters, are not amenable to exact queuing analysis and therefore developing feasible approximations is an active research area. Liu and Baras develop a new framework by solely characterizing the dierent orders of moments, enabling non-asymptotic closed-form analysis [122]. This multiscale behaviour has been veried with Sprint trac traces and leveraged to provide provisioning results to minimize latency in the core network [75]. In another paper [205], the multi-scale nature was replicated with dierent Hurst parameters for small and large time
CRAIG CAMERON
137
scales. However, in stark contrast to the other papers, the authors also showed highly aggregated network core trac has Gaussianlike marginal densities, even at small millisecond time scales. With the recent discovery of multi-scaling properties at small time scales, it has become important to clearly dene the timescale of interest and take care in extrapolating results over wide time intervals. To be more explicit, LRD queuing results may be only valid for time-scales larger than the round-trip time, especially in low bandwidth links. At the very least, the parameters used to dene the approximate distribution of the trac must match the time-scale of interest [135]. It has been shown through analysis and simulation that, for time-scales shorter than the burst assembly timer period T , where T = 2ms, the arrival process does not show long-range dependence [100]. Furthermore, it was claimed in [80] that burst assembly algorithms can reduce long-range dependence. Although recent papers dispute this claim [186, 197], it does seem that the jury is out on the relevance of LRD within the OBS paradigm. We concentrate on SRD trac models in the following Sections.
6.3
Gaussian Queuing
As mentioned above, current trac in the backbone of the Internet seems to have Gaussian characteristics. As OBS networks are also intended for carrying trac with high levels of aggregation, it is to be expected that incoming trac will exhibit similar Gaussian characteristics. Several OBS research papers support this hypothesis. In [100], the authors conclude that for a high-speed network with small buers the arrival process of bursts at intermediate OBS nodes can safely assumed to be Poisson distributed with Gaussian sized bursts for practical engineering purposes. This suggests that, given a timer-based burst assembly policy, each ingress node periodically generates Gaussian sized bursts. Therefore, for nite
138
sized ingress buers, determining the corresponding queuing performance given an input Gaussian process is important in dimensioning ingress nodes. Explicit results for the statistics of the unnished work distribution of a single server queue (SSQ) with an innite buer fed by Gaussian trac were derived in [8], using the Semi Markov Process (SMP) approach introduced in [15]. This Section extends this work to nite queues, using the fundamental assumption that the waiting time or the unnished work distribution has a dominant exponential tail. In other words, for large enough t, the tail of the complementary waiting time distribution of a semi-Markov queue takes the form ces t . We derive an accurate solution for c and compare with previous solutions [131, 158] which have the drawback of focusing exclusively on tail behaviour and may produce misleading results, as shown in [51].
6.3.1
Research Overview
Calculating the output variance of SRD Gaussian trac is very dicult and has been extensively researched over the last 40 years following the discovery by Miller that the stationary complementary waiting time, or unnished work distribution, has a dominant negative exponential tail [129]: P r(V > x) ces x ,
for x large
(6.2)
Arjas and Speed provided a method for calculating c and s in [14, 15], however this method requires a complete spectral factorization and therefore is seldom practical for complex problems. In [4, 5], Abate, Choudhury and Whitt further justied the validity of the exponential tail limit using exponential change of measure arguments. They also comment that determining an appropriate constant c is often much more dicult than the appropriate decay rate s and therefore simplify this exponential approximation,
CRAIG CAMERON removing the dependence on c by setting it to 1: P r(V > x) es x ,
139
for x large
(6.3)
Large deviations theory [40, 60, 81, 131] also reached this simplied approximation, albeit through the result: x1 logP r(V > x) s , as x (6.4)
While this simpler approximation is very amenable to mathematical analysis and can be even extended to multiple service classes with priorities [170], it has been shown in [51] to produce misleading results over a wide range of parameters, especially where there is a large number of independent sources. Extreme Value Theory [46, 47] has been used to calculate stronger asymptotic results for the tail probability of a Gaussian queuing system. While the results were derived by taking the queue length to innity, they provide accurate estimates of the tail probability even for small queue sizes. These results also show a negative exponential tail.
6.3.2
The Model
Glossglauser and Bolot provide an excellent description of what constitutes a useful model: A model is a tool for decision making. Thus, its quality depends on the quality of the decisions it leads to rather than on its closeness to physical reality [86]. The discrete model presented below is only one of many models developed over the last twenty years but it has been extensively used, with minor variations, to accurately predict performance measures [68, 10, 11]. Consider a First-In-First-Out (FIFO) SSQ. Let time be divided into xed length sampling intervals, corresponding to the buer timer period, T . The model allows arbitrary choice of interval length. Let An be a continuous random variable representing the
140
amount of work entering the system during the nth sampling interval. Let Bn be a continuous random variable representing the amount of work which can be processed by the server per sampling interval. We assume here the service takes place at the end of the interval. As usual, the utilization, , is dened by = E{A } . E{B } (6.5)
Let the sequence of continuous random variables, Yn , be the net input process dened by Yn = An Bn , n 0. (6.6)
Let Vn represent the unnished work at the beginning of the nth sampling interval. The systems unnished work process, for the case of innite buer, satises Lindleys recurrence relation: Vn+1 = (Vn + Yn )+ , n 0, (6.7)
where V0 = 0 and where X + = X if X 0 and X + = 0 otherwise. Letting Z denote the set of integers and R the real line, we now suppose that (Yn )nZ , is a stationary, ergodic, Gaussian discretetime process with mean m, variance 2 < . Any such process can be represented [115] in the form
Yn = m +
k=0
ak Unk ,
n Z,
(6.8)
where {Un }nZ are mutually independent Gaussian random variables with mean zero and variance one, ak R, k 0, and a2 = 2 < . A necessary and sucient condition for k queueing stability is m < 0: the mean arrival rate is lower than the mean service rate. Let Xn denote the sequence {Uk }kn from Equation 6.8. Then (Xn , Yn ) is a semi-Markov sequence (semiMarkov process sampled just after every state transition) with (Xn )
k=0
CRAIG CAMERON
141
as the underlying Markov process. The net input process, Yn , can equivalently be interpreted as the time between transitions of the underlying Markov process. We now need to consider the asymptotic variance rate (AVR) of an arbitrary stochastic process {Zn }, referred to from now on as AVR({Zn }). It is dened by lim VAR{
k n=1
Zn }
k=0
(6.9)
In particular, let v = AVR({Yn }) = (
ak )2 be the asymptotic
variance rate of the net input process. The autocovariance sum S dened by Cov(Yn , Yn+k ) is related to the asymptotic variance k=1 rate by v = 2 + 2S. (6.10)
The asymptotic variance rate of a process may be innite, in which case the autocovariance sum will also be innite. In the case where v < , the so-called non-fractal, or SRD, case, with which we shall be concerned for this Chapter, the estimation of the tail of the unnished work distribution can be expressed simply in terms of the trac parameters m, 2 and v, together with the size of the buer in the case where it is nite. We assume a constant service rate, . In this case (An )nZ is a stationary, ergodic, Gaussian discrete-time process with mean m+ and variance 2 . This is equivalent to Bn remaining constant and equal to . Therefore the critical three trac parameters become the arrival rate = E{A }, 2 = Var{An } and v = AVR({An }). These critical parameters are additive: the superposition of j
2 independent processes with parameters i , i , vi , i = 1, . . . , j, 2 is given by = j i , 2 = j i , and v = j vi . Daley i=1 i=1 i=1 [56] has shown that AVR({An }) = AVR({On }), where On is the amount of work departing during the nth sampling interval. This
result implies that two out of the three parameters, namely and v,
142
are unchanged when the arrival process is ltered through a queue. The tail of the unnished work distribution can also be expressed in terms of these critical parameters. In the case of dominant tailed trac, a simple rst order autoregressive Gaussian process, which can realize any feasible values of these parameters, can be accurately used to model stationary ergodic Gaussian trac for the purpose of queueing performance evaluation [9].
6.3.3
Simulation Model
A rst order AR time series representing the arrival processes was used to validate the analytical calculations, i.e. An = aAn1 + bUn (6.11)
where Un is Gaussian with mean and variance 2 , and a and b are real numbers with |a| < 1. The parameters needed to t the theoretical model with this AR process were calculated from the following simultaneous equations taken from [9]: a= b= = S S + 2 (6.12) (6.13) (6.14)
2 (1 a2 ) 2 (1 a) . b
Note that using this model, equals the standard deviation of the input process, .
6.3.4
An Approximation for c
As mentioned above, the tail of the complementary waiting time distribution of a semi-Markov queue takes the form ces t . The exponent, s , was previously derived in [8] to be:
CRAIG CAMERON
143
s =
2m 2m = 2 + 2S v
(6.15)
For the nite buer case, an approximation for c, denoted as c, can be calculated from the following coupled equations, given a buer size, K and a xed service rate , such that K = K . The complete derivation of these formulae by Addie can be found in [6].
D=
erf( Km ) + ces 2
(2erf( u2 ) erf( Ku ) erf( uK )) 2 2
u erfc( 2 ) + erfc( Ku ) 2
(6.16)
c= where
s ((m) (K m) DZ) , erf( u2 ) + s es K Z u=
(6.17)
2m , v Z = 2(u) (u K) (K u) + K,
(6.18) (6.19)
and 1 (x) = 2
0
ye
(y+x)2 2 2
x2 x x dy = e 22 erfc( ). (6.20) 2 2 2
Using the above equations, the loss probability can be calculated using the following two equations: E{L } , Ploss = m+ where E{L } D(u) + ce s
K
(6.21)
u (erf( 2 ) s (u))
s (1 es K )
(6.22)
and (x) is dened above in Equation 6.20. Note that in practice, K must be designed to be much greater than . Additionally, for very high utilizations, these formulae do
144
utilization 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 0.98
Ploss simulation Ploss approx. without FP 0.0 1.80 108 0.0 2.90 108 0.0 1.37 107 0.0 7.60 106 3.2 106 4.18 106 1.8 105 2.16 105 5 7.9 10 1.02 104 4.8 104 4.43 104 3 2.9 10 1.84 103 1.4 102 1.16 102 2.7 102 3.7 102
Ploss approx. with FP 1.80 108 2.90 108 1.37 107 7.60 106 4.18 106 2.16 105 1.02 104 4.41 104 1.80 103 9.18 103 2.92 102 5.199 102
Table 6.1: Analytic vs. Simulation Results for Ploss . not provide good answers as they stand. In the example considered below, negative values are obtained for c and D when the utilization is above 0.9. This can be explained by the fact that the eective utilization in a system with this nominal utilization should be reduced by the amount of loss experienced in the system. This adjustment requires solving a xed point problem for the loss (see Appendix F written by Addie in [6] for details). In Table 6.3.4 below, we show the calculated loss for 2 = 70.8, v = 575.8, K = 130 and = 30, the same parameter values used in [8]. When the xed-point adjustment is made, the results are in good agreement with simulation. Note that the length of the simulation runs are of the order of 107 sampling intervals, so they cannot measure accurately loss probability of order of 107 or less.
6.3.5
The Output Variance
Being able to calculate the corresponding parameter values of the output process is important when we extend the single server queue results to performance evaluation of a network of queues. For ex-
CRAIG CAMERON Utilization 0.98 0.95 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Approximation 6.77 16.02 29.14 48.02 59.31 65.48 68.56 69.95 70.51 Simulation 6.74 16.15 29.45 48.30 59.38 65.30 68.74 70.04 70.57
145
Table 6.2: Analytic vs. Simulation Results for the Output Variance
ample, given a upstream measurement of the three key trac parameters, useful estimates of the modied values for these parameters on arrival at the ingress router can be determined. It was shown in [56] that the asymptotic variance rate of An is equal to the asymptotic variance rate of the work departing during the nth sampling interval in innite buer FIFO SSQs. As there is no loss, the mean is also unchanged. Therefore calculation of the output process parameters reduces to the calculation of the output variance. Once again, we present the nal formula: for more details, refer to the derivation by Addie in [6]. The output variance is given by:
2 out = 2
2 + m2 m 2(m) 2 c m m2 erfc( ) + (u) e 22 2 s s 2 2 (6.23)
where (x) is dened above in Equation 6.20. To test the accuracy of the above formula, we used the parameter values of [8] and the previous Section: 2 = 70.8, v = 575.8, K = 130, = 30 and calculate the output variance both from Equation 6.23 and through simulation. The buer size K is taken to be 130, and = 30 so K = 100. From the results shown in Table 6.2 we can see that Equation 6.23 is very close to simulated values.
146
LAN Router 1 packets LAN Router 2 OBS Ingress Buffer bursts
LAN Router 10
Figure 6.1: Simulation Topology. Routers with innite sized buers feeding nite OBS ingress buer
6.4
Networks of Queues: Simulations
As mentioned above, accurate measurements of 2 , m and v may not be available at the ingress node but further upstream within the access network. In this Section, we present a simple network topology and compare the theoretical results calculated using the introduced approximation formulae with long sample simulation runs of input SRD Gaussian processes calculated from rst order auto-regressive (AR) time series. The chosen topology, shown in Figure 6.1, models the aggregation of many large buered packetbased routers, modelled by innite buers, into a single limited size ingress router buer. The same input parameters were used for each packet-based router. We assume in the following analysis and simulation that the buer in each packet-based router is suciently dimensioned such that the probability of overow is negligible and therefore the innite buer approximation is appropriate. We include two comparisons of particular interest: the eect of ingress node buer size and the eect of service rate on loss probability at the OBS ingress node buer. The other free parameters: m, and v were set to typical values (described in detail below) and the upstream packet routers buer sizes set to innity. It was
CRAIG CAMERON
147
found through extensive exploration of the parameter space that the modication of these free parameters had little eect on the general shape of the graphs, however care was taken not to overload the input queues so as to preserve the Gaussian nature of the trac. As the utilization of a queue increases, the output process becomes deterministic in a manner which is not correctly modelled by a Gaussian process and the approximations used to derive the formulae become invalid. This weakness of the Gaussian model is mitigated when many similar trac streams are aggregated because the resulting process is then closer to Gaussian. However, decreasing the number of input nodes, at the second stage, reduces the Gaussian nature of the aggregated ow. Thus, we cannot expect the formulae to be accurate when the rst stage of queueing is heavily loaded and the number of aggregated streams at the second stage is low. The output variance of the packet routers was calculated using Equation 6.23. This variance was multiplied by ten and used as the input parameter for the OBS ingress node buer. The other input parameters to the packet routers are unmodied by innite buer queuing and therefore, after multiplying by ten, were used with the new variance to calculate the loss rate at the OBS ingress node buer from the coupled Equations 6.16 and 6.17. In packet-based routers, all packets are queued in the same buer regardless of their nal destination. It is important to note that both simulations required one hundred million loops and twenty independent runs to achieve average errors of less than one in ten million. Due to this complexity, each simulation run took over three hours on a 2.4Ghz Intel Pentium IV processor. In contrast, calculating estimate values from the above formulae took less than one second to compute on an identically congured machine.
148
10
Network of Queues Simulation Comparison
10
Simulation Theoretical Calculation

4
Loss Probability
10
10
10
10
70
72
74
76
78
80 Buffer Size
82
84
86
88
90
Figure 6.2: Loss vs. Buer Size: For each upstream router, m = 2.5, 2 = 10, v = 100, = 5. For the ingress node, = 50
6.4.1
Ingress Node Buer Size
Being able to dimension ingress node buer sizes is a key requirement for OBS network designers and vendors. To check the accuracy of the above formulae, we compared results from the above formulae with the rst order AR model introduced in 6.3.3, using the topology shown in Figure 6.1. For each upstream router, m = 2.5, 2 = 10, v = 100, = 5. For the ingress node, = 50 and the buer size, K, was varied from 70 to 90. The results are shown in Figure 6.2. In this case, the simulation result closely matched the theory, giving only slightly larger buers for a given desired loss rate. For example, for a loss probability of 104 , theory predicted that a buer size of 77 was required, while simulations found that a size of 80 was required, a dierence of less than 4%.
CRAIG CAMERON
149
10
Network of Queues Simulation Comparison 2
10
Simulation Theoretical Calculation
Loss Probability
10
10
10
50
52
54
56 58 Service Rate
60
62
64
Figure 6.3: Loss vs. Service Rate: For each upstream router, m = 4, 2 = 10, v = 100, = 10. For the ingress node, K = 200
6.4.2
Eect of Service Rate
As mentioned above, due to the large number of buers within an OBS node, in order to share more equally the output link resources, the size of each burst can be restricted. Using the notation introduced in this chapter, this restriction is equivalent to modifying the service rate, . In a similar way to the previous section, we compared results from the above formulae with the rst order AR model introduced in 6.3.3, using the topology shown in Figure 6.1. For each upstream router, m = 4, 2 = 10, v = 100, = 10. For the ingress node, K = 200 and the service rate, , was varied from 50 to 64. The results are shown in Figure 6.3. In this case, the simulation result also closely matched the theory. For low values of the service rate and correspondingly high loss probability, the theory slightly underestimated the simulated loss, while for high service rate values, the loss probability was overestimated by theory. The largest dierence was for 5 106 where the
CHAPTER 6. GAUSSIAN QUEUING IN OBS corresponding service rates diered by less than 4%.
150
6.5
Conclusion
Usage of the output link resources can be more fairly shared amongst the competing buers by restricting the size of each burst. Viewed from a network providers perspective, limiting buer access to the output links enables ne grained control over network usage, especially when the corresponding ingress and egress nodes connect into another providers network and the path carries only transit trac. In this case, the ability to dimension ingress node buers and oered service rates is required. In this Chapter, we showed that due to high levels of multiplexing, the number of packets arriving at an ingress buer over a xed period of time tends to a Gaussian distribution. With this in mind, we developed methods to approximate packet loss given Gaussian trac and restricted buer service rates. These approximations were used to dimension appropriate sized buers in OBS ingress nodes accurately and quickly in comparison to lengthy simulations.
CRAIG CAMERON
151
Chapter 7 Conclusion
OBS has been proposed as a future high-speed switching technology for all-optical networks that may be able to eciently utilize extremely high capacity links without the need for data buering or optical-electronic conversions at intermediate nodes. Packets arriving at an OBS ingress node that are destined for the same egress OBS node and belong to the same Quality of Service (QoS) class are aggregated and sent in bursts. At intermediate nodes, the data within the optical signal is transparently switched to the next node according to forwarding information contained within a control packet preceding the burst. At the egress node, the burst is subsequently de-aggregated and forwarded electronically. Unlike classical circuit switching, contention between bursts may cause loss within the network. The main problem of OBS is that this loss is quite high, even for moderate input loads. By measuring, managing and reducing loss, the novel techniques introduced in this thesis overcome serious hurdles in current Optical Burst Switching proposals, helping to bring OBS towards feasibility and eventual deployment. The following Sections highlight the major results obtained in each part of the dissertation.
CHAPTER 7. CONCLUSION
152
7.1
Burst Assembly
We showed that timer algorithms may induce extremely high and deterministic levels of burst loss due to timing synchronicity. We presented a new dynamic algorithm, O Timer Burst Assembly (OTBA), that links burst size and burst injection times to input trac intensity, introducing required randomness in the network to greatly reduce burst loss due to timing synchronicity.
7.2
TCP
We combined several source rate TCP models with an OBS loss model to nd xed-point input loads and loss rate for TCP over OBS where individual TCP sources have at most one packet in each burst. It was found that TCP can utilize OBS networks at comparable levels to current networks, given a large enough number of output wavelengths.
7.3
Deection Routing
We developed a novel algorithm, Shortest Path Prioritized Random Deection Routing (SP-PRDR) that combines wavelength conversion, deection routing and preemption contention resolution schemes to reduce burst loss rates in OBS networks signicantly, yet maintains stability, even at high loads.
7.4
Ingress Buer Dimensioning
Limiting buer access to the output links is equivalent to establishing a virtual circuit from that buer to its common destination, enabling ne-grained control over network usage. We showed that due to high levels of multiplexing, the number of packets arriving at
CRAIG CAMERON
153
an ingress buer over a xed period of time tends to a Gaussian distribution and developed methods to approximate packet loss given this Gaussian trac and restricted buer service rates. These approximations were used to dimension appropriate sized buers in OBS ingress nodes.
7.5
Further Work
There is a signicant amount of work to be done to ensure that Optical Burst Switching becomes a feasible option for future highspeed networks. Protocols, simulations and analyses introduced in this thesis serve as stepping stones towards this goal, solving some problems and introducing new areas of study within the eld of OBS. With any new technology, extensive testing is required before wide-spread deployment. Within the OBS paradigm, the testing of new protocols under realistic network conditions is extremely challenging problem. Most importantly, the question of what are realistic network conditions becomes intractable when considering future networks. Nevertheless, conducting comprehensive simulations under various conditions greatly adds to the impact of any theoretical modeling and calculation. Simulations can be extremely dicult, especially with something as complex as a large, high-speed communication network, such as the Internet [69]. Using simulations presented in this thesis as a base, further work in increasing the simulation complexity may yield a deeper understanding of OBS dynamics. One such study to increase the complexity of the presented simulations is currently underway in collaboration with the Information and Communications University, Daejeon, South Korea. In this project, we aim to incorporate an OBS network model within the
CHAPTER 7. CONCLUSION
154
ns-2 simulator1 to observe the dynamics of real TCP applications, over an OBS network. Further work in this area must also focus on the nancial incentives, or disincentives, to deploy OBS networks. Optical technology is currently technologically immature, making this task even more dicult. Ultimately, even if OBS networks become technically feasible, this will not guarantee deployment. In order to have significant impact, solutions must oer a lower cost/performance ratio than other competing technologies. Calculating accurate estimates of the nancial cost of ideas presented in this thesis, and indeed all OBS research ideas, would provide a useful lter, removing costly impractical schemes and highlighting cost-eective schemes for further development. Optical Burst Switching has become a very active research eld in the last ve years, yielding hundreds of papers and a diverse range of solutions. The novel ideas in this thesis add to this large, and growing, body of knowledge and by providing many opportunities for further work, should continue to bring OBS towards feasibility.
http://www.isi.edu/nsnam/ns/
CRAIG CAMERON
155
Bibliography
[1] SONET transport systems: Common generic criteria, Telcordia Technologies GR-253-CORE Issue 2, Revision 2, 1999. [2] ITU-T recommendation G.807/Y.1302. requirements for automatic transport networks (ASTN), July 2001. [3] ITU-T recommendation G.8080/Y.1304. architecture for the automatically switched optical networks, amendement 1, Mar. 2003. [4] J. Abate, G. L. Choudhury, and W. Whitt, Asymptotics for steady-state tail probabilities in structured markov queuing models, Stochastic Models, vol. 10, pp. 99143, 1994. [5] , Exponential approximations for tail probabilities in queues, I: Waiting times, Operations Research, vol. 43, no. 5, pp. 885901, 1995. [6] R. Addie, C. Cameron, C. Foh, and M. Zukerman, Conditional independence in gaussian queues, submitted January 2003 to IEEE/ACM Transactions on Networking, currently in review. [7] R. Addie, T. Neame, and M. Zukerman, Performance evaluation of a queue fed by a Poisson Pareto burst process, Computer Networks, vol. 40, no. 3, pp. 377397, Oct. 2002. [8] R. Addie and M. Zukerman, An approximation for performance evaluation of stationary single server queues, IEEE/ ACM Transactions on Communications, vol. 42, no. 12, pp. 31503160, Dec. 1994. [9] , Performance evaluation of a single server AutoRegressive queue, Australian Telecommunication Research, vol. 28, no. 1, 1994.
BIBLIOGRAPHY
156
[10] , Queueing performance of a tree type ATM network, in Proc. IEEE INFOCOM, Toronto, Canada, June 1994. [11] R. Addie, M. Zukerman, and T. Neame, Broadband trac modeling: Simple solutions to hard problems, IEEE Communications Magazine, vol. 36, no. 8, pp. 8895, Aug. 1998. [12] H. Akimaru and K. Kawashima, Teletrac: Theory and Application, 2nd ed. Springer-Verlag Telos, 1999. [13] M. Allman, V. Paxson, and W. Stevens, TCP congestion control, RFC 2581, Apr. 1999. [14] E. Arjas, On a fundamental identity in the theory of semiMarkov processes, Advances in Applied Probability, vol. 4, pp. 258270, 1972. [15] E. Arjas and T. Speed, Symmetric Wiener-Hopf factorizations in Markov-additive processes, Z. Wahrscheinlichkeitstheorie und Verwandt Gebeite, vol. 26, pp. 105118, 1973. [16] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and X. Xiao, Overview and principles of Internet trac engineering, RFC 3272, May 2002. [17] D. Awduche and Y. Rekhter, Multi-protocol lambda switching: Combining MPLS trac engineering control with optical crossconnects, IEEE Communications Magazine, vol. 39, no. 3, pp. 111116, Mar. 2001. [18] M. Baresi, S. Bregni, A. Pattavina, and G. Vegetti, Deection routing eectiveness in full-optical IP packet switching networks, in Proc. IEEE International Conference on Communications (ICC), Anchorage, USA, May 2003, pp. 1360 1364. [19] J. Bellardo and S. Savage, Measuring packet reordering, in Proc. ACM Internet Measurement Workshop, Marseille, France, Nov. 2002, pp. 97105. [20] J. Bennett, C. Partridge, and N. Shectman, Packet reordering is not pathological network behavior, IEEE/ACM Transactions on Networking, vol. 7, no. 6, pp. 789798, Dec. 1999.
CRAIG CAMERON
157
[21] G. M. Bernstein, J. Yates, and D. Saha, IP/MPLS-centric control and management of optical transport networks, IEEE Communications Magazine, vol. 38, no. 10, pp. 161 167, Oct. 2000. [22] D. Bertsekas and R. Gallager, Data Networks, 2nd ed. Prentice Hall, 1992. [23] S. Bhattacharyya, C. Diot, J. Jetcheva, and N. Taft, POPlevel and Access-Link-Level trac dynamics in a tier-1 POP, in ACM SIGCOMM Internet Measurement Workshop, San Francisco, USA, Nov. 2001. [24] , Geographical and temporal characteristics of interPOP ows: View from a single POP, European Transactions on Telecommunications, vol. 13, no. 1, pp. 522, Feb. 2002. [25] S. Bigo, Y. Frignac, G. Charlet, W. Idler, S. Borne, H. Gross, R. Dischler, W. Poehlmann, P. Tran, C. Simonneau, D. Bayart, G. Veith, A. Jourdan, and J.-P. Hamaide, 10.2 Tbit/s (256x42.7 Gbit/s PDM/WDM) transmission over 100 km Teralight ber with 1.28 bit/s/Hz spectral eciency, in Proc. Optical Fiber Communications Conference (OFC), Anaheim, USA, vol. 4, Mar. 2001. [26] D. Bishop, C. Giles, and G. Austin, The Lucent LambdaRouter: MEMS technology of the future here today, IEEE Communications Magazine, vol. 40, no. 3, pp. 7579, Mar. 2002. [27] U. Black, Optical Networks. Third Generation Transport Systems. Prentice Hall, 2002. [28] E. Blanton and M. Allman, On making TCP more robust to packet reordering, ACM SIGCOMM Computer Communications Review, vol. 32, Jan. 2002. [29] J. Bolliger, T. Gross, and U. Hengartner, Bandwidth modeling for network-aware applications, in Proc. IEEE INFOCOMM, New York City, USA, Mar. 1999. [30] N. Brownlee and K. Clay, Understanding internet trac streams: Dragonies and tortoises, IEEE Communications, vol. 40, no. 10, pp. 110117, Oct. 2002.
BIBLIOGRAPHY
158
[31] C. Cameron, J. Choi, S. Bilgrami, H. L. Vu, M. Zukerman, and M. Kang, Fixed-point performance analysis of TCP over optical burst switched networks, in Proc. ATNAC, Sydney Australia, Dec. 2004. [32] , Fixed-point performance analysis of TCP over optical burst switched networks, OSA Optics Express, vol. 13, no. 23, pp. 91679174, 2005. [33] C. Cameron, S. Low, and D. Wei, High density model for server allocation and placement, in ACM SIGMETRICS, Marina Del Rey, 2002. [34] C. Cameron, J. White, and M. Zukerman, O-timer burst assembly in OBS networks, in Proc. ATNAC, Sydney Australia, Dec. 2004. [35] C. Cameron, A. Zalesky, and M. Zukerman, Optical burst switching - prioritized deections, in Proc. International Conference on the Optical Internet (COIN), Yokohama, Japan, July 2004. [36] , Shortest path prioritized random deection routing in optical burst switched networks, in ICST International Workshop on Optical Burst Switching (WOBS), San Jose, USA, Oct. 2004. [37] , Prioritized deection routing in optical burst switching networks, IEICE Transactions, May 2005. [38] X. Cao, J. Li, Y. Chen, and C. Qiao, Assembling TCP/IP packets in optical burst switched networks, in Proc. IEEE GLOBECOM, Taipei, Taiwan, vol. 3, Nov. 2002, pp. 2808 2812. [39] D. Cavendish, Evolution of optical transport technologies: From SONET/SDH to WDM, IEEE Communications Magazine, pp. 164172, June 2000. [40] C. S. Chang, Stability, queue length and delay of deterministic and stochastic queueing networks, IEEE Transactions on Automatic Control, vol. 39, pp. 913931, 1994.
CRAIG CAMERON
159
[41] Y. Chen, C. Qiao, and X. Yu, Optical burst switching: A new area in optical networking research, IEEE Network, pp. 1623, May 2004. [42] Y. Chen, H. Wu, D. Xu, and C. Qiao, Performance analysis of optical burst switched node with deection routing, in IEEE International Conference on Communications (ICC), vol. 2, May 2003, pp. 13551359. [43] I. Chlamtac, A. Fumagalli, and C.-J. Suh, Multibuer delay line architectures for ecient contention resolution in optical switching nodes, IEEE Transactions on Communications, vol. 48, no. 12, pp. 20892098, Dec. 2000. [44] I. Chlamtac, A. Ganz, and G. Karmi, Lightpath communications: An approach to high bandwidth optical WANs, IEEE Transactions on Communication, vol. 40, pp. 11711182, July 1992. [45] J. Choe and N. B. Shro, A new method to determine the queue length distribution at an ATM multiplexer, in Proc. IEEE INFOCOM, Kobe, Japan, Apr. 1997. [46] , A central-limit-theorem-based approach for analyzing queue behavior in high-speed networks, IEEE/ACM Transactions on Networking, vol. 6, no. 5, pp. 659671, Oct. 1998. [47] , Queueing analysis of high-speed multiplexers including long-range dependent arrival processes, in Proc. IEEE INFOCOM, New York City, USA, Mar. 1999. [48] J. Choi, H. L. Vu, C. Cameron, M. Zukerman, and M. Kang, The eect of burst assembly on performance of optical burst switched networks, Feb. 2004. [49] , The eect of burst assembly on performance of optical burst switched networks, Lecture Notes in Computer Science, vol. 3090, pp. 729739, 2004. [50] A. K. Choudhury and V. O. K. Li, An approximate analysis of the performance of deection routing in regular networks, IEEE Journal on Selected Areas in Communications, vol. 11, no. 8, pp. 13021316, Oct. 1993.
BIBLIOGRAPHY
160
[51] G. L. Choudhury, D. M. Lucantoni, and W. Whitt, Squeezing the most out of ATM, IEEE/ACM Transactions on Communications, vol. 44, no. 2, pp. 203217, Feb. 1996. [52] M. Christiansen, K. Jeay, D. Ott, and F. D. Smith, Tuning RED for web trac, IEEE/ACM Transactions on Networking, vol. 9, no. 3, pp. 249264, June 2001. [53] T. Cinkler, S. Bjrnstad, D. Careglio, D. Colle, C. Gauger, M. Karasek, A. Kuchar, S. de Maesschalck, F. Matera, C. Mauz, and M. Settembre, On the future of the optical infrastructure - COST 266 views, in Proc. International Conference on Transparent Optical Networks (ICTON), Warsaw, Poland, Apr. 2002, pp. 8792. [54] K. Clay, G. Miller, and K. Thompson, The nature of the beast: Recent trac measurements from an Internet backbone, 1998. [Online]. Available: http: //www.caida.org/outreach/papers/1998/Inet98/Inet98.html [55] M. E. Crovella and A. Bestavros, Self-similarity in World Wide Web trac: Evidence and possible causes, IEEE/ ACM Transactions on Networking, vol. 5, no. 6, pp. 835846, Dec. 1997. [56] D. J. Daley, Queueing output processes, Advances in Applied Probability, vol. 8, pp. 395415, 1976. [57] DARPA, Internet protocol, RFC 791, Sept. 1981. [58] A. Detti and M. Listanti, Impact of segments aggregation on TCP Reno ows in optical burst switching networks, in Proc. IEEE INFOCOM, New York City, USA, 2002, pp. 18031812. [59] K. Dolzer, C. Gauger, J. Spaeth, and S. Bodamer, Evaluation of reservation mechanisms for optical burst switching, International Journal of Electronics and Communications, vol. 55, no. 1, pp. 1826, 2001. [60] N. G. Dueld and N. OConnell, Large deviations and overow probabilities for the general single-server queue, with applications, Proc. Cambridge Philisophical Society, vol. 118, pp. 363374, 1995.
CRAIG CAMERON
161
[61] T. Durhuus, B. Mikkelsen, C. Joergensen, S. L. Danielsen, and K. E. Stubkjaer, All-optical wavelength conversion by semiconductor optical ampliers, IEEE Journal of Lightwave Technology, vol. 14, p. 942954, June 1996. [62] T. S. El-Bawab and J.-D. Shin, Optical packet switching in core networks: Between vision and reality, IEEE Communications Magazine, vol. 40, no. 9, pp. 6065, Sept. 2002. [63] A. Elwalid and D. Mitra, Eective bandwidth of general markovian trac sources and admission control of high speed networks, IEEE/ACM Transactions on Networking, vol. 1, no. 3, pp. 329343, June 1993. [64] K. Fall and S. Floyd, Simulation-based comparisons of Tahoe, Reno, and SACK TCP, Computer Communication Review, vol. 26, no. 3, pp. 521, July 1996. [65] A. Feldmann, A. Gilbert, and W. Willinger, Data networks as cascades: explaining the multifractal nature of internet WAN trac, in Proc. ACM SIGCOMM, Vancouver, Canada, 1998, pp. 4255. [66] A. Feldmann, A. Gilbert, W. Willinger, and T. Kurtz, The changing nature of network trac: Scaling phenomena, Computer Communication Review, vol. 28, no. 2, pp. 529, Apr. 1998. [67] S. Floyd and K. Fall, Promoting the use of end-to-end congestion control in the internet, IEEE/ACM Transactions on Networking, vol. 7, no. 4, pp. 458472, Aug. 1999. [68] S. Floyd, T. Henderson, and A. Gurtov, The NewReno modication to TCPs fast recovery algorithm, RFC 3782, Apr. 2004. [69] S. Floyd and V. Paxson, Diculties in simulating the Internet, IEEE/ACM Transactions on Networking, vol. 9, no. 4, pp. 392403, Aug. 2001. [70] M. Fomenkov, K. Keys, D. Moore, and K. Clay, Longitudinal study of Internet trac in 1998-2003. [Online]. Available: http://www.caida.org/outreach/papers/ 2003/nlanr/nlanr overview.pdf
BIBLIOGRAPHY
162
[71] F. Forghieri, A. Bononi, and P. R. Prucnal, Analysis and comparison of hot-potato and single-buer deection routing in very high bit rate optical mesh networks, IEEE Transactions on Communications, vol. 43, no. 1, pp. 8898, Jan. 1995. [72] J. E. Fouquet, S. Venkatesh, M. Troll, D. Chen, H. F. Wong, and P. W. Barth, A compact, scalable cross-connect switch using total internal reection due to thermally-generated bubbles, in Proc. Lasers and Electro-Optics Society Annual Meeting, Florida, USA, Dec. 1998, pp. 169170. [73] H. Fowler and W. Leland, Local area network trac characteristics, with implications for broadband network congestion management, IEEE Journal on Selected Areas in Communications, vol. 9, no. 7, pp. 11391149, Sept 1991. [74] C. Fraleigh, S. Moon, B. Lyles, C. Cotton, M. Khan, D. Moll, R. Rockell, T. Seely, and C. Diot, Packet-level trac measurements from the Sprint IP backbone, IEEE Network, vol. 17, no. 6, pp. 616, Nov. 2003. [75] C. Fraleigh, F. Tobagi, and C. Diot, Provisioning IP backbone networks to support latency sensitive trac, in Proc. IEEE INFOCOM, San Francisco, USA, Apr. 2003. [76] M. Fujiwara, N. Shimosaka, M. Nishio, S. Suzuki, S. Yamazaki, S. Murata, and K. Kaede, A coherent photonic wavelength-division switching system for broad-band networks, IEEE Journal of Lightwave Technology, vol. 8, no. 3, pp. 416422, Mar. 1990. [77] K. Fukuchi, T. Kasamatsu, M. Morie, R. Ohhira, T. Ito, K. Sekiya, D. Ogasahara, and T. Ono, 10.92Tb/s (273/spl times/40-Gb/s) triple-band/ultra-dense WDM optical-repeatered transmission experiment, in Proc. Optical Fiber Communications Conference (OFC), Anaheim, USA, vol. 4, Mar. 2001. [78] R. Gareiss, Is the internet in trouble? Data Communications Magazine, Sept. 1997.
CRAIG CAMERON
163
[79] M. Garret and W. Willinger, Analysis, modeling and generation of self-similar VBR video trac, in Proc. ACM SIGCOMM, London, UK, Sept. 1994, pp. 269280. [80] A. Ge, F. Callegati, and L. S. Tamil, On optical burst switching and self-similar trac, IEEE Communication Letters, vol. 4, no. 3, pp. 98100, Mar. 2000. [81] P. Glynn and W. Whitt, Logarithmic asymptotics for steady-state tail probabilities in a single-server queue, Studies in Applied Probability, vol. 17, pp. 107128, 1993. [82] W. Goralski, Optical Osborne/McGraw-Hill, 2001. Networking and WDM.
[83] , SONET/SDH, 3rd ed. Osborne/McGraw-Hill, 2002. [84] G. Goth, Dynamic optical networks time might (nally) be approaching, IEEE Internet Computing, vol. 8, no. 3, pp. 912, May 2004. [85] S. Gowda, R. K. Shenai, K. M. Sivalingam, and H. C. Cankaya, Performance evaluation of TCP over optical burstswitched (OBS) WDM networks, in Proc. IEEE International Conference on Communications (ICC), Anchorage, USA, vol. 2, May 2003, pp. 14331437. [86] M. Grossglauser and J. C. Bolot, On the relevance of longrange dependence in network trac, vol. 7, no. 5, pp. 629 640, May 1999. [87] T. J. Hacker, B. D. Athey, and B. D. Noble, The end-to-end performance eects of parallel TCP sockets on a lossy widearea network, in Proc. International Parallel and Distributed Processing Symposium (IPDPS), Fort Lauderdale, USA, Apr. 2002. [88] T. J. Hacker, B. D. Noble, and B. D. Athey, The effects of systemic packet loss on aggregate TCP ows, in Proc. ACM/IEEE International Conference on Supercomputing, Baltimore, USA, Nov. 2002, pp. 115. [89] J. He and S.-H. G. Chan, TCP and UDP performance for internet over optical packet-switched networks, in Proc.
BIBLIOGRAPHY
164
IEEE International Conference on Communications (ICC), Anchorage, USA, vol. 2, May 2003, pp. 13501354. [90] , TCP and UDP performance for Internet over optical packet-switched networks, Computer Networks, vol. 45, pp. 505521, Apr. 2004. [91] D. Heyman and T. Lakshman, What are the implications of long-range dependence for VBR-video trac engineering, IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 301317, June 1996. [92] G. Hjalmtysson, J. Yates, S. Chaudhuri, and A. Greenberg, Smart routers-simple optics: an architecture for the optical Internet, IEEE Journal of Lightwave Technology, vol. 18, pp. 18801891, Dec. 2000. [93] M. G. Hluchyj and M. Karol, Queuing in high-performance packet switching, IEEE Journal on Selected Areas in Communications, vol. 6, pp. 15871597, Dec. 1988. [94] , Shuenet: An application of generalized perfect shufes to multihop lightwave networks, in Proc. IEEE INFOCOM, New Orleans, USA, Mar. 1988, pp. 379390. [95] J. Hoe, Improving the start-up behavior of a congestion control scheme for TCP, in Proc. ACM SIGCOMM, Stanford, USA, Aug. 1996. [96] C. Hsu, T. Liu, and N. Huang, Performance analysis of deection routing in optical burst-switched networks, in Proc. IEEE INFOCOM, New York City, USA, June 2002, pp. 66 73. [97] D. Z. Hsu, S. L. Lee, P. M. Gong, Y. M. Lin, S. S. W. Lee, and M. C. Yuang, High-eciency wide-band SOA-based wavelength converters by using dual-pumped four-wave mixing and an assist beam, IEEE Photonics Technology Letters, vol. 16, no. 8, pp. 19031905, Aug. 2004. [98] G. Iannaccone, S. Jaiswal, and C. Diot, Packet reordering inside the sprint backbone, Sprintlabs, Tech. Rep. TR01ATL-062917, June 2001.
CRAIG CAMERON
165
[99] S. Iyer, S. Bhattacharyya, N. Taft, N. McKeown, and C. Diot, An approach to alleviate link overhead as observed on an IP backbone, in Proc. IEEE INFOCOM, San Francisco, USA, Apr. 2003. [100] M. Izal and J. Aracil, On the inuence of self-similarity on optical burst switching trac, in Proc. IEEE GLOBECOM, Taipei, Taiwan, vol. 3, Nov. 2002, pp. 23082312. [101] V. Jacobson, Congestion avoidance and control, Computer Communication Review, vol. 18, no. 4, pp. 314329, Aug. 1988. [102] , Compressing TCP/IP headers for low-speed serial links, RFC 1144, Feb. 1990. [103] , Modied TCP congestion avoidance algorithm, end2end-interest mailing list, Apr. 1990. [Online]. Available: ftp://ftp.isi.edu/end2end/end2end-interest-1990.mail [104] R. Jain and S. Dharanikota, Internet protocol over DWDM recent developments, trends and issues, Global Optical Communications - Business Brieng, July 2001. [105] R. Jain and K. K. Ramakrishnan, Congestion avoidance in computer networks with a connectionless network layer: Concepts, goals and methodology, in Proc. IEEE Computer Networks Symposium, Washington D.C, Apr. 1988, pp. 134143. [106] R. Jain, K. K. Ramakrishnan, and D. Chiu, Congestion avoidance in computer networks with a connectionless network layer, DEC Technical Report TR-506, 1988. [107] M. Jones, R. Butler, and W. Szeto, Sprint long distance network survivability: today and tomorrow, IEEE Communications Magazine, vol. 37, no. 8, pp. 5862, Aug. 1999. [108] E. Karasan and E. Ayanoglu, Performance of WDM transport networks, IEEE Journal on Selected Areas in Communications, vol. 16, no. 7, Sept. 1998. [109] F. P. Kelly, Reversibility and Stochastic Networks. Wiley & Sons, 1979. John
BIBLIOGRAPHY
166
[110] G. Kesidis, J. Walrand, and C. S. Chang, Eective bandwidth for multiclass uids and other ATM sources, IEEE/ ACM Transactions on Networking, vol. 1, no. 4, pp. 424428, Aug. 1993. [111] A. Khanna and J. Zinky, The revised ARPANET routing metric, in Proc. ACM SIGCOMM, Austin, USA, Sept. 1989, pp. 4556. [112] S. Kim, N. Kim, and M. Kang, Contention resolution for optical burst switching networks using alternative routing, in Proc. IEEE International Conference on Communications (ICC), New York City, USA, vol. 5, Apr. 2002, pp. 2678 2681. [113] L. Kleinrock, Queuing Systems Volume I: Theory. John Wiley & Sons, 1975. [114] T. V. Lakshman and U. Madhow, The performance of TCP/IP for networks with high bandwidth-delay products and random loss, IEEE/ACM Transactions on Networking, vol. 6, no. 3, pp. 336350, June 1997. [115] J. Lamperti, Stochastic Processes - a Survey of the Mathematical Theory. Springer-Verlag, 1976. [116] K. Lee and T. Aprille, SONET evolution: the challenges ahead, in Proc. IEEE GLOBECOM, Phoenix, USA, vol. 2, Dec. 1991, pp. 736740. [117] S. K. Lee, H. S. Kim, J. S. Song, and D. Grith, A study on deection routing in optical burst-switched networks, Photonic Network Communications, vol. 6, no. 1, pp. 5159, 2003. [118] S. K. Lee, K. Sriram, H. S. Kim, and J. S. Song, Contentionbased limited deection routing in OBS networks, in Proc. IEEE GLOBECOM, San Francisco, USA, Dec. 2003. [119] S. S. W. Lee, M. C. Yuang, P. L. Tien, and S. H. Lin, A lagrangean relaxation-based approach for routing and wavelength assignment in multigranularity optical WDM networks, IEEE Journal on Selected Areas in Communications, vol. 22, no. 9, pp. 17411751, Nov. 2004.
CRAIG CAMERON
167
[120] W. Leland, M. Taqqu, W. Willinger, and D. Wilson, On the self-similar nature of ethernet trac (extended version), IEEE/ACM Transactions on Networking, vol. 2, no. 1, pp. 115, Feb. 1994. [121] N. Likhanov, B. Tsybakov, and N. D. Georganas, Analysis of an ATM buer with self-similar input trac, in Proc. IEEE INFOCOM, Boston, USA, Apr. 1995. [122] N. X. Liu and J. S. Baras, Statistical modeling and performance analysis of multi-scale trac, in Proc. IEEE INFOCOM, San Francisco, Apr. 2003. [123] J. Mahdavi and S. Floyd, TCP-friendly unicast rate-based ow control, technical note sent to the end2end-interest mailing list, Jan. 8, 1997. [124] M. Mathis and J. Mahdavi, Forward acknowledgement: Rening TCP congestion control, in Proc ACM SIGCOMM, Stanford, USA, Aug. 1996. [125] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow, TCP selective acknowledgement options, RFC 2018, Oct. 1996. [126] M. Mathis, J. Semke, and J. Mahdavi, The macroscopic behaviour of the TCP congestion avoidance algorithm, ACM SIGCOMM Computer Communication Review, vol. 27, no. 3, July 1997. [127] N. Maxemchuk, Regular mesh topologies in local and metropolitan area networks, AT&T Technical Journal, vol. 65, pp. 16591685, Sept. 1985. [128] M. May, J. Bolot, C. Diot, and B. Lyles, Reasons not to deploy RED, in Proc. IEEE International Workshop on Quality of Service (IWQoS), London, UK, June 1999. [129] H. D. Miller, Absorption probabilities for sums of random variables dened on a nite Markov Chain, in Proc. Cambridge Philosophical Society, vol. 58, 1962, pp. 286298. [130] A. Mokhtar and M. Azizoglu, Adaptive wavelength routing in all-optical networks, IEEE/ACM Transactions on Networking, vol. 6, pp. 197206, Apr. 1998.
BIBLIOGRAPHY
168
[131] M. Montgomery and G. deVeciana, On the relevance of time scales in performance oriented trac characterizations, in Proc. IEEE INFOCOM, San Francisco, USA, Mar. 1996. [132] J. Moy, OSPF version 2, RFC 2328, Apr. 1998. [133] N. Nagatsu, S. Okamoto, and K. Sato, Optical path crossconnect system scale evaluation using path accomodation design for restricted wavelength multiplexing, IEEE Journal on Selected Areas in Communication, vol. 14, no. 5, pp. 893 902, 1996. [134] J. Nagle, Congestion control in TCP/IP internetworks, RFC 896, Jan. 1984. [135] A. L. Neidhardt and J. L. Wang, The concept of relevant time scales and its application to queuing analysis of selfsimilar trac (or is Hurst naughty or nice?), in Proc. ACM SIGMETRICS, Madison, USA, June 1998, pp. 222232. [136] I. Norros, On the use of fractional Brownian motion in the theory of connectionless networks, IEEE Journal on Selected Areas in Communications, vol. 13, no. 6, pp. 953962, Aug. 1995. [137] A. M. Odlyzko, Internet trac growth: Sources and implications, in Proc. SPIE Optical Transmission Systems and Equipment for WDM Networking II, vol. 5247, 2003, pp. 1 15. [138] M. J. OMahony, D. Simeonidou, D. K. Hunter, and A. Tzanakaki, The application of optical packet switching in future communication networks, IEEE Communications Magazine, vol. 39, no. 3, pp. 128135, Mar. 2001. [139] T. Ott, J. Kemperman, and M. Mathis, The stationary distribution of ideal TCP congestion avoidance. [Online]. Available: http://citeseer.ist.psu.edu/ott96stationary.html [140] J. Padhye, V. Firoiu, and D. F. Towsley, A stochastic model of TCP Reno congestion avoidance and control, Tech. Rep. UMASS-CS-TR-1999-02, 1999.
CRAIG CAMERON
169
[141] J. Padhye, V. Firoiu, D. F. Towsley, and J. F. Kurose, Modeling TCP Reno performance: A simple model and its empirical validation, IEEE/ACM Transactions on Networking, vol. 8, no. 2, pp. 133145, Apr. 2000. [142] J. Pang, Q. Zou, Z. Tan, X. Qian, L. Liu, and Z. Li, A novel single-chip fabrication technique for three-dimensional MEMS structures, in Proc. International Conference on Solid-Sate and Integrated Circuit Technology (ICSICT), Beijing, China, Oct. 1998, pp. 936938. [143] K. Park and W. Willinger, Eds., Self-similar network trac and performance evaluation. Wiley-Interscience, 2000. [144] V. Paxson, Automated packet trace analysis of TCP implementations, in Proc. ACM SIGCOMM, Cannes, France, Sept. 1997. [145] , End-to-end internet packet dynamics, in Proc. ACM SIGCOMM, Sept. 1997. [146] V. Paxson and M. Allman, Computing TCPs retransmission timer, RFC 2988, Nov. 2000. [147] V. Paxson, M. Allman, S. Dawson, W. Fenner, J. Griner, I. Heavens, K. Lahey, J. Semke, and B. Volz, Known TCP implementation problems, RFC 2525, Mar. 1999. [148] V. Paxson and S. Floyd, Wide area trac: The failure of Poisson modeling, IEEE/ACM Transactions on Networking, vol. 3, no. 3, pp. 226244, June 1995. [149] , Why we dont know how to simulate the internet, in Proc. 1997 Winter Simulation Conference, Atlanta, USA, Dec. 1997, pp. 10371044. [150] K. Pentikousis and H. Badr, Quantifying the deployment of TCP options - a comparative study, IEEE Communications Letters, vol. 8, no. 10, pp. 647649, Oct. 2004. [151] J. Postel, Transmission control protocol - DARPA internet program protocol specication, RFC 793, Sept. 1981.
BIBLIOGRAPHY
170
[152] C. Qiao and M. Yoo, Optical burst switching (OBS) a new paradigm for an optical internet, Journal of High Speed Networks, vol. 8, no. 1, pp. 6984, 1999. [153] , Choices, features and issues in optical burst switching, SPIE Optical Networking Magazine, vol. 1, pp. 3644, Apr. 2000. [154] L. Qiu, Y. Zhang, and S. Keshav, On individual and aggregate TCP performance, Cornell CS Tech. Rep. TR99-1744, May 1999. [155] J. S. Quarterman, Starr day, SunExpert Magazine, pp. 50 52, Nov. 1998. [156] R. Ramaswami and K. N. Sivarajan, Routing and wavelength assignment in all-optical networks, IEEE/ACM Transactions on Networking, vol. 3, no. 5, pp. 489500, May 1995. [157] , Optical Networks. A practical perspective, 2nd ed. Morgan Kaufmann Publishers, 2002. [158] J. Roberts, U. Mocci, and J. Virtamo, Eds., Broadband Network Teletrac, Final Report of Action COST 242. Springer, 1996. [159] L. G. Roberts, Beyond Moores law: Internet growth trends, Computer, vol. 33, no. 1, pp. 117119, Jan. 2000. [160] B. K. Ryu and A. Elwalid, The importance of long-range dependence of VBR video trac in ATM trac engineering: myths and realities, in Proc. ACM SIGCOMM, Stanford, USA, Aug. 1996, pp. 314. [161] N. B. Shro and M. Schwartz, Improved loss calculations at an ATM multiplexer, IEEE/ACM Transactions on Networking, vol. 6, no. 4, pp. 411421, Aug. 1998. [162] R. R. Shumaker, Optoelectric tautomeric compositions, US Patent 5237067, Aug. 1993. [163] H. Sirisena, A. Haider, and K. Pawlikowski, Auto-tuning RED for accurate queue control, in Proc. IEEE GLOBECOM, Taipei, Taiwan, Nov. 2002.
CRAIG CAMERON
171
[164] O. B. Spahn, C. Sullivan, J. Burkhart, C. Tigges, and E. Garcia, GaAs-based microelectromechanical waveguide switch, in Proc. IEEE/LEOS International Conference on Optical MEMS (MOEMS), Kauai, USA, Aug. 2000, pp. 4142. [165] W. Stevens, The Protocols (TCP/IP Illustrated, Volume 1), 2nd ed. Addison-Wesley Pub Co, 1993. [166] , TCP slow start, congestion avoidance, fast retransmit, and fast recovery algorithms, RFC 2001, Jan. 1997. [167] W. J. Stewart, Introduction to the Numerical Solution of Markov Chains. Princeton University Press, 1994. [168] J. Stone and C. Partridge, When the CRC and TCP checksum disagree, in Proc. ACM SIGCOMM, Stockholm, Sweden, Sept. 2000. [169] V. Subramanian and R. Barry, Wavelength assignment in xed routing WDM networks, in Proc. IEEE International Conference on Communications (ICC), Montreal, Canada, Nov. 1997, pp. 406410. [170] V. Subramanian and R. Srikant, Statistical multiplexing with priorities: Tail probabilities of queue lengths, workloads and waiting times, in Proc. IEEE Conference on Decision and Control, San Diego, USA, 1997. [171] Y. Sun, T. Hashiguchi, N. Yoshida, X. Wang, H. Morikawa, and T. Aoyama, Architecture and design issues of an optical burst switched network testbed, in Proc. OECC/COIN, Yokohama, Japan, July 2004, pp. 386387. [172] A. Tang, J. Wang, and S. H. Low, Understanding CHOKe: Throughput and spatial characteristics, IEEE/ACM Transactions on Networking, vol. 12, no. 4, pp. 694707, Aug. 2004. [173] G. P. V. Thodime, V. M. Vokkarane, and J. P. Jue, Dynamic congestion-based load balanced routing in optical burstswitched networks, in Proc. GLOBECOM, San Francisco, USA, Dec. 2003, pp. 26282632. [174] J. Turner, Terabit burst switching, Journal of High Speed Networks, vol. 8, no. 1, pp. 316, Jan. 1999.
BIBLIOGRAPHY
172
[175] S. Verma, H. Chaskar, and R. Ravikanth, Optical burst switching: A viable solution for terabit IP backbone, IEEE Network, vol. 11, no. 6, pp. 4853, Nov. 2000. [176] V. Vokkarane, K. Haridoss, and J. Jue, Threshold-based burst assembly policies for QoS support in optical burstswitched networks, in Proc. SPIE OPTICOM, Boston, USA, vol. 4874, July 2002, pp. 125136. [177] V. Vokkarane and J. Jue, Prioritized routing and burst segmentation for QoS in optical burst-switched networks, in Proc. Optical Fiber Communication Conference (OFC), Anaheim, USA, Mar. 2002, pp. 221 222. [178] , Prioritized burst segmentation and composite burst assembly techniques for QoS support in optical burstswitched networks, IEEE Journal on Selected Areas in Communications, vol. 21, no. 7, pp. 11981209, Sept. 2003. [179] H. L. Vu and M. Zukerman, Blocking probability for priority classes in optical burst switching networks, IEEE Communication Letters, vol. 6, no. 5, pp. 214216, May 2002. [180] X. Wang, H. Morikawa, and T. Aoyama, Burst optical deection routing protocol for wavelength routing WDM networks, Optical Networks Magazine, vol. 3, no. 6, pp. 1219, Nov. 2002. [181] I. White, R. Penty, M. Webster, Y. J. Chai, A. Wonfor, and S. Shahkooh, Wavelength switching components for future photonic networks, IEEE Communications Magazine, vol. 40, no. 9, pp. 7481, Sept. 2002. [182] I. Widjaja, Performance analysis of burst admission-control protocols, IEE Proc. Communications, vol. 142, pp. 714, Feb. 1995. [183] W. Willinger and J. Doyle, Robustness and the internet: Design and evolution, Mar. 2002. [Online]. Available: http://netlab.caltech.edu/FAST/papers [184] A. Willner and W. Shieh, Optimal spectral and power parameters for all-optical wavelength shifting: Single stage, fanout
CRAIG CAMERON
173
and cascadability, IEEE Journal of Lightwave Technology, vol. 13, pp. 771781, May 1995. [185] G. Wright and W. Stevens, The Implementation (TCP/IP Illustrated, Volume 2), 1st ed. Addison-Wesley Pub Co, 1995. [186] Y. Xiong, M. Vandenhoute, and H. C. Cankaya, Control architecture in optical burst switched WDM networks, IEEE Journal on Selected Areas in Communications, vol. 18, no. 10, pp. 18381851, Oct. 2000. [187] F. Xue and S. J. B. Yoo, High-capacity multiservice optical label switching for the next-generation Internet, IEEE Communications Magazine, vol. 42, no. 5, pp. S16S22, May 2004. [188] S. Yao, , S. J. B. Yu, and B. Muhkerjee, A comparison study between slotted and unslotted all-optical packet-switched network with priority-based routing, in Proc. Optical Fiber Communications Conference (OFC), Anaheim, USA, Mar. 2001. [189] S. Yao, S. Dixit, and M. B, Advances in photonic packet switching: An overview, IEEE Communications Magazine, vol. 38, pp. 8494, Feb. 2000. [190] S. Yao, B. Muhkerjee, S. J. B. Yoo, and S. Dixit, All-optical packet-switched networks: a study of contention-resolution schemes in an irregular mesh network with variable-sized packets, in Proc. SPIE OPTICOM, Dallas, USA, Oct. 2000. [191] , A unied study of contention-resolution schemes in optical packet-switched networks, IEEE Journal of Lightwave Technology, vol. 21, no. 3, pp. 672683, Mar. 2003. [192] S. Yao, F. Xue, B. Mukherjee, S. J. B. Yoo, and S. Dixit, Electrical ingress buering and trac aggregation for optical packet switching and their eect on TCP-level performance in optical mesh networks, IEEE Communications Magazine, vol. 40, no. 9, pp. 66 72, Sept. 2002. [193] S. Yao, S. J. B. Yu, and B. Mukherjee, All-optical packet switching for metropolitan area networks: Opportunities and
BIBLIOGRAPHY
174
challenges, IEEE Communications Magazine, vol. 39, no. 3, pp. 142148, Mar. 2001. [194] J. M. Yates, J. Lacey, M. Rumsewicz, and M. A. Summereld, Performance of networks using wavelength converters based on four-wave mixing in semiconductor optical ampliers, IEEE Journal of Lightwave Technology, vol. 17, no. 5, pp. 782791, May 1999. [195] M. Yoo and C. Qiao, Just-enough-time (JET): a high speed protocol for bursty trac in optical networks, 1997 Digest of the IEEE/LEOS Summer Topical Meetings, Montreal, Canada, pp. 2627, Aug. 1997. [196] M. Yoo, C. Qiao, and S. Dixit, Qos performance of optical burst switching in IP-over-WDM networks, IEEE Journal on Selected Areas in Communications, vol. 18, no. 10, pp. 20622071, Oct. 2000. [197] X. Yu, Y. Chen, and C. Qiao, Performance evaluation of optical burst switching with assembled burst trac input, in Proc. IEEE GLOBECOM, Taipei, Taiwan, Nov. 2002, pp. 2318 2322. [198] X. Yu, J. Li, X. Cao, Y. Chen, and C. Qiao, Trac statistics and peformance evaluation in optical burst switched networks, IEEE Journal of Lightwave Technology, vol. 22, no. 12, pp. 27222738, Dec. 2004. [199] X. Yu, C. Qiao, and Y. Liu, TCP implementations and false time out detection in OBS networks, in Proc. IEEE INFOCOM, Hong Kong, China, Mar. 2004. [200] X. Yu, C. Qiao, Y. Liu, and D. Towsley, Performance evaluation of TCP implementations in OBS networks, Dept. Computer Science and Engineering, State University of New York at Bualo Technical Report 2003-13, 2003. [201] H. Yuan, W. D. Zhong, and W. Hu, FBG-based bidirectional optical cross connects for bidirectional WDM ring networks, IEEE Journal of Lightwave Technology, vol. 22, no. 12, pp. 27102721, Dec. 2004.
CRAIG CAMERON
175
[202] M. C. Yuang, J. Shil, and P. L. Tien, QoS burstication for optical burst switched WDM networks, in Proc. Optical Fiber Communications Conference (OFC), Anaheim, USA, Mar. 2002. [203] A. Zalesky, H. L. Vu, Z. Rosberg, E. W. M. Wong, and M. Zukerman, Reduced load Erlang xed point analysis of optical burst switched networks with deection routing and wavelength reservation, in Proc. First International Workshop on Optical Burst Switching (WOBS), Dallas, USA, Oct. 2003. [204] H. Zang, J. P. Jue, and B. Mukherjee, A review of routing and wavelength assignment approaches for wavelength-routed optical WDM networks, SPIE Optical Networks Magazine, vol. 1, no. 1, pp. 4760, Jan. 2000. [205] Z. Zhang, V. J. Ribeiro, S. Moon, and C. Diot, Small-time scaling behaviours of internet backbone trac: An empirical study, in Proc. IEEE INFOCOM, San Francisco, USA, 2003. [206] W. D. Zhong, X. Niu, B. Chen, and A. K. Bose, Peformance comparison of overlay and peer models in IP/MPLS over optical networks, Photonic Network Communications, vol. 9, no. 1, pp. 121131, 2005. [207] W. D. Zhong and R. S. Tucker, Wavelength routing-based photonic packet buers and their applications in photonic packet switching systems, IEEE Journal of Lightwave Technology, vol. 16, no. 10, pp. 17371745, Oct. 1998. [208] , A new wavelength-routed photonic packet buer combining traveling delay lines with delay-line loops, IEEE Journal of Lightwave Technology, vol. 19, no. 8, pp. 10851092, Aug. 2001. [209] J. Zhou, N. Park, K. J. Vahala, M. A. Newkirk, and B. I. Miller, Four-wave mixing wavelength conversion eciency in semiconductor traveling-wave ampliers measured to 65 nm of wavelength shift, IEEE Photonic Technology Letters, vol. 6, pp. 984987, Aug. 1994.
BIBLIOGRAPHY
176
[210] M. Zukerman, E. Wong, Z. Rosberg, G. M. Lee, and H. L. Vu, On teletrac applications to OBS, IEEE Communications Letters, vol. 8, no. 2, pp. 116118, Feb. 2004.

Cubin Craig Cameron Thesis

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Cubin Craig Cameron Thesis

Diunggah oleh

Hak Cipta:

Format Tersedia

Optical Burst Switching: Towards Feasibility

Craig Warrington Cameron

Department of Electrical and Electronic Engineering

CRAIG WARRINGTON CAMERON, 1st April 2005

CRAIG WARRINGTON CAMERON, 1st April 2005

Optical Burst Switching . . . . . . . . . . . . . . .

OBS Loss Calculation

O-Timer Burst Assembly . . . . . . . . . . . . . . OTBA Properties . . . . . . . . . . . . . . . . . . .

TCP over OBS . . . . . . . . . . . . . . . . . . . .

Linking TCP and OBS Models

Burst Assembly and TCP . . . . . . . . . . . . . .

CRAIG CAMERON 4.8.4 4.8.5 4.8.6 4.8.7 4.9

TCP alternatives . . . . . . . . . . . . . . . . . . . 107

4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . 108 5 Contention Resolution 5.1 5.2 111

5.4 5.5 5.6

SP-PRDR Simulation . . . . . . . . . . . . . . . . . 123 Results . . . . . . . . . . . . . . . . . . . . . . . . . 126 Conclusion . . . . . . . . . . . . . . . . . . . . . . . 131 133

6 Gaussian Queuing in OBS 6.1 6.2

. . . . . . 127 transmis. . . . . . 128 transmis. . . . . . 128

Focus of this Thesis

Subdivision of Thesis by Chapter

Chapter 2: All Optical Networks

Chapter 3: Burst Assembly

Chapter 4: TCP over OBS

Chapter 5: Contention Resolution

Chapter 6: Gaussian Queuing In OBS

Contributions of this Thesis

Publications by the Author Related to this Thesis

Other publications by the Author

Chapter 2 All-Optical Networks

CHAPTER 2. ALL-OPTICAL NETWORKS

Current High Speed Networks

CHAPTER 2. ALL-OPTICAL NETWORKS

Future Optical Networks

CHAPTER 2. ALL-OPTICAL NETWORKS

http://www.lucent.com/press/1001/011009.nsa.html http://www.calient.net/les/DW PXC May2004.pdf

CHAPTER 2. ALL-OPTICAL NETWORKS

CHAPTER 2. ALL-OPTICAL NETWORKS

Optical Circuit Switching

Lightpaths and Wavelength Routing

CRAIG CAMERON management of circuits.

CHAPTER 2. ALL-OPTICAL NETWORKS

Peer and Overlay Models

Out of Band Control

CHAPTER 2. ALL-OPTICAL NETWORKS

Optical Packet Switching

CHAPTER 2. ALL-OPTICAL NETWORKS

Packet Header Processing

Switching Speed Limitations

CHAPTER 2. ALL-OPTICAL NETWORKS groups of packets, called bursts.

Optical Burst Switching

Figure 2.1: Example Path (2-7-9-12) in OBS Network

CHAPTER 2. ALL-OPTICAL NETWORKS

CHAPTER 2. ALL-OPTICAL NETWORKS

BHP Offset Time BHP Proc. Time

OBS Loss Calculation

On-O Source Model

CHAPTER 2. ALL-OPTICAL NETWORKS

B = The burst input load is then B =

OBS Scheduler Wavelength 1

Source N Source (N-1)*M+1

2D Markov Chain Framework

The input load is given by

the carried load is given by