Transport Layer
Computer
They obviously represent a lot of work on our part. In return for use, we only
ask the following:
If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
Networking: A Top
If you post any slides on a www site, that you note that they are adapted
from (or perhaps identical to) our slides, and note our copyright of this
Down Approach
material.
7th edition
Thanks and enjoy! JFK/KWR
Jim Kurose, Keith Ross
All material copyright 1996-2016 Pearson/Addison Wesley
J.F Kurose and K.W. Ross, All Rights Reserved April 2016
Transport Layer 2-1
Chapter 3: Transport Layer
our goals:
understand principles learn about Internet
behind transport transport layer protocols:
layer services: • UDP: connectionless
• multiplexing, transport
demultiplexing • TCP: connection-oriented
• reliable data transfer reliable transport
• flow control • TCP congestion control
• congestion control
network
• delay guarantees
• bandwidth guarantees
application
application
application P4 P5 P6 application
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B
length checksum
why is there a UDP?
no connection
application establishment (which can
data add delay)
(payload) simple: no connection
state at sender, receiver
small header size
UDP segment format no congestion control:
UDP can blast away as fast
as desired
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
send receive
side side
sender receiver
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
U L/R .008
sender = = = 0.00027
RTT + L / R 30.008
U L/R .008
sender = = = 0.00027
RTT + L / R 30.008
U 3L / R .0024
sender = = = 0.00081
RTT + L / R 30.008
data
checksum urg pointer
window size
acknowledgements: N
User
types
‘C’ Seq=42, ACK=79, data = ‘C’
host ACKs
receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of echoed
‘C’ Seq=43, ACK=80
350
300
250
RTT (milliseconds)
200
sampleRTT
150
EstimatedRTT
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
time (seconds) Transport Layer 3-66
SampleRTT Estimated RTT
TCP round trip time, timeout
timeout interval: EstimatedRTT plus “safety margin”
• large variation in EstimatedRTT -> larger safety margin
estimate SampleRTT deviation from EstimatedRTT:
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically, = 0.25)
SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data
timeout
ACK=100
X
ACK=100
ACK=120
SendBase=120
ACK=100
X
ACK=120
cumulative ACK
Transport Layer 3-73
TCP ACK generation [RFC 1122, RFC 2581]
ACK=100
timeout
ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data
IP
flow control code
receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting from sender
too much, too fast
receiver protocol stack
application application
network network
2-way handshake:
Q: will 2-way handshake
always work in
network?
Let’s talk
ESTAB variable delays
OK
ESTAB retransmitted messages (e.g.
req_conn(x)) due to
message loss
message reordering
choose x
req_conn(x)
can’t “see” other side
ESTAB
acc_conn(x)
ESTAB
choose x choose x
req_conn(x) req_conn(x)
ESTAB ESTAB
retransmit acc_conn(x) retransmit acc_conn(x)
req_conn(x) req_conn(x)
ESTAB ESTAB
data(x+1) accept
req_conn(x)
retransmit data(x+1)
data(x+1)
connection connection
client x completes server x completes server
client
terminates forgets x terminates forgets x
req_conn(x)
ESTAB ESTAB
data(x+1) accept
half open connection! data(x+1)
(no client!)
Transport Layer 3-85
TCP 3-way handshake
closed
Socket connectionSocket =
welcomeSocket.accept();
L Socket clientSocket =
SYN(x) newSocket("hostname","port
number");
SYNACK(seq=y,ACKnum=x+1)
create new socket for SYN(seq=x)
communication back to client listen
SYN SYN
rcvd sent
SYNACK(seq=y,ACKnum=x+1)
ESTAB ACK(ACKnum=y+1)
ACK(ACKnum=y+1)
L
LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime
CLOSED
TCP server
lifecycle
TCP client
lifecycle
no retransmission
Host B
R/2
delay
lout
Host A
lout
sender sends only when
router buffers available
lin R/2
A
no buffer space!
Host B
Transport Layer 3-96
Causes/costs of congestion: scenario 2
Idealization: known loss R/2
packets can be lost,
dropped at router due when sending at R/2,
some packets are
lout
to full buffers retransmissions but
A
free buffer space!
Host B
Transport Layer 3-97
Causes/costs of congestion: scenario 2
Realistic: duplicates R/2
packets can be lost, dropped at
router due to full buffers when sending at R/2,
some packets are
lout
sender times out prematurely, retransmissions
lin
timeout
copy l'in lout
A
free buffer space!
Host B
Transport Layer 3-98
Causes/costs of congestion: scenario 2
Realistic: duplicates R/2
packets can be lost, dropped at
router due to full buffers when sending at R/2,
some packets are
lout
sender times out prematurely, retransmissions
“costs” of congestion:
more work (retrans) for given “goodput”
unneeded retransmissions: link carries multiple copies of pkt
• decreasing goodput
Host D
Host C
C/2
lout
lin’ C/2
Congestion
collapse
Origins of “TCP”
(Cerf & Kahn, ’74)
Congestion collapse
Observed, ‘86
1988
...
Fabric Output Buffer
Input Buffer
Router
Router
Queued
(FIFO) Queue Packets
TCP’s
X “sawtooth”
behavior
time
cwnd
rate = bytes/sec
RTT
RTT
RTT
time
RTT
available bandwidth may be >>
MSS/RTT
• desirable to quickly ramp up to
respectable rate
increase rate exponentially until first
loss event or when threshold
reached
• double cwnd every RTT
• done by incrementing cwnd by
1 for every ACK received
time
duplicate ACK
dupACKcount++ new ACK
cwnd = cwnd+MSS
dupACKcount = 0
L transmit new segment(s),as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0 slow L congestion
start timeout avoidance
ssthresh = cwnd/2
cwnd = 1 MSS
timeout dupACKcount = 0
retransmit missing segment
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
time
Transport Layer 3-122
TCP Congestion Control: details
sender sequence number space
cwnd TCP sending rate:
roughly: send cwnd
bytes, wait RTT for
last byte last byte
ACKS, then send
ACKed sent, not-
yet ACKed
sent more bytes
(“in-
flight”) cwnd
sender limits transmission: rate ~
~
RTT
bytes/sec
RTT
• initially cwnd = 1 MSS
• double cwnd every RTT
• done by incrementing
cwnd for every ACK
received
summary: initial rate is
slow but ramps up
exponentially fast time
Implementation:
variable ssthresh
on loss event, ssthresh
is set to 1/2 of cwnd just
before loss event
W/2
BANDWIDTH x DELAY
LFN (“elephant”)
BD > 105 Bytes
How much data should be pushed in the pipe in order to never stall?
TCP connection 1
bottleneck
router
capacity R
TCP connection 2
Connection 1 throughput R
Transport Layer 3-133
Fairness (more)
Fairness and UDP Fairness, parallel TCP
multimedia apps often connections
do not use TCP application can open
• do not want rate multiple parallel
throttled by congestion connections between
control
two hosts
instead use UDP:
• send audio/video at
web browsers do this
constant rate, tolerate e.g., link of rate R with 9
packet loss existing connections:
• new app asks for 1 TCP, gets
rate R/10
• new app asks for 11 TCPs,
gets R/2
ECN=00 ECN=11
IP datagram
Transport Layer 3-136
Hall of fame