Professional Documents
Culture Documents
Transport Layer
Issues
1
But first,
a general overview of networks (and the Internet)
Telecommunication
networks
Circuit-switched Packet-switched
networks networks
2
What is the Internet?
r The largest network of networks in the world.
r Uses TCP/IP protocols and packet switching .
r Runs on any communications substrate.
3
*** Internet History ***
1945 1995
Review: Transport Layer 8
Copyright 2002, William F. Slater, III, Chicago, IL, USA
4
From Simple, But Significant Ideas Bigger Ones
Grow 1940s to 1969
We will prove that packet switching
works over a WAN.
We can access
information using
electronic computers
1945 1969
Review: Transport Layer 9
Copyright 2002, William F. Slater, III, Chicago, IL, USA
Ideas from
1940s to 1969
1970 1995
Review: Transport Layer 10
Copyright 2002, William F. Slater, III, Chicago, IL, USA
5
The Creation of the Internet
Internet Pioneers
Mark Andreesen
(Mosaic/Netscape)
6
Growth of Internet Hosts *
Sept. 1969 - Sept. 2002
250,000,000
Sept. 1, 2002
200,000,000
No. of Hosts
150,000,000
50,000,000
0
1
3
/71
/74
/76
/79
/81
/85
/88
/89
/91
/92
/93
/94
/95
/96
/97
/99
/01
/02
9
/73
/83
/86
/89
/98
/9
/9
/9
9/6
01
04
04
01
01
01
01
08
10
07
10
10
10
10
07
01
01
01
01
01
08
01
08
11
01
01
Time Period
Chart by William F. Slater, III
The Internet was not known as "The Internet" until January 1984, at which time
there were 1000 hosts that were all converted over to using TCP/IP.
Review: Transport Layer 13
Copyright 2002, William F. Slater, III, Chicago, IL, USA
application application
transport presentation
network session
link
physical
7
Internet protocol stack
r application: supporting network
applications
m FTP, SMTP, HTTP
application
r transport: host-host data transfer
transport
m TCP, UDP
r network: routing of datagrams from
source to destination network
m IP, routing protocols e.g. OSPF, BGP
r link: data transfer between link
neighboring network elements
m PPP, Ethernet physical
r physical: bits “on the wire”
8
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer
controlled by
app developer
process process
socket socket
TCP with TCP with
buffers, Internet buffers,
variables variables
controlled
by OS
9
Transport services and protocols
r provide logical application
communication between
transport
network
data link
app processes running on
network
physical data link
network physical
different hosts
log
data link
ica
physical
le
r transport protocols run network
nd
data link
-en
data link
dt
physical
m send side: breaks app
ran
spo
network
messages into
rt
data link
physical
segments into
messages, passes to
app layer
Review: Transport Layer 19
10
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer
11
Connection-oriented demux
Connection-oriented demux
P3 P3 P4 P1
P1
SP: 80 SP: 80
DP: 9157 DP: 5775
12
Connection-oriented demux
Connection-oriented demux
13
UDP: User Datagram Protocol [RFC 768]
r “no frills,” “bare bones”
Internet transport Why is there a UDP?
protocol
r no connection
r “best effort” service, UDP establishment (which can
segments may be: add delay)
m lost r simple: no connection state
m delivered out of order at sender, receiver
to app r small segment header
r connectionless: r no congestion control: UDP
m no handshaking between can blast away as fast as
UDP sender, receiver desired
m each UDP segment
handled independently
of others
UDP: more
r often used for streaming
32 bits
multimedia apps
m loss tolerant Length, in source port # dest port #
m rate sensitive bytes of UDP length checksum
segment,
r other UDP uses including
m DNS – why ? header
14
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer
15
Reliable data transfer: getting started
send receive
side side
We’ll:
r incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
r What is unreliability ?
m Biterror
m Packet loss – congestion
m Delay – too long
16
Rdt1.0: reliable transfer over a reliable channel
r underlying channel perfectly reliable
m no bit errors
m no loss of packets
r separate FSMs for sender, receiver:
m sender sends data into underlying channel
m receiver read data from underlying channel
sender receiver
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
17
rdt2.0 has a fatal flaw!
18
rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt ) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum )
udt_send(sndpkt)
rdt_rcv(rcvpkt ) && (corrupt(rcvpkt) rdt_rcv(rcvpkt ) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum ) sndpkt = make_pkt(NAK, chksum )
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt ) && 0 from 1 from rdt_rcv(rcvpkt ) &&
not corrupt(rcvpkt ) && below below not corrupt(rcvpkt ) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum ) sndpkt = make_pkt(ACK, chksum )
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt ) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum )
udt_send(sndpkt)
19
rdt 2.1 in action (cont)
sender receiver
pkt
send pkt0
rcv pkt0
ACK send ACK0
(corrupted) X
rcv ACK0
resend pkt0 pkt
rcv pkt0
ACK send ACK0
rcv ACK0
send pkt1 pkt
rcv pkt1
ACK send ACK1
c) ACK corrupted
20
rdt 2.2 in action
sender receiver
pkt0
send pkt0
rcv pkt0
ACK0 send ACK0
(corrupted) X
rcv ACK0
resend pkt0 pkt0
rcv pkt0
ACK0 send ACK0
rcv ACK0
send pkt1 pkt1
rcv pkt1
ACK1 send ACK1
c) ACK corrupted
21
rdt3.0 channels with errors and loss
rdt_send(data)
rdt_rcv(rcvpkt ) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt ) start_timer Λ
Λ Wait for Wait
timeout
call 0from for
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt )
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt )
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt )
rdt_send(data) Λ
rdt_rcv(rcvpkt ) &&
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer Sender
Λ
22
Performance of rdt3.0
U L/R . 008
= = = 0.00027
sender 30.008
RTT + L / R microsec
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-to-
be-acknowledged pkts
m range of sequence numbers must be increased
m buffering at sender and/or receiver
23
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
Increase utilization
by a factor of 3
U 3*L/R . 024
= = = 0.0008
sender 30.008
RTT + L / R microsecon
Go-Back-N
Sender:
r k-bit seq # in pkt header
r “window ” of up to N, consecutive unack’ed pkts allowed – sliding
window
24
GBN: sender extended FSM
rdt_send(data)
25
GBN in
action
26
Sender Receiver
send pkt0
GBN in send pkt1
rcv pkt0
send ACK0
action send pkt2 rcv pkt1
send ACK1
send pkt3 rcv pkt2
send ACK2
rcv pkt3
Cumulative ACK rcv ACK0
send pkt4
(loss)X send ACK3
rcv ACK5
send pkt6
send pkt7
send pkt8
send pkt9
Review: Transport Layer 53
Sender Receiver
send pkt0
GBN in send pkt1
rcv pkt0
send ACK0
action send pkt2 rcv pkt1
send ACK1
send pkt3 rcv pkt2
send ACK2
rcv pkt3
Cumulative ACK rcv ACK0
send pkt4
(loss)X send ACK3
rcv ACK5
send pkt6
send pkt7
send pkt8
send pkt9
Review: Transport Layer 54
27
Sender Receiver
send pkt0
rcv pkt0
send ACK0
GBN in send pkt1
rcv pkt1
action
send ACK1
send pkt2
send pkt3
rcv pkt5,discard
send ACK2
pkt2 timeout
send pkt2,3,4,5
Sender Receiver
send pkt0
rcv pkt0
send ACK0
GBN in send pkt1
rcv pkt1
action
send ACK1
send pkt2
send pkt3
rcv pkt5,discard
send ACK2
pkt2 timeout
send pkt2,3,4,5
28
Selective Repeat
29
Selective repeat
sender receiver
data from above : pkt n in [rcvbase, rcvbase+N -1]
r if next available seq # in r send ACK(n)
window, send pkt r out-of-order: buffer
timeout(n): r in-order: deliver (also
r resend pkt n, restart timer deliver buffered, in-order
pkts), advance window to
ACK(n) in [sendbase,sendbase+N]: next not-yet-received pkt
r mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
r if n smallest unACKed pkt,
r ACK(n)
advance window base to
next unACKed seq # otherwise:
r ignore
30
Selective repeat:
dilemma
Example:
r seq #’s: 0, 1, 2, 3
r window size=3
r receiver sees no
difference in two
scenarios!
r incorrectly passes
duplicate data as new
in (a)
Q: what relationship
between seq # size
and window size? Will
this happen in GBN ? Review: Transport Layer 61
r Efficiency
m No loss
m Loss
• Bursty loss
• Sporadic loss
r Resource consumption
m Buffer space
m Timer
• How to implement multi-timers ?
31
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer
32
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now UA P R S F Receive window
len used
(generally not used) # bytes
checksum Urg data pnter rcvr willing
RST, SYN, FIN: to accept
Options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)
33
TCP reliable data transfer
r TCP creates rdt r Retransmissions are
service on top of IP’s triggered by:
unreliable service m timeout events
r Pipelined segments m duplicate acks
34
NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum
(simplified)
event: data received from application above
create TCP segment with sequence number NextSeqNum
if (timer currently not running)
start timer
pass segment to IP Comment:
NextSeqNum = NextSeqNum+ length(data) • SendBase-1: last
event: timer timeout cumulatively
retransmit not-yet-acknowledged segment with ack’ed byte
smallest sequence number Example:
start timer • SendBase-1 = 71;
y= 73, so the rcvr
event: ACK received, with ACK field value of y wants 73+ ;
if (y > SendBase) { y > SendBase, so
SendBase = y
that new data is
if (there are currently not-yet-acknowledged segments)
acked
start timer
}
Seq =9 Seq =9
2, 8 byte 2, 8 byte
s data s data
Seq=92 timeout
Seq =
10 0, 20
bytes
timeout
data
=100
ACK
100
X K=
AC ACK=1
20
loss
Seq =9 Seq =9
2, 8 byte Sendbase 2, 8 byte
s data
s data
= 100
Seq=92 timeout
SendBase
= 120 0
=12
=100 ACK
ACK
SendBase
= 100 SendBase
= 120 premature timeout
time
time
lost ACK scenario
Review: Transport Layer 70
35
TCP retransmission scenarios (more)
Host A Host B
Seq =9
2, 8 byte
s data
100
timeout
Seq =1 CK=
00 , 20 b A
ytes d
ata
X
loss
SendBase ACK
=120
= 120
time
Cumulative ACK scenario
36
Fast Retransmit
r Time-out period may be relatively long:
m eRTT+4DevRTT
m long delay before resending lost packet
r Solution: Fast Retransmit
m Hint: GBN
GBN in
action
37
Fast Retransmit
r Time-out period may r If sender receives 3
be relatively long: ACKs for the same
m eRTT+4DevRTT data, it supposes that
m long delay before segment after ACKed
resending lost packet data was lost:
r Detect lost segments m fast retransmit: resend
via duplicate ACKs. segment before timer
expires
m Sender often sends
many segments back-to-
back
m If segment is lost,
there will likely be many
duplicate ACKs.
38
TCP Round Trip Time and Timeout
Q: how to estimate RTT?
300
RTT (milliseconds)
250
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
39
TCP Round Trip Time and Timeout
EstimatedRTT = (1- α)* EstimatedRTT + α*SampleRTT
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer
40
Principles of Congestion Control
Congestion:
r informally: “too many sources sending too many
data too fast for network to handle”
r Solution
m Sender controls sending rate
r different from flow control!
m Flow control: not overwhelm receiver
m Congestion control: not overwhelm network
r another top-10 problem!
41
TCP Congestion Control
r end-end control (no network assistance)
r sender limits transmission:
LastByteSent-LastByteAcked
≤ CongWin
RcvWindow?
≤
min { rcwWindow, CongWin }
r CongWin is dynamic, function of perceived
network congestion
m Too high a rate -> congestion
m Too low a rate -> low network utilization
three mechanisms:
m AIMD (additive increase multiplicative decrease)
m slow start
m conservative after timeout events
42
1. TCP AIMD
additive increase: multiplicative decrease :
increase CongWin by cut CongWin in half
1 MSS every RTT in after loss event
the absence of loss
events: probing
congestion
window
24 Kbytes
16 Kbytes
8 Kbytes
Sawtooth
time
43
2. TCP Slow Start (more)
r When connection Host A Host B
begins, increase rate
exponentially until one segm
ent
RTT
first loss event:
m double CongWin every two segm
ents
RTT
m done by incrementing
CongWin for every ACK fo ur se gm
en ts
received
r Summary: initial rate
is slow but ramps up
exponentially fast time
44
Refinement (more)
Q: Threshold: When will
exponential increase 14
TimeOut
(segments)
10
1/2 of its value before 8
timeout. 6
4 threshold
2 TCP
Tahoe
Implementation: 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
r Variable Threshold Transmission round
r At a loss event, Threshold
is set to 1/2 of CongWin Series1 Series2
10
(segments)
4 threshold
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Transmission round
Series1 Series2
45
TCP congestion behavior (2)
14
3 Dup Ack
12
congestion window size
10
(segments)
4 threshold
TCP
2
Tahoe
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Transmission round
Series1 Series2
Reno
10
(segments)
4 threshold
TCP
2
Tahoe
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Transmission round
Series1 Series2
46
Summary: TCP Congestion Control (Reno)
r When CongWin is below Threshold, sender in
slow-start phase, window grows exponentially.
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer
47
TCP Fairness
Fair: 1. Equal share
2. Full utilization
Goal: if K TCP sessions share same bottleneck link
of bandwidth R, each should have average rate of
R/K
TCP connection 1
bottleneck
TCP
router
connection 2
capacity R
TCP AIMD
additive increase: multiplicative decrease :
increase CongWin by cut CongWin in half
1 MSS every RTT in after loss event
the absence of loss
events: probing
congestion
window
24 Kbytes
16 Kbytes
8 Kbytes
Sawtooth
time
48
Why is TCP fair?
Two competing sessions:
r Additive increase gives slope of 1, as throughout increases
r multiplicative decrease decreases throughput proportionally
Connection 1 throughput R
y
R x=y
(x 0/2+? /2, y 0/2+? /2)
Connection 2 throughput
(x 0+? , y 0+? )
(x 0,y0)
Connection 1 throughput R x
49
Why is TCP fair?
D.M. Chiu and R. Jain, "Analysis of the Increase and Decrease
Algorithms for Congestion Avoidance in Computer Networks,"
Computer Networks and ISDN Systems, pp. 1-14, 1989.
R x=y
Connection 2 throughput
Connection 1 throughput R
Fairness (more)
Fairness and UDP Fairness and parallel TCP
connections
r Multimedia apps often do
not use TCP r nothing prevents app from
m do not want rate throttled opening parallel connections
by congestion control between 2 hosts.
r Instead use UDP: r Web browsers/FTP client do this
m pump audio/video at m NetAnts, GetRight
constant rate, tolerate r Example: link of rate R with 9
packet loss ongoing Tcp connections;
r Research area: TCP m new app asks for 1 TCP, gets rate
friendly, more on later R/10
m new app asks for 11 TCPs, gets >
R/2 !
50
Delay performance
Q: How long does it take to receive an object from
a Web server after sending a request?
Methods
r Measurement
m Ping, traceroute
r Simulation
m Ns-2
r Analytical modeling
m Math
51
Fixed congestion window (1)
First case:
WS/R > RTT + S/R: ACK for
first segment in window
returns before window ’s
worth of data sent
delay = ?
First case:
WS/R > RTT + S/R: ACK for
first segment in window
returns before window ’s
worth of data sent
52
Fixed congestion window (2)
Second case:
r WS/R < RTT + S/R: wait
for ACK after sending
window’s worth of data
sent
delay = ?
Second case:
r WS/R < RTT + S/R: wait
for ACK after sending
window’s worth of data
sent
K?
53
Fixed congestion window (2)
Second case:
r WS/R < RTT + S/R: wait
for ACK after sending
window’s worth of data
sent
K =O/(WS)
P = min{Q, K − 1}
54
Case 1: P = Q
Delay components: initiate TCP
connection
• 2 RTT for connection
estab and request request
• O/R to transmit object object
first window
Server idles:
P = min{K-1,Q} times third window
= 4S/R
Case 2: P = K-1
Delay components:
• 2 RTT for connection
estab and request
• O/R to transmit object
• time server idles due to
slow start
Server idles:
P = min{K-1,Q} times
Example:
• O/S = 3 segments
• K = 2 windows
•Q=2
• P = min{K-1,Q} = 1
55
TCP Delay Modeling (contd)
S
+ RTT = time from when server starts to send segment
R
until server receives acknowledgement
initiate TCP
connection
S
2k −1 = time to transmit the kth window request
R object
first window
= S/R
+
S k −1 S
RTT
P
O
delay = + 2RTT + ∑ idleTimep fourth window
= 8S/R
R p =1
P
O S S
= + 2RTT + ∑[ + RTT − 2 k −1 ]
R k =1 R R object
complete
transmission
delivered
O S S
= + 2RTT + P[ RTT + ] − (2P − 1) time at
time at
server
R R R client
How do we calculate K ?
K = min {k : 2 0 S + 21 S + L + 2 k −1 S ≥ O}
k −1
= min {k : 2 + 2 + L + 2 ≥ O / S}
0 1
O
= min {k : 2 k −1 ≥ }
S
O
= min {k : k ≥ log 2 ( + 1)}
S
O
= log 2 ( + 1)
S
Calculation of Q, number of idles for infinite -size object,
is similar
max{q : 2q −1 S / R ≤ RTT + S / R}
Review: Transport Layer 112
56