You are on page 1of 56

Mobile Ad Hoc Networking

Transport Layer
Issues

Review: Transport Layer 1

Transport Layer Issues


Contents:
r overview principles r TCP Performance analysis
behind transport layer
services:
m multiplexing/demultiple
xing
m reliable data transfer
m flow control
m congestion control

Review: Transport Layer 2

1
But first,
a general overview of networks (and the Internet)

Telecommunication
networks

Circuit-switched Packet-switched
networks networks

FDM Networks Datagram


TDM
with VCs Networks

Review: Transport Layer 3

What Is the Internet?


r A network of networks, joining many government, university
and private computers together and providing an
infrastructure for the use of E-mail, bulletin boards, file
archives, hypertext documents, databases and other
computational resources
r The vast collection of computer networks which form and
act as a single huge network for transport of data and
messages across distances which can be anywhere from the
same office to anywhere in the world.

Written by William F. Slater, III


1996
President of the Chicago Chapter of the Internet Society
Review: Transport Layer 4
Copyright 2002, William F. Slater, III, Chicago, IL, USA

2
What is the Internet?
r The largest network of networks in the world.
r Uses TCP/IP protocols and packet switching .
r Runs on any communications substrate.

From Dr. Vinton Cerf,


Co-Creator of TCP/IP

Review: Transport Layer 5

Brief History of the Internet

r 1968 - DARPA (Defense Advanced Research Projects Agency)


contracts with BBN (Bolt, Beranek & Newman) to create ARPAnet
r 1970 - First five nodes:
m UCLA
m Stanford
m UC Santa Barbara
m U of Utah, and
m BBN

r 1974 - TCP specification by Vint Cerf


r 1984 – On January 1, the Internet with its 1000 hosts
converts en masse to using TCP/IP for its messaging

Review: Transport Layer 6

3
*** Internet History ***

Review: Transport Layer 7

A Brief Summary of the


Evolution of the Internet Age of
eCommerce
Mosaic Begins
WWW Created 1995
Internet Created 1993
Named 1989
and
Goes
TCP/IP TCP/IP
Created 1984
ARPANET 1972
1969
Hypertext
Invented
Packet 1965
Switching
First Vast Invented
Computer 1964
Network
Silicon Envisioned
Chip 1962
Mathematical 1958
Theory of
Communication
Memex 1948
Conceived
1945

1945 1995
Review: Transport Layer 8
Copyright 2002, William F. Slater, III, Chicago, IL, USA

4
From Simple, But Significant Ideas Bigger Ones
Grow 1940s to 1969
We will prove that packet switching
works over a WAN.

Hypertext can be used to allow


rapid access to text data

Packet switching can be used to


send digitized data though
computer networks
We can accomplish a lot by having a
vast network of computers to use for
accessing information and exchanging ideas

We can do it cheaply by using


Digital circuits etched in silicon.

We do it reliably with “bits”,


sending and receiving data

We can access
information using
electronic computers

1945 1969
Review: Transport Layer 9
Copyright 2002, William F. Slater, III, Chicago, IL, USA

From Simple, But Significant Ideas Bigger Ones


Grow 1970s to 1995

Great efficiencies can be accomplished if we use


The Internet and the World Wide Web to conduct business.

The World Wide Web is easier to use if we have a browser that


To browser web pages, running in a graphical user interface context.

Computers connected via the Internet can be used


more easily if hypertext links are enabled using HTML
and URLs: it’s called World Wide Web
The ARPANET needs to convert to
a standard protocol and be renamed to
The Internet
We need a protocol for Efficient
and Reliable transmission of
Packets over a WAN: TCP/IP

Ideas from
1940s to 1969

1970 1995
Review: Transport Layer 10
Copyright 2002, William F. Slater, III, Chicago, IL, USA

5
The Creation of the Internet

r The creation of the Internet solved the following challenges:


m Basically inventing digital networking as we know it
m Survivability of an infrastructure to send / receive high-speed
electronic messages
m Reliability of computer messaging

Review: Transport Layer 11


Copyright 2002, William F. Slater, III, Chicago, IL, USA

Internet Pioneers

Vannevar Bush Claude Shannon Paul Baran


(APARNet) (Information theory) (Pakcet switching)

Leonard Kleinrock Ted Nelson Lawrence Roberts


(Pakcet switching) (Hypertext) (APARNet)

Vinton Cerf Robert Kahn Tim Berners-Lee


(TCP/IP) (TCP/IP) (WWW)

Mark Andreesen
(Mosaic/Netscape)

Review: Transport Layer 12

6
Growth of Internet Hosts *
Sept. 1969 - Sept. 2002

250,000,000

Sept. 1, 2002
200,000,000
No. of Hosts

150,000,000

100,000,000 Dot-Com Burst Begins

50,000,000

0
1

3
/71

/74

/76

/79

/81

/85

/88

/89

/91

/92

/93

/94

/95

/96

/97

/99

/01

/02
9

/73

/83

/86

/89

/98
/9

/9

/9
9/6

01

04

04
01

01

01

01

08

10

07

10

10

10

10

07

01

01

01

01

01

08
01

08

11

01

01
Time Period
Chart by William F. Slater, III
The Internet was not known as "The Internet" until January 1984, at which time
there were 1000 hosts that were all converted over to using TCP/IP.
Review: Transport Layer 13
Copyright 2002, William F. Slater, III, Chicago, IL, USA

ISO 7-layer reference model

application application

transport presentation

network session

link

physical

Review: Transport Layer 14

7
Internet protocol stack
r application: supporting network
applications
m FTP, SMTP, HTTP
application
r transport: host-host data transfer
transport
m TCP, UDP
r network: routing of datagrams from
source to destination network
m IP, routing protocols e.g. OSPF, BGP
r link: data transfer between link
neighboring network elements
m PPP, Ethernet physical
r physical: bits “on the wire”

Review: Transport Layer 15

Internet Standardization Process


r All standards of the Internet are published as RFC
(Request for Comments)
m but not all RFCs are Internet Standards !
m available: http://www.ietf.org
m Till now: RFC4333
r A typical (but not the only) way of standardization:
m Internet draft
m RFC
m Proposed standard
m Draft standard (requires 2 working implementations)
m Internet standard (declared by Internet Architecture
Board)

Review: Transport Layer 16

8
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer

Review: Transport Layer 17

Transport layer – the other side of the door


host or host or
server server

controlled by
app developer
process process

socket socket
TCP with TCP with
buffers, Internet buffers,
variables variables

controlled
by OS

r API: (1) choose transport protocol; (2) set parameters

Review: Transport Layer 18

9
Transport services and protocols
r provide logical application

communication between
transport
network
data link
app processes running on
network
physical data link
network physical
different hosts

log
data link

ica
physical

le
r transport protocols run network

nd
data link

in end systems physical network

-en
data link

dt
physical
m send side: breaks app

ran
spo
network

messages into

rt
data link
physical

segments, passes to application


network layer transport
network

m rcv side: reassembles


data link
physical

segments into
messages, passes to
app layer
Review: Transport Layer 19

Transport vs. network layer


r network layer: logical communication between hosts
m Point-to-point
r transport layer: logical communication between
processes
m relies on and enhances, network layer services
m also called “End-to-End”

J. Saltzer , D. Reed, and D. Clark. End-to-end arguments in system design.


ACM Transactions on Computer Systems, 2(4):277--288, 1984.

Review: Transport Layer 20

10
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer

Review: Transport Layer 21

How demultiplexing works


r host receives IP datagrams
meach datagram has source 32 bits
IP address, destination IP
source port # dest port #
address
m each datagram carries 1
transport-layer segment other header fields
m each segment has source,
destination port number
(recall: well-known port application
numbers for specific data
applications) (message)
r host uses IP addresses & port
numbers to direct segment to
appropriate socket TCP/UDP segment format

Review: Transport Layer 22

11
Connection-oriented demux

r TCP socket identified by 4-tuple:


m source IP address
m source port number
m dest IP address
m dest port number
r recv host uses all four values to direct
segment to appropriate socket

Review: Transport Layer 23

Connection-oriented demux

P3 P3 P4 P1
P1

SP: 80 SP: 80
DP: 9157 DP: 5775

SP: 9157 SP: 5775


client DP: 80 DP: 80 Client
server
IP: A IP:B
IP: C

Review: Transport Layer 24

12
Connection-oriented demux

r TCP socket identified Q:


by 4-tuple: r Why use 4-tuple?
m source IP address
m source port number
m dest IP address
m dest port number
r recv host uses all four
values to direct
segment to appropriate
socket

Review: Transport Layer 25

Connection-oriented demux

r TCP socket identified Examples:


by 4-tuple: r Server host may support
m source IP address many simultaneous TCP
m source port number sockets:
m dest IP address m each socket identified by
m dest port number its own 4-tuple
r recv host uses all four r Web servers have
values to direct different sockets for
segment to appropriate each connecting client
socket m non-persistent HTTP will
have a different socket
for each request

Review: Transport Layer 26

13
UDP: User Datagram Protocol [RFC 768]
r “no frills,” “bare bones”
Internet transport Why is there a UDP?
protocol
r no connection
r “best effort” service, UDP establishment (which can
segments may be: add delay)
m lost r simple: no connection state
m delivered out of order at sender, receiver
to app r small segment header
r connectionless: r no congestion control: UDP
m no handshaking between can blast away as fast as
UDP sender, receiver desired
m each UDP segment
handled independently
of others

Review: Transport Layer 27

UDP: more
r often used for streaming
32 bits
multimedia apps
m loss tolerant Length, in source port # dest port #
m rate sensitive bytes of UDP length checksum
segment,
r other UDP uses including
m DNS – why ? header

r reliable transfer over UDP: Application


add reliability at data
application layer (message)
m application-specific
error recovery!
UDP segment format

Review: Transport Layer 28

14
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer

Review: Transport Layer 29

Principles of Reliable data transfer


r important in app., transport, link layers
r top-10 list of important networking topics!

r characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
Review: Transport Layer 30

15
Reliable data transfer: getting started

rdt_send(): called from above, deliver_data(): called by


(e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver

Review: Transport Layer 31

Reliable data transfer: getting started

We’ll:
r incrementally develop sender, receiver sides of
reliable data transfer protocol (rdt)
r What is unreliability ?

m Biterror
m Packet loss – congestion
m Delay – too long

Review: Transport Layer 32

16
Rdt1.0: reliable transfer over a reliable channel
r underlying channel perfectly reliable
m no bit errors
m no loss of packets
r separate FSMs for sender, receiver:
m sender sends data into underlying channel
m receiver read data from underlying channel

Wait for rdt_send(data) Wait for rdt_rcv(packet)


call from call from extract (packet,data)
above packet = make_pkt(data) below deliver_data(data)
udt_send(packet)

sender receiver

Review: Transport Layer 33

rdt2.0: channel with bit errors


rdt_send(data)
sndpkt = make_pkt(data, checksum) receiver
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for
Λ
call from
sender below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

Review: Transport Layer 34

17
rdt2.0 has a fatal flaw!

What happens if Handling duplicates:


ACK/NAK corrupted? r sender adds sequence
r sender doesn’ t know what number to each pkt
happened at receiver! r sender retransmits current
pkt if ACK/NAK garbled
What to do? r receiver discards (doesn’ t
r sender NAKs for receiver’s deliver up) duplicate pkt
ACK/NAK? What if sender
NAK corrupted?
r retransmit, assuming it is
NAK …
r but this might cause
retransmission of correctly
received pkt!
- packet duplications !

Review: Transport Layer 35

rdt2.1: sender, handles garbled ACK/NAKs


rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK or isNAK(rcvpkt) )
call 0 from
NAK 0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)
Λ
Λ
Wait for Wait for
ACK or call 1 from
rdt_rcv(rcvpkt) && NAK 1 above
( corrupt(rcvpkt) ||
isNAK(rcvpkt) ) rdt_send(data)

udt_send(sndpkt) sndpkt = make_pkt(1, data, checksum)


udt_send(sndpkt)

Review: Transport Layer 36

18
rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt ) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum )
udt_send(sndpkt)
rdt_rcv(rcvpkt ) && (corrupt(rcvpkt) rdt_rcv(rcvpkt ) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum ) sndpkt = make_pkt(NAK, chksum )
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt ) && 0 from 1 from rdt_rcv(rcvpkt ) &&
not corrupt(rcvpkt ) && below below not corrupt(rcvpkt ) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum ) sndpkt = make_pkt(ACK, chksum )
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt ) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum )
udt_send(sndpkt)

Review: Transport Layer 37

rdt 2.1 in action

sender receiver sender receiver


pkt pkt
send pkt0 send pkt0
rcv pkt0 rcv pkt0
ACK send ACK0 ACK send ACK0
rcv ACK0 rcv ACK0
pkt pkt
send pkt1 send pkt1 X (corrupted)
rcv pkt1 rcv pkt1
ACK send ACK1 NAK send NAK1
rcv ACK1 rcv NAK1
send pkt0 pkt resend pkt1 pkt
rcv pkt0 rcv pkt1
ACK send ACK0 ACK send ACK1

a) operation with no corruption b) packet corrupted

Review: Transport Layer 38

19
rdt 2.1 in action (cont)

sender receiver
pkt
send pkt0
rcv pkt0
ACK send ACK0
(corrupted) X
rcv ACK0
resend pkt0 pkt
rcv pkt0
ACK send ACK0
rcv ACK0
send pkt1 pkt
rcv pkt1
ACK send ACK1

c) ACK corrupted

Review: Transport Layer 39

rdt2.2: a NAK-free protocol


rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
Wait for
( corrupt(rcvpkt) ||
Wait for
ACK isACK(rcvpkt,1) )
call 0 from
above 0 udt_send(sndpkt)
sender FSM
fragment rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && && isACK(rcvpkt,0)
(corrupt(rcvpkt) || Λ
has_seq1(rcvpkt)) Wait for
receiver FSM
0 from
udt_send(sndpkt) below fragment
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt) Review: Transport Layer 40

20
rdt 2.2 in action

sender receiver sender receiver


pkt0 pkt0
send pkt0 send pkt0
rcv pkt0 rcv pkt0
ACK0 send ACK0 ACK0 send ACK0
rcv ACK0 rcv ACK0
pkt1 pkt1
send pkt1 send pkt1 X (corrupted)
rcv pkt1 rcv pkt1
ACK1 send ACK1 ACK0 send ACK0
rcv ACK1 rcv ACK0
send pkt0 pkt0 resend pkt1 pkt1
rcv pkt0 rcv pkt1
ACK0 send ACK0 ACK1 send ACK1

a) operation with no corruption b) packet corrupted

Review: Transport Layer 41

rdt 2.2 in action (cont)

sender receiver

pkt0
send pkt0
rcv pkt0
ACK0 send ACK0
(corrupted) X
rcv ACK0
resend pkt0 pkt0
rcv pkt0
ACK0 send ACK0
rcv ACK0
send pkt1 pkt1
rcv pkt1
ACK1 send ACK1

c) ACK corrupted

Review: Transport Layer 42

21
rdt3.0 channels with errors and loss
rdt_send(data)
rdt_rcv(rcvpkt ) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) ||
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt ) start_timer Λ
Λ Wait for Wait
timeout
call 0from for
ACK0 udt_send(sndpkt)
above
start_timer
rdt_rcv(rcvpkt )
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt )
&& isACK(rcvpkt,1) && notcorrupt(rcvpkt)
stop_timer && isACK(rcvpkt,0)
stop_timer
Wait Wait for
timeout for call 1 from
udt_send(sndpkt) ACK1 above
start_timer rdt_rcv(rcvpkt )

rdt_send(data) Λ
rdt_rcv(rcvpkt ) &&
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum)
isACK(rcvpkt,0) ) udt_send(sndpkt)
start_timer Sender
Λ

Review: Transport Layer 43

rdt3.0: Poor performance


sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send
ACK

ACK arrives, send next


packet, t = RTT + L / R Stop-and-Wait

stop and wait


Sender sends one packet, L/R
U =
then waits for receiver sender
RTT + L / R
response

Review: Transport Layer 44

22
Performance of rdt3.0

r example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet:

Ttransmit = L (packet length in bits) 8kb/pkt


= = 8 microsec
R (transmission rate, bps) 109 b/sec

U L/R . 008
= = = 0.00027
sender 30.008
RTT + L / R microsec

m U sender: utilization – fraction of time sender busy sending


m 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
m network protocol limits use of physical resources!
m microsec = 10-6 sec millisec=ms=10-3 s Gb, Mb, Kb
Review: Transport Layer 45

Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-to-
be-acknowledged pkts
m range of sequence numbers must be increased
m buffering at sender and/or receiver

Review: Transport Layer 46

23
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

Increase utilization
by a factor of 3
U 3*L/R . 024
= = = 0.0008
sender 30.008
RTT + L / R microsecon

r Two generic forms of pipelined protocols: go-Back-N,


selective repeat 47 Review: Transport Layer

Go-Back-N
Sender:
r k-bit seq # in pkt header
r “window ” of up to N, consecutive unack’ed pkts allowed – sliding
window

r ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”


may receive duplicate ACKs (see receiver)
m
r timer for the packet of send_base
r timeout(n): retransmit pkt n and all higher seq # pkts in window

Review: Transport Layer 48

24
GBN: sender extended FSM
rdt_send(data)

if (nextseqnum < base+N) {


sndpkt[nextseqnum ] = make_pkt(nextseqnum,data,chksum )
udt_send(sndpkt[nextseqnum ])
if (base == nextseqnum )
start_timer
nextseqnum ++
}
Λ else
refuse_data(data)
base=1
nextseqnum =1
timeout
start_timer
Wait udt_send(sndpkt[base])
rdt_rcv(rcvpkt ) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum -1])
rdt_rcv(rcvpkt ) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum )
stop_timer
else
start_timer Review: Transport Layer 49

GBN: receiver extended FSM


default
udt_send(sndpkt) rdt_rcv(rcvpkt )
&& notcurrupt(rcvpkt)
Λ && hasseqnum(rcvpkt,expectedseqnum )
expectedseqnum =1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt( 0, ACK, chksum ) sndpkt = make_pkt(expectedseqnum,ACK,chksum )
udt_send(sndpkt)
expectedseqnum ++

ACK-only: always send ACK for correctly-received pkt


with highest in-order seq #
m may generate duplicate ACKs
m need only remember expectedseqnum
r out-of-order pkt:
m discard (don’t buffer) -> no receiver buffering!
m Re-ACK pkt with highest in-order seq #
Review: Transport Layer 50

25
GBN in
action

Review: Transport Layer 51

GBN: sender extended FSM


rdt_send(data)

if (nextseqnum < base+N) {


sndpkt[nextseqnum ] = make_pkt(nextseqnum,data,chksum )
udt_send(sndpkt[nextseqnum ])
if (base == nextseqnum )
start_timer
nextseqnum ++
}
Λ else
refuse_data(data)
base=1
nextseqnum =1
timeout
start_timer
Wait udt_send(sndpkt[base])
rdt_rcv(rcvpkt ) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum -1])
rdt_rcv(rcvpkt ) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum )
stop_timer
else
start_timer Review: Transport Layer 52

26
Sender Receiver

send pkt0
GBN in send pkt1
rcv pkt0
send ACK0
action send pkt2 rcv pkt1
send ACK1
send pkt3 rcv pkt2
send ACK2
rcv pkt3
Cumulative ACK rcv ACK0
send pkt4
(loss)X send ACK3

rcv ACK1 (loss)X


send pkt5 rcv pkt4
send ACK4
rcv pkt5
(loss)X send ACK5

rcv ACK5
send pkt6
send pkt7
send pkt8
send pkt9
Review: Transport Layer 53

Sender Receiver

send pkt0
GBN in send pkt1
rcv pkt0
send ACK0
action send pkt2 rcv pkt1
send ACK1
send pkt3 rcv pkt2
send ACK2
rcv pkt3
Cumulative ACK rcv ACK0
send pkt4
(loss)X send ACK3

rcv ACK1 (loss)X


send pkt5 rcv pkt4
send ACK4
rcv pkt5
(loss)X send ACK5

rcv ACK5
send pkt6
send pkt7
send pkt8
send pkt9
Review: Transport Layer 54

27
Sender Receiver
send pkt0
rcv pkt0
send ACK0
GBN in send pkt1
rcv pkt1

action
send ACK1
send pkt2

send pkt3

Premature rcv ACK0 rcv pkt3,discard


send ACK1

timeout send pkt4 rcv pkt2


rcv ACK1 send ACK2
rcv pkt4,discard
send pkt5 send ACK2

rcv pkt5,discard
send ACK2
pkt2 timeout
send pkt2,3,4,5

Review: Transport Layer 55

Sender Receiver
send pkt0
rcv pkt0
send ACK0
GBN in send pkt1
rcv pkt1

action
send ACK1
send pkt2

send pkt3

Premature rcv ACK0 rcv pkt3,discard


send ACK1

timeout send pkt4 rcv pkt2


rcv ACK1 send ACK2
rcv pkt4,discard
send pkt5 send ACK2

rcv pkt5,discard
send ACK2
pkt2 timeout
send pkt2,3,4,5

Review: Transport Layer 56

28
Selective Repeat

r receiver individually acknowledges all correctly


received pkts
m buffers pkts, as needed, for eventual in-order delivery
to upper layer
r sender only resends pkts for which ACK not
received
m sender timer for each unACKed pkt
r sender window
m N consecutive seq #’s
m again limits seq #s of sent, unACKed pkts

Review: Transport Layer 57

Selective repeat: sender, receiver windows

Review: Transport Layer 58

29
Selective repeat
sender receiver
data from above : pkt n in [rcvbase, rcvbase+N -1]
r if next available seq # in r send ACK(n)
window, send pkt r out-of-order: buffer
timeout(n): r in-order: deliver (also
r resend pkt n, restart timer deliver buffered, in-order
pkts), advance window to
ACK(n) in [sendbase,sendbase+N]: next not-yet-received pkt
r mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
r if n smallest unACKed pkt,
r ACK(n)
advance window base to
next unACKed seq # otherwise:
r ignore

Review: Transport Layer 59

Selective repeat in action

Review: Transport Layer 60

30
Selective repeat:
dilemma
Example:
r seq #’s: 0, 1, 2, 3
r window size=3

r receiver sees no
difference in two
scenarios!
r incorrectly passes
duplicate data as new
in (a)

Q: what relationship
between seq # size
and window size? Will
this happen in GBN ? Review: Transport Layer 61

Go Back N vs. Selective Repeat

r Efficiency
m No loss
m Loss
• Bursty loss
• Sporadic loss
r Resource consumption
m Buffer space
m Timer
• How to implement multi-timers ?

Review: Transport Layer 62

31
Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer

Review: Transport Layer 63

TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581

r End-to-end, unicast: r full duplex data:


m one sender, one receiver m bi-directional data flow
r reliable, in-order byte in same connection
steam: r connection-oriented:
m no “message boundaries” m handshaking (exchange
of control msgs) init’s
r Pipelined (not stop-wait): sender, receiver state
m TCP congestion and flow before data exchange
control set window size
r flow controlled:
m send & receive buffers
m sender will not
overwhelm receiver
application application
writes data reads data
socket socket
door door
TCP TCP
send buffer receive buffer
segment

Review: Transport Layer 64

32
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now UA P R S F Receive window
len used
(generally not used) # bytes
checksum Urg data pnter rcvr willing
RST, SYN, FIN: to accept
Options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

Review: Transport Layer 65

TCP Connection Setup

Three way handshake:


Step 1: client host sends TCP SYN segment to server
m specifies initial seq #
m no data
Step 2: server host receives SYN, replies with SYNACK
segment
m server allocates buffers
m specifies server initial seq. #
Step 3: client receives SYNACK, replies with ACK segment,
which may contain data – piggyback

Q: Is 3-way handshake perfect ?

Review: Transport Layer 66

33
TCP reliable data transfer
r TCP creates rdt r Retransmissions are
service on top of IP’s triggered by:
unreliable service m timeout events
r Pipelined segments m duplicate acks

r Cumulative acks r Initially consider


simplified TCP sender:
r TCP uses single
ignore duplicate acks
retransmission timer m

m ignore flow control,


congestion control

Review: Transport Layer 67

TCP sender events:


data rcvd from app: timeout:
r Create segment with r retransmit segment that
seq # caused timeout
r seq # is byte-stream r restart timer
number of first data Ack rcvd:
byte in segment r If acknowledges
r start timer if not previously unacked
already running (think segments
of timer as for oldest m update what is known to be
unacked segment) acked – cumulative ack
r expiration interval: m start timer if there are
TimeOutInterval outstanding segments

Review: Transport Layer 68

34
NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum

loop (forever) { TCP


sender
switch(event)

(simplified)
event: data received from application above
create TCP segment with sequence number NextSeqNum
if (timer currently not running)
start timer
pass segment to IP Comment:
NextSeqNum = NextSeqNum+ length(data) • SendBase-1: last
event: timer timeout cumulatively
retransmit not-yet-acknowledged segment with ack’ed byte
smallest sequence number Example:
start timer • SendBase-1 = 71;
y= 73, so the rcvr
event: ACK received, with ACK field value of y wants 73+ ;
if (y > SendBase) { y > SendBase, so
SendBase = y
that new data is
if (there are currently not-yet-acknowledged segments)
acked
start timer
}

} /* end of loop forever */


Review: Transport Layer 69

TCP: retransmission scenarios


Host A Host B Host A Host B

Seq =9 Seq =9
2, 8 byte 2, 8 byte
s data s data
Seq=92 timeout

Seq =
10 0, 20
bytes
timeout

data
=100
ACK
100
X K=
AC ACK=1
20
loss
Seq =9 Seq =9
2, 8 byte Sendbase 2, 8 byte
s data
s data
= 100
Seq=92 timeout

SendBase
= 120 0
=12
=100 ACK
ACK

SendBase
= 100 SendBase
= 120 premature timeout
time
time
lost ACK scenario
Review: Transport Layer 70

35
TCP retransmission scenarios (more)
Host A Host B

Seq =9
2, 8 byte
s data

100
timeout

Seq =1 CK=
00 , 20 b A
ytes d
ata
X
loss

SendBase ACK
=120
= 120

time
Cumulative ACK scenario

Review: Transport Layer 71

TCP ACK generation [RFC 1122, RFC 2581]

Event at Receiver TCP Receiver action


Arrival of in-order segment with Delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

Arrival of in-order segment with Immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

Arrival of segment that Immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Arrival of out-of-order segment Immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected
Review: Transport Layer 72

36
Fast Retransmit
r Time-out period may be relatively long:
m eRTT+4DevRTT
m long delay before resending lost packet
r Solution: Fast Retransmit
m Hint: GBN

Review: Transport Layer 73

GBN in
action

Review: Transport Layer 74

37
Fast Retransmit
r Time-out period may r If sender receives 3
be relatively long: ACKs for the same
m eRTT+4DevRTT data, it supposes that
m long delay before segment after ACKed
resending lost packet data was lost:
r Detect lost segments m fast retransmit: resend
via duplicate ACKs. segment before timer
expires
m Sender often sends
many segments back-to-
back
m If segment is lost,
there will likely be many
duplicate ACKs.

Review: Transport Layer 75

Fast retransmit algorithm:

event: ACK received, with ACK field value of y


if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}
else {
increment count of dup ACKs received for y
if (count of dup ACKs received for y = 3) {
resend segment with sequence number y
}

a duplicate ACK for fast retransmit


already ACKed segment

Review: Transport Layer 76

38
TCP Round Trip Time and Timeout
Q: how to estimate RTT?

r SampleRTT: measured time from segment transmission


until ACK receipt

One RTT sample

Review: Transport Layer 77

TCP Round Trip Time and Timeout


r Problem 2:
SampleRTT will vary -> atypical
m Need the trend of RTT: history –> future
m average several recent measurements, not just current
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
SampleRTT
350

300
RTT (milliseconds)

250

200

150

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)

SampleRTT Estimated RTT

Review: Transport Layer 78

39
TCP Round Trip Time and Timeout
EstimatedRTT = (1- α)* EstimatedRTT + α*SampleRTT

r typical value: α = 0.125


r influence of past sample decreases exponentially fast
m Exponential weighted moving average

Review: Transport Layer 79

Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer

Review: Transport Layer 80

40
Principles of Congestion Control

Congestion:
r informally: “too many sources sending too many
data too fast for network to handle”
r Solution
m Sender controls sending rate
r different from flow control!
m Flow control: not overwhelm receiver
m Congestion control: not overwhelm network
r another top-10 problem!

Review: Transport Layer 81

Approaches towards congestion control


Two broad approaches towards congestion control:

Network-assisted End-end congestion


congestion control: control:
r routers provide feedback r no explicit feedback from
to end systems network
m single bit indicating r congestion inferred from
congestion (SNA, end-system observed loss,
DECbit, TCP/IP ECN, delay
ATM) r approach taken by TCP
m explicit rate sender
should send at

Fast, accurate, but expensive

Review: Transport Layer 82

41
TCP Congestion Control
r end-end control (no network assistance)
r sender limits transmission:
LastByteSent-LastByteAcked
≤ CongWin

RcvWindow?

min { rcwWindow, CongWin }
r CongWin is dynamic, function of perceived
network congestion
m Too high a rate -> congestion
m Too low a rate -> low network utilization

Review: Transport Layer 83

TCP Congestion Control


How does sender perceive congestion?
r loss event
r TCP sender reduces rate (CongWin) after loss
event
Loss event = timeout or 3 duplicate acks

three mechanisms:
m AIMD (additive increase multiplicative decrease)
m slow start
m conservative after timeout events

Review: Transport Layer 84

42
1. TCP AIMD
additive increase: multiplicative decrease :
increase CongWin by cut CongWin in half
1 MSS every RTT in after loss event
the absence of loss
events: probing
congestion
window

24 Kbytes

16 Kbytes

8 Kbytes

Sawtooth
time

Long-lived TCP connection Review: Transport Layer 85

2. TCP Slow Start


r When connection begins, r When connection begins,
CongWin = 1 MSS increase rate exponentially
m Example: MSS = 500 bytes fast until first loss event
& RTT = 200 msec
m initial rate = 20 kbps
r available bandwidth may
be >> MSS/RTT
m desirable to quickly ramp
up to respectable rate

Review: Transport Layer 86

43
2. TCP Slow Start (more)
r When connection Host A Host B
begins, increase rate
exponentially until one segm
ent

RTT
first loss event:
m double CongWin every two segm
ents
RTT
m done by incrementing
CongWin for every ACK fo ur se gm
en ts
received
r Summary: initial rate
is slow but ramps up
exponentially fast time

Review: Transport Layer 87

3. Refinement (TCP Reno)


Philosophy:
r After 3 dup ACKs:
• 3 dup ACKs indicates
m CongWin is cut in half
network capable of
m window then grows delivering some segments
linearly • timeout before 3 dup
r But after timeout event: ACKs is “more alarming”
m CongWin instead set to
1 MSS;
TCP versions:
m window then grows
exponentially Tahoe -> Reno -> Sack
m to a Threshold, then
Vegas, Westwood …
grows linearly
(Nevada)
Review: Transport Layer 88

44
Refinement (more)
Q: Threshold: When will
exponential increase 14
TimeOut

congestion window size


switch to linear? 12 TCP
Reno
A: When CongWin gets to

(segments)
10
1/2 of its value before 8
timeout. 6
4 threshold
2 TCP
Tahoe
Implementation: 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
r Variable Threshold Transmission round
r At a loss event, Threshold
is set to 1/2 of CongWin Series1 Series2

just before loss event

Review: Transport Layer 89

TCP congestion behavior (1)


14
TimeOut
12
congestion window size

10
(segments)

4 threshold

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Transmission round

Series1 Series2

Review: Transport Layer 90

45
TCP congestion behavior (2)
14
3 Dup Ack
12
congestion window size

10
(segments)

4 threshold
TCP
2
Tahoe
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Transmission round

Series1 Series2

Review: Transport Layer 91

TCP congestion behavior (3)


14
3 Dup Ack
12 TCP
congestion window size

Reno
10
(segments)

4 threshold
TCP
2
Tahoe
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Transmission round

Series1 Series2

Review: Transport Layer 92

46
Summary: TCP Congestion Control (Reno)
r When CongWin is below Threshold, sender in
slow-start phase, window grows exponentially.

r When CongWin is above Threshold, sender is in


congestion-avoidance phase, window grows linearly.

r When a triple duplicate ACK occurs, Threshold


set to CongWin/2 and CongWin set to
Threshold.

r When timeout occurs, Threshold set to


CongWin/2 and CongWin is set to 1 MSS.

V. Jacobson, Congestion Avoidance and Control. Proceedings of


ACM SIGCOMM '88, Aug. 1988.
Review: Transport Layer 93

Outline
r 1. Transport-layer r 5. Connection-oriented
services transport: TCP
r 2. Multiplexing and r 6. TCP congestion control
demultiplexing r 7. TCP fairness and delay
r 3. Connectionless performance
transport: UDP
r 4. Principles of reliable
data transfer

Review: Transport Layer 94

47
TCP Fairness
Fair: 1. Equal share
2. Full utilization
Goal: if K TCP sessions share same bottleneck link
of bandwidth R, each should have average rate of
R/K

TCP connection 1

bottleneck
TCP
router
connection 2
capacity R

Review: Transport Layer 95

TCP AIMD
additive increase: multiplicative decrease :
increase CongWin by cut CongWin in half
1 MSS every RTT in after loss event
the absence of loss
events: probing
congestion
window

24 Kbytes

16 Kbytes

8 Kbytes

Sawtooth
time

Long-lived TCP connection Review: Transport Layer 96

48
Why is TCP fair?
Two competing sessions:
r Additive increase gives slope of 1, as throughout increases
r multiplicative decrease decreases throughput proportionally

R equal bandwidth share


Connection 2 throughput

loss: decrease window by factor of 2


congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase

Connection 1 throughput R

Review: Transport Layer 97

Why is TCP fair?


Known:
x0 >y0

y
R x=y
(x 0/2+? /2, y 0/2+? /2)
Connection 2 throughput

(x 0+? , y 0+? )

(x 0+? /2, y 0+? /2)

(x 0,y0)
Connection 1 throughput R x

Review: Transport Layer 98

49
Why is TCP fair?
D.M. Chiu and R. Jain, "Analysis of the Increase and Decrease
Algorithms for Congestion Avoidance in Computer Networks,"
Computer Networks and ISDN Systems, pp. 1-14, 1989.

R x=y
Connection 2 throughput

Connection 1 throughput R

Review: Transport Layer 99

Fairness (more)
Fairness and UDP Fairness and parallel TCP
connections
r Multimedia apps often do
not use TCP r nothing prevents app from
m do not want rate throttled opening parallel connections
by congestion control between 2 hosts.
r Instead use UDP: r Web browsers/FTP client do this
m pump audio/video at m NetAnts, GetRight
constant rate, tolerate r Example: link of rate R with 9
packet loss ongoing Tcp connections;
r Research area: TCP m new app asks for 1 TCP, gets rate
friendly, more on later R/10
m new app asks for 11 TCPs, gets >
R/2 !

Review: Transport Layer 100

50
Delay performance
Q: How long does it take to receive an object from
a Web server after sending a request?

Methods
r Measurement
m Ping, traceroute
r Simulation
m Ns-2
r Analytical modeling
m Math

Review: Transport Layer 101

Delay modeling – No Congestion


Notation, assumptions:
Q: How long does it take to r Assume one link between
receive an object from a client and server of rate R
Web server after sending r S: MSS (bits)
a request? r O: object size (bits)
Ignoring congestion, delay is r no retransmissions (no loss,
influenced by: no corruption)
r TCP connection establishment Window size:
r data transmission delay r First assume: fixed
r slow start congestion window, W
segments
r Then dynamic window,
modeling slow start

Review: Transport Layer 102

51
Fixed congestion window (1)

First case:
WS/R > RTT + S/R: ACK for
first segment in window
returns before window ’s
worth of data sent

delay = ?

Review: Transport Layer 103

Fixed congestion window (1)

First case:
WS/R > RTT + S/R: ACK for
first segment in window
returns before window ’s
worth of data sent

delay = 2RTT + O/R

Review: Transport Layer 104

52
Fixed congestion window (2)

Second case:
r WS/R < RTT + S/R: wait
for ACK after sending
window’s worth of data
sent

delay = ?

Review: Transport Layer 105

Fixed congestion window (2)

Second case:
r WS/R < RTT + S/R: wait
for ACK after sending
window’s worth of data
sent

delay = 2RTT + O/R


+ (K-1)[S/R + RTT - WS/R]

K?

Review: Transport Layer 106

53
Fixed congestion window (2)

Second case:
r WS/R < RTT + S/R: wait
for ACK after sending
window’s worth of data
sent

delay = 2RTT + O/R


+ (K-1)[S/R + RTT - WS/R]

K =O/(WS)

Review: Transport Layer 107

TCP Delay Modeling: Slow Start (1)


Now suppose window grows according to slow start
But no congestion
Will show that the delay for one object is:
O  S S
Latency = 2 RTT + + P  RTT +  − (2 P − 1)
R  R R
where P is the number of times TCP idles at server:

P = min{Q, K − 1}

- Q is the number of times the server idles


if the object were of infinite size.

- K is the number of windows that cover the object.

Review: Transport Layer 108

54
Case 1: P = Q
Delay components: initiate TCP
connection
• 2 RTT for connection
estab and request request
• O/R to transmit object object
first window

• time server idles due to


= S/R

slow start RTT


second window
= 2S/R

Server idles:
P = min{K-1,Q} times third window
= 4S/R

Example: fourth window


= 8S/R
• O/S = 15 segments
• K = 4 windows
•Q=2
• P = min{K-1,Q} = 2 object
complete
transmission
delivered

Server idles P=2 times time at


time at
server
client

Review: Transport Layer 109

Case 2: P = K-1
Delay components:
• 2 RTT for connection
estab and request
• O/R to transmit object
• time server idles due to
slow start

Server idles:
P = min{K-1,Q} times

Example:
• O/S = 3 segments
• K = 2 windows
•Q=2
• P = min{K-1,Q} = 1

Server idles P=1 times

Review: Transport Layer 110

55
TCP Delay Modeling (contd)
S
+ RTT = time from when server starts to send segment
R
until server receives acknowledgement
initiate TCP
connection

S
2k −1 = time to transmit the kth window request
R object
first window
= S/R
+
S k −1 S 
RTT

 R + RTT − 2 R  = idle time after the kth window


second window
= 2S/R
 
third window
= 4S/R

P
O
delay = + 2RTT + ∑ idleTimep fourth window
= 8S/R
R p =1
P
O S S
= + 2RTT + ∑[ + RTT − 2 k −1 ]
R k =1 R R object
complete
transmission
delivered
O S S
= + 2RTT + P[ RTT + ] − (2P − 1) time at
time at
server
R R R client

Review: Transport Layer 111

TCP Delay Modeling (contd)


Recall K = number of windows that cover object

How do we calculate K ?

K = min {k : 2 0 S + 21 S + L + 2 k −1 S ≥ O}
k −1
= min {k : 2 + 2 + L + 2 ≥ O / S}
0 1

O
= min {k : 2 k −1 ≥ }
S
O
= min {k : k ≥ log 2 ( + 1)}
S
 O 
= log 2 ( + 1) 
 S 
Calculation of Q, number of idles for infinite -size object,
is similar
max{q : 2q −1 S / R ≤ RTT + S / R}
Review: Transport Layer 112

56

You might also like