You are on page 1of 120

Chapter 4: network layer

Network Layer 4-1


Chapter 4: network layer
chapter goals:
 understand principles behind network layer
services:
 network layer service models
 forwarding versus routing
 how a router works
 routing (path selection)
 broadcast, multicast
 instantiation, implementation in the Internet

Network Layer 4-2


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-3


Network layer
application

 transport segment from transport


network

sending to receiving host data link


physical
network
on sending side
network
 network
data link
physical
data link
physical

encapsulates segments
data link
physical network network

into datagrams
data link data link
physical physical

 on receiving side, delivers network


data link
network
data link

segments to transport physical


network
data link
physical

layer physical
application
transport
network layer protocols
network
 data link
physical
network
network
data link

in every host, router


network data link
physical
data link physical
physical

 router examines header


fields in all IP datagrams
passing through it
Network Layer 4-4
Two key network-layer functions
 forwarding: move packets analogy:
from router’s input to
appropriate router  routing: process of
output planning trip from source
to dest
 routing: determine route
taken by packets from  forwarding: process of
source to dest. getting through single
interchange
 routing algorithms

Network Layer 4-5


Interplay between routing and forwarding

routing algorithm routing algorithm determines


end-end-path through network

local forwarding table forwarding table determines


header value output link local forwarding at this router
0100 3
0101 2
0111 2
1001 1

value in arriving
packet’s header
0111 1

3 2

Network Layer 4-6


Connection setup
 3rd important function in some network
architectures:
 ATM, frame relay, X.25
 before datagrams flow, two end hosts and
intervening routers establish virtual connection
 routers get involved
 network vs transport layer connection service:
 network: between two hosts (may also involve intervening
routers in case of VCs)
 transport: between two processes

Network Layer 4-7


Network service model
Q: What service model for “channel” transporting
datagrams from sender to receiver?
example services for example services for a flow
individual datagrams: of datagrams:
 guaranteed delivery  in-order datagram
 guaranteed delivery with delivery
less than 40 msec delay  guaranteed minimum
bandwidth to flow
 restrictions on changes in
inter-packet spacing

Network Layer 4-8


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-9


Connection, connection-less service
 datagram network provides network-layer
connectionless service
 virtual-circuit network provides network-layer
connection service
 analogous to TCP/UDP connecton-oriented /
connectionless transport-layer services, but:
 service: host-to-host
 no choice: network provides one or the other
 implementation: in network core

Network Layer 4-10


Virtual circuits
“source-to-dest path behaves much like telephone
circuit”
 performance-wise
 network actions along source-to-dest path

 call setup, teardown for each call before data can flow
 each packet carries VC identifier (not destination host
address)
 every router on source-dest path maintains “state” for
each passing connection
 link, router resources (bandwidth, buffers) may be
allocated to VC (dedicated resources = predictable
service)
Network Layer 4-11
VC implementation
a VC consists of:
1. path from source to destination
2. VC numbers, one number for each link along path
3. entries in forwarding tables in routers along path
 packet belonging to VC carries VC number
(rather than dest address)
 VC number can be changed on each link.
 new VC number comes from forwarding table

Network Layer 4-12


VC forwarding table
12 22 32

1 3
2
VC number
interface
forwarding table in number
northwest router:
Incoming interface Incoming VC # Outgoing interface Outgoing VC #

1 12 3 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …

VC routers maintain connection state information!


Network Layer 4-13
Virtual circuits: signaling protocols
 used to setup, maintain teardown VC
 used in ATM, frame-relay, X.25
 not used in today’s Internet

application application
transport 5. data flow begins 6. receive data
transport
network 4. call connected 3. accept call network
data link 1. initiate call 2. incoming call data link
physical physical

Network Layer 4-14


Datagram networks
 no call setup at network layer
 routers: no state about end-to-end connections
 no network-level concept of “connection”
 packets forwarded using destination host address

application application
transport transport
network 1. send datagrams
2. receive datagrams network
data link data link
physical physical

Network Layer 4-15


Datagram forwarding table
4 billion IP addresses, so
routing algorithm rather than list individual
destination address
local forwarding table
list range of addresses
dest address output link (aggregate table entries)
address-range 1 3
address-range 2 2
address-range 3 2
address-range 4 1

IP destination address in
arriving packet’s header
1
3 2

Network Layer 4-16


Datagram forwarding table
Destination Address Range
Link Interface
11001000 00010111 00010000 00000000
through 0
11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000


through 1
11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000


through 2
11001000 00010111 00011111 11111111

otherwise 3

Q: but what happens if ranges don’t divide up so nicely?


Network Layer 4-17
Longest prefix matching
longest prefix matching
when looking for forwarding table entry for given
destination address, use longest address prefix that
matches destination address.

Destination Address Range Link interface


11001000 00010111 00010*** ********* 0
11001000 00010111 00011000 ********* 1

11001000 00010111 00011*** ********* 2

otherwise 3

examples:
DA: 11001000 00010111 00010110 10100001 which interface?
DA: 11001000 00010111 00011000 10101010 which interface?
Network Layer 4-18
Datagram or VC network: why?
Internet (datagram) ATM (VC)
 data exchange among  evolved from telephony
computers  human conversation:
 “elastic” service, no strict  strict timing, reliability
timing req. requirements
 need for guaranteed service
 many link types  “dumb” end systems
 different characteristics  telephones
 uniform service difficult  complexity inside
 “smart” end systems network
(computers)
 can adapt, perform control,
error recovery
 simple inside network,
complexity at “edge”

Network Layer 4-19


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-20


Interplay between routing, forwarding
routing algorithm determines
routing algorithm
end-end-path through network
forwarding table determines
local forwarding table
local forwarding at this router
dest address output link
address-range 1 3
address-range 2 2
address-range 3 2
address-range 4 1

IP destination address in
arriving packet’s header
1
3 2

Network Layer 4-21


Graph abstraction
5

v 3 w
2 5
u 2 z
1
3
1
x y 2
graph: G = (N,E) 1

N = set of routers = { u, v, w, x, y, z }

E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

aside: graph abstraction is useful in other network contexts, e.g.,


P2P, where N is set of peers and E is set of TCP connections

Network Layer 4-22


Graph abstraction: costs
5 c(x,x’) = cost of link (x,x’)
3 e.g., c(w,z) = 5
v w
2 5
u cost could always be 1, or
2 1 z inversely related to bandwidth,
3
1 or inversely related to
x y 2
congestion
1

cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)

key question: what is the least-cost path between u and z ?


routing algorithm: algorithm that finds that least cost path

Network Layer 4-23


Routing algorithm classification
Q: global or decentralized Q: static or dynamic?
information?
static:
global:  routes change slowly over
 all routers have complete time
topology, link cost info dynamic:
 “link state” algorithms
 routes change more
decentralized: quickly
 router knows physically-  periodic update
connected neighbors, link  in response to link
costs to neighbors cost changes
 iterative process of
computation, exchange of
info with neighbors
 “distance vector” algorithms
Network Layer 4-24
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-25


A Link-State Routing Algorithm
Dijkstra’s algorithm notation:
 net topology, link costs  c(x,y): link cost from
known to all nodes node x to y; = ∞ if not
 accomplished via “link state direct neighbors
broadcast”  D(v): current value of
 all nodes have same info cost of path from source
 computes least cost paths to dest. v
from one node (‘source”)  p(v): predecessor node
to all other nodes along path from source to
 gives forwarding table for v
that node  N': set of nodes whose
 iterative: after k least cost path definitively
iterations, know least cost known
path to k dest.’s
Network Layer 4-26
Dijsktra’s Algorithm
1 Initialization:
2 N' = {u}
3 for all nodes v
4 if v adjacent to u
5 then D(v) = c(u,v)
6 else D(v) = ∞
7
8 Loop
9 find w not in N' such that D(w) is a minimum
10 add w to N'
11 update D(v) for all v adjacent to w and not in N' :
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N'

Network Layer 4-27


Dijkstra’s algorithm: example
D(v) D(w) D(x) D(y) D(z)
Step N' p(v) p(w) p(x) p(y) p(z)
0 u 7,u 3,u 5,u ∞ ∞
1 uw 6,w 5,u 11,w ∞
2 uwx 6,w 11,w 14,x
3 uwxv 10,v 14,x
4 uwxvy 12,y
5 uwxvyz x
9
notes: 5 7
 construct shortest path tree by 4
tracing predecessor nodes 8
 ties can exist (can be broken u 3 w y z
arbitrarily) 2
3
7 4
v
Network Layer 4-28
Dijkstra’s algorithm: another example
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5 uxyvwz

v 3 w
2 5
u 2 z
1
3
1
x y 2
1
Network Layer 4-29
Dijkstra’s algorithm: example (2)
resulting shortest-path tree from u:

v w
u z
x y

resulting forwarding table in u:


destination link
v (u,v)
x (u,x)
y (u,x)
w (u,x)
z (u,x)
Network Layer 4-30
Dijkstra’s algorithm, discussion
algorithm complexity: n nodes
 each iteration: need to check all nodes, w, not in N
 n(n+1)/2 comparisons: O(n2)
 more efficient implementations possible: O(nlogn)
oscillations possible:
 e.g., support link cost equals amount of carried traffic:

1
A 1+e A A A
2+e 0 0 2+e 2+e 0
D 0 0 B D 1+e 1 B D B D 1+e 1 B
0 0
0 e 0 0
C 0 1 1+e 0
C C C
1 1
e given these costs, given these costs, given these costs,
initially find new routing…. find new routing…. find new routing….
resulting in new costs resulting in new costs resulting in new costs
Network Layer 4-31
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-32


Distance vector algorithm
Bellman-Ford equation (dynamic programming)

let
dx(y) := cost of least-cost path from x to y
then
dx(y) = min
v
{c(x,v) + dv (y) }

cost from neighbor v to destination y


cost to neighbor v

min taken over all neighbors v of x


Network Layer 4-33
Bellman-Ford example
5
clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
v 3 w
2 5
u 2 z B-F equation says:
1
3
1 du(z) = min { c(u,v) + dv(z),
x y 2
1 c(u,x) + dx(z),
c(u,w) + dw(z) }
= min {2 + 5,
1 + 3,
5 + 3} = 4
node achieving minimum is next
hop in shortest path, used in forwarding table
Network Layer 4-34
Distance vector algorithm
 Dx(y) = estimate of least cost from x to y
 x maintains distance vector Dx = [Dx(y): y є N ]
 node x:
 knows cost to each neighbor v: c(x,v)
 maintains its neighbors’ distance vectors. For
each neighbor v, x maintains
Dv = [Dv(y): y є N ]

Network Layer 4-35


Distance vector algorithm
key idea:
 from time-to-time, each node sends its own
distance vector estimate to neighbors
 when x receives new DV estimate from neighbor,
it updates its own DV using B-F equation:
Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N

 under minor, natural conditions, the estimate Dx(y)


converge to the actual least cost dx(y)

Network Layer 4-36


Distance vector algorithm
iterative, asynchronous: each node:
each local iteration
caused by:
 local link cost change wait for (change in local link
cost or msg from neighbor)
 DV update message from
neighbor
distributed: recompute estimates
 each node notifies
neighbors only when its
DV changes if DV to any dest has
 neighbors then notify their changed, notify neighbors
neighbors if necessary

Network Layer 4-37


Dx(z) = min{c(x,y) +
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to
table x y z x y z
x 0 2 7 x 0 2 3

from
from

y ∞∞ ∞ y 2 0 1
z ∞∞ ∞ z 7 1 0

node y cost to
table x y z y
2 1
x ∞ ∞ ∞
x z
from

y 2 0 1 7
z ∞∞ ∞

node z cost to
table x y z
x ∞∞ ∞
from

y ∞∞ ∞
z 7 1 0
time
Network Layer 4-38
Dx(z) = min{c(x,y) +
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to cost to
table x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3

from
from

y ∞∞ ∞ y 2 0 1

from
y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y cost to cost to cost to
table x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from

from

y 2 0 1 y 2 0 1

from
y 2 0 1 7
z ∞∞ ∞ z 7 1 0 z 3 1 0

node z cost to cost to cost to


table x y z x y z x y z

x ∞∞ ∞ x 0 2 7 x 0 2 3
from

from

y 2 0 1 y 2 0 1
from

y ∞∞ ∞
z 7 1 0 z 3 1 0 z 3 1 0
time
Network Layer 4-39
Distance vector: link cost changes
link cost changes:
1
 node detects local link cost change y
4 1
 updates routing info, recalculates x z
distance vector 50
 if DV changes, notify neighbors

“good t0 : y detects link-cost change, updates its DV, informs its


news neighbors.
travels t1 : z receives update from y, updates its table, computes new
fast” least cost to x , sends its neighbors its DV.

t2 : y receives z’s update, updates its distance table. y’s least costs
do not change, so y does not send a message to z.

Network Layer 4-40


Distance vector: link cost changes
link cost changes:
60
 node detects local link cost change y
4 1
 bad news travels slow - “count to x z
infinity” problem! 50
 44 iterations before algorithm
stabilizes: see text
poisoned reverse:
 If Z routes through Y to get to X :
 Z tells Y its (Z’s) distance to X is infinite (so Y won’t route to
X via Z)
 will this completely solve count to infinity problem?

Network Layer 4-41


Comparison of LS and DV algorithms
message complexity robustness: what happens if
 LS: with n nodes, E links, O(nE) router malfunctions?
msgs sent LS:
 DV: exchange between neighbors  node can advertise incorrect
only link cost
 convergence time varies  each node computes only its
own table
speed of convergence DV:
 LS:O(n2) algorithm requires
O(nE) msgs  DV node can advertise
incorrect path cost
 may have oscillations
 each node’s table used by
 DV: convergence time varies others
 may be routing loops • error propagate thru
 count-to-infinity problem network

Network Layer 4-42


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-43


Hierarchical routing
our routing study thus far - idealization
 all routers identical
 network “flat”
… not true in practice

scale: with 600 million administrative autonomy


destinations:  internet = network of
 can’t store all dest’s in networks
routing tables!  each network admin may
 routing table exchange want to control routing in
would swamp links! its own network

Network Layer 4-44


Hierarchical routing
 aggregate routers into gateway router:
regions, “autonomous  at “edge” of its own AS
systems” (AS)  has link to router in
 routers in same AS another AS
run same routing
protocol
 “intra-AS” routing
protocol
 routers in different AS
can run different intra-
AS routing protocol

Network Layer 4-45


Interconnected ASes

3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d  forwarding table
configured by both intra-
and inter-AS routing
Intra-AS Inter-AS algorithm
Routing Routing
algorithm algorithm  intra-AS sets entries
Forwarding
for internal dests
table  inter-AS & intra-AS
sets entries for
external dests
Network Layer 4-46
Inter-AS tasks
 suppose router in AS1 AS1 must:
receives datagram 1. learn which dests are
destined outside of AS1: reachable through AS2,
 router should forward which through AS3
packet to gateway 2. propagate this
router, but which one? reachability info to all
routers in AS1
job of inter-AS routing!

3c
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d

Network Layer 4-47


Example: setting forwarding table in router 1d
 suppose AS1 learns (via inter-AS protocol) that subnet x
reachable via AS3 (gateway 1c), but not via AS2
 inter-AS protocol propagates reachability info to all internal
routers
 router 1d determines from intra-AS routing info that its
interface I is on the least cost path to 1c
 installs forwarding table entry (x,I)

3c
x
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d

Network Layer 4-48


Example: choosing among multiple ASes

 now suppose AS1 learns from inter-AS protocol that subnet


x is reachable from AS3 and from AS2.
 to configure forwarding table, router 1d must determine
which gateway it should forward packets towards for dest x
 this is also job of inter-AS routing protocol!

3c
x
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
?
Network Layer 4-49
Example: choosing among multiple ASes
 now suppose AS1 learns from inter-AS protocol that subnet
x is reachable from AS3 and from AS2.
 to configure forwarding table, router 1d must determine
towards which gateway it should forward packets for dest x
 this is also job of inter-AS routing protocol!
 hot potato routing: send packet towards closest of two
routers.

use routing info determine from


learn from inter-AS hot potato routing: forwarding table the
from intra-AS
protocol that subnet choose the gateway interface I that leads
protocol to determine
x is reachable via that has the to least-cost gateway.
costs of least-cost
multiple gateways smallest least cost Enter (x,I) in
paths to each
of the gateways forwarding table

Network Layer 4-50


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-51


Internet inter-AS routing: BGP
 BGP (Border Gateway Protocol): the de facto
inter-domain routing protocol
 “glue that holds the Internet together”
 BGP provides each AS a means to:
 eBGP: obtain subnet reachability information from
neighboring ASs.
 iBGP: propagate reachability information to all AS-
internal routers.
 determine “good” routes to other networks based on
reachability information and policy.
 allows subnet to advertise its existence to rest of
Internet: “I am here”
Network Layer 4-52
BGP basics
 BGP session: two BGP routers (“peers”) exchange BGP
messages:
 advertising paths to different destination network prefixes (“path vector”
protocol)
 exchanged over semi-permanent TCP connections

 when AS3 advertises a prefix to AS1:


 AS3 promises it will forward datagrams towards that prefix
 AS3 can aggregate prefixes in its advertisement

3c
BGP
3a message
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d

Network Layer 4-53


BGP basics: distributing path information
 using eBGP session between 3a and 1c, AS3 sends prefix
reachability info to AS1.
 1c can then use iBGP do distribute new prefix info to all routers
in AS1
 1b can then re-advertise new reachability info to AS2 over 1b-to-
2a eBGP session
 when router learns of new prefix, it creates entry for
prefix in its forwarding table.

eBGP session
3a iBGP session
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d

Network Layer 4-54


Path attributes and BGP routes
 advertised prefix includes BGP attributes
 prefix + attributes = “route”
 two important attributes:
 AS-PATH: contains ASs through which prefix
advertisement has passed: e.g., AS 67, AS 17
 NEXT-HOP: indicates specific internal-AS router to next-
hop AS. (may be multiple links from current AS to next-
hop-AS)
 gateway router receiving route advertisement uses
import policy to accept/decline
 e.g., never route through AS x
 policy-based routing

Network Layer 4-55


BGP route selection
 router may learn about more than 1 route to
destination AS, selects route based on:
1. local preference value attribute: policy decision
2. shortest AS-PATH
3. closest NEXT-HOP router: hot potato routing
4. additional criteria

Network Layer 4-56


BGP messages
 BGP messages exchanged between peers over TCP
connection
 BGP messages:
 OPEN: opens TCP connection to peer and authenticates
sender
 UPDATE: advertises new path (or withdraws old)
 KEEPALIVE: keeps connection alive in absence of
UPDATES; also ACKs OPEN request
 NOTIFICATION: reports errors in previous msg; also
used to close connection

Network Layer 4-57


BGP routing policy
legend: provider
B network
X
W A
customer
C network:

 A,B,C are provider networks


 X,W,Y are customer (of provider networks)
 X is dual-homed: attached to two networks
 X does not want to route from B via X to C
 .. so X will not advertise to B a route to C

Network Layer 4-58


BGP routing policy (2)
legend: provider
B network
X
W A
customer
C network:

 A advertises path AW to B
 B advertises path BAW to X
 Should B advertise path BAW to C?
 No way! B gets no “revenue” for routing CBAW since neither W nor
C are B’s customers
 B wants to force C to route to w via A
 B wants to route only to/from its customers!

Network Layer 4-59


Why different Intra-, Inter-AS routing ?
policy:
 inter-AS: admin wants control over how its traffic
routed, who routes through its net.
 intra-AS: single admin, so no policy decisions needed
scale:
 hierarchical routing saves table size, reduced update
traffic
performance:
 intra-AS: can focus on performance
 inter-AS: policy may dominate over performance

Network Layer 4-60


Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and  link state
datagram networks  distance vector
4.3 what’s inside a router  hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
 datagram format  RIP
 IPv4 addressing  OSPF
 ICMP  BGP
 IPv6 4.7 broadcast and multicast
routing

Network Layer 4-61


Broadcast routing
 deliver packets from source to all other nodes
 source duplication is inefficient:
duplicate
duplicate R1 creation/transmission R1
duplicate
R2 R2

R3 R4 R3 R4

source in-network
duplication duplication

 source duplication: how does source determine


recipient addresses?
Network Layer 4-62
In-network duplication
 flooding: when node receives broadcast packet,
sends copy to all neighbors
 problems: cycles & broadcast storm
 controlled flooding: node only broadcasts pkt if it
hasn’t broadcast same packet before
 node keeps track of packet ids already broadacsted
 or reverse path forwarding (RPF): only forward packet
if it arrived on shortest path between node and source
 spanning tree:
 no redundant packets received by any node

Network Layer 4-63


Spanning tree
 first construct a spanning tree
 nodes then forward/make copies only along
spanning tree

A A

B B
c c

D D
F E F E

G G
(a) broadcast initiated at A (b) broadcast initiated at D

Network Layer 4-64


Spanning tree: creation
 center node
 each node sends unicast join message to center
node
 message forwarded until it arrives at a node already
belonging to spanning tree

A A
3
B B
c c
4
2
D D
F E F E
1 5
G G
(a) stepwise construction of (b) constructed spanning
spanning tree (center: E) tree
Network Layer 4-65
Multicast routing: problem statement
goal: find a tree (or trees) connecting routers having
local mcast group members legend
 tree: not all paths between routers used group
 shared-tree: same tree used by all group members member
not group
 source-based: different tree from each sender to rcvrs member
router
with a
group
member
router
without
group
member

shared tree source-based trees


Network Layer 4-66
Approaches for building mcast trees
approaches:
 source-based tree: one tree per source
 shortest path trees
 reverse path forwarding
 group-shared tree: group uses one tree
 minimal spanning (Steiner)
 center-based trees

…we first look at basic approaches, then specific protocols


adopting these approaches

Network Layer 4-67


Shortest path tree
 mcast forwarding tree: tree of shortest path
routes from source to all receivers
 Dijkstra’s algorithm

s: source LEGEND
R1 2 router with attached
1 R4
group member
R2 5 router with no attached
3 4 group member
R5
i link used for forwarding,
R3 6
i indicates order link
R6 R7 added by algorithm

Network Layer 4-68


Reverse path forwarding

 rely on router’s knowledge of unicast shortest path


from it to sender
 each router has simple forwarding behavior:

if (mcast datagram received on incoming link on


shortest path back to center)
then flood datagram onto all outgoing links
else ignore datagram

Network Layer 4-69


Reverse path forwarding: example
s: source LEGEND
R1
R4 router with attached
group member
R2
router with no attached
R5 group member
R3 datagram will be forwarded
R6 R7
datagram will not be
forwarded

 result is a source-specific reverse SPT


 may be a bad choice with asymmetric links

Network Layer 4-70


Reverse path forwarding: pruning
 forwarding tree contains subtrees with no mcast group
members
 no need to forward datagrams down subtree
 “prune” msgs sent upstream by router with no
downstream group members
s: source
LEGEND
R1
R4
router with attached
group member
R2
P
router with no attached
R5 group member
P
R3 P prune message
R6 links with multicast
R7 forwarding

Network Layer 4-71


Shared-tree: steiner tree

 steiner tree: minimum cost tree connecting all


routers with attached group members
 problem is NP-complete
 excellent heuristics exists
 not used in practice:
 computational complexity
 information about entire network needed
 monolithic: rerun whenever a router needs to
join/leave

Network Layer 4-72


Center-based trees
 single delivery tree shared by all
 one router identified as “center” of tree
 to join:
 edge router sends unicast join-msg addressed to center
router
 join-msg “processed” by intermediate routers and
forwarded towards center
 join-msg either hits existing tree branch for this center,
or arrives at center
 path taken by join-msg becomes new branch of tree for
this router

Network Layer 4-73


Center-based trees: example
suppose R6 chosen as center:

LEGEND

R1 router with attached


R4
3 group member

R2 router with no attached


2 group member
1
R5 path order in which join
messages generated
R3
1 R6
R7

Network Layer 4-74


Internet Multicasting Routing: DVMRP

 DVMRP: distance vector multicast routing


protocol, RFC1075
 flood and prune: reverse path forwarding, source-
based tree
 RPF tree based on DVMRP’s own routing tables
constructed by communicating DVMRP routers
 no assumptions about underlying unicast
 initial datagram to mcast group flooded everywhere
via RPF
 routers not wanting group: send upstream prune msgs

Network Layer 4-75


DVMRP: continued…
 soft state: DVMRP router periodically (1 min.)
“forgets” branches are pruned:
 mcast data again flows down unpruned branch
 downstream router: reprune or else continue to receive
data
 routers can quickly regraft to tree
 following IGMP join at leaf
 odds and ends
 commonly implemented in commercial router

Network Layer 4-76


Tunneling
Q: how to connect “islands” of multicast routers in a
“sea” of unicast routers?

physical topology logical topology

 mcast datagram encapsulated inside “normal” (non-


multicast-addressed) datagram
 normal IP datagram sent thru “tunnel” via regular IP unicast
to receiving mcast router (recall IPv6 inside IPv4 tunneling)
 receiving mcast router unencapsulates to get mcast
datagram
Network Layer 4-77
PIM: Protocol Independent Multicast
 not dependent on any specific underlying unicast
routing algorithm (works with all)
 two different multicast distribution scenarios :

dense: sparse:
 group members densely  # networks with group
packed, in “close” members small wrt #
proximity. interconnected networks
 bandwidth more plentiful  group members “widely
dispersed”
 bandwidth not plentiful

Network Layer 4-78


Consequences of sparse-dense dichotomy:
dense sparse:
 group membership by  no membership until routers
routers assumed until explicitly join
routers explicitly prune  receiver- driven construction
 data-driven construction on of mcast tree (e.g., center-
mcast tree (e.g., RPF) based)
 bandwidth and non-group-  bandwidth and non-group-
router processing profligate router processing conservative

Network Layer 4-79


PIM- dense mode
flood-and-prune RPF: similar to DVMRP but…
 underlying unicast protocol provides RPF info
for incoming datagram
 less complicated (less efficient) downstream
flood than DVMRP reduces reliance on
underlying routing algorithm
 has protocol mechanism for router to detect it
is a leaf-node router

Network Layer 4-80


PIM - sparse mode
 center-based approach
 router sends join msg to R1
R4
rendezvous point (RP) join
 intermediate routers R2
update state and join
forward join R5
join
 after joining via RP, router R3
can switch to source- R6
specific tree R7
all data multicast rendezvous
 increased from rendezvous point
performance: less point
concentration, shorter
paths

Network Layer 4-81


PIM - sparse mode
sender(s):
R1
 unicast data to RP, R4
which distributes join

down RP-rooted tree R2


join
 RP can extend mcast R5
tree upstream to R3
join

source R6
 RP can send stop msg all data multicast R7
rendezvous
if no attached from rendezvous
point
point
receivers
 “no one is listening!”

Network Layer 4-82


Internetworking
 What is internetwork
 An arbitrary collection of networks interconnected to provide some sort of host-to-
host packet delivery service

A simple internetwork where H represents hosts and R represents routers


Internetworking
 What is IP
 IP stands for Internet Protocol
 Key tool used today to build scalable, heterogeneous internetworks
 It runs on all the nodes in a collection of networks and defines the infrastructure that
allows these nodes and networks to function as a single logical internetwork

A simple internetwork showing the protocol layers


IP Service Model
 Two parts
 Global Addressing Scheme
• Provides a way to identify all hosts in the network
 Datagram (Connectionless) model for data delivery
• Best-effort delivery (unreliable service)
• packets are lost
• packets are delivered out of order
• duplicate copies of a packet are delivered
• packets can be delayed for a long time
Packet Format
 Version (4 bits):
 currently 4 or 6.
 Also called IPv4 and IPv6
 Hlen (4 bits):
 number of 32-bit words in
header
 usually 5 32-bit words with no
options
 TOS (8 bits):
 type of service (not widely used)
 Length (16 bits):
 number of bytes in this
datagram including the header
 Ident (16 bits) and
Flags/Offset (16 bits):
 used by fragmentation
Packet Format
 TTL (8 bits):
 number of hops/routers this
packet can travel
• discard the looping packets
 Originally based on time, but
changed to a hop-count based
 Each router decrements it by 1
 Discard the packet when it
becomes 0
 Default is 64
 Problems
• Setting it too high the packet
will loop a lot
• Setting it too low the packet
will not reach the destination
Packet Format
 Protocol (8 bits):
 demux key (TCP=6, UDP=17)
 Checksum (16 bits):
 of the header only
 DestAddr & SrcAddr (32
bits)
 The key for datagram delivery
 Every packet contains a full
destination address
 Forwarding/routing decisions are
made at each router
 The source address is for the
destination to know the sender
and if it wants to reply to it
IP Fragmentation and Reassembly
 Each network has some MTU (Maximum Transmission Unit)
 Ethernet (1500 bytes), FDDI (4500 bytes)
 IP packets need to fit in the payload of link-layer frame
 Solutions
• Make all packet size small enough to fit all
• Or fragment the large packets into smaller ones and reassembles them
later
 Strategy
 Fragmentation occurs in a router when it receives a datagram that it
wants to forward over a network which has (MTU < datagram)
 Reassembly is done at the receiving host
 All the fragments carry the same identifier in the Ident field
• Fragments are self-contained datagrams
IP Fragmentation and Reassembly

Suppose PPP has MTU of 532-byte packet


(20 header + 512 payload)

IP datagrams traversing the sequence of physical networks


IP Fragmentation and Reassembly

Header fields used in IP fragmentation. (a) Unfragmented packet; (b) fragmented packets.
Global Addresses
 IP addresses Properties
 globally unique
 hierarchical: network + host
• Network part: identifies the network the host is attached to
• Host: identifies a unique host on that network
• Ethernet addresses, even globally unique, are flat (no structure and thus no meaning) and
can not be use for routing
 Note that a router is attached to at least two networks, so it must have an IP address
on each port/interface
• Thus it is more precise to think of IP addresses as belonging to interfaces rather than to
hosts
Global Addresses
 Approximately, 4 Billion IP address, half are A type, ¼ is B type, and 1/8 is C
type

(a) Class A (b) Class B (c) Class C


Global Addresses
 Class A was intended for Wide Area Networks
 Thus there should a very few of them
 Class B was intended for a modest size networks (like a campus)
 Class C is for the large number of LANs
 However, these classifications are not flexible and today’s IP addresses are
normally “classless” as we will see
 Format
 4 bytes, each byte is represented by a decimal number
 Dot notation
• 10.3.2.4
• 128.96.33.81
• 192.12.69.77
IP Datagram Forwarding
 Strategy
 every datagram contains destination's address
 if directly connected to destination network, then forward to host
 if not directly connected to destination network, then forward to some router
 forwarding table maps network number into next hop
 each host has a default router
 each router maintains a forwarding table
 Example (router R2)
IP Datagram Forwarding
 Algorithm
if (NetworkNum of destination = NetworkNum of one of my interfaces) then
deliver packet to destination over that interface
else
if (NetworkNum of destination is in my forwarding table) then
deliver packet to NextHop router
else
deliver packet to default router

For a host with only one interface and only a default router in its forwarding table, this simplifies to

if (NetworkNum of destination = my NetworkNum)then


deliver packet to destination directly
else
deliver packet to default router
Subnetting
 The network number part was designed to uniquely identify exactly one
physical network
 However, this approach has some problems
 A network with only 2 hosts has to have at least a class C network!!
 A network with only 256 hosts has to have at least a class B network!!
 Thus, we will waste our valuable IP address space
 Solution
• Subnetting
Subnetting
 Key Idea
 Allocate a single network number and use it for several physical networks
• called subnets
 Several things need to be done
• Subnets need to be physically close to each other
– From the Internet point of view, they all look ONE network
– A perfect situation to use subnetting is for large campus or corporation
• Configure all nodes on each subnet with a subnet mask
– It masks the network part
– Introduces the subnet number
– All nodes on the same subnet have the same subnet number and the same mask

 The IP address of a nodes ANDed with the subnet mask give the subnet
number
 IP AND subnet mask  subnet number
Subnetting

Increases the number


of networks and
reduces the number
of hosts
Subnetting
 When a host wants to send a packet to a certain IP address
 First, it does the bitwise AND between its own subnet mast and destination IP address
 If the result equals the subnet number of the sender, then the destination host is on the same subnet so the
packet can be delivered directly (without a router)
 Else, the packet will be forwarded to another subnet (through a router)
Subnetting

 Forwarding Table at Router R1


Subnetting
Forwarding Algorithm
D = destination IP address
for each entry < SubnetNum, SubnetMask, NextHop>
D1 = SubnetMask & D
if D1 = SubnetNum
if NextHop is an interface
deliver datagram directly to destination
else
deliver datagram to NextHop (a router)
Different Protocols
 ARP (Address Resolution Protocol)
 DHCP (Dynamic Host Configuration Protocol)
 ICMP (Internet Control Message Protocol)
Address Translation Protocol (ARP)
 Map IP addresses into physical addresses
 ARP (Address Resolution Protocol)
 table of IP to physical address bindings
 The router broadcasts a request (who-has / tell) if the required IP
address not in the ARP table
• Ex., who-has 192.168.0.29 tell 192.168.0.1
 target machine (with IP 192.168.0.29 in the example) responds with its
physical address (its MAC)
Host IP Configurations
 Most host Operating Systems provide a way to manually configure the IP
information for the host
 Drawbacks of manual configuration
 A lot of work to configure all the hosts in a large network
 Configuration process is error-prune
 Automated Configuration Process is required
 Using the DHCP protocol
Internet Control Message Protocol (ICMP)
 Defines a collection of error messages that are sent back to the source
host whenever a router or host is unable to process an IP datagram
successfully
 Destination host unreachable due to link /node failure
 Reassembly process failed
 TTL had reached 0 (so datagrams don't cycle forever)
 IP header checksum failed

 ICMP-Redirect
 From router to a source host
 With a better route information
ARP: address resolution protocol
Question: how to determine
interface’s MAC address,
knowing its IP address? ARP table: each IP node (host,
router) on LAN has table
137.196.7.78
 IP/MAC address
mappings for some LAN
1A-2F-BB-76-09-AD
nodes:
137.196.7.23
137.196.7.14 < IP address; MAC address; TTL>
 TTL (Time To Live):
LAN time after which address
71-65-F7-2B-08-53 mapping will be
forgotten (typically 20
58-23-D7-FA-20-B0

min)
0C-C4-11-6F-E3-98
137.196.7.88

Link Layer 5-107


ARP protocol: same LAN
 A wants to send datagram
to B
 B’s MAC address not in A’s  A caches (saves) IP-to-
ARP table. MAC address pair in its
 A broadcasts ARP query ARP table until
packet, containing B's IP information becomes old
address (times out)
 dest MAC address = FF-FF-  soft state: information that
FF-FF-FF-FF times out (goes away)
 all nodes on LAN receive unless refreshed
ARP query  ARP is “plug-and-play”:
 B receives ARP packet,  nodes create their ARP
replies to A with its (B's) tables without intervention
from net administrator
MAC address
 frame sent to A’s MAC
address (unicast)
Link Layer 5-108
Addressing: routing to another LAN
walkthrough: send datagram from A to B via R
 focus on addressing – at IP (datagram) and MAC layer (frame)
 assume A knows B’s IP address
 assume A knows IP address of first hop router, R (how?)
 assume A knows R’s MAC address (how?)

A B
R
111.111.111.111
222.222.222.222
74-29-9C-E8-FF-55
49-BD-D2-C7-56-2A
222.222.222.220
1A-23-F9-CD-06-9B

111.111.111.112 111.111.111.110 222.222.222.221


CC-49-DE-D0-AB-7D E6-E9-00-17-BB-4B 88-B2-2F-54-1A-0F

Link Layer 5-109


Addressing: routing to another LAN
 A creates IP datagram with IP source A, destination B
 A creates link-layer frame with R's MAC address as dest, frame
contains A-to-B IP datagram
MAC src: 74-29-9C-E8-FF-55
MAC dest: E6-E9-00-17-BB-4B
IP src: 111.111.111.111
IP dest: 222.222.222.222

IP
Eth
Phy

A B
R
111.111.111.111
222.222.222.222
74-29-9C-E8-FF-55
49-BD-D2-C7-56-2A
222.222.222.220
1A-23-F9-CD-06-9B

111.111.111.112 111.111.111.110 222.222.222.221


CC-49-DE-D0-AB-7D E6-E9-00-17-BB-4B 88-B2-2F-54-1A-0F

Link Layer 5-110


Addressing: routing to another LAN
 frame sent from A to R
 frame received at R, datagram removed, passed up to IP

MAC src: 74-29-9C-E8-FF-55


MAC dest: E6-E9-00-17-BB-4B
IP src: 111.111.111.111
IP dest: 222.222.222.222
IP src: 111.111.111.111
IP dest: 222.222.222.222

IP IP
Eth Eth
Phy Phy

A B
R
111.111.111.111
222.222.222.222
74-29-9C-E8-FF-55
49-BD-D2-C7-56-2A
222.222.222.220
1A-23-F9-CD-06-9B

111.111.111.112 111.111.111.110 222.222.222.221


CC-49-DE-D0-AB-7D E6-E9-00-17-BB-4B 88-B2-2F-54-1A-0F

Link Layer 5-111


Addressing: routing to another LAN
 R forwards datagram with IP source A, destination B
 R creates link-layer frame with B's MAC address as dest, frame
contains A-to-B IP datagram

MAC src: 1A-23-F9-CD-06-9B


MAC dest: 49-BD-D2-C7-56-2A
IP src: 111.111.111.111
IP dest: 222.222.222.222
IP
IP Eth
Eth Phy
Phy

A B
R
111.111.111.111
222.222.222.222
74-29-9C-E8-FF-55
49-BD-D2-C7-56-2A
222.222.222.220
1A-23-F9-CD-06-9B

111.111.111.112 111.111.111.110 222.222.222.221


CC-49-DE-D0-AB-7D E6-E9-00-17-BB-4B 88-B2-2F-54-1A-0F

Link Layer 5-112


Addressing: routing to another LAN
 R forwards datagram with IP source A, destination B
 R creates link-layer frame with B's MAC address as dest, frame
contains A-to-B IP datagram

MAC src: 1A-23-F9-CD-06-9B


MAC dest: 49-BD-D2-C7-56-2A
IP src: 111.111.111.111
IP dest: 222.222.222.222
IP
IP Eth
Eth Phy
Phy

A B
R
111.111.111.111
222.222.222.222
74-29-9C-E8-FF-55
49-BD-D2-C7-56-2A
222.222.222.220
1A-23-F9-CD-06-9B

111.111.111.112 111.111.111.110 222.222.222.221


CC-49-DE-D0-AB-7D E6-E9-00-17-BB-4B 88-B2-2F-54-1A-0F

Link Layer 5-113


Addressing: routing to another LAN
 R forwards datagram with IP source A, destination B
 R creates link-layer frame with B's MAC address as dest, frame
contains A-to-B IP datagram
MAC src: 1A-23-F9-CD-06-9B
MAC dest: 49-BD-D2-C7-56-2A
IP src: 111.111.111.111
IP dest: 222.222.222.222

IP
Eth
Phy

A B
R
111.111.111.111
222.222.222.222
74-29-9C-E8-FF-55
49-BD-D2-C7-56-2A
222.222.222.220
1A-23-F9-CD-06-9B

111.111.111.112 111.111.111.110 222.222.222.221


CC-49-DE-D0-AB-7D E6-E9-00-17-BB-4B 88-B2-2F-54-1A-0F

Link Layer 5-114


The Global Internet

The tree structure of the Internet in 1990


The Global Internet

A simple multi-provider Internet


Next Generation IP
(IPv6)
Major Features
 128-bit addresses
 Multicast
 Real-time service
 Authentication and security
 Auto-configuration
 End-to-end fragmentation
 Enhanced routing functionality, including support
for mobile hosts
IPv6 Addresses
 Classless addressing/routing (similar to CIDR)
 Notation: x:x:x:x:x:x:x:x (x = 16-bit hex number)
 contiguous 0s are compressed: 47CD::A456:0124
 IPv6 compatible IPv4 address: ::128.42.1.87
 Address assignment
 provider-based
 geographic
IPv6 Header
 40-byte “base” header
 Extension headers (fixed order, mostly fixed
length)
 fragmentation
 source routing
 authentication and
security
 other options

You might also like