You are on page 1of 10

Hot Potatoes Heat Up BGP

Routing
Renata Teixeira
(UC San Diego)
http://www-cse.ucsd.edu/~teixeira

with
Aman Shaikh (AT&T), Tim Griffin(Intel), and
Jennifer Rexford(AT&T)
30th NANOG Miami, Florida

BGP Routing inside an AS


Most people think
BGP = inter-domain routing
Routing inside an AS = OSPF/IS-IS


But in large transit ASes




Most traffic is routed using BGP


IGP changes affect BGP decisions

30th NANOG

Hot-Potato Routing
dst
New York

San Francisco
ISP network
10

9
Dallas
Hot-potato routing = route to closest exit point
when there is more than one
route to destination

30th NANOG

Hot-Potato Routing
dst
New York

San Francisco

failure
planned maintenance 11
traffic engineering

ISP network

Consequences:
Transient forwarding instability
Traffic shift
Inter-domain routing changes


10

9
Dallas

Routes to thousands
of destinations switch
exit point!!!

30th NANOG

BGP Decision Process


equally good

Ignore if exit point unreachable


Highest local preference
Lowest AS path length
Lowest origin type
Lowest MED (with same next hop AS)
Lowest IGP cost to next hop
Lowest router ID of BGP speaker
5

30th NANOG

Why Care about


Hot Potatoes?
Large number of routes potentially affected


All routes from peers and multi-homed customers




Understanding of impact in real networks




How often hot-potato changes happen in a real


network and how many prefixes do they affect?
What are the convergence delays?


Avoiding routing instability




Operators: avoid hot-potato changes


Vendors: reduce impact of hot-potato changes

30th NANOG

Measuring Hot-Potato
Routing
Collect measurement of both protocols


BGP monitor and OSPF monitor




OSPF LSAs
AT&T backbone
AS 7018

BGP updates
from every PoP

Pre-process each stream




Clustering related messages




Correlate the two streams of data




Match BGP updates with OSPF events


7

30th NANOG

Heuristic for Matching


Stream of OSPF messages
Transform stream of OSPF
messages into routing
changes

refresh

link failure

weight change

chg cost

chg cost

del
Match BGP updates
with OSPF events that
happen close in time

time

Classify BGP updates


by possible OSPF causes
Stream of BGP updates
30th NANOG

Impact of OSPF/BGP
Interaction (June 2003)
High variability according to location and day


Impact on external BGP measurements and customers




location

min

% updates
max

days > 10% of


hot-potato changes

rich peering

0%

5.3%

no peering

0%

46.5%

One LSA can have a big impact




location

LSAs with no impact prefixes impacted by an LSA

rich peering

95.1%

less than 1%

no peering

94.8%

59%
9

30th NANOG

Delay for BGP Routing Change


Router rerunning BGP decision process


Implementation specific


Timer driven (tunable parameter)


Event driven


Internal BGP route propagation delay




Internal BGP hierarchy (route reflectors)




Wait for another router to change best route

Transmitting many BGP messages




Latency for transferring the data

Delays very sensitive to router


implementation decisions and network design!
30th NANOG

10

Delay for First BGP Update

Fraction of
Hot-Potato Changes

Uniform distribution: 60 sec


timer to revisit BGP decision

30th NANOG

Two runs of timer?

60-second timer

Routers in backbone (June)


Routers in backbone (September)
Router in lab
time BGP update time LSA (seconds)

11

Cumulative Number of
Hot-Potato Changes

Transferring Multiple Prefixes

Worst case scenario:


120 sec to revisit BGP decision
80 sec to send multiple updates
Last prefix may take 3 minutes to converge!
81 seconds delay

30th NANOG

time BGP update time LSA (seconds)

12

Implications on Packet
Forwarding
Forwarding plane convergence
FIB update only when both IGP and BGP
converge


Measuring the performance impact




Do we need new measurement techniques?

13

30th NANOG

Forwarding Plane
Convergence
1 - Scan process runs in R2
R1

10

2 - R2 starts using R1 to reach dst

R2
10
X

100

111

3 - R1s scan process can


take up to 60 seconds to run

dst
R1

Packets to dst may


be caught in a loop
for 60 seconds!
30th NANOG

10

R2
111

100

dst
14

Detecting Loops is Hard


Active probing of forwarding path
Probing machines just have one exit point


customer traffic in loop


R1
W1

10

R2
111

100

W2

Operator probes
dst
15

30th NANOG

Avoiding Unnecessary
Hot-Potato Changes
Router connectivity to exit points


Likelihood of hot-potato routing changes

Cost in/out of links during maintenance

30th NANOG

16

Comparison of Network
Designs
dst

NY

dst
SF
NY

SF
10

1000

dst

10

Dallas
10

SF

Dallas

NY

10
10
10
Dallas

Small changes will make Dallas


switch exit points to dst

More robust to intra-domain


routing changes
17

30th NANOG

Careful Cost in/out Links

SF

5
100

5
10
4

NY
10

10
dst

Dallas
Traffic is more predictable
Faster convergence
Less impact on neighbors

30th NANOG

18

Conclusions
Hot potatoes can affect a lot of prefixes and
have high convergence delays
Vendors:


Event-driven implementations
Understand network-wide effect


Operators:
Tuning timer
Network design
Maintenance practices


19

30th NANOG

Future Work
Impact of the IGP-triggered BGP updates


Sudden shifts in the flow of traffic


Forwarding loops during convergence
Externally visible BGP routing changes

30th NANOG

20

10

You might also like